Image Text Extraction Using OpenAI GPT-4

This Python script extracts text from images using OpenAI's GPT-4 API. The script supports both image URLs and base64-encoded images as inputs. Extracted text can be saved to a local file for further use.

Features

Text Extraction from Image URLs: Extract text directly from an image available online.
Text Extraction from Base64 Images: Extract text from locally stored images by converting them into base64 encoding.
Accurate Text Parsing: Ensures completeness by carefully extracting all text elements, including faint or small text, and preserving the original structure.
Output to File: Saves extracted text to a specified file.

Requirements

Python 3.7+
OpenAI Python SDK
Base64 library (standard library)

Installation

Clone this repository or download the script.
Install the required dependencies:
```
pip install openai
```
Ensure you have an OpenAI API key.

Usage

Extracting Text from an Image URL

Uncomment the relevant section in the script:

# image_url = "https://example.com/image.jpg"
# extracted_text = image_to_text_from_url(image_url)
# output_file_path = "path/to/extracted_text.txt"
# with open(output_file_path, "a", encoding="utf-8") as text_file:
#     text_file.write(extracted_text)
#     text_file.write("\n")

Replace the image_url with the URL of your image.
Run the script and view the extracted text in the specified file.

Extracting Text from a Local Image

Provide the path to your local image:
```
local_image_path = "path/to/image.png"
```
The script will automatically convert the image to base64 encoding and extract text.
View the extracted text in the console or in the specified output file.

Code Structure

Functions

image_to_base64(image_path)
- Converts a local image to base64 encoding.
image_to_text_from_url(image_url)
- Extracts text from an image using its URL.
image_to_text_from_base64(image_base64)
- Extracts text from an image provided in base64 format.

Example Workflow

Input: Image URL or local image file.
Process:
- For URLs: Direct API call.
- For local files: Convert to base64, then API call.
Output: Extracted text saved to a file.

Configuration

Update your OpenAI API key in the client initialization:

client = OpenAI(
    api_key="your-openai-api-key"
)

Specify the file path for saving the extracted text.

Limitations

The script requires an active OpenAI API key with appropriate permissions.
Limited by the capabilities of the GPT-4 model for OCR tasks.

Contributing

Feel free to fork this repository and submit pull requests for new features or improvements.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Image Text Extraction Using OpenAI GPT-4

Features

Requirements

Installation

Usage

Extracting Text from an Image URL

Extracting Text from a Local Image

Code Structure

Functions

Example Workflow

Configuration

Limitations

Contributing

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Image Text Extraction Using OpenAI GPT-4

Features

Requirements

Installation

Usage

Extracting Text from an Image URL

Extracting Text from a Local Image

Code Structure

Functions

Example Workflow

Configuration

Limitations

Contributing

License