Image Text Extraction Using OpenAI GPT-4

This Python script extracts text from images using OpenAI's GPT-4 API. The script supports both image URLs and base64-encoded images as inputs. Extracted text can be saved to a local file for further use.

Features

Text Extraction from Image URLs: Extract text directly from an image available online.
Text Extraction from Base64 Images: Extract text from locally stored images by converting them into base64 encoding.
Accurate Text Parsing: Ensures completeness by carefully extracting all text elements, including faint or small text, and preserving the original structure.
Output to File: Saves extracted text to a specified file.

Requirements

Python 3.7+
OpenAI Python SDK
Base64 library (standard library)

Installation

Clone this repository or download the script.
Install the required dependencies:
```
pip install openai
```
Ensure you have an OpenAI API key.

Usage

Extracting Text from an Image URL

Uncomment the relevant section in the script:

# image_url = "https://example.com/image.jpg"
# extracted_text = image_to_text_from_url(image_url)
# output_file_path = "path/to/extracted_text.txt"
# with open(output_file_path, "a", encoding="utf-8") as text_file:
#     text_file.write(extracted_text)
#     text_file.write("\n")

Replace the image_url with the URL of your image.
Run the script and view the extracted text in the specified file.

Extracting Text from a Local Image

Provide the path to your local image:
```
local_image_path = "path/to/image.png"
```
The script will automatically convert the image to base64 encoding and extract text.
View the extracted text in the console or in the specified output file.

Code Structure

Functions

image_to_base64(image_path)
- Converts a local image to base64 encoding.
image_to_text_from_url(image_url)
- Extracts text from an image using its URL.
image_to_text_from_base64(image_base64)
- Extracts text from an image provided in base64 format.

Example Workflow

Input: Image URL or local image file.
Process:
- For URLs: Direct API call.
- For local files: Convert to base64, then API call.
Output: Extracted text saved to a file.

Configuration

Update your OpenAI API key in the client initialization:

client = OpenAI(
    api_key="your-openai-api-key"
)

Specify the file path for saving the extracted text.

Limitations

The script requires an active OpenAI API key with appropriate permissions.
Limited by the capabilities of the GPT-4 model for OCR tasks.

Contributing

Feel free to fork this repository and submit pull requests for new features or improvements.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
image_to_text_gpt4omini.py		image_to_text_gpt4omini.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Text Extraction Using OpenAI GPT-4

Features

Requirements

Installation

Usage

Extracting Text from an Image URL

Extracting Text from a Local Image

Code Structure

Functions

Example Workflow

Configuration

Limitations

Contributing

License

About

Releases

Packages

Languages

ceodaniyal/Image_To_Text_GPT-4o-mini

Folders and files

Latest commit

History

Repository files navigation

Image Text Extraction Using OpenAI GPT-4

Features

Requirements

Installation

Usage

Extracting Text from an Image URL

Extracting Text from a Local Image

Code Structure

Functions

Example Workflow

Configuration

Limitations

Contributing

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages