🏰 Image to Speech GenAI Tool 🌟♨️

An innovative AI tool that generates audio short stories based on the context of uploaded images. It leverages cutting-edge GenAI models from Hugging Face, OpenAI, and LangChain, and is deployed on both Streamlit Cloud and Hugging Face Space.

📢 Deployments

Run App on Streamlit Cloud

Run App on HuggingFace Space Cloud

🎯 Key Features

Image to Text
- Uses Hugging Face's image-to-text transformer model (Salesforce/blip-image-captioning-base) to analyze the image and generate descriptive text.
Text to Story
- Utilizes OpenAI's GPT-3.5-Turbo model to create a short, imaginative story (default: 50 words) from the descriptive text.
Story to Speech
- Converts the story into a narrated audio file using Hugging Face's text-to-speech model (espnet/kan-bayashi_ljspeech_vits).
User-Friendly Interface
- Built with Streamlit for easy image uploading and playback of generated audio.

📈 System Design

📂 Demo

Couple Test Image Output

Audio file available in the img-audio folder.

Family Test Image Output

Audio file available in the img-audio folder.

Picnic Test Image Output

Audio file available in the img-audio folder.

🌟 Requirements

The following libraries and tools are required:

os
python-dotenv
transformers
torch
langchain
openai
requests
streamlit

🚀 Usage

Prerequisites

Obtain personal API tokens for Hugging Face and OpenAI.

Save the tokens in a .env file with the following format:

OPENAI_API_KEY=<your-api-key-here>  
HUGGINGFACE_API_TOKEN=<your-access-token-here>

Steps

Set up a virtual environment (venv) and install dependencies:

pip install -r requirements.txt

Run the app:

streamlit run app.py

Upload an image via the app interface.

The app will: Generate descriptive text for the uploaded image.

Create a short story based on the text.

Provide a playable audio file of the narrated story.

▶️ Installation

Clone the Repository

git clone https://github.com/alimdsaif3/Image-to-Story-Converter.git

Install Dependencies

pip install -r requirements.txt

Run the App Locally

streamlit run app.py

🤝 Contributions If you like this project, please ⭐ the repository! Contributions are welcome. Submit a pull request if you have suggestions or enhancements.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
audio-img		audio-img
img-audio		img-audio
img		img
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
please-2700212_1280.jpg		please-2700212_1280.jpg
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🏰 Image to Speech GenAI Tool 🌟♨️

📢 Deployments

Run App on Streamlit Cloud

Run App on HuggingFace Space Cloud

🎯 Key Features

📈 System Design

📂 Demo

Couple Test Image Output

Family Test Image Output

Picnic Test Image Output

🌟 Requirements

🚀 Usage

Prerequisites

Steps

Run the app:

Upload an image via the app interface.

Create a short story based on the text.

▶️ Installation

Install Dependencies

Run the App Locally

If you like this LLM Project do drop ⭐ to this repo

Follow me on

About

Releases

Packages

Languages

License

alimdsaif3/Image-to-Story-Converter

Folders and files

Latest commit

History

Repository files navigation

🏰 Image to Speech GenAI Tool 🌟♨️

📢 Deployments

Run App on Streamlit Cloud

Run App on HuggingFace Space Cloud

🎯 Key Features

📈 System Design

📂 Demo

Couple Test Image Output

Family Test Image Output

Picnic Test Image Output

🌟 Requirements

🚀 Usage

Prerequisites

Steps

Run the app:

Upload an image via the app interface.

Create a short story based on the text.

▶️ Installation

Install Dependencies

Run the App Locally

If you like this LLM Project do drop ⭐ to this repo

Follow me on

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages