Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jarvis now has eyes:) - minor bugs fixes as well #7

Merged
merged 3 commits into from
Apr 25, 2024
Merged

Conversation

FotieMConstant
Copy link
Member

Below is what is addressed in this PR

  • Now Jarvis can see you, and NO, your data/image feed or anything is not sent to any server.
  • Fixed a minor issue address in install steps not clear #6 where the model isn't found when you run main.py
  • Models are not loaded from the local .env file and you can see an example in the .env.example file.
  • Started working on whisper integration for better speech to text:)

…Vision models

feat(README.md): add important note about adding necessary env variables before running

feat(main.py):
- import necessary modules for Vision, OnlineOps, CurrentDateTeller, CurrentTimeTeller
- load environment variables for Jarvis and Vision models
- add support for Jarvis model in main loop
- add support for Vision module to describe images
- add support for "go to sleep" or "goodbye" commands to exit the program

feat(modules/speech_to_text.py): add code to listen to audio using whisper module

feat(modules/text_to_speech.py): add support for ElevenLabs text-to-speech API

feat(modules/vibranium/vision/vision.py): add Vision class to generate image descriptions using a specified model

feat(requirements.txt): add required dependencies for the project

feat(speech_to_text.py): add example code to transcribe audio using whisper module

feat(whisper.cpp): add whisper module as a submodule
The README.md file was updated to include a new feature where users can ask Jarvis what it sees using the webcam. This feature utilizes the llava library, which needs to be downloaded and installed with the "ollama run llava" command. Users can ask questions like "what is this?", "what are you looking at?", "tell me what you see", "describe this", or even "describe what you see".
… directory

fix(images): remove image.jpg file as it is no longer needed
@FotieMConstant FotieMConstant changed the title Jarvis can now see you:) Jarvis can now see you:) - minor bugs fixes as well Apr 25, 2024
@FotieMConstant FotieMConstant changed the title Jarvis can now see you:) - minor bugs fixes as well Jarvis now has eyes:) - minor bugs fixes as well Apr 25, 2024
@FotieMConstant FotieMConstant merged commit b369006 into main Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant