Skip to content

Commit

Permalink
Merge branch 'main' into jeromehardaway/update-code
Browse files Browse the repository at this point in the history
  • Loading branch information
jeromehardaway authored Oct 24, 2024
2 parents 4cc2e6f + e4aaafd commit 262fb31
Show file tree
Hide file tree
Showing 21 changed files with 134 additions and 49 deletions.
Binary file removed .DS_Store
Binary file not shown.
4 changes: 4 additions & 0 deletions .coveragerc
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[run]
omit =
tests/*
*/tests/*
23 changes: 23 additions & 0 deletions .github/workflows/unit-test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
name: Run Unit Tests with Pytest

on: [ push, pull_request ]

jobs:
test:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Run tests with coverage report
run: |
pytest --cov --cov-report=term-missing
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ __pycache__/
# C extensions
*.so

.DS_Store

# Distribution / packaging
.Python
build/
Expand Down
94 changes: 46 additions & 48 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,11 @@ VetsAI is an AI-powered virtual assistant designed to help veterans navigate emp
- **Chat Assistant**: Ask questions and receive advice on job searching and career transitions.
- **Military Job Code Translation**: Provide a military job code (e.g., MOS, AFSC) to get suggestions for related civilian careers.
- **Document Upload**: Upload employment-related documents (PDF or DOCX), and VetsAI will process the content to assist with career suggestions.
- **OpenAI Integration**: Uses OpenAIs GPT-4 to generate responses based on the conversation context.
- **OpenAI Integration**: Uses OpenAI's GPT-4 to generate responses based on the conversation context.

## Prerequisites

To run this application, ensure you have the following installed:

- Python 3.8 or later
- A virtual environment (recommended)

Expand All @@ -22,62 +21,61 @@ To run this application, ensure you have the following installed:
```bash
git clone <repository-url>
cd <repository-directory>
```

2. Set up a virtual environment:

python -m venv venv
source venv/bin/activate # For macOS/Linux
.\venv\Scripts\activate # For Windows


3. Install dependencies:

pip install -r requirements.txt


4. Set up environment variables:
• Create a .env file in the root of your project.
• Add your OpenAI API key to the .env file:

OPENAI_API_KEY=your-openai-api-key


2. **Set up a virtual environment**:
```bash
python -m venv venv
source venv/bin/activate # For macOS/Linux
.\venv\Scripts\activate # For Windows
```

Running the App
3. **Install dependencies**:
```bash
pip install -r requirements.txt
```

1. Run the Streamlit app:
4. **Set up environment variables**:
- Create a .env file in the root of your project.
- Add your OpenAI API key to the .env file:
```
OPENAI_API_KEY=your-openai-api-key
```
streamlit run app.py
## Running the App
1. **Run the Streamlit app**:
```bash
streamlit run app.py
```

2. Access the app:
Open your web browser and navigate to http://localhost:8501.
2. **Access the app**:
Open your web browser and navigate to http://localhost:8501.

Usage
## Usage

Chat: Ask questions about job searching, resume building, and military job code translations.
Upload Resume: Upload a resume (PDF or DOCX), and VetsAI will process the text for further assistance.
Military Job Codes: Enter your military job code (e.g., MOS, AFSC) to get suggestions for civilian careers.
- **Chat**: Ask questions about job searching, resume building, and military job code translations.
- **Upload Resume**: Upload a resume (PDF or DOCX), and VetsAI will process the text for further assistance.
- **Military Job Codes**: Enter your military job code (e.g., MOS, AFSC) to get suggestions for civilian careers.

File Structure
## File Structure

app.py: Main application script.
data/employment_transitions/job_codes/: Directory containing military job code files.
requirements.txt: Python package dependencies.
- `app.py`: Main application script.
- `data/employment_transitions/job_codes/`: Directory containing military job code files.
- `requirements.txt`: Python package dependencies.

Dependencies
## Dependencies

The following Python libraries are required to run this app:

• streamlit: For the web interface.
• httpx: To make HTTP requests to OpenAI’s API.
• nest-asyncio: To allow nested event loops for async operations.
• better-profanity: To filter profane language.
• PyPDF2: For extracting text from PDF files.
• python-docx: For reading DOCX files.
• python-dotenv: To load environment variables from a .env file.
• openai: To interact with OpenAI’s API.

License

This project is licensed under the MIT License.
- `streamlit`: For the web interface.
- `httpx`: To make HTTP requests to OpenAI's API.
- `nest-asyncio`: To allow nested event loops for async operations.
- `better-profanity`: To filter profane language.
- `PyPDF2`: For extracting text from PDF files.
- `python-docx`: For reading DOCX files.
- `python-dotenv`: To load environment variables from a .env file.
- `openai`: To interact with OpenAI's API.

## License

This project is licensed under the MIT License.
Binary file removed data/.DS_Store
Binary file not shown.
Binary file removed data/employment_transitions/.DS_Store
Binary file not shown.
Binary file removed data/employment_transitions/job_codes/.DS_Store
Binary file not shown.
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@ tiktoken>=0.5.1
cachetools>=5.3.2
dataclasses-json>=0.6.1
asyncio>=3.4.3
aiohttp>=3.9.1
aiohttp>=3.9.1
1 change: 1 addition & 0 deletions streamlit_app.py
Original file line number Diff line number Diff line change
Expand Up @@ -428,5 +428,6 @@ def main():
save_feedback(feedback)
st.success("Thank you for your feedback!")


if __name__ == "__main__":
main()
Empty file added tests/__init__.py
Empty file.
25 changes: 25 additions & 0 deletions tests/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
import io
import pytest
import os

TEST_RESOURCE_DIR = f"{os.path.dirname(__file__)}/resources"


def load_resource_file(file_name):
with open(file_name, "rb") as file:
data = io.BytesIO(file.read())
return data


@pytest.fixture(scope="module")
def file_resources():
library = {}
for filename in os.listdir(TEST_RESOURCE_DIR):
library[filename.split(".")[0]] = load_resource_file(f"{TEST_RESOURCE_DIR}/{filename}")
yield library


def pytest_configure(config):
config.addinivalue_line(
"markers", "slow: marks tests as slow (deselect with '-m \"not slow\"')"
)
Binary file added tests/resources/docx_blank.docx
Binary file not shown.
Binary file added tests/resources/docx_text_and_media.docx
Binary file not shown.
Binary file added tests/resources/docx_text_only.docx
Binary file not shown.
Binary file added tests/resources/docx_unicode_sample.docx
Binary file not shown.
Binary file added tests/resources/pdf_blank.pdf
Binary file not shown.
Binary file added tests/resources/pdf_text_and_media.pdf
Binary file not shown.
Binary file added tests/resources/pdf_text_only.pdf
Binary file not shown.
Binary file added tests/resources/pdf_unicode_sample.pdf
Binary file not shown.
32 changes: 32 additions & 0 deletions tests/test_streamlit_app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
from streamlit_app import extract_text_from_pdf, extract_text_from_word
import pytest


class TestDOCXExtraction:
def test_extract_text_from_word_with_only_text(self, file_resources):
assert extract_text_from_word(file_resources["docx_text_only"]) == "This document has text!"

def test_extract_text_from_word_with_empty_file(self, file_resources):
assert extract_text_from_word(file_resources["docx_blank"]) == ""

def test_extract_text_from_word_with_non_text_contents(self, file_resources):
assert extract_text_from_word(file_resources["docx_text_and_media"]) == "This document has text!"

def test_extract_text_from_word_with_special_characters(self, file_resources):
assert extract_text_from_word(file_resources["docx_unicode_sample"])


class TestPDFExtraction:
def test_extract_text_from_pdf_with_only_text(self, file_resources):
assert extract_text_from_pdf(file_resources["pdf_text_only"]) == "This document has text!"

def test_extract_text_from_pdf_with_empty_file(self, file_resources):
assert extract_text_from_pdf(file_resources["pdf_blank"]) == ""

def test_extract_text_from_pdf_with_non_text_contents(self, file_resources):
# PyPDF2 will pull the text from charts also, so we cannot use == to compare
assert "This document has text!" in extract_text_from_pdf(file_resources["pdf_text_and_media"])

@pytest.mark.slow
def test_extract_text_from_pdf_with_special_characters(self, file_resources):
assert extract_text_from_pdf(file_resources["pdf_unicode_sample"])

0 comments on commit 262fb31

Please sign in to comment.