Skip to content

Commit

Permalink
build: update build files
Browse files Browse the repository at this point in the history
Add Dockerfile.

Update conda environment.yml

Update README.md
  • Loading branch information
dvdblk committed Jan 10, 2024
1 parent 8c46f5f commit b14c367
Show file tree
Hide file tree
Showing 5 changed files with 542 additions and 18 deletions.
12 changes: 12 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# syntax=docker/dockerfile:1

FROM python:3.11.5

WORKDIR /app

ADD app app

COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt

CMD [ "python", "-m", "streamlit", "run", "app/main.py" ]
56 changes: 48 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,27 @@
# Hack4Good - OECD
# Hack4Good - NLP for policy trend analysis (OECD)

[![Code style](https://img.shields.io/badge/code%20style-black-000000.svg)](pyproject.toml)
[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit)](.pre-commit-config.yaml)

* https://hackmd.io/g5AYgepnQrqrk4V26DrQMg
This project was created for the [Hack4Good 2023](https://www.analytics-club.org/hack4good) hackathon in collaboration with [OECD](https://www.oecd.org/switzerland/).

* https://docs.google.com/spreadsheets/d/1pKb_1Je4hD2X8IfYrFXYqBhWfPg5lgPY/edit?usp=sharing&ouid=110500414719598262605&rtpof=true&sd=true
## GUI Quickstart

## Getting started
Environment variables need to be set in order to run the code.
Create a `.env` (you can use `cp .env.default .env`) file in the root of the repo with the following contents:

### Conda Environment
To run the code in this repo create a [conda](https://conda.io/projects/conda/en/latest/user-guide/install/index.html) environment with the required dependencies:
| Environment Variable | Description |
| --- | --- |
| `ADOBE_CLIENT_ID` | Create Adobe Developer account and select "Get credentials" [here](https://developer.adobe.com/document-services/docs/overview/pdf-extract-api/) |
| `ADOBE_CLIENT_SECRET` | Copy from "Get credentials" [here](https://developer.adobe.com/document-services/docs/overview/pdf-extract-api/) as with `ADOBE_CLIENT_ID` |
| `OPENAI_API_KEY` | Get the [OpenAI API key](https://help.openai.com/en/articles/4936850-where-do-i-find-my-api-key) |

After setting the environment variables, you can run the code in one of two ways:

<details>
<summary><b><font size="+1">Conda Environment</font></b></summary>

1. Create a [conda](https://conda.io/projects/conda/en/latest/user-guide/install/index.html) environment with the required dependencies:

To **create** a conda environment after cloning the repo:
```
Expand All @@ -22,7 +33,36 @@ conda activate hack4good
conda deactivate
```

(Optional) To **update** the conda environment after pulling latest changes:
```
conda activate hack4good
conda env update -f environment.yml --prune
```

(Optional) To **remove** the conda environment:
```
conda deactivate
conda env remove -n hack4good
```

2. Run the streamlit app
```
python -m streamlit run app/main.py
```
</details>

<details>
<summary><b><font size="+1">Docker</font></b></summary>

1. Pull (or build) the docker image

To **pull** the latest docker image:
```
docker pull ghc
```
</details>

## Contributing
1. Install [pre-commit](https://pre-commit.com/#installation).
2. `pre-commit install`
3. Add changes, commit and pull request to `main` branch.
2. Run `pre-commit install` to apply the repo's pre-commit hooks to your local git repo.
3. Add your changes, commit and create a pull request with `main` branch as the target.
11 changes: 1 addition & 10 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,4 @@ dependencies:
- pip=23.2.1
- python=3.11.5
- pip:
- pdfplumber
- pdfminer.six
- tqdm
- torch
- nougat-ocr
- streamlit
- langchain
- python-dotenv
- openai
- python-multipart
- -r requirements.txt
12 changes: 12 additions & 0 deletions requirements.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
pdfplumber
pdfminer.six
tqdm
torch
nougat-ocr
streamlit
langchain
python-dotenv
openai
python-multipart
pdfservices-sdk
requests
Loading

0 comments on commit b14c367

Please sign in to comment.