This script will process all documents in a directory using the Unstructured Serverless API JavaScript SDK.
- Clone the repo
git clone [email protected]:Unstructured-IO/js-client-batch.git
- Install dependencies
npm i
- Set environment variables in
.env
UNSTRUCTURED_API_URL=https://api.unstructuredapp.io || <the URL of your choice>
UNSTRUCTURED_API_KEY=<Your Unstructured API Key>
STRATEGY='fast' || 'hi_res'
SPLIT_PAGES='true' || 'false'
By default this script looks to ./sample_data
for files to process, or you can specify the path as in .env
as:
...
DOCS_PATH=full/path/to/documents
By default this script will output partitioned results to ./output
or you can specify the output path in .env
as:
...
OUTPUT_PATH=full/path/to/output
To run the script npm run start