WISYKI-API

The WISYKI-API is a tool developed as part of the WISY@KI project that provides capabilities to predict ESCOv1.2.0, GRETAv2, DKZ, DigComp 2.2 (ESCO) and DigCompEdu skills based on given course descriptions or learning outcomes. Predictions include learning outcomes, prerequisites, keywords and competency levels.

The application leverages two fine-tuned models:

Finetuned Embedding Model: A derivative of the embedding model intfloat/multilingual-e5-base, this model was optimized to retrieve the most relevant skills from a vectordatabase based on german course descriptions.
Finetuned Reranker: This model, a fine-tuned version of the cross-encoder model BAAI/bge-reranker-base, is employed for the validation and reranking of skill predictions.

The fine-tuning of these models was executed using a diverse set of training data. This data, which includes learning outcomes along with positive and negative labels representing ESCO and GRETA skills, was sourced from the INVITE projects GRETA, MYEdULife, and WISY@KI. Additionally, synthetic data, generated through the LLM validation model (mistral-medium) available in this API, was incorporated into the training process to enhance diversity, especially considering the scarcity of human-validated training data. While earlier versions used synthetic training data to increase diversity, the update models were trained on human validated data exclusively.

In addition to the above models, the WISYKI-API also uses other models via their APIs for learning outcome extraction and LLM validation. Depending on the request, the following models are used:

For learning outcome extraction, one of the following models is used:

Llama 3 8B (hosted in SH by myLab@THL)
mistral-small (API key required)
gpt-3.5-turbo-1106 (API key required)

For more challenging tasks like validation, one of the following models is used:

Mixtral-8x7B (hosted in SH by myLab@THL)
mistral-large (API key required)
gpt-4o (API key required)

Please note that some of these models require an API key for access.

Features

Predict ESCOv1.2.0, GRETAv2, DKZ, DigComp 2.2 (ESCO) and DigCompEdu Skills based on course descriptions or learning outcomes.
Predict competency levels.
Create embeddings.
Provide validated training data.

Installation

To get the API up and running, follow these steps:

Prerequisites

You will need to have Docker installed on your machine. If you don't have Docker installed, you can download it from the official docker website.

Environment Variables

The following environment variables are required:

POSTGRES_PASSWORD: The password for the PostgreSQL database.
PGADMIN_EMAIL: The email for PgAdmin.
PGADMIN_PASSWORD: The password for PgAdmin.
HOST_PORT: The host port.
POSTGRES_PORT: The port for the PostgreSQL database.
PGADMIN_PORT: The port for PgAdmin.
ROOT_PATH: This variable is used when the API is running behind a proxy. If the API is served under a subpath, such as /api/v1, then ROOT_PATH should be set to that subpath (/api/v1 in this case). If the API is not behind a proxy or is served from the root path, this variable can be left empty.

Steps

Clone the repository:
```
git clone <repository-url>
```
Navigate to the project directory:
```
cd wisyki-api
```

Download required models:

git clone https://huggingface.co/isy-thl/multilingual-e5-base-course-skill-tuned data/models/multilingual-e5-base-course-skill-tuned
git clone https://huggingface.co/hkunlp/instructor-large data/models/instructor-large
git clone https://huggingface.co/isy-thl/wisyki-complevel-classifier data/models/comp_level_model

Download the vectorstore:

Download the skill vectorstore archive from the release page.
Extract the vectorstore to the correct location:

Unzip the downloaded file into a directory at data/stores/skill_vectorstore.
Copy and setup enviroment variables:
```
cp .env.example .env
```

Run the Docker compose command:

docker-compose up -d

or run for local development

docker compose -f docker-compose.dev.yml up -d

API Documentation

For more details on how to use the API, please refer to the API Documentation.

Acknowledgements

This service uses the ESCO classification of the European Commission.

This API was developed as part of the project WISY@KI by the Institut für interakive Systeme, Technical University of Applied Sciences Lübeck and was funded by the Federal Ministry of Education and Research.

Support

If you encounter any issues or require further assistance, feel free to raise an issue in this repository.

Contributing

We welcome contributions from the community.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

WISYKI-API

Features

Installation

Prerequisites

Environment Variables

Steps

API Documentation

Acknowledgements

Support

Contributing

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

WISYKI-API

Features

Installation

Prerequisites

Environment Variables

Steps

API Documentation

Acknowledgements

Support

Contributing

License