Implementation of a virtual student assistant using large language models

Goal

Research the effectiveness of- and develop a working prototype of a virtual student assistant and make it available to students for testing.

Background

This project was created for Østfold University College as a bachelor's thesis. It was a collaboration between @Klattet, @JegHeterTobias, @khadijo and @OmriJam. The project was chosen from a list of available projects via vote by the project collaborators. Østfold University College lent us a Jetson AGX Orin machine to help run our applications on, and we used it as a test platform for debugging and LLM testing.

The project did not only involve the implementation and testing of a prototype, but also research on how to create a persona to manipulate the language model to give responses in a certain way.

Multiple open source language models that were popular at the time were tested for their generation speed and the accuracy of their responses. We found that Llama2 by Meta and Orca2 by Microsoft had the best performance on our hardware, and we used Orca2 for the prototype.

The prototype was voluntarily alpha-tested by over 30 Programming-2 students at the campus, to varying degrees of success. We found that the quality of the LLM responses was highly dependent on the quality of the user's prompt. A common trend was that users initiated the conversation with a few or even a single word, leading to the generation of a poor or blank response. This meant that the effectiveness of the chatbot assistant was highly dependent on the user's prompting efficacy.

Dependencies

Library	Usecase
disnake	Controlling the bot user through Discord's API
haystack-ai	Tools for creating a LLM prompt pipeline
jsonschema	Ensuring valid JSON format
llama-cpp-haystack	LlamaCPP integration with Haystack.
pdfminer.six	Parsing text out of PDF files
python-docx	Parsing text out of docx files
websockets	Handling socket requests asynchronously

After cloning this repo, I recommend creating a virtual environment with:

python -m venv .venv

Then activating it with:

source .venv/bin/activate

Run either of the commands below to install dependencies.

pip install disnake haystack-ai jsonschema llama-cpp-haystack pdfminer.six python-docx websockets

pip install -r requirements.txt

It can be a bit challenging to get llama_cpp to work depending on how you want to run the LLMs and what hardware you have. You may need to build it with custom parameters. See here.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
discord_interface		discord_interface
haystack_server		haystack_server
llamacpp_server		llamacpp_server
parsing		parsing
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.toml		config.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Implementation of a virtual student assistant using large language models

Goal

Background

Dependencies

About

Languages

License

Klattet/StudassBot

Folders and files

Latest commit

History

Repository files navigation

Implementation of a virtual student assistant using large language models

Goal

Background

Dependencies

About

Topics

Resources

License

Stars

Watchers

Forks

Languages