Llama Chat Next.js

Hey all, this project was a little experiment in wanting run Llama locally on my own machine using the Next.js App Router, node-llama-cpp, SQLite and more! See the full write-up of the project and technical considerations/decisions here: https://harry.is-a.dev/projects/llama-chat/.

Getting Started

First, download a quantised GGUF Llama 2 Chat model from TheBloke on HuggingFace: https://huggingface.co/TheBloke. I used https://huggingface.co/TheBloke/Llama-2-7b-Chat-GGUF, but choose whatever best suits your needs and requirements.
Then, create a folder in the root directory called "llama", and add the downloaded model to it.
Now, create an .env file in the root directory, and add the path of the model to LLAMA_MODEL_PATH, for example:

LLAMA_MODEL_PATH=llama/llama-2-7b-chat.Q2_K.gguf

Now run pnpm install.
(Optional) If you want to enable CUDA to speed up Llama 2, run the following command:

pnpm dlx node-llama-cpp download --cuda

For this to work, you must have the CUDA toolkit installed, cmake-js dependencies and CMake version 3.26 or higher.

For this to work for me on Windows, I had to modify the compileLlamaCpp.js file in the node-llama-cpp package. I had to change:

if (cuda && process.env.CUDA_PATH != null && await fs.pathExists(process.env.CUDA_PATH))
            cmakeCustomOptions.push("CMAKE_GENERATOR_TOOLSET=" + process.env.CUDA_PATH);

to this:

if (cuda && process.env.CUDA_PATH != null && await fs.pathExists(process.env.CUDA_PATH))
            cmakeCustomOptions.push("CMAKE_VS_PLATFORM_TOOLSET_CUDA=" + process.env.CUDA_PATH);

I also had to switch the NPM VS Build tools to 2022, as 2017 wasn't working for me:

npm config set msvs_version 2022 --global

To see more information on how to install Cuda and to check other requirements, see the official docs for node-llama-cpp.

Now, you can run the development server:

pnpm dev

Open http://localhost:3000 with your browser to see the result.

The API route to interact with Llama 2 is at /api/chat.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.github/workflows		.github/workflows
.vscode		.vscode
app		app
components		components
hooks		hooks
lib		lib
llama		llama
public		public
scripts		scripts
services		services
types		types
.env.example		.env.example
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc		.prettierrc
README.md		README.md
components.json		components.json
database.db		database.db
environment.d.ts		environment.d.ts
next.config.mjs		next.config.mjs
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Llama Chat Next.js

Getting Started

About

Releases

Packages

Contributors 2

Languages

Harry-Ross/llama-chat-nextjs

Folders and files

Latest commit

History

Repository files navigation

Llama Chat Next.js

Getting Started

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages