Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Your Own LLM Chatbot App: A Fun and Easy Guide to React Native Development! #2635

Open
wants to merge 26 commits into
base: main
Choose a base branch
from

Conversation

MekkCyber
Copy link
Contributor

Adding a blogpost about how to create an on-device chatbot app to chat with small llms using react native and gguf models from the hub :

Create Your Own LLM Chatbot App: A Fun and Easy Guide to React Native Development!

@Vaibhavs10 Vaibhavs10 self-requested a review January 31, 2025 13:32
@SunMarc SunMarc requested review from pcuenca and ngxson February 7, 2025 13:51
Copy link
Member

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is awesome. Thanks a lot!!

I just do a quick pass, will do another pass later.


- [`llama.rn`](https://github.com/mybigday/llama.rn): a binding for [`llama.cpp`](https://github.com/ggerganov/llama.cpp) tailored for React Native apps.
- `react-native-fs`: allows us to manage the device's file system in a React Native environment.
- `axios`: a library for sending requests to the Hugging Face Hub API.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the more modern fetch API? AFAIK it's natively supported by rn

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we can but I prefer axios because it's more comprehensive

Comment on lines 529 to 531
We store the selected model format in the state and clear the previous list of GGUF files from other selections, and then we fetch the new list of GGUF files for the selected format.
The screen should look like this on your device:
![alt text](assets/blog_images/model_selection_start.png)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may need to add a new line between each paragraph, otherwise they will be all on the same line.

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few comments ! I wasn't able to run the code due to hardware issue on my side but I will keep trying

@@ -0,0 +1,726 @@
---
title: "Create Your Own LLM Chatbot App: A Fun and Easy Guide to React Native Development!"
thumbnail: /blog/assets/deepseek-r1-aws/thumbnail.png
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Comment on lines 8 to 10
Did you ever wonder how you can create a mobile app to chat with LLMs locally? Have you tried to understand the code in some open source projects but found it too complex? Well, this blog is for you! Inspired by the great [Pocket Pal](https://github.com/a-ghorbani/pocketpal-ai) app, We will help you build a simple React Native app to chat with LLMs downloaded from the [**Hugging Face**](https://huggingface.co/) hub, everything is private and runs on device !

---
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right after this introduction, maybe you can share a short video demo !

Comment on lines 333 to 345
### **Model Download Implementation**

Now let's implement the model download functionality in the `handleDownloadModel` function which should be called when the user clicks on the download button. This will download the selected GGUF file from Hugging Face and store it in the app's Documents directory:

```typescript
const handleDownloadModel = async (file: string) => {
const downloadUrl = `https://huggingface.co/${
HF_TO_GGUF[selectedModelFormat as keyof typeof HF_TO_GGUF]
}/resolve/main/${file}`;
// we set the isDownloading state to true to show the progress bar and set the progress to 0
setIsDownloading(true);
setProgress(0);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here. How can the user verify that he's doing right ?

Copy link
Contributor Author

@MekkCyber MekkCyber Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried my best to add checkpoints and screenshots and explanations and ways to verify if the app is working properly

};
```

The `downloadModel` function, located in `src/api`, accepts three parameters: `modelName`, `downloadUrl`, and a `progress` callback function. This callback is triggered during the download process to update the progress. The `RNFS` module, part of the `react-native-fs` library, provides file system access for React Native applications. It allows developers to read, write, and manage files on the device's storage. In this case, the model is stored in the app's Documents folder using `RNFS.DocumentDirectoryPath`, ensuring that the downloaded file is accessible to the app. The progress bar is updated accordingly to reflect the current download status and the progress bar component is defined in the `components` folder.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to copy paste the fucntion to in scr/api from your repo ?


Let's create `src/api/model.ts` and copy the code from the [`src/api/model.ts`](https://github.com/MekkCyber/EdgeLLM/blob/main/EdgeLLMBasic/src/api/model.ts) file in the repo. The logic should be simple to understand. The same goes for the progress bar component in the [`src/components`](https://github.com/MekkCyber/EdgeLLM/blob/main/EdgeLLMBasic/src/components/ProgressBar.tsx) folder.

### **Model Loading and Initialization**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

};
```

### **Chat Implementation**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Comment on lines 479 to 490
### **The UI && Logic**

Now that we have the core functionality implemented, we can focus on the UI. The UI is straightforward, consisting of a model selection screen with a list of models and a chat interface that includes a conversation history and a user input field. During the model download phase, a progress bar is displayed. We intentionally avoid adding many screens to keep the app simple and focused on its core functionality. To keep track of which part of the app is being used, we will use a an other state variable called `currentPage`, it will be a string that can be either `modelSelection` or `conversation`. We add it to the `App.tsx` file.

```typescript
const [currentPage, setCurrentPage] = useState<
'modelSelection' | 'conversation'
>('modelSelection'); // Navigation state
```
For the css we will use the same styles as in the [EdgeLLMBasic](https://github.com/MekkCyber/EdgeLLM/blob/main/EdgeLLMBasic/App.tsx) repo, you can copy the styles from there.

We will start by working on the model selection screen in the App.tsx file, we will add a list of model formats (you need to do the necessary imports):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it's a bit hard to digest everything, all the more that we can't test anything. We have to copy paste everything and hope that it works at the end. Maybe create more paragraphs and allow the users to see the results.

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missclick

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! I just did a quick pass, and agree with @SunMarc's comments.

I don't think it's necessary to show all the code blocks in the post. Instead, I'd maybe show a screenshot of the app, point readers to the repo, explain (high level) how everything works, and then call attention towards interesting portions of the code. One potential idea (only if it's easy) could be to identify an easy new feature that does not exist in the repo, and show how to go about creating it.


First, let's install the required packages. We aim to load models from the [Hugging Face Hub](https://huggingface.co/) and run them locally. To achieve this, we need to install :

- [`llama.rn`](https://github.com/mybigday/llama.rn): a binding for [`llama.cpp`](https://github.com/ggerganov/llama.cpp) tailored for React Native apps.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can explain that we'll be using llama.cpp models in the intro to the post, and explain how to select interesting GGUF files from the Hub.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 267 to 282
```typescript
const modelFormats = [
{label: 'Llama-3.2-1B-Instruct'},
{label: 'Qwen2-0.5B-Instruct'},
{label: 'DeepSeek-R1-Distill-Qwen-1.5B'},
{label: 'SmolLM2-1.7B-Instruct'},
];

const HF_TO_GGUF = {
'Llama-3.2-1B-Instruct': 'bartowski/Llama-3.2-1B-Instruct-GGUF',
'DeepSeek-R1-Distill-Qwen-1.5B':
'bartowski/DeepSeek-R1-Distill-Qwen-1.5B-GGUF',
'Qwen2-0.5B-Instruct': 'Qwen/Qwen2-0.5B-Instruct-GGUF',
'SmolLM2-1.7B-Instruct': 'bartowski/SmolLM2-1.7B-Instruct-GGUF',
};
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these? A preliminary section could be about:

  • Useful archs, model sizes, and quants.
  • Interesting examples (these ones).
  • Additional options: how to search the Hub for additional models.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@Vaibhavs10 Vaibhavs10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lightweight review - sorry for dropping the ball on this earlier - been swamped.
IMO we need to contextualise and make this a bit less intimidating and more like a follow-along tutorial. All the pieces are there it's just about putting them together.

We should also put the big code snippets in collapsible or link them to the GH repo.

Essentially find the absolute minimum LoC that you need to run the bare bones app, run them through those and showcase the most fleshed out version via the GH repo etc.

_blog.yml Outdated
@@ -5427,3 +5427,15 @@
tags:
- aws
- partnerships

- local: edgellm-reactnative-guide
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- local: edgellm-reactnative-guide
- local: LLM-inference-on-edge

Maybe something like this? - Makes it sound much better!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

_blog.yml Outdated
@@ -5427,3 +5427,15 @@
tags:
- aws
- partnerships

- local: edgellm-reactnative-guide
title: "Create Your Own LLM Chatbot App: A Fun and Easy Guide to React Native Development!"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title: "Create Your Own LLM Chatbot App: A Fun and Easy Guide to React Native Development!"
title: "LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on an iPhone!"

Or something along those lines

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sounds good !

@@ -0,0 +1,726 @@
---
title: "Create Your Own LLM Chatbot App: A Fun and Easy Guide to React Native Development!"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above suggestion re: title/ slug

- user: medmekk
---

Did you ever wonder how you can create a mobile app to chat with LLMs locally? Have you tried to understand the code in some open source projects but found it too complex? Well, this blog is for you! Inspired by the great [Pocket Pal](https://github.com/a-ghorbani/pocketpal-ai) app, We will help you build a simple React Native app to chat with LLMs downloaded from the [**Hugging Face**](https://huggingface.co/) hub, everything is private and runs on device !
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd start a bit differently and maybe paint the picture around - how LLMs are becoming smaller and smarter to the point that they can run directly on your phone first. You can use the example of DeepSeek R1 Distil Qwen 2.5 1.5B for example.

This would contextualise your blogpost well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

- Is interested in privacy-focused AI applications that run completely offline

By the end of this guide, you'll have a working app to chat with your favorite small hub models.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right about here, I'd put a video of the app - this would be a nice hook!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

MekkCyber and others added 20 commits March 3, 2025 10:58
@MekkCyber
Copy link
Contributor Author

Thank you all for the feedback, I think i addressed most of it 🤗 feel free to re-check

Copy link
Member

@Vaibhavs10 Vaibhavs10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick pass to check blog requirements! Please make sure to test the markdown on hf.co/new-blog - so that it renders properly!

- local: llm-inference-on-edge
title: "LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on an iPhone!"
author: medmekk
thumbnail: /blog/assets/deepseek-r1-aws/thumbnail.png
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to update both the date and the thumbnail

@@ -0,0 +1,1098 @@
---
title: "LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on an iPhone!"
thumbnail: /blog/assets/deepseek-r1-aws/thumbnail.png
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to update too?

As LLMs continue to evolve, they are becoming smaller and smarter, enabling them to run directly on your phone. Take, for instance, the DeepSeek R1 Distil Qwen 2.5 with 1.5 billion parameters, this model showcases how advanced AI can now fit into the palm of your hand!
In this blog, we will guide you through creating a mobile app that allows you to chat with these powerful models locally. If you've ever felt overwhelmed by the complexity of open-source projects, fear not! Inspired by the innovative [Pocket Pal](https://github.com/a-ghorbani/pocketpal-ai) app, we will help you build a straightforward React Native application that downloads LLMs from the [**Hugging Face**](https://huggingface.co/) hub, ensuring everything remains private and runs on your device. We will utilize `llama.rn`, a binding for `llama.cpp`, to load GGUF files efficiently!

## Why You Should Follow This Tutorial?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a good idea to link the codebase shortly after too? i.e. before the 0,1,2,3,4 points

To find additional GGUF models on Hugging Face:

1. Visit [huggingface.co/models](https://huggingface.co/models)
2. Use the search filters:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just link it like this: https://huggingface.co/models?library=gguf

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating. This is much better !

Comment on lines +42 to +61
#### Legacy Quants (Q4_0, Q4_1, Q8_0)
- Basic, straightforward quantization methods
- Each block is stored with:
• Quantized values (the compressed weights).
• One (_0) or two (_1) scaling constants.
- Fast but less efficient than newer methods => not used widely anymore

#### K-Quants (Q3_K_S, Q5_K_M, ...)
- Introduced in this [PR](https://github.com/ggml-org/llama.cpp/pull/1684)
- Smarter bit allocation than legacy quants
- The K in “K-quants” refers to a mixed quantization format, meaning some layers get more bits for better accuracy.
- Suffixes like _XS, _S, or _M refer to specific mixes of quantization (smaller = more compression), for example :
• Q3_K_S uses Q3_K for all tensors
• Q3_K_M uses Q4_K for the attention.wv, attention.wo, and feed_forward.w2 tensors, and Q3_K for the rest.
• Q3_K_L uses Q5_K for the attention.wv, attention.wo, and feed_forward.w2 tensors, and Q5_K for the rest.

#### I-Quants (IQ2_XXS, IQ3_S, ...)
- It still uses the block-based quantization, but with some new features inspired by QuIP
- Smaller file sizes but may be slower on some hardware
- Best for devices with strong compute power but limited memory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

```bash
npm i @react-native-community/cli
```
> **Note:** If you are prompted to install CocoaPods, it's not necessary if you are using a virtual device, as we are not going to be using Xcode.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure this is not needed ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants