-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High memory usage when embedding large texts #222
Comments
There might be a few things at play here:
Can you give some estimates of 3 things:
|
But the main problem i have is that after processing of batch is done the memory is not released. it only keeps growing untill the process gets killed. i think that the problem is in onnx session when using continuosly without recreating. when i use the model dirrectly from Hugging face, there is no growth of memory and no problem |
Thanks for sharing specifics @Barney241 — very helpful.
This'd be among my areas to investigate as well. Thanks for sharing. I'll look into this as energy permits! I'm leaving this issue open and marking it as a bug because of the impact it has on how usable FastEmbed is. |
Hi, any updates on this? We've experienced the same issue. |
Hi @johnreyev Could you tell us the version of fastembed you're using and maybe a reproducible code snippet? |
Hi @joein, you can try this
For the device spec I'm using 14-inch MacBook Pro with M3 Pro ![]() |
@johnreyev
rather than this:
because you are embedding the documents in batches but you save them all in one variable and it might explode your memory when you increase the document length |
Hi i am experiencing high memory usages which caused my pod to be killed because of exeeding its limits. after some experiments i found its related to text length that i am trying to embed.
when using models such as e5 large or paraphrase base v2 the program starts with about 1.5GB of ram usage which is expected for this model, but after some iterations this process uses 16GB of virtual memory and 6GB of ram. which is a lot
I tried other smaller models as well and its happening for them too but in smaller scale. So my intuiton was that i am just using texts that are too long so i cut all to max 100 chars. which slowed the ram usage increase, but still if not using constant batch of text it kept growing untill the program was eventually killed.
For more context i am building vector api that returns embeddings for texts. so model sessions is active from the start.
To reproduce you can use this array of texts with e5 large model. in default example with infinite loop
The text was updated successfully, but these errors were encountered: