Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Saved conversation structure collapse #4761

Closed
1 task done
quacrobat opened this issue Nov 20, 2024 · 8 comments · Fixed by #4778
Closed
1 task done

[Bug]: Saved conversation structure collapse #4761

quacrobat opened this issue Nov 20, 2024 · 8 comments · Fixed by #4778
Labels
🐛 bug Something isn't working

Comments

@quacrobat
Copy link

What happened?

I had a long conversation saved last week. That conversation had occasional branching where i edited my questions. Never more than a couple of branches wide, max 3 or 4. Now it is showing as if the entire conversation has one message at the trunk, and 42 branches on the same level. When i click on the < and > to navigate between these branches, it actually changes the root message, which is sometimes mine sometimes the LLM's answer!

I tried on both local / running on docker and on hugginface. They're both using the same DB so i guess there is corruption at the DB level. WHICH IS VERY SCARY.

image

Steps to Reproduce

I noticed this today I tried to find a message in that conversation so searched for a term in the search bar, it listed the relevant messages and the correct conversation. When I clicked on it, I found that the normal tree structure of the conversation had totally collapsed into a single trunk with 42 branches.

What browsers are you seeing the problem on?

Chrome

Relevant log output

The log shows 200 messages loaded. but i don't see them except by navigating using the < 40/42 > prompt paginator.

Screenshots

image

Code of Conduct

  • I agree to follow this project's Code of Conduct
@quacrobat quacrobat added the 🐛 bug Something isn't working label Nov 20, 2024
@quacrobat
Copy link
Author

PS, i had updated to latest version on local a few days ago. but the hugging face hasn't been updated (unless it does it automatically when i rebuild). they both present the same issue.
When i update the local/docker version to latest version, i still see the same issue.

@quacrobat
Copy link
Author

Here's a sketch of what the structure of the structure i see, i hope it makes sense!

WhatsApp Image 2024-11-19 at 22 16 13

@nickmahdavi
Copy link

Also having this issue — not sure if it's number of messages or chat lengths, but I noticed it kicking in with one of my chats every time it hit 20 messages or so. The collapse seems to happen if I export or fork the chat but not in the original thread. Although what does tend to happen in the original thread (past another length threshold) is #3813, likely related.

I also don't notice any corruption that happens in the actual JSON, say if I export the original chat.

@xyqyear
Copy link
Contributor

xyqyear commented Nov 22, 2024

I'm also experiencing the same issue. It seems that if you fork when there are more than 16 messages (user + assistant), the newly forked conversation will be messed up.

I've come up with a pretty consistent way to reproduce:

  1. keep the conversation going for 9 rounds. The easiest way would be in the system prompt tell it to just repeat whatever the user sent.
    image
  2. fork the chat at the last response (or the 9th user query).
  3. the newly created conversation will be messed up.

There would be no problem if you fork at any point before that.

Another thing I found out is that if the original conversation has branches, the original conversation might get corrupted at well. Let's say after I had 8 rounds of conversation, I change the 8th query to another message, then I created a branch with a new response, then the total number of message in this conversation will exceed 16. In this case, if I at any point create a fork, the ORIGINAL conversation will be messed up as well. If the newly created conversation contains more than 16 messages, it too will be messed up.

So here is my summary:

  1. if there is no branches in the original conversation, then the original conversation will be left alone when forking.
  2. if the newly forked conversation has more than 16 messages, that new conversation will be corrupted.
  3. if the original conversation has any branching, and has more than 16 messages, the ORIGINAL conversation will be messed up when forking. Rule 2 applys too for the new conversation.

Edit:
After further investigation, I found that when a conversation has more than 16 messsages and has any branching going on. When refreshing the page, this conversation will be messed up.

@xyqyear
Copy link
Contributor

xyqyear commented Nov 22, 2024

Continuing the conversation at #4772 (comment)

After some investigation, I found that the message list is sorted by createdAt in

async function getMessages(filter, select) {
try {
if (select) {
return await Message.find(filter).select(select).sort({ createdAt: 1 }).lean();
}
return await Message.find(filter).sort({ createdAt: 1 }).lean();

But after forking, all the newly created messages will have the createdAt field updated to the current time (is this intentional? @danny-avila ), so when querying, the ordering of the message list is unpredictable. So in buildTree.ts, the built tree is also incorrect.

My guess is that mongodb only guanrantees the insertion order when qeurying small number of objects, hence the magic number 16.

@quacrobat
Copy link
Author

Fantastic work @xyqyear - and I realize I had a similar bug in a different project due to createdAt in mongodb...
@danny-avila hopefully this mean this is a front-end issue only and once fixed the corrupted conversations will appear correctly again?

@danny-avila
Copy link
Owner

Fantastic work @xyqyear - and I realize I had a similar bug in a different project due to createdAt in mongodb... @danny-avila hopefully this mean this is a front-end issue only and once fixed the corrupted conversations will appear correctly again?

Unfortunately it's a backend issue. The folding conversations can be fixed, but it would be a lot of overhead to fix the issue retroactively. Maybe I can make a script to fix specific conversations?

@quacrobat
Copy link
Author

Maybe I can make a script to fix specific conversations?

yes please! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛 bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants