Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cut answers are still present in the history #396

Open
Lovenoreus opened this issue Jan 21, 2025 · 2 comments
Open

Cut answers are still present in the history #396

Lovenoreus opened this issue Jan 21, 2025 · 2 comments
Labels
bug Something isn't working

Comments

@Lovenoreus
Copy link

This contains "container voice recording 2" with clear audio

This show how normal sentences said by the user are broken up into multiple messages.

It contains examples of sentences cutting off as well

container_call_2.mp4

container_log_2_splitting_of_messages.txt

Transcript.written.before.the.call.that.i.will.follow.t.txt

Image
Image

@clemlesne
Copy link
Collaborator

clemlesne commented Jan 21, 2025

Sentences are cuted into multiple messages because the Assistant started answering in between.

From my perspective, the issue is Why Assistant didn't spoke between 00:34 and 00:52? It appears that the issue happened once in the conversation, at the first message. The following one seems fine.

Each time that the speech timeout is triggered (250 ms by default), voice is stored then LLM is called, and the answer is forwarded piece by piece to the TTS. This process can be interrupted at any time, but it isn't rollbacked. So, if the LLM is interrupted, for example, after the LLM response but before the TTS generation, it'll be stored into the history.

A mitigation would be to implement a "transaction", technically a rollback mechanism on "answer aborted" scenario. It would remove this fractioned history with never heard responses. Do you this this would fit your need?

Step by step:

  1. The app did well generated and sent the text to the TTS: Hello! Thank you for calling Contoso. [...].
2025-01-21T15:13:23.725786Z [info     ] Playing TTS: Hello!            call.channel=voice call.id=18167bc5-df67-454a-b32b-e6eeef6f626c call.message= call.phone_number=+4570726778
2025-01-21T15:13:23.873943Z [info     ] Playing TTS: Thank you for calling Contoso. call.channel=voice call.id=18167bc5-df67-454a-b32b-e6eeef6f626c call.message= call.phone_number=+4570726778
2025-01-21T15:13:24.122839Z [info     ] Playing TTS: I'm Amélie, your virtual assistant. call.channel=voice call.id=18167bc5-df67-454a-b32b-e6eeef6f626c call.message= call.phone_number=+4570726778
2025-01-21T15:13:24.246056Z [info     ] Playing TTS: How can I assist you today? call.channel=voice call.id=18167bc5-df67-454a-b32b-e6eeef6f626c call.message= call.phone_number=+4570726778
  1. You continued to spoke, so the app stoped the generation before even starting to speak but after generated and stored the sentence
2025-01-21T15:13:33.390035Z [info     ] Stoping TTS after 250 ms       call.id=af0f6372-0532-44fd-89cc-66235aaa49a3 call.phone_number=+4570726778
  1. You stopped talking for long enough and the bot stored your message
2025-01-21T15:13:59.353412Z [info     ] Voice stored: Claim about me driving off the road on my motorbike. call.id=18167bc5-df67-454a-b32b-e6eeef6f626c call.phone_number=+4570726778
  1. It answered
2025-01-21T15:14:00.170562Z [info     ] Playing TTS: I understand, you had an accident while driving your motorbike off the road. call.channel=voice call.id=18167bc5-df67-454a-b32b-e6eeef6f626c call.message=Claim about me driving off the road on my motorbike. call.phone_number=+4570726778
2025-01-21T15:14:00.327616Z [info     ] Playing TTS: Let's make sure we have all the details for your claim. call.channel=voice call.id=18167bc5-df67-454a-b32b-e6eeef6f626c call.message=Claim about me driving off the road on my motorbike. call.phone_number=+4570726778
2025-01-21T15:14:00.494044Z [info     ] Playing TTS: First, could you please tell me the date and time of the incident? call.channel=voice call.id=18167bc5-df67-454a-b32b-e6eeef6f626c call.message=Claim about me driving off the road on my motorbike. call.phone_number=+4570726778
2025-01-21T15:14:00.619422Z [info     ] Playing TTS: And, were there any injuries? call.channel=voice call.id=18167bc5-df67-454a-b32b-e6eeef6f626c call.message=Claim about me driving off the road on my motorbike. call.phone_number=+4570726778
  1. You talked
2025-01-21T15:14:15.710649Z [info     ] Stoping TTS after 250 ms       call.id=812d0408-b07d-4e39-9a64-536dab9b2ea1 call.phone_number=+4570726774
  1. Voice stored
2025-01-21T15:14:16.670697Z [info     ] Voice stored: Hello. Add.      call.id=18167bc5-df67-454a-b32b-e6eeef6f626c call.phone_number=+4570726778
  1. It answered
2025-01-21T15:14:17.172073Z [info     ] Playing TTS: Hello!            call.channel=voice call.id=18167bc5-df67-454a-b32b-e6eeef6f626c call.message=Hello. Add. call.phone_number=+4570726778
2025-01-21T15:14:17.333241Z [info     ] Playing TTS: Let's continue with your claim. call.channel=voice call.id=18167bc5-df67-454a-b32b-e6eeef6f626c call.message=Hello. Add. call.phone_number=+4570726778
  1. You talked
2025-01-21T15:14:17.634999Z [info     ] Stoping TTS after 250 ms       call.id=812d0408-b07d-4e39-9a64-536dab9b2ea1 call.phone_number=+4570726774
  1. Voice stored
2025-01-21T15:14:20.313089Z [info     ] Voice stored: yes this was yesterday call.id=18167bc5-df67-454a-b32b-e6eeef6f626c call.phone_number=+4570726778
  1. You talked
2025-01-21T15:14:21.072215Z [info     ] Stoping TTS after 250 ms       call.id=812d0408-b07d-4e39-9a64-536dab9b2ea1 call.phone_number=+4570726774
  1. Answer called a tool
2025-01-21T15:14:21.219311Z [info     ] Playing TTS: I am updating the claim with the date of the incident. call.channel=voice call.id=18167bc5-df67-454a-b32b-e6eeef6f626c call.message=yes this was yesterday call.phone_number=+4570726778 tool.args={"updates": [{"field": "incident_datetime", "value": "2025-01-20"}], "customer_response": "I am updating the claim with the date of the incident."} tool.name=updated_claim
2025-01-21T15:14:21.223099Z [info     ] Executed function updated_claim ({'updates': [{'field': 'incident_datetime', 'value': '2025-01-20'}], 'customer_response': 'I am updating the claim with the date of the incident.'}): # Updated fields
- U... value "2025-01-20". call.channel=voice call.id=18167bc5-df67-454a-b32b-e6eeef6f626c call.message=yes this was yesterday call.phone_number=+4570726778 tool.args={"updates": [{"field": "incident_datetime", "value": "2025-01-20"}], "customer_response": "I am updating the claim with the date of the incident."} tool.name=updated_claim

@clemlesne clemlesne added the bug Something isn't working label Jan 21, 2025
@Lovenoreus
Copy link
Author

Yes a rollback mechanism the way you describe it sounds reasonable in my ears when you explain it like that

I thought it would be easier and sufficient just still having the option to interrupt the bot but even then also saving the full sentences into the conversation history. You could even mark what parts were pronounced to the user and what part of the ending sentence was used by the application but cut off from the audio.

@clemlesne clemlesne changed the title Sentences are broken up into multiple messages, container call 2 Cutted answers are still prrsent in the history Jan 21, 2025
@Lovenoreus Lovenoreus changed the title Cutted answers are still prrsent in the history Cut answers are still present in the history Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants