-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Completed traces locked/frozen in pending status. Option to delete individual traces? #453
Comments
What version of langsmith are you using? (and langchain, if you're using this) |
@hinthornw I haven't updated anything since last week, I think. Current versions are langchain 0.1.5 and langsmith 0.0.87. Another thing I noticed is if I did a keyboard interrupt on a run and it also froze in pending status. I will do an update on Monday. |
We fixed a couple issues with pending runs in more recent versions. Could you let me know if the issue persists after upgrading? |
@hinthornw I'm getting pip dependency errors with langchain when trying to update langsmith to 0.1.2, or any version above 0.0.87. I updated langchain to 0.1.7, then had to downgrade to 0.1.6 and downgrade langchain-community to 0.0.19 due to this current pwd import issue with the PebbloSafeLoader in langchain_community.document_loaders: (langchain-ai/langchain#17514) Then when I try to update langsmith I get this:
My code still works fine with langsmith 0.1.2 though, and I have not had any issues with pending runs yet. I just ran the same script that locked up in pending on Friday with the same concurrency parameters and the run completed successfully. I did get a new error though while running that script:
This error popped up twice during a processing step that iterates through a ton of text and makes ~30 concurrent calls to OpenAI each iteration. The traces from the nested runs were logged successfully though, so maybe everything's all good? Thank you for the help. |
I am having the same issue. Several runs and traces are frozen in "pending" status. The status hasn't changed even after a couple of days. Here's the version info:
And,
|
@vishal-git this is because the span end never made it to the server. they will never be marked finished unless they are patched with an end time. |
Okay, that makes sense. Thank you for replying. Can you please advise how to set an end time to avoid this situation? This is happening way too often and we are stuck with so many traces (and runs) in 'pending' mode. |
I'm running into a similar issue as well, even using the latest version of node.js langsmith - 0.1.8. |
if it's helpful, I'm also seeing some of these logs (I'm not sure if they can be safely ignored or not):
|
Hmm will forward to our JS folks - less familiar with some of the corner cases in node |
@hinthornw thanks for checking back! for me, this has since resolved 👌 |
though, if it's helpful, I'm still seeing the |
I am getting similar pending RunnableSequence issue with langsmith 0.1.38 and langchain 0.1.13, but this time with this error
also calculating start_time on langsmith.Client() looks deprecated |
I have similar issue Failed to batch ingest runs: LangSmithConnectionError('Connection error caused failure to post https://api.smith.langchain.com/runs/batch in LangSmith API. Please confirm your LANGCHAIN_ENDPOINT. ConnectionError(MaxRetryError("HTTPSConnectionPool(host='api.smith.langchain.com', port=443): Max retries exceeded with url: /runs/batch (Caused by ProtocolError('Connection aborted.', timeout('The write operation timed out')))"))') |
I have the same issue as well, and I have not been able to solve it yet. |
I have a similar issue:
|
Thank you all for your patience. @sergiovadyen and @thmedata do you have a code snippet I can use to reproduce your 422's?
|
@hinthornw by just importing langsmith and calling
the error happened after I upgraded the Langsmith package |
@athmedata that sounds like an unrelted error, probably related to API keys? [edited] OK interesting. So the same code; different langsmith versions; now it's getting a 422. |
@hinthornw |
This is still happening for us but only after upgrading to 0.2.6 from 0.0.333: Failed to batch ingest runs: LangSmithError('Failed to POST https://api.smith.langchain.com/runs/batch in LangSmith API. HTTPError('422 Client Error: unknown for url: https://api.smith.langchain.com/runs/batch\', '{"detail":"start_time must be an ISO 8601 timestamp"}')') same API keys, etc. LangSmith version 0.1.82 |
@RobertCorwin-AustinAI could you share a code snippet to help us debug? Pending runs occur when the run patch event doesn't make it to the server. The reason for that is usually connectivity or rate limiting related, but in this case it seems like some issue with how the timestamp is being passed to the client. There's clearly something I need to fix or clarify; I'm having a hard time reproducing this particular case though, since all the timestamps we create in the lib use |
I know you'll hate hearing this but it appears to happen at random... although we're using an agent and also a chain in a tool and it doesn't seem to happen for the "AgentExecutor" but it does happen in the "Runnable Sequence" which involves calling parallel search functions - RunnableParallel. maybe the parallelism has something to do with it? Is there some way we can monitor the actual calls to the LangSmith API? It seems to me that might help a lot. I actually tried to do that using a packet sniffer on my network card but it was taking too much time. Is there some way to log the actual calls in a verbose debug mode or something? thx |
Going to close as stale and likely overlapping with #808 |
I just started using LangSmith, and it's awesome, but certain traces are locking up in "pending" status even after they complete successfully. It's happening on runs with higher concurrency parameters/more nested runs. Is this something you've run into before?
Also, there doesn't seem to be an option to delete individual traces from a project, so to clear these pending "lock ups" I have to delete and recreate the whole project. Could deleting traces maybe be a future feature request?
The text was updated successfully, but these errors were encountered: