Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rate limited #2397

Open
xXsheroXx opened this issue Mar 4, 2025 · 2 comments
Open

Rate limited #2397

xXsheroXx opened this issue Mar 4, 2025 · 2 comments

Comments

@xXsheroXx
Copy link

I'm getting "Rate limited on the OpenAI embeddings API, sleeping before retrying..." when trying to deploy to AZURE, is there any solution?

@pamelafox
Copy link
Collaborator

You will typically always see a few of those messages, but it should eventually succeed in computing the embeddings after it pauses. One way to avoid seeing as many messages is to increase the TPM capacity on your embedding model - we default to requesting 30K TPM but you may have more capacity available.

Or did none of your embedding calls succeed at all? Please share full log if so.

@xXsheroXx
Copy link
Author

xXsheroXx commented Mar 8, 2025

I'm getting

[13:55:26] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[13:57:15] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[13:57:43] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[13:59:32] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[14:00:13] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[14:01:48] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[14:02:14] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[14:04:05] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[14:04:46] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[14:05:28] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[14:07:18] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[14:07:45] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[14:09:09] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[14:09:59] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[14:11:45] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[14:12:18] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[14:13:29] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[14:15:19] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[14:15:50] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
Traceback (most recent call last):
File "C:\Users\sherv\Desktop\openai-demo\app\backend\prepdocslib\embeddings.py", line 112, in create_embedding_batch
emb_response = await client.embeddings.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\openai\resources\embeddings.py", line 243, in create
return await self._post(
^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\openai_base_client.py", line 1856, in post
return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\openai_base_client.py", line 1550, in request
return await self._request(
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\openai_base_client.py", line 1636, in _request
return await self._retry_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\openai_base_client.py", line 1683, in _retry_request
return await self._request(
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\openai_base_client.py", line 1636, in _request
return await self._retry_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\openai_base_client.py", line 1683, in _retry_request
return await self._request(
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\openai_base_client.py", line 1651, in _request
raise self._make_status_error_from_response(err.response) from None
openai.RateLimitError: Error code: 429 - {'error': {'code': '429', 'message': 'Rate limit is exceeded. Try again in 60 seconds.'}}

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\sherv\Desktop\openai-demo\app\backend\prepdocs.py", line 439, in
loop.run_until_complete(main(ingestion_strategy, setup_index=not args.remove and not args.removeall))
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64__qbz5n2kfra8p0\Lib\asyncio\base_events.py", line 654, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo\app\backend\prepdocs.py", line 244, in main
await strategy.run()
File "C:\Users\sherv\Desktop\openai-demo\app\backend\prepdocslib\filestrategy.py", line 107, in run
await search_manager.update_content(sections, blob_image_embeddings, url=file.url)
File "C:\Users\sherv\Desktop\openai-demo\app\backend\prepdocslib\searchmanager.py", line 288, in update_content
embeddings = await self.embeddings.create_embeddings(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo\app\backend\prepdocslib\embeddings.py", line 149, in create_embeddings
return await self.create_embedding_batch(texts, dimensions_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo\app\backend\prepdocslib\embeddings.py", line 105, in create_embedding_batch
async for attempt in AsyncRetrying(
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\tenacity\asyncio_init_.py", line 166, in anext
do = await self.iter(retry_state=self.retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\tenacity\asyncio_init
.py", line 153, in iter
result = await action(retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\tenacity_utils.py", line 99, in inner
return call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\tenacity_init_.py", line 419, in exc_check
raise retry_exc from fut.exception()
tenacity.RetryError: RetryError[<Future at 0x15558254310 state=finished raised RateLimitError>]
[14:18:30] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[14:20:54] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[14:23:23] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
[14:26:01] INFO Rate limited on the OpenAI embeddings API, sleeping before retrying... embeddings.py:63
Traceback (most recent call last):
File "C:\Users\sherv\Desktop\openai-demo\app\backend\prepdocslib\embeddings.py", line 112, in create_embedding_batch
emb_response = await client.embeddings.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\openai\resources\embeddings.py", line 243, in create
return await self._post(
^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\openai_base_client.py", line 1856, in post
return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\openai_base_client.py", line 1550, in request
return await self._request(
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\openai_base_client.py", line 1636, in _request
return await self._retry_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\openai_base_client.py", line 1683, in _retry_request
return await self._request(
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\openai_base_client.py", line 1636, in _request
return await self._retry_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\openai_base_client.py", line 1683, in _retry_request
return await self._request(
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\openai_base_client.py", line 1651, in _request
raise self._make_status_error_from_response(err.response) from None
openai.RateLimitError: Error code: 429 - {'error': {'code': '429', 'message': 'Rate limit is exceeded. Try again in 60 seconds.'}}

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\sherv\Desktop\openai-demo\app\backend\prepdocs.py", line 439, in
loop.run_until_complete(main(ingestion_strategy, setup_index=not args.remove and not args.removeall))
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64__qbz5n2kfra8p0\Lib\asyncio\base_events.py", line 654, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo\app\backend\prepdocs.py", line 244, in main
await strategy.run()
File "C:\Users\sherv\Desktop\openai-demo\app\backend\prepdocslib\filestrategy.py", line 107, in run
await search_manager.update_content(sections, blob_image_embeddings, url=file.url)
File "C:\Users\sherv\Desktop\openai-demo\app\backend\prepdocslib\searchmanager.py", line 288, in update_content
embeddings = await self.embeddings.create_embeddings(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo\app\backend\prepdocslib\embeddings.py", line 149, in create_embeddings
return await self.create_embedding_batch(texts, dimensions_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo\app\backend\prepdocslib\embeddings.py", line 105, in create_embedding_batch
async for attempt in AsyncRetrying(
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\tenacity\asyncio_init_.py", line 166, in anext
do = await self.iter(retry_state=self.retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\tenacity\asyncio_init
.py", line 153, in iter
result = await action(retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\tenacity_utils.py", line 99, in inner
return call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sherv\Desktop\openai-demo.venv\Lib\site-packages\tenacity_init_.py", line 419, in exc_check
raise retry_exc from fut.exception()
tenacity.RetryError: RetryError[<Future at 0x274eb90add0 state=finished raised RateLimitError>]

Deploying services (azd deploy)

(x) Failed: Deploying service backend

ERROR: error executing step command 'deploy --all': failed deploying service 'backend': POST https://management.azure.com/subscriptions/c645d3fb-d531-43b4-b061-945f80bf073c/resourceGroups/rg-openai-demo-dev/providers/Microsoft.ContainerRegistry/registries/openaidemodevacrm67nlh27l373u/scheduleRun

RESPONSE 400: 400 Bad Request
ERROR CODE: TasksOperationsNotAllowed

{
"error": {
"code": "TasksOperationsNotAllowed",
"message": "ACR Tasks requests for the registry openaidemodevacrm67nlh27l373u and c645d3fb-d531-43b4-b061-945f80bf073c are not permitted. Please file an Azure support request at http://aka.ms/azuresupport for assistance.",
"target": "request"
}
}

TraceID: 2cb858acbf3c2f1b5c2e2ad41471a5a0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants