Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rate Limiting with Redis #83

Merged
merged 22 commits into from
Jan 30, 2025
Merged

Rate Limiting with Redis #83

merged 22 commits into from
Jan 30, 2025

Conversation

fredygerman
Copy link
Collaborator

Description

This PR adds a rate-limiting system using Redis to manage user and global-specific limits.

Key updates:

  • Redis-backed rate limiting for scalability.
  • Scheduler to reset rate limits periodically.
  • CLI command to manually reset rate limits in Redis.
  • Token usage tracking for daily limits.
  • Added Redis to docker-compose for local development.

@fredygerman fredygerman added bug Something isn't working enhancement New feature or request labels Jan 28, 2025
@fredygerman fredygerman requested a review from jurmy24 January 28, 2025 03:47
@fredygerman fredygerman self-assigned this Jan 28, 2025
@jurmy24 jurmy24 requested a review from alvaro-mazcu January 28, 2025 20:18
@jurmy24 jurmy24 added this to the Release v0.1.0 milestone Jan 28, 2025
Copy link
Collaborator

@alvaro-mazcu alvaro-mazcu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR! Some comments for improving code. Also the discussion can be centralised in Discord

app/cli.py Outdated Show resolved Hide resolved
app/cli.py Outdated Show resolved Hide resolved
app/config.py Outdated Show resolved Hide resolved
app/scheduler.py Outdated Show resolved Hide resolved
app/scheduler.py Outdated Show resolved Hide resolved
app/services/llm_service.py Outdated Show resolved Hide resolved
app/services/llm_service.py Outdated Show resolved Hide resolved
app/services/rate_limit_service.py Outdated Show resolved Hide resolved
app/services/rate_limit_service.py Outdated Show resolved Hide resolved
) -> JSONResponse:
logger.debug(f"Responding with rate-limit message. message_key={message_key}")
user = await db.get_or_create_user(wa_id=phone_number)
assert user.id is not None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm skeptical about using asserts in pure-prod code. I think it's better adding an if statement with the corresponding treatment (can be raising an exception or reporting the error)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree. Can modify this later today/tomorrow morning. We started using em cus we increased our type safety requirements in our IDE from none to basic.

@jurmy24
Copy link
Member

jurmy24 commented Jan 29, 2025

I wrote quite a few modifications to make this all work with render for production as well and set it up so that we use TTL instead of a cron job. Thus, I removed it from the docker file.

To run it you can just run the official redis docker image on your computer and set the environment to either staging or production in your .env file to have rate limiting set up.

Sorry about doing it so hastily but I wanna make a lot of progress before I go to paris.

@jurmy24 jurmy24 requested a review from alvaro-mazcu January 29, 2025 15:44
@jurmy24 jurmy24 removed their request for review January 29, 2025 20:42
@jurmy24
Copy link
Member

jurmy24 commented Jan 29, 2025

@alvaro-mazcu I implemented your comments and am now just looking for an approval.

I'll update the assert statements to use more conventional ifs with error handling across the whole codebase in a future PR.

And @fredygerman tomorrow morning I'll need your help with solving the Flow error we're getting (i'm quite confident it isn't our code, its just the environment variables I might've written incorrectly).

Again, happy birthday @fredygerman !!!

@jurmy24
Copy link
Member

jurmy24 commented Jan 29, 2025

Though I must admit that the current rate limiting design only prevents a user from overusing our chat or in the LLM token sense. Even if a user sends more messages that their limit we respond with a default rate limit message. So if a user keeps sending these and we keep responding we'll still get overloaded.

Due to the way WhatsApp resends messages if we don't respond with a 200 ok I didn't implement the limiting as middleware nor as a decorator. Take a look at main.py if you think you have a workaround for this to make it even safer.

@jurmy24 jurmy24 merged commit 795b379 into development Jan 30, 2025
4 checks passed
@jurmy24 jurmy24 deleted the feature/rate-limiting branch January 30, 2025 12:07
@jurmy24 jurmy24 mentioned this pull request Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG]: no token limit for chat messages Add restrictions/usage limits
3 participants