Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test expiration #11

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Test expiration #11

wants to merge 4 commits into from

Conversation

jmsanders
Copy link

I don't think expiration works as intended (or at least as I would expect it).

The response keys are correctly expired, but nothing is ever removed from the idempotency keys set. This has two consequences:

  • once a response key expires, requests using the same idempotency key will return 409 forever
  • the idempotency keys set grows without bound

Instead, I would expect expiration to also apply to the idempotency keys set. Once the expiration window has passed, a new request with the same idempotency key should issue a new request.

This commit only introduces the tests without implementing a solution. For MemoryBackend, I'd propose storing the expiry alongside the idempotency key. For RedisBackend, I'd propose something along the lines of redis/redis#135 (comment) using a sorted set to purge expired idempotency keys. But I wanted to run the change by you before taking action on either approach.

I don't think expiration works as intended (or at least as I would
expect it to).

The response keys are correctly expired, but nothing is ever removed
from the idempotency keys set. This has two consequences:

- once a response key expires, requests using the same idempotency key
  will return 409 forever
- the idempotency keys set grows without bound

Instead, I would expect expiration to also apply to the idempotency keys
set. Once the expiration window has passed, a new request with the same
idempotency key should issue a new request.

This commit only introduces the tests without implementing a solution.
For MemoryBackend, I'd propose storing the expiry alongside the
idempotency key. For RedisBackend, I'd propose something along the lines
of redis/redis#135 (comment) using
a sorted set to purge expired idempotency keys. But I wanted to run the
change by you before taking action on either approach.
@sondrelg
Copy link
Member

sondrelg commented Jan 3, 2023

Yeah I agree this seems like an implementation we've forgotten to account for. I like the look of the redis suggestion, so feel free to finish the implementation along those lines 👍 Just ping me when you consider this ready and I'll take another look 🙂

@jmsanders
Copy link
Author

Great - I'll throw something up later this week.

This reverts commit 9057f07.
@jmsanders jmsanders force-pushed the jordan/expiry branch 2 times, most recently from 92ebda6 to 44eee09 Compare January 4, 2023 16:26
Previously, idempotency keys were never being expired. This had two
consequences:

- once a response key expires, requests using the same idempotency
  key will return 409 forever
- the idempotency keys set grows without bound

There's a new `expire_idempotency_keys` abstract method that
implementations of Backend are responsible for implementing. The
middleware calls it on every request.

For the Memory backend, we now store idempotency keys in a dict intead
of a set. The value of each key is its expiration time.

We then iterate over all of the items in the dict and
remove the ones with an expiration earlier than the current time. This
operation isn't very efficient because it iterates over the entire dict,
but given that the Memory backend is only meant to be used in local
testing, I don't think it's worth optimizing further.

For the RedisBackend, we now store idempotency keys in a sortedset
instead of a set. The "score" of each key is its expiration time. This
matches the Redis project's official recommendation on how to expire
items in a set: redis/redis#135 (comment)

We then delete any items from the sorted set with a "score" lower than
the current time. The Redis docs describe this a potentially slow
operation: https://redis.io/commands/zremrangebyscore/

If we don't want to call this on every request, we could instead add
some sort of random fuzzing to only occassionally call it. Or we could
abandon using a sorted set entirely and just store the idempotency key
as its own key in Redis using the standard Redis expiration
functionality. This would mean each request would store potentially
three keys (idempotency key, response key, status code key) instead of
the current two (response key, status code key).

Users who upgrade to a version of the library that includes this change
will see a failure in Redis when it tries to use a sortedset operation
on an existing set. At a minimum, this should be called out in a
changelog. Additionally, we could change the default idempotency key
name from `idempotency-key-keys` to something else to avoid a collision.
@jmsanders
Copy link
Author

I've added a new expire_idempotency_keys abstract method thatimplementations of Backend are responsible for implementing. The middleware calls it on every request.

For the Memory backend, we now store idempotency keys in a dict intead of a set. The value of each key is its expiration time.

We then iterate over all of the items in the dict and remove the ones with an expiration earlier than the current time. This operation isn't very efficient because it iterates over the entire dict, but given that the Memory backend is only meant to be used in local testing, I don't think it's worth optimizing further.

For the RedisBackend, we now store idempotency keys in a sortedset instead of a set. The "score" of each key is its expiration time. This matches the Redis project's official recommendation on how to expire items in a set: redis/redis#135 (comment)

We then delete any items from the sorted set with a "score" lower than the current time. The Redis docs describe this a potentially slow operation: https://redis.io/commands/zremrangebyscore/

If we don't want to call this on every request, we could instead add some sort of random fuzzing to only occassionally call it. Or we could abandon using a sorted set entirely and just store the idempotency key as its own key in Redis using the standard Redis expiration functionality. This would mean each request would store potentially three keys (idempotency key, response key, status code key) instead of the current two (response key, status code key).

Users who upgrade to a version of the library that includes this change will see a failure in Redis when it tries to use a sortedset operation on an existing set. At a minimum, this should be called out in a changelog. Additionally, we could change the default idempotency key name from idempotency-key-keys to something else to avoid a collision.

after the response expires.
"""
if self.expiry:
await self.redis.zremrangebyscore(self.KEYS_KEY, '-inf', time.time())
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned it in the main PR thread, but here's where it might make sense to instead store each idempotency key as its own key in Redis and avoid the potentially expensive ZREMRANGEBYSCORE call entirely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very concerned with this being a slow operation, as a slow redis operation is probably much faster for this type of use-case than making just one additional round-trip to the cache. At least that's what I'd expect.

If we wanted to store them separately and save the round-trips, I suppose we could use pipelining or a Lua script to perform the operations in one go. I'll leave that up to you 🙂

@sondrelg
Copy link
Member

sondrelg commented Jan 6, 2023

Users who upgrade to a version of the library that includes this change will see a failure in Redis when it tries to use a sortedset operation on an existing set. At a minimum, this should be called out in a changelog. Additionally, we could change the default idempotency key name from idempotency-key-keys to something else to avoid a collision.

Is that because we re-use the same key value for the new data structure? Could we get around the error by changing the key?

I guess upgrading will still effectively drop existing data, which isn't that great. Could this be mitigated if we stored multiple values?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants