You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
built-in redis rate limiter script has room for performance tuning
Describe the bug
The builtin request_rate_limiter.lua script is used for rate limiting, implementing Token Bucket Algorithm, the script's main logic is ok, but has some flaws in performance due to some details.
The logics of assigning values to variables last_tokens, last_refreshed are a little redundant, take assigning to last_tokens for example,
1.1. first by trying to get value via redis.get and assign it to last_tokens
1.2. then judging if it's nil, then assign the value of capacity to last_tokens as a fallback
local last_tokens = tonumber(redis.call("get", tokens_key))
if last_tokens == nil then
last_tokens = capacity
end
but according to load testing and luau compiler results, doing so could be inefficient since the whole chunk could be replaced by
local last_tokens = tonumber(redis.call("get", tokens_key)) or capacity
As you can see in the compiled code of the chunk in request_rate_limiter.lua below (here I replaced redis.get with a simple to_number call since I ran luau in local lua env, where redis was not configured and the point is that in compiled code, there is redundancy compared to the refined version), we have a line L0: JUMPXEQKNIL R0 L1 NOT which judges if the variable in R0 register(which is last_tokens) is Nil, if so, jump to label L1 where it's assigned the value of capacity, which is the line LOADN R0 10000
Function 0 (??):
REMARK builtin tonumber/1
2: local last_tokens = tonumber("100000")
LOADK R1 K0 ['100000']
FASTCALL1 62 R1 L0
GETIMPORT R0 2 [tonumber]
CALL R0 1 1
3: if last_tokens == nil then
L0: JUMPXEQKNIL R0 L1 NOT
4: last_tokens = capacity
LOADN R0 10000
6:
L1: RETURN R0 0
While in the compiled code of the refined chunk below, by using or we only produce an ORK operation, which means we check the value of R1 register, if it's Nil, then assign the value of K0 to R1, and then assign the value of R1 to R0 and return.
In request_rate_limiter.lua, a local variable allowed_num indicating whether the current request is allowed is unnecessarily created, and the assignment of local variable new_tokens can also be similarly optimized.
local allowed = filled_tokens >= requested
local new_tokens = filled_tokens
local allowed_num = 0
if allowed then
new_tokens = filled_tokens - requested
allowed_num = 1
end
...
return { allowed_num, new_tokens }
while the whole logic can also be replaced by simplified and..or clause
local allowed = filled_tokens >= requested
local new_tokens = allowed and filled_tokens - requested or filled_tokens
...
return { allowed and 1 or 0, new_tokens }
To sum up, the full optimized request_rate_limiter.lua is as follows:
redis.replicate_commands()
local tokens_key = KEYS[1]
local timestamp_key = KEYS[2]
local rate = tonumber(ARGV[1])
local capacity = tonumber(ARGV[2])
local now = tonumber(ARGV[3]) or redis.call('TIME')[1]
local requested = tonumber(ARGV[4])
local fill_time = capacity / rate
local ttl = math.floor(fill_time * 2)
local last_tokens = tonumber(redis.call("get", tokens_key)) or capacity
local last_refreshed = tonumber(redis.call("get", timestamp_key)) or 0
local delta = math.max(0, now-last_refreshed)
local filled_tokens = math.min(capacity, last_tokens+(delta*rate))
local allowed = filled_tokens >= requested
local new_tokens = allowed and filled_tokens - requested or filled_tokens
if ttl > 0 then
redis.call("setex", tokens_key, ttl, new_tokens)
redis.call("setex", timestamp_key, ttl, now)
end
return { allowed and 1 or 0, new_tokens }
The load test using JMeter in the image below, by using request_rate_limiter.lua I get max qps of 7913.2, while by changing the script to the refined version I get max qps of 8479.2
Given if put to use, the execution count of the lua script can be very large, a little optimization may bring in a significant improvement in performance.
built-in redis rate limiter script has room for performance tuning
Describe the bug
The builtin request_rate_limiter.lua script is used for rate limiting, implementing Token Bucket Algorithm, the script's main logic is ok, but has some flaws in performance due to some details.
The logics of assigning values to variables
last_tokens, last_refreshed
are a little redundant, take assigning tolast_tokens
for example,1.1. first by trying to get value via
redis.get
and assign it tolast_tokens
1.2. then judging if it's nil, then assign the value of capacity to
last_tokens
as a fallbackbut according to load testing and luau compiler results, doing so could be inefficient since the whole chunk could be replaced by
As you can see in the compiled code of the chunk in request_rate_limiter.lua below (here I replaced redis.get with a simple
to_number
call since I ran luau in local lua env, where redis was not configured and the point is that in compiled code, there is redundancy compared to the refined version), we have a lineL0: JUMPXEQKNIL R0 L1 NOT
which judges if the variable inR0
register(which islast_tokens
) isNil
, if so, jump to label L1 where it's assigned the value ofcapacity
, which is the lineLOADN R0 10000
While in the compiled code of the refined chunk below, by using
or
we only produce anORK
operation, which means we check the value ofR1
register, if it'sNil
, then assign the value ofK0
toR1
, and then assign the value of R1 to R0 and return.In request_rate_limiter.lua, a local variable
allowed_num
indicating whether the current request is allowed is unnecessarily created, and the assignment of local variablenew_tokens
can also be similarly optimized.while the whole logic can also be replaced by simplified and..or clause
To sum up, the full optimized request_rate_limiter.lua is as follows:
And the revised script passes all test cases in RedisRateLimiterLuaScriptTests.java
Sample
The load test using JMeter in the image below, by using request_rate_limiter.lua I get max qps of 7913.2, while by changing the script to the refined version I get max qps of 8479.2
Given if put to use, the execution count of the lua script can be very large, a little optimization may bring in a significant improvement in performance.
So far I have done below tests.
Please help check what further testing needs to be done if you choose to involve this change.
The text was updated successfully, but these errors were encountered: