46
loading...
This website collects cookies to deliver better user experience
create table rate_limiting_token_bucket (
id text primary key,
tk bigint not null,
ts timestamptz not null);
id
identifies the usertk
is the amount of available tokens for this userts
records the last time a token was retrieved
In YugabyteDB the table is sharded on a hash of the first column, so this is distributed per user.tk
to window_seconds*refill_per_sec-1
where window_seconds
is the maximum (I'll use 60 seconds) and refill_per_sec
is the number of tokens allowed per second. -1
is the token taken for the current call.tk
to tk-1+refill_per_sec*extract(epoch from clock_timestamp()-ts)
. tk-1
takes the token for the current call. extract(epoch from clock_timestamp()-ts)
is the number of seconds since the last call for this user, which I multiply with the rate of tokens acquired per second refill_per_sec
. The new value of tk
will be bound between -1
(there's no debt) and the maximum window_seconds*refill_per_sec
with greatest() and least() functions (there's no unlimited refill).-1
means that we have no remaining tokens. The application may refuse the next calls, or wait to get new tokens refilled. Postitive calues allows the API access for the user identified by id
because the requested token was acquired.id
this reat-then-write operation must be atomic. On a call, I will attempt the UPDATE first (the most probable) and, when row is not found, will insert it. This must run in a SERIALIZABLE transaction. Be careful if you implement this in another database, Oracle does not provide this isolation level. I've written the first post of this series to clarify this.create or replace function rate_limiting_token_bucket_request
(rate_id text, refill_per_sec int default 2
, window_seconds int default 60)
returns int as $$
declare
new_tk int:=null; -- stores the remaining token to return
begin
-- take 1 token and add refilled ones
-- (returns the remaining tokens, or NULL if id is not found)
update rate_limiting_token_bucket
set ts=now(), tk=greatest(least(
tk-1+refill_per_sec*extract(epoch from clock_timestamp()-ts)
,window_seconds*refill_per_sec),-1)
where rate_limiting_token_bucket.id=rate_id
returning tk into new_tk;
-- fill initial window if first call if id was not found
if new_tk is null then
insert into rate_limiting_token_bucket (id, tk,ts)
values (rate_id,window_seconds*refill_per_sec-1,clock_timestamp())
returning tk into new_tk;
end if;
-- return the remaining tokens
return new_tk;
end; $$ language plpgsql;
yugabyte=truncate table rate_limiting_token_bucket;
TRUNCATE TABLE
yugabyte=# set default_transaction_isolation=serializable
SET
yugabyte=# select rate_limiting_token_bucket_request('user1',1,10);
\watch 0.1
rate_limiting_token_bucket_request
------------------------------------
9
(1 row)
Mon 03 Jan 2022 10:52:58 AM GMT (every 0.1s)
rate_limiting_token_bucket_request
------------------------------------
8
(1 row)
Mon 03 Jan 2022 10:52:58 AM GMT (every 0.1s)
rate_limiting_token_bucket_request
------------------------------------
7
(1 row)
Mon 03 Jan 2022 10:52:58 AM GMT (every 0.1s)
rate_limiting_token_bucket_request
------------------------------------
6
(1 row)
[105/9827]
Mon 03 Jan 2022 10:52:58 AM GMT (every 0.1s)
rate_limiting_token_bucket_request
------------------------------------
5
(1 row)
Mon 03 Jan 2022 10:52:58 AM GMT (every 0.1s)
rate_limiting_token_bucket_request
------------------------------------
4
(1 row)
Mon 03 Jan 2022 10:52:58 AM GMT (every 0.1s)
rate_limiting_token_bucket_request
------------------------------------
3
(1 row)
Mon 03 Jan 2022 10:52:59 AM GMT (every 0.1s)
rate_limiting_token_bucket_request
------------------------------------
2
(1 row)
Mon 03 Jan 2022 10:52:59 AM GMT (every 0.1s)
rate_limiting_token_bucket_request
------------------------------------
1
(1 row)
Mon 03 Jan 2022 10:52:59 AM GMT (every 0.1s)
rate_limiting_token_bucket_request
------------------------------------
0
(1 row)
Mon 03 Jan 2022 10:52:59 AM GMT (every 0.1s)
rate_limiting_token_bucket_request
------------------------------------
-1
(1 row)
Mon 03 Jan 2022 10:52:59 AM GMT (every 0.1s)
rate_limiting_token_bucket_request
------------------------------------
-1
(1 row)
Mon 03 Jan 2022 10:52:59 AM GMT (every 0.1s)
rate_limiting_token_bucket_request
------------------------------------
-1
(1 row)
^CCancel request sent
yugabyte=# -- waiting 4 seconds
yugabyte=# \watch 0.1
rate_limiting_token_bucket_request
------------------------------------
3
(1 row)
Mon 03 Jan 2022 10:53:05 AM GMT (every 0.1s)
rate_limiting_token_bucket_request
------------------------------------
2
(1 row)
Mon 03 Jan 2022 10:53:05 AM GMT (every 0.1s)
rate_limiting_token_bucket_request
------------------------------------
1
(1 row)
Mon 03 Jan 2022 10:53:05 AM GMT (every 0.1s)
rate_limiting_token_bucket_request
------------------------------------
0
(1 row)
Mon 03 Jan 2022 10:53:05 AM GMT (every 0.1s)
rate_limiting_token_bucket_request
------------------------------------
-1
(1 row)
postgres=# set default_transaction_isolation=serializable;
postgres=# select rate_limiting_token_bucket_request('user1',1000,3600);
postgres=# \watch 0.001
yugabyte=# set default_transaction_isolation=serializable;
yugabyte=# select rate_limiting_token_bucket_request('user1',1000,3600);
yugabyte=# \watch 0.001
Operation expired: Transaction ... expired or aborted by a conflict: 40001
or ERROR: All transparent retries exhausted. Operation failed. Try again: Value write after transaction start
or Operation expired: Transaction aborted: kAborted
\watch
command stops on first exception but an application should be able to retry when encountering those errors. The YugabyteDB server or client driver may be able to retry itself, but some transaction conflicts must be managed by the application. The reason for that is that we don't want to wait with pessimistic locking. This cannot scale on a distributed database where the goal is to avoid single point of contention. And the probability of transaction conflict (two user calls getting the token at the exact same time) has a low probability of happening. It must be handled when encountered, but without slowing down all other calls with pessimistic locking.