-
-
Save beginor/9d9f90bc58e1313f6aecd107f8296732 to your computer and use it in GitHub Desktop.
CREATE SEQUENCE public.global_id_seq; | |
ALTER SEQUENCE public.global_id_seq OWNER TO postgres; | |
CREATE OR REPLACE FUNCTION public.id_generator() | |
RETURNS bigint | |
LANGUAGE 'plpgsql' | |
AS $BODY$ | |
DECLARE | |
our_epoch bigint := 1314220021721; | |
seq_id bigint; | |
now_millis bigint; | |
-- the id of this DB shard, must be set for each | |
-- schema shard you have - you could pass this as a parameter too | |
shard_id int := 1; | |
result bigint:= 0; | |
BEGIN | |
SELECT nextval('public.global_id_seq') % 1024 INTO seq_id; | |
SELECT FLOOR(EXTRACT(EPOCH FROM clock_timestamp()) * 1000) INTO now_millis; | |
result := (now_millis - our_epoch) << 23; | |
result := result | (shard_id << 10); | |
result := result | (seq_id); | |
return result; | |
END; | |
$BODY$; | |
ALTER FUNCTION public.id_generator() OWNER TO postgres; | |
-- SELECT public.id_generator() |
@davidshepherd7 , what implementation did you guys settle with? We are also in the same boat and deciding which id generation method to use out of the plethora of options available out there! What I like about this, as compared to pgflake, is that we dont need to build this. We can just create the function public.id_generator
when we are creating our db/tables on a new postgres instance.
But am worried about "generates non-monotonic IDs" part you mentioned. We already have a created_at
column in each table and use that when we need "proper" sorting. With that in mind can you elaborate on "we'll switch to a simpler ID format, something like (shard << 10) | sequence." part.?Did you go with that design, if yes, did you find any issue?
Thanks in advance!
We're in an awkward situation where we have since tables that need a monotonic Id still using sequential IDs and others using the approach in the snippet. Resolving this hasn't been a priority yet, I'm not sure where we'll end up.
JFYI, mausimag/pgflake#3
Want to point out that this generates ids with Instagram's layout which is slightly different from Twitter's (Instagram uses 13 bits for shard ID + 10 for seq number and Twitter uses 10 for a machine ID + 12 for a seq number). So if you want to generate snowflakes w/ the Twitter layout you should change the seq number to % 4096
(2^12) and change the shifts to << 22
and << 12
.
There's another extension for this here: https://github.com/pgEdge/snowflake-sequences I'm not sure if/how it's different to pgflake.
Can somebody hint how to get now_millis
from stored ID?
Since the row creation time is encoded in snowflake ID, it should be retrievable. I need it for example to "get only records created today".
Oh neat, it looks like pgflake probably does it right (from a quick glance). The part inside
if (last_time == curr_time)
looks the same as twitter's implementation.