Skip to content

Instantly share code, notes, and snippets.

@beginor
Last active November 20, 2024 09:41
Show Gist options
  • Save beginor/9d9f90bc58e1313f6aecd107f8296732 to your computer and use it in GitHub Desktop.
Save beginor/9d9f90bc58e1313f6aecd107f8296732 to your computer and use it in GitHub Desktop.
Twitter Snowflake ID for PostgreSQL
CREATE SEQUENCE public.global_id_seq;
ALTER SEQUENCE public.global_id_seq OWNER TO postgres;
CREATE OR REPLACE FUNCTION public.id_generator()
RETURNS bigint
LANGUAGE 'plpgsql'
AS $BODY$
DECLARE
our_epoch bigint := 1314220021721;
seq_id bigint;
now_millis bigint;
-- the id of this DB shard, must be set for each
-- schema shard you have - you could pass this as a parameter too
shard_id int := 1;
result bigint:= 0;
BEGIN
SELECT nextval('public.global_id_seq') % 1024 INTO seq_id;
SELECT FLOOR(EXTRACT(EPOCH FROM clock_timestamp()) * 1000) INTO now_millis;
result := (now_millis - our_epoch) << 23;
result := result | (shard_id << 10);
result := result | (seq_id);
return result;
END;
$BODY$;
ALTER FUNCTION public.id_generator() OWNER TO postgres;
-- SELECT public.id_generator()
@davidshepherd7
Copy link

Oh neat, it looks like pgflake probably does it right (from a quick glance). The part inside if (last_time == curr_time) looks the same as twitter's implementation.

@gautamborad
Copy link

@davidshepherd7 , what implementation did you guys settle with? We are also in the same boat and deciding which id generation method to use out of the plethora of options available out there! What I like about this, as compared to pgflake, is that we dont need to build this. We can just create the function public.id_generator when we are creating our db/tables on a new postgres instance.

But am worried about "generates non-monotonic IDs" part you mentioned. We already have a created_at column in each table and use that when we need "proper" sorting. With that in mind can you elaborate on "we'll switch to a simpler ID format, something like (shard << 10) | sequence." part.?Did you go with that design, if yes, did you find any issue?

Thanks in advance!

@davidshepherd7
Copy link

We're in an awkward situation where we have since tables that need a monotonic Id still using sequential IDs and others using the approach in the snippet. Resolving this hasn't been a priority yet, I'm not sure where we'll end up.

@gautamborad
Copy link

@hazelmeow
Copy link

Want to point out that this generates ids with Instagram's layout which is slightly different from Twitter's (Instagram uses 13 bits for shard ID + 10 for seq number and Twitter uses 10 for a machine ID + 12 for a seq number). So if you want to generate snowflakes w/ the Twitter layout you should change the seq number to % 4096 (2^12) and change the shifts to << 22 and << 12.

@davidshepherd7
Copy link

There's another extension for this here: https://github.com/pgEdge/snowflake-sequences I'm not sure if/how it's different to pgflake.

@srigi
Copy link

srigi commented Apr 17, 2024

Can somebody hint how to get now_millis from stored ID?
Since the row creation time is encoded in snowflake ID, it should be retrievable. I need it for example to "get only records created today".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment