ULIDs are awesome

Learn how ULIDs are a better alternative to UUIDs for unique identifiers, and all the hidden benefits they provide.

Nadeesha Cabral on 06-02-2025

ULIDs are awesome. There I said it. It's one of those things that you'll look back on and wonder how you ever lived without it, once you start using it.

If you're familiar with UUIDs, ULIDs are very similar. UUIDs have subtle differences between specs. The v4 spec that I've seen most people use is designed to be more random than the other spec, while v1 uses some information from the MAC address of the machine that generated the UUID.

ULIDs on the other hand has a more straight-forward spec. They're 128-bit identifiers.

  • First 48 bits are the timestamp
  • Next 80 bits are the random bits
 01AN4Z07BY      79KA1307SR9X4MV3
|----------|    |----------------|
 Timestamp          Randomness
   48bits             80bits

They are sortable

The first 48 bits being the timestamp means it's lexicographically sortable. Meaning, you can generate a bunch of ULIDs, and sort them by the id itself. It'll be in the order of when they were generated. This is a huge benefit when you're dealing with time-series data, and data storage systems that store time-series data.

Efficient Indexing

When you store ULIDs as a unique identifier or a primary key, you reduce index fragmentation. This is because the new ULIDs are always "newer" and can be appended to the end of the index. This prevents random writes to the index, which can cause fragmentation.

If you're working with a lot of writes and/or a lot of indexes, this can be a huge benefit.

Easier cursor-based pagination implementation

Cursor-based pagination is a way to paginate through a list of items without having to know the total number of items. For example, if you load the first 100 messages, and have to load the next 100 messages, you can use the last message message id as the cursor to load the next 100 messages.

SELECT * FROM messages WHERE id > $1 ORDER BY id ASC LIMIT 100

This is much more easier to implement than the offset-based pagination which requires you to know the total number of items. And because you can be assured that no new ULIDs will be generated after your cursor, > is a safe operator to use.

sequenceDiagram
    participant Client
    participant Server
    participant DB
    Client->>Server: Request Page 1
    Server->>DB: SELECT * FROM messages LIMIT 100
    DB-->>Server: First 100 messages
    Server-->>Client: Messages + Last ULID
    Client->>Server: Request Next (Last ULID)
    Server->>DB: SELECT * WHERE id > last_ulid LIMIT 100
    DB-->>Server: Next 100 messages
    Server-->>Client: Messages + New Last ULID

Uniqueness

80 bits of randomness means that the probability of a collision is extremely low. It's perfectly suitable for a usecases where you need a unique identifier across multiple distributed systems - yes, even at the rate of thousands of messages (perhaps even more) per millisecond. Although the first 48 bits (timestamp) might be the same, the last 80 bits are random, so the probability of a collision is extremely low.

Embeds time

You can reverse engineer the timestamp from the ULID, which is useful for debugging and auditing. If you keep a created_at DEFAULT NOW() column for all your tables, you can instead use the ULID as the primary key, and then desconstruct the timestamp to get the created_at value when you need it. A trivial javascript implmementation is here.

export function decodeTime(id: string): number {
  if (id.length !== TIME_LEN + RANDOM_LEN) {
    throw createError("malformed ulid")
  }
  var time = id
    .substr(0, TIME_LEN)
    .split("")
    .reverse()
    .reduce((carry, char, index) => {
      const encodingIndex = ENCODING.indexOf(char)
      if (encodingIndex === -1) {
        throw createError("invalid character found: " + char)
      }
      return (carry += encodingIndex * Math.pow(ENCODING_LEN, index))
    }, 0)
  if (time > TIME_MAX) {
    throw createError("malformed ulid, timestamp too large")
  }
  return time
}

How we use ULIDs

We use ULIDs as the primary key for all our tables. Since we do a lot of time-series data (messages, events, etc), this is a huge benefit. We use the ulid package to generate ULIDs at the application layer, rather than relying on the database to generate them. This helps us preserve the correct ordering of ULIDs, when we do batch inserts.

Subscribe to our newsletter for high signal updates from the cross section of AI agents, LLMs, and distributed systems.

Maximum one email per week.

Subscribe to Newsletter