Notes on Signal’s Double Ratchet Algorithm
Future Proof Notebook, Volume 1
Written by Val, Co-Founder of Future Proof
August 3, 2023
Conclusions
The Double Ratchet algorithm developed by the creators of Signal is core to how Signal keeps DMs private. It uses a new secret key to encrypt every message so that an attacker who steals a single key can not decrypt previous or future messages.
Each party shares public keys and use their private keys to generate their own Diffie-Hellman “ratchet”. The ratchet gives them shared secrets from which they can generate a root key chain, that allows them to then derive a sending key chain and a receiving key chain. The cool thing is that these chains are identical for each person. The sending chain of Asia corresponds to the receiving chain of Beau and vice versa. This means that symmetric key encryption can be used where Asia and Beau are generating the same exact keys to encrypt and decrypt with each other, but for each message they are using a new key.
Moxie Marlinspike (creator of Signal, who recently stepped away from the company) wrote this paper and Trevor Perrin edited it but they acknowledge many others who helped them as well. It is a quite accessible read and also includes implementation advice. I think it is definitely worth giving the implementation a go.
Notes
Introduction
Every Double Ratchet message users new keys so that if an attacker has one key they cannot use it to calculate any of the previous keys used and therefore can not decrypt earlier messages. They also send new public keys with each message and generate new Diffie-Hellman symmetric keys from the new public key and the previous private key. This ensures that if an attacker has one key, they can’t use it to calculate a future key and therefore can not decrypt later messages. So basically we have forward security and break-in recovery.
2.1. KDF chains
KDF is key derivation function. It takes a secret, random key called the KDF key and some input and produces an output. They say if for some reason the KDF key is not both secret and random the KDF “should still provide a secure cryptographic hash of its key and input data”. This implies KDF is deterministic. Beyond that I don’t really understand. Is KDF a hash function?
A KDF chain has the key and input and produces an output where some of the output is a key, that can be used for other things, and some of the output is a key that can be used as another KDF key. So that new KDF key can be combined with another input and produce another output, and so on.
They talk about output keys appearing to be random but if the attacker knows what they are looking for and is in the position to get an output key from the same place they get the KDF key it seems like it wouldn’t matter how random it appears.
2.2. Symmetric-key ratchet
A constant and a KDF key into a KDF function gives a new KDF key and a message key. The message key is used as a symmetric key for encryption/decryption. Got it.
2.3. Diffie-Hellman ratchet
At this point, it’s not clear if symmetric-key ratchet is used in its normal form. Also not clear when a new DH keypair is generated. Is it on every single message? Or only when the sender switches? Also DH is used for the input into KDF to get out the sending and receiving chain keys, but do those chains also use DH to get out a message key (and the next KDF key in the chain) for encrpytion/decryption? Lastly, do we key new sending and receiving chains on each new sender? Things are a bit jumbled now but at least I understand this much:
I just don’t understand if this root chain ever continues between the parties and if so when.
2.4. Double Ratchet
Where does the initial root key come from?
Some questions answered: The root chain advances when a new public key is received. Every time the root chain advances, new sending and receiving chains are created. Sending and receiving chains don’t seem to use DH. So what is their input for KDF that accompanies the chain key? Is it a constant?
2.6. Out-of-order messages
Skipped messages can be tracked because each message header includes the message’s number based on where it is in the sending chain and the length of the previous sending chain. Message keys should be stored temporarily in case of delayed or out of order messages.
Speaking of skipped messages, shouldn’t this be section 2.5?
3.1. External functions
Function definitions.
Don’t understand this sentence about the MAX_SKIP value. Requires more brain power than I have right now: “It should be set high enough to tolerate routine lost or delayed messages, but low enough that a malicious sender can’t trigger excessive recipient computation.”
3.2. State variables
Just some vars that each party tracks (in the application code).
3.3. Initialization
“Prior to initialization both parties must use some key agreement protocol to agree on a 32-byte shared secret key SK and Bob’s ratchet public key.” This kind of answers my question about the initial root chain key. But practically, how do they determine a shared secret for this initial root chain key?
3.4. Encrypting messages
Unclear why we concat ad with the message header.
4.1. Overview
We can encrypt headers as well so that an eavesdropping does not know the order of messages within a “session” or which messages are in which session. Great. I don’t know what a session is but sounds good.
Ahh so KDF can actually give us 3 keys, one being a key for encrypting the next header.
4…
Much of section 4 is just pseudocode and explanations on how encrypting headers works. I believe it. Let’s keep going!
5.1. Integration with X3DH
Ahh finally! The mystery of the initial root key is solved. They use extended triple Diffie-Hellman (X3DH)
6.1. Secure deletion
As far as I understand, if an attacker is recording encrypted messages (without a way to view them) and then gets their hands on a key, pretty much all messages except 1 should remain un-decryptable and private. But if message keys or plaintext can be recovered we have bigger issues and less privacy guarantees with the double ratchet system.
6.4. Deletion of skipped message keys
This kind of answers my earlier question. To prevent denial-of-service by consuming storage space, set a limit on how many skipped message keys are stored. Also delete them periodically just in case so an attacker can’t get them.
6.7. Implementation fingerprinting
This section talks about what to do if the parties are anonymous and potentially using different applications/implementations. This makes it clear that Double Ratchet is not designed for a trustless environment because they state specific aspects that should be identical among parties and there’s probably no way to guarantee this (or there probably is a way but who knows what that could be).