lnbook/10_onion_routing.asciidoc

[[onion_routing]]
== Onion Routing

In this chapter we will describe the Lightning Network's _Onion Routing_ mechanism. The invention of _onion routing_ precedes the Lightning Network by 25 years! Onion routing was invented by U.S. Navy researchers as a communications security protocol. Onion routing is most famously used by _Tor_, the onion routed internet overlay that allows researchers, activists, intelligence agents and everyone else to use the internet privately and anonymously.

In this chapter we are focusing on the "Source based Onion Routing (SPHINX)" part of the Lightning protocol architecture, highlighted by a double outline in the center (Routing Layer) of <<LN_protocol_onion_highlight>>:

[[LN_protocol_onion_highlight]]
.The Lightning Network Protocol Suite
image::images/LN-protocol-onion-highlight.png["The Lightning Network Protocol Suite"]

Onion routing describes a method of encrypted communication where a message sender builds successive _nested layers of encryption_ that are "peeled" off by each intermediary node, until the innermost layer is delivered to the intended recipient. The name "onion routing" describes this use of layered encryption that is peeled off one layer at a time, like the skin of an onion.

Each of the intermediary nodes can only "peel" one layer and see who is next in the communications path. Onion routing ensures that no one except the sender knows the destination or length of the communication path. Each intermediary only knows the previous and next hop.

The Lightning Network uses an implementation of onion routing protocol based on _Sphinx_footnote:[http://www0.cs.ucl.ac.uk/staff/G.Danezis/papers/sphinx-eprint.pdf[George Danezis and Ian Goldberg. Sphinx: A compact and provably secure mix format. In IEEE Symposium on Security and Privacy, pp 269–282. IEEE, 2009.]] developed in 2009 by George Danezis and Ian Goldberg.

The implementation of onion routing in the Lightning Network is defined in https://github.com/lightningnetwork/lightning-rfc/blob/master/04-onion-routing.md[BOLT #4 - Onion Routing Protocol]

=== A physical example illustrating onion routing

There are many ways to describe onion routing, but one of the easiest is to use the physical equivalent of sealed envelopes. An envelope represents a layer of encryption, allowing only the named recipient to open it and read the contents.

Let's say Alice wants to send a secret letter to Dina, indirectly via some intermediaries.

==== Selecting a path

The Lightning Network uses _source routing_, which means that the payment path is selected and specified by the sender, and only the sender. In this example, our Alice's secret letter to Dina will be the equivalent of a payment. To make sure the letter reaches Dina, Alice will create a path from her to Dina, using Bob and Chan as intermediaries.

[TIP]
====
There may be many paths that make it possible for Alice to reach Dina. We will explain the process of selecting the _optimum_ path in <<path_finding>>. For now, we'll assume that the path selected by Alice uses Bob and Chan as intermediaries to get to Dina.
====

As a reminder, the path selected by Alice is shown in <<alice_dina_path>>, below:

[[alice_dina_path]]
image::images/alice_dina_path.png["Alice to Bob to Chan to Dina"]

Let's see how Alice can use this path without revealing information to intermediaries Bob and Chan.

.Source-based routing
****
Source-based routing is not how packets are typically routed on the internet today, though source routing was possible in the early days.
Internet routing is based on _packet switching_ at each intermediary routing node. An IPv4 packet, for example, includes the sender and recipient's IP address and every other IP routing node decides how to forward each packet towards the destination.
However, the lack of privacy in such a routing mechanism, where every intermediary node sees the sender and recipient, make this a poor choice for use in a payment network.
****

==== Building the layers

Alice starts by writing a secret letter to Dina.  She then seals the letter inside an envelope and writes "To Dina" on the outside (see <<dina_envelope>>). The envelope represents encryption with Dina's public key, so that only Dina can open the envelope and read the letter.

[[dina_envelope]]
.Dina's secret letter, sealed in an envelope
image::images/dina_envelope.png[Dina's secret letter, sealed in an envelope]

Dina's letter will be delivered to Dina by Chan, who is immediately before Dina in the "path". So, Alice puts Dina's envelope inside an envelope addressed to Chan (see <<chan_envelope>>). The only part that Chan can read is the destination (routing instructions): "To Dina". Sealing this inside an envelope addressed to Chan represents encrypting it with Chan's public key so that only Chan can read the envelope address. Chan still can't open Dina's envelope. All he sees is the instructions on the outside (the address).

[[chan_envelope]]
.Chan's envelope, containing Dina's sealed envelope
image::images/chan_envelope.png[Chan's envelope, containing Dina's sealed envelope]

Now, this letter will be delivered to Chan by Bob. So Alice puts it inside an envelope addressed to Bob (see <<bob_envelope>>). As before, the envelope represents a message encrypted to Bob that only Bob can read. Bob can only read the outside of Chan's envelope (the address), so he knows to send it to Chan.

[[bob_envelope]]
.Bob's envelope, containing Chan's sealed envelope
image::images/bob_envelope.png[Bob's envelope, containing Chan's sealed envelope]

Now, if we could look through the envelopes (with X-rays!) we would see the envelopes nested one inside the other, as shown in <<nested_envelopes>>, below:

[[nested_envelopes]]
.Nested envelopes
image::images/nested_envelopes.png[Nested envelopes]

==== Peeling the layers

Alice now has an envelope that says "To Bob" on the outside. It represents an encrypted message that only Bob can open (decrypt). Alice will now begin the process by sending this to Bob. The entire process is shown in <<sending_nested_envelopes>> below:

[[sending_nested_envelopes]]
.Sending the envelopes
image::images/sending_nested_envelopes.png[Sending the envelopes]

As you can see, Bob receives the envelope from Alice. He knows it came from Alice, but doesn't know if Alice is the original sender or just someone forwarding envelopes. He opens it to find an envelope inside that says "To Chan". Since this is addressed to Chan, Bob can't open it. He doesn't know what's inside it and doesn't know if Chan is getting a letter or another envelope to forward. Bob doesn't know if Chan is the ultimate recipient or not. Bob forwards the envelope to Chan.

Chan receives the envelope from Bob. He doesn't know that it came from Alice. He doesn't know if Bob is an intermediary or the sender of a letter. Chan opens the envelope and finds another envelope inside addressed "To Dina", which he can't open. Chan forwards it to Dina, not knowing if Dina is the final recipient.

Dina receives an envelope from Chan. Opening it she finds a letter inside, so now she knows she's the intended recipient of this message. She reads the letter, knowing that none of the intermediaries know where it came from and no one else has read her secret letter!

This is the essence of onion routing. The sender wraps a message in layers, specifying exactly how it will be routed and preventing any of the intermediaries from gaining any information about the path or payload. Each intermediary peels one layer, sees only a forwarding address and doesn't know anything other than the previous and next hop in the path.

Now, let's look at the details of the onion routing implementation in the Lightning Network.

=== Introduction to onion routing of HTLCs

Onion routing in the Lightning Network appears complex at first glance, but once you understand the basic concept is really quite simple.

From a practical perspective, Alice is telling every intermediary node what HTLC to set up with the next node in the path.

The first node, which is the payment sender or Alice in our example, is called the _-_origin node_.  The last node, which is the payment recipient or Dina in our example, is called the _final node_.

Each intermediary node, or Bob and Chan in our example, is called a _hop_. Every hop must set up an _outgoing HTLC_ to the next hop. The information communicated to each hop by Alice is called the _hop payload_ or _hop data_. The message that is routed from Alice to Dina is called an _onion_ and consists of encrypted _hop payload_ or _hop data_ messages encrypted to each hop.

Now that we know the terminology used in Lightning Onion Routing, let's restate Alice's task: Alice must construct an _onion_ with _hop data_, telling each _hop_ how to construct an _outgoing HTLC_ in order to send a payment to the _final node_ (Dina).

==== Alice selects the path

From <<routing>> we know that Alice will send a 50,000 satoshi payment to Dina via Bob and Chan. This payment is transmitted via a series of HTLCs, as shown in <<alice_dina_htlc_path>>, below:

[[alice_dina_htlc_path]]
.Payment path with HTLCs from Alice to Dina
image::images/alice-dina-htlc.png[Payment path with HTLCs from Alice to Dina]

As we see in <<gossip>>, Alice is able to construct this path to Dina because Lightning nodes announce their channels to the entire Lightning Network using the _Lightning Gossip Protocol_. After the initial channel announcement, Bob and Chan each sent out an additional "channel update" message with their routing fee and timelock expectations for payment routing.

From the announcements and updates, Alice knows the following information about the channels between Bob, Chan and Dina:

* A +short_channel_id+ (short channel ID) for each channel, that Alice can use to reference the channel, when constructing the path

* An +cltv_expiry_delta+ (timelock delta) which Alice can add to the expiry time for each HTLC

* A +fee_base_msat+ and +fee_proportional_millionths+ which Alice can use to calculate the total routing fee expected by that node for relay on that channel.

In practice, other information is also exchanged such as the largest (`htlc_maximum_msat`) and smallest (`htlc_minimum_msat`) HTLCs a channel will carry, but these aren't used as directly during onion route construction as the above fields are.

This information is used by Alice to identify the nodes, channels, fees, and timelocks for the following detailed path, shown in <<alice_dina_path_detail>>:

[[alice_dina_path_detail]]
.A detailed path constructed from gossiped channel and node information
image::images/alice_dina_path_detail.png[A path constructed from gossiped channel and node information]

Alice already knows her own channel to Bob, and therefore doesn't need this info to construct the path. Note also that Alice didn't need a channel update from Dina, because she has the update from Chan for that last channel in the path.

==== Alice constructs the payloads

There are two possible formats that Alice can use for the information communicated to each hop: A legacy fixed-length format called the _hop data_ and a more flexible Type-Length-Value (TLV) based format called the _hop payload_. The TLV message format is explained in more detail in <<tlv>>. It offers flexibility by allowing fields to be added to the protocol at will. Both formats are specified in https://github.com/lightningnetwork/lightning-rfc/blob/master/04-onion-routing.md#packet-structure[BOLT #4 - Onion Routing - Packet Structure]

Alice will start building the hop data from the end of the path backwards: Dina, Chan, Bob.

===== Final node payload for Dina

Alice first builds the payload that will be delivered to Dina. Dina will not be constructing an "outgoing HTLC", because Dina is the final node and payment recipient. For this reason, the payload for Dina is different that all the others (uses all zeros for the `short_channel_id`), but only Dina will know this since it will be encrypted in the innermost layer of the onion. Essentially, this is the "secret letter to Dina" we saw in our physical envelope example.

The hop payload for Dina must match the information in the invoice generated by Dina for Alice and will contain (at least) the following fields in Type-Lenth-Value (TLV) format:

amt_to_forward:: The amount of this payment in milli-satoshis. If this is only one part of a multi-part payment, the amount is less than the total. Otherwise, this is a single full payment and it is equal to the invoice amount and +total_msat+ value.

outgoing_cltv_value:: The payment expiry timelock set to the value +min_final_cltv_expiry+ in the invoice.

payment_secret:: A special 256-bit secret value from the invoice, allowing Dina to recognize this incoming payment, and also prevents a class of probing that previously made 0-value invoices insecure. Probing by intermediate nodes is mitigated, as this value is encrypted to _only_ the recipient, meaning they can't reconstruct a final packet that "looks" legitimate.

total_msat:: The total amount matching the invoice. This may be omitted if there is only one part, in which case it is assumed to match +amt_to_forward+ and must equal the invoice amount.

The invoice Alice received from Dina specified the amount as 50,000 satoshis, which is 50,000,000 milli-satoshis. Dina specified the minimum expiry for the payment +min_final_cltv_expiry+ as 18 blocks (3 hours, given 10-minute on average Bitcoin blocks). At the time Alice is attempting to make the payment, let's say the Bitcoin blockchain has recorded 700,000 blocks. So Alice must set the +outgoing_cltv_value+ to a *minimum* block height of 700,018.

Alice constructs the hop payload for Dina as follows:

----
amt_to_forward : 50,000,000
outgoing_cltv_value: 700,018
payment_secret: fb53d94b7b65580f75b98f10...03521bdab6d519143cd521d1b3826
total_msat: 50,000,000
----

Alice serializes it in TLV format as shown (simplified) <<dina_onion_payload>> below:

[[dina_onion_payload]]
.Dina's payload is constructed by Alice
image::images/dina_onion_payload.png[Dina's payload is constructed by Alice]

===== Hop payload for Chan

Next, Alice constructs the hop payload for Chan. This will tell Chan how to setup an outgoing HTLC to Dina.

The hop payload for Chan includes three fields: +short_channel_id+, +amt_to_forward+ and +outgoing_cltv_value+:

----
short_channel_id: 010002010a42be
amt_to_forward: 50,000,000
outgoing_cltv_value: 700,018
----

Alice serializes this payload in TLV format, as shown (simplified) in <<chan_onion_payload>> below:

[[chan_onion_payload]]
.Chan's payload is constructed by Alice
image::images/chan_onion_payload.png[Chan's payload is constructed by Alice]

===== Hop payload for Bob

Finally, Alice constructs the hop payload for Bob, which also contains the same three fields as the hop payload for Chan, but with different values:

----
short_channel_id: 000004040a61f0
amt_to_forward: 50,100,000
outgoing_cltv_value: 700,038
----

As you can see, the +amt_to_forward+ field is 50,100,000 milli-satoshis, or 50,100 satoshis. That's because Chan expects a fee of 100 satoshis to route a payment to Dina. In order for Chan to "earn" that routing fee, Chan's incoming HTLC must be 100 satoshis more than Chan's outgoing HTLC. Since Chan's incoming HTLC is Bob's outgoing HTLC, the instructions to Bob reflect the fee Chan earns. In simple terms, Bob needs to be told to send 50,100 satoshi to Chan, so that Chan can send 50,000 satoshi and keep 100 satoshi.

Similarly, Chan expects a timelock delta of 20 blocks. SO Chan's incoming HTLC must expire 20 blocks *later* than Chan's outgoing HTLC. To achieve this, Alice tells Bob to make his outgoing HTLC to Chan expire at block height 700,038 - 20 blocks later than Chan's HTLC to Dina.

[TIP]
====
Fees and timelock delta expectations for a channel are set by the difference between incoming and outgoing HTLCs. Since the incoming HTLC is created by the _preceding node_, the fee and timelock delta is set in the onion payload to that preceding node. Bob is told how to make an HTLC that meets Chan's fee and timelock expectations.
====

Alice serializes this payload in TLV format, as shown (simplified) in <<bob_onion_payload>> below:

[[bob_onion_payload]]
.Bob's payload is constructed by Alice
image::images/bob_onion_payload.png[Bob's payload is constructed by Alice]

===== Finished hop payloads

Alice has now built the three hop payloads that will be wrapped in an onion. A simplified view of the payloads is shown in <<onion_hop_payloads>>, below:

[[onion_hop_payloads]]
.Hop payloads for all the hops
image::images/onion_hop_payloads.png[Hop payloads for all the hops]

==== Key generation

Alice must now generate several keys that will be used to encrypt the various layers in the onion.

Remember the goals of onion routing, that Alice can achieve with these keys:

* Alice can encrypt each layer of the onion so that only the intended recipient can read it.
* Every intermediary can check that the message is not modified.
* No one in the path will know who sent this onion or where it is going. Alice doesn't reveal her identity as the sender or Dina's identity as the recipient of the payment.
* Each hop only learns about the previous and next hop.
* No one can know how long the path is, or where  in the path they are.

[WARNING]
====
Like a chopped onion, the following technical details may bring tears to your eyes. Feel free to skip to the next section if you get confused. Come back to this and read BOLT #4 if you want to learn more.
====


The basis for all the keys used in the onion, is a _shared secret_ that Alice and Bob can both generate independently using the Elliptic Curve Diffie-Hellman (ECDH) algorithm. From the shared secret (ss), they can independently generate four additional keys named rho, mu, um and pad:

rho:: Used to generate a stream of random bytes from a stream cipher (used as a
CSPRNG). These bytes are used to encrypt/decrypt the message body as well as
filler zero bytes during sphinx packet processing.

mu:: Used in the Hash-Based Message Authentication Code (HMAC) for integrity/authenticity verification.

um:: Used in error reporting.

pad:: Used to generate filler bytes for padding the onion to a fixed length.

The relationship between the various keys and how they are generated is shown in the diagram <<onion_keygen>>, below:

[[onion_keygen]]
.Onion Key Generation
image::images/onion_keygen.png[Onion Key Generation]

[[session_key]]
===== Alice's session key

To avoid revealing her identity, Alice does not use her own node's public key in building the onion. Instead, Alice creates a temporary 32-byte (256-bit) key called the _session private key_ and corresponding _session public key_. This serves as a temporary "identity" and key *for this onion only*. From this session key, Alice will build all the other keys that will be used in this onion.

[[keygen_details]]
===== Key generation details
The key generation, random byte generation, ephemeral keys and how they are used in packet construction are specified in three sections of BOLT #4:

* https://github.com/lightningnetwork/lightning-rfc/blob/master/04-onion-routing.md#key-generation[Key Generation]
* https://github.com/lightningnetwork/lightning-rfc/blob/master/04-onion-routing.md#pseudo-random-byte-stream[Random Byte Stream]
* https://github.com/lightningnetwork/lightning-rfc/blob/master/04-onion-routing.md#packet-construction[Packet Construction]

For simplicity and to avoid getting too technical, we have not included these details in the book. See the links above if you want to see the inner workings.

[[shared_secret]]
===== Shared secret generation

One important detail that seems almost magical is the ability for Alice to create a _shared secret_ with another node simply by knowing their public keys. This is based on the invention of Diffie-Hellman key exchange (DH) in the 1970s that revolutionized cryptography. Lightning Onion Routing uses Elliptic Curve Diffie-Hellman (ECDH) on Bitcoin's +secp256k1+ curve. It's such a cool trick that we try to explain it in simple terms in <<ecdh_explained>>


// To editor: Maybe put this in an appendix instead of a sidebar?

[[ecdh]]
.Elliptic Curve Diffie-Hellman (ECDH) explained
****
Assume Alice's private key is +a+ and Bob's private key is +b+. Using the Elliptic Curve, they multiply each private key by the generator point +G+ to produce their public keys +A+ and +B+ respectively:

 A = aG

 B = bG

Now Alice and Bob can create a shared secret +ss+, a value that they can both calculate independently without exchanging any information, such that

 ss = aB = bA

Follow along, as we demonstrate the math that proves this is possible:

 ss

 = aB

calculated by Alice who knows both +a+ and +B+

 = a(bG)

because we know

 = (ab)G

because of associativity

 = (ba)G

because the curve is an abelian group

 = b(aG)

because of associativity

 = bA

can be calculated by Bob who knows +b+ and +A+


We have therefore shown that

 ss = aB = bA

Alice can multiple her private key with Bob's public key to calculate +ss+

Bob can multiply his private key with Alice's public key to calculate +ss+

Thus, they will both get the same result which they can use as a shared key to symmetrically encrypt secrets between the two of them without communicating the shared secret.

****

[[wrapping_the_onion]]
=== Wrapping the onion layers

The process of wrapping the onion is detailed in https://github.com/lightningnetwork/lightning-rfc/blob/master/04-onion-routing.md#packet-construction[BOLT #4 - Onion Routing - Packet Construction].

In this section we will describe this process at a high-level and somewhat simplified - omitting certain details.


[[fixed_length_onions]]
==== Fixed-length Onions

We've mentioned the fact that none of the "hop" nodes know how long the path is, or where they are in the path. How is this possible?

If you have a set of directions, even if encrypted, can't you tell how far you are from the beginning or end simply by looking at *where* in the list of directions you are?

The "trick" used in onion routing is to always make the path (the list of directions), the same length for every node. This is achieved by keeping the onion packet the same length at every step.

At each hop, the hop payload appears at the beginning of the onion payload, followed by _what seem to be_ 19 more hop payloads. Every hop sees itself as the first of 20 hops.

[TIP]
====
The onion payload is 1300 bytes. Each hop payload is 65 bytes or less (padded to 65 bytes if less). So the total onion payload can fit 20 hop payloads (1300 = 20 * 65). The maximum onion routed path is therefore 20 hops.
====

As each layer is "peeled off", more filler data (essentially junk) is added at the end of the onion payload so the next hop gets an onion of the same size and is once again the "first hop" in the onion.

The onion size is 1366 bytes structured as shown in <<onion_packet>> below:

[[onion_packet]]
.The onion packet
image::images/onion_packet.png[]

* 1 byte: A version byte
* 33 bytes: A compressed public session key (<<session_key>>) from which the per-hop shared secret (<<shared_secret>>) can be generated without revealing Alice's identity
* 1300 bytes: The actual _onion payload_ containing the instructions for each hop
* 32 bytes: An HMAC integrity checksum

A unique trait of Sphinx as a mix-net packet format is that rather than include a distinct session key for each hop in the route, which would increase the size of the mix-net packet dramatically, instead a clever _blinding_ scheme is used deterministically randomize the session key at each hop.

In practice, this little trick allows us to keep the onion packet as compact as possible while still retaining the desired security properties.


==== Wrapping the onion (outlined)

Here is the process of wrapping the onion, outlined below. Come back to this list, as we explore each step with our real-world example.

For each hop the sender (Alice) repeats the same process:

1. Alice generates the per-hop shared secret and the rho, mu, and pad keys

1. Alice generates 1300 bytes of filler and fills the 1300-byte onion payload field with this filler.

1. Alice calculates the HMAC for the hop payload (zeros for the final hop).

1. Alice calculates the length of the hop payload + HMAC + space to store the length itself

1. Alice _right shifts_ the onion payload by the calculated space needed to fit the hop payload. The rightmost "filler" data is discarded, making enough space on the left for the payload.

1. Alice inserts the length + hop payload + HMAC at the front of the payload field in the space made from shifting the filler.

1. Alice uses the _rho_ key to generate a 1300 byte one-time-pad.

1. Alice obfuscates the entire onion payload by XOR-ing with the bytes generated from rho.

1. Alice calculates the HMAC of the onion payload, using the mu key.

1. Alice adds the session public key (so that the hop can calculate the shared secret)

1. Alice adds the version number.

Next, Alice repeats the process. The new keys are calculated, the onion payload is shifted (dropping more junk), the new hop payload is added to the front and the whole onion payload encrypted with the rho byte-stream for the next hop.

For the final hop, the HMAC included in step #3 over the plaintext instructions is actually _all zero_.
The final hop uses this signal to determine that it is indeed the final hop in the route.
Alternatively, the fact that the `short_chan_id` included in the payload to denote the "next hop" is all zero can be used as well.

Note that at each phase the _mu_ key is used to generate an HMAC over the _encrypted_ (from the PoV of the node processing the payload) onion packet, as well as over the contents of the packet with a single layer of encryption removed.
This outer HMAC allows the node processing the packet to verify the integrity of the onion packet (no bytes modified).
The inner HMAC is then revealed during the inverse of the "shift and encrypt" routine described above, which serves as the _outer_ HMAC for the next hop.

==== Wrapping Dina's hop payload

As a reminder, the onion is wrapped by starting at the end of the path from Dina, the final node or recipient. Then the path is built in reverse all the way back to the sender, Alice.

Alice starts with an empty 1300 byte field, the fixed-length _onion payload_. Then, Alice fills the onion payload with a pseudo-random byte stream "filler", that is generated from the +pad+ key.

[NOTE]
====
Random byte-stream generation uses the ChaCha20 algorithm, as a Cryptographic Secure Pseudo-Random Number Generator (CSPRNG). Such an algorithm will generate a deterministic, long non-repeating stream of seemingly random bytes from an initial seed. The details are specified in https://github.com/lightningnetwork/lightning-rfc/blob/master/04-onion-routing.md#pseudo-random-byte-stream[BOLT #4 - Onion Routing - Pseudo Random Byte Stream].
====

This is shown in <<onion_payload_filler>>:

[[onion_payload_filler]]
.Filling the onion payload with a random byte-stream
image::images/onion_payload_filler.png[]

Alice will now insert Dina's hop payload into the left side of the 1300 byte array, shifting the filler to the right and discarding anything that overflows. This is visualized in <<onion_add_dina>>:

[[onion_add_dina]]
.Adding Dina's hop payload
image::images/onion_add_dina.png[]

Another way to look at this is that Alice measures the length of Dina's hop payload, shifts the filler right to create an equal space in the left side of the onion payload and inserts Dina's payload in that space.

Next row down we see the result: the 1300 byte onion payload contains Dina's hop payload and then the filler byte-stream filling up the rest of the space.

Next, Alice obfuscates the entire onion payload so that *only Dina* can read it.

To do this, Alice generates a byte-stream using the +rho+ key (which Dina also knows). Alice uses a bitwise exclusive-or (XOR) between the bits of the onion payload and the byte-stream created from +rho+. The result appears like a random (or encrypted) byte stream of 1300 bytes length. This step is shown in <<onion_obfuscate_dina>>:

[[onion_obfuscate_dina]]
.Obfuscating the onion payload
image::images/onion_obfuscate_dina.png[]

One of the properties of XOR is that if you do it twice you get back to the original data. As we will see in the "unwrapping" section, if Dina applies the same XOR operation with the byte-stream generated from +rho+, it will reveal the original onion payload.

[TIP]
====
XOR is an _involutory_ function which means that if it is applied twice it undoes itself. Specifically XOR(XOR(a, b), b) = a. This property is used extensively in symmetric-key cryptography.
====

Since only Alice and Dina have the +rho+ key (derived from Alice and Dina's shared secret), only they can do this. Effectively, this encrypts the onion payload for Dina's eyes only.

Finally, Alice calculates a Hash-based Message Authentication Code (HMAC) for Dina's payload, which uses the _mu_ key as it's initialization key. This is shown in <<dina_hop_payload_hmac>>:

[[dina_hop_payload_hmac]]
.Adding an HMAC integrity checksum to Dina's hop payload
image::images/dina_hop_payload_hmac.png[]

===== Onion Routing Replay Protection & Detection

The HMAC acts as a secure checksum and helps Dina verify the integrity of the hop payload. The 32-byte HMAC is appended to Dina's hop payload.
Note that we compute the HMAC over the _encrypted_ data rather then over the plaintext data.
This is known as "encrypt-then-mac" and is the recommended way to use a MAC, as it provides both plaintext _and_ cihpertext integrity.

Modern authenticated encryption also allows for the use of an optional set of plaintext bytes to also be authenticated known as "associated data".
In practice, this is usually something like a plaintext packet header or other auxiliary information.
By including this associated data in the payload to be authenticated (MAC'd), the verifier of the MAC ensures that this associated data hasn't been tampered with (eg: swapping out the plaintext header on an encrypted packet).

In the context of the Lightning Network, this associated data is used to _strengthen_ the replay protection of this scheme.
As we'll learn more below, replay protection ensures that an attacker can't _retransmit_ (replay) a packet into the network and observe its the resulting path.
Instead, intermediate nodes are able to use the defined replay protection measures to detect, and reject a replayed packet.
The base Sphinx packet format uses a log of all the ephemeral secret keys used to detect replays.
If a secret key is ever used again, then the node can detect it, and reject the packet.

The nature of HTLCs in the Lightning Network allows us to further strengthen the replay protection by adding an additional _economic_ incentive.
Remember that the payment hash of an HTLC can only ever safely be used (for a complete payment) once.
If a payment hash is used again, and traverses a node that has already seen the payment secret fo that hash, then they can simply pull the funds, and collect the entire payment amount without forwarding!
We can use this fact to strengthen the replay protection by requiring that the _payment hash_ is included in our HMAC computation as the associated data.
With this added step, attempting to replay an onion packet also requires the sender to commit to using the _same_ payment hash.
As a result, on top of the normal replay protection, an attacker also stands to lose the entire amount of the HTLC replayed as well.

One consideration with the ever-increasing set of session keys stored for replay protection is: are nodes able to reclaim this space?
In the context of the Lightning Network, the answer is: yes!
Once again, due to the uniquer attributes of the HTLC construct, we can make a further improvement over the base Sphinx protocol.
Given that HTLCs are _time locked_ contracts based on the absolute block height, once an HTLC has expired, then the contract is effectively permanently closed.
As a result, nodes can use this CLTV expiration height as an indicator to know when it's safe to garbage collect an entry in the anti-replay log.

==== Wrapping Chan's hop payload

In <<chan_onion_wrapping>> we see the steps used to wrap Chan's hop payload in the onion. These are the same steps Alice used to wrap Dina's hop payload.

[[chan_onion_wrapping]]
.Wrapping the onion for Chan
image::images/chan_onion_wrapping.png[]

Alice starts with the 1300 onion payload created for Dina. The first 65 (or less) bytes of this are Dina's payload obfuscated and the rest is filler. Alice must be careful not to overwrite Dina's payload.

Alice adds an inner HMAC checksum to Chan's payload and inserts it at the "front" (left side) of the onion payload, shifting the existing payload to the right by an equal amount.
Remember that there are effectively _two_ HMACs used in the scheme: the outer and the inner HMAC.
In this case, Chan's _inner_ HMAC is actually Dina's _outer_ HMAC.

Now Chan's payload is in the front of the onion. When Chan sees this he has no idea how many payloads came before or after. It looks like the 1st of 20 hops always!

Next, Alice obfuscates the entire payload by XOR with the byte-stream generated from the Alice-Chan +rho+ key. Only Alice and Chan have this +rho+ key and only they can produce the byte stream to obfuscate and de-obfuscate the onion.
Finally, as we did in the earlier step, we compute Chan's outer HMAC, which is what she'll use to verify the integrity of the encrypted onion packet.

==== Wrapping Bob's hop payload

In <<bob_onion_wrapping>> we see the steps used to wrap Bob's hop payload in the onion.

All right, by now this is easy!

[[bob_onion_wrapping]]
.Wrapping the onion for Bob
image::images/bob_onion_wrapping.png[]

Start with the onion payload (obfuscated) containing Chan's and Dina's hop payloads.

Include the prior hop's outer HMAC as this hop's inner HMAC.
Insert Bob's hop payload at the beginning and shift everything else over to the right, dropping a Bob-hop-payload-size chunk from the end (it was filler anyway).

Obfuscate the whole thing XOR with the _rho_ key from the Alice-Bob-shared-secret so that only Bob can unwrap this.

Calculate the outer HMAC and stick it on the end of Bob's hop payload.


==== The final onion packet

The final onion payload is read to send to Bob. Alice doesn't need to add any more hop payloads.

Alice calculates an HMAC for the onion payload, to cryptographically secure it with a checksum that Bob can verify. Alice adds a 33-byte public session key that will be used by each hop to generate a shared-secret and the rho, mu, and pad keys. Finally Alice puts the onion version number (+0+ currently) in the front. This allows for future upgrades of the onion packet format.

The result can be seen in <<onion_packet_2>> below:

[[onion_packet_2]]
.The onion packet
image::images/onion_packet.png[]

=== Sending the onion

In this section we will look at how the onion packet is forwarded and how HTLCs are deployed along the path.

==== The update_add_htlc message

Onion packets are sent as part of the +update_add_htlc+ message. If you recall from <<update_add_htlc>>, in <<channel_operations>>, we saw the contents of the +update_add_htlc+ message were as follows:

----
[channel_id:channel_id]
[u64:id]
[u64:amount_msat]
[sha256:payment_hash]
[u32:cltv_expiry]
[1366*byte:onion_routing_packet]
----

You will recall that this message is sent by one channel partner to ask the other channel partner to add an HTLC. This is how Alice will ask Bob to add an HTLC to pay Dina. Now you understand the purpose of the last field +onion_routing_packet+, which is 1366 bytes long. It's the fully-wrapped onion packet we just constructed!

==== Alice sends the onion to Bob

Alice will send the +update_add_htlc+ message to Bob. Let's look at what this message will contain:

channel_id:: This field will contain the Alice-Bob channel ID, which in our example is +0000031e192ca1+ (see <<alice_dina_path_detail>>).

id:: The ID of this HTLC in this channel, starting at +0+.

amount_msat:: The amount of the HTLC, 50,200,000 milli-satoshis.

payment_hash:: The RIPEMD160(SHA256) payment hash, +9e017f6767971ed7cea17f98528d5f5c0ccb2c71+.

cltv_expiry:: The expiry timelock for the HTLC will be 700,058. Alice adds 20 blocks to the expiry set in Bob's payload according to Bob's negotiated +cltv_expiry_delta+.

onion_routing_packet:: The final onion packet Alice constructed with all the hop payloads!

==== Bob checks the onion

As we saw in <<channel_operations>>, Bob will add the HTLC to the commitment transactions and update the state of the channel with Alice.

Bob will unwrap the onion he received from Alice as follows:

1. Bob takes the session key from the onion packet and derives the Alice-Bob shared secret.

1. Bob generates the +mu+ key from the shared secret and uses it to verify the onion packet HMAC checksum.

Now that Bob has generated the shared key and verified the HMAC, he can start unwrapping the 1300 byte onion payload inside the onion packet. The goal is for Bob to retrieve his own hop payload and then forward the remaining onion to the next hop.

If Bob extracts and removes his hop payload, the remaining onion will not be 1300 bytes, it will be shorter! So the next hop will know that they are not the first hop and will be able to detect how long the path is. To prevent this, Bob needs to add more filler to refill the onion.

==== Bob generates filler

Bob generates filler in a slightly different way than Alice, but following the same general principle.

First, Bob *extends* the onion payload by 1300 bytes and filles them with +0+ values. Now the onion packet is 2600 bytes long, with the first half containing the data Alice sent and the next half containing zeroes. This operation is shown in <<bob_extends>>:

[[bob_extends]]
.Bob extends the onion payload by 1300 (zero-filled) bytes
image::images/bob_extends.png[Bob extends the onion payload by 1300 (zero-filled) bytes]

This empty space will become obfuscated and turn into "filler", by the same process that Bob uses to deobfuscate his own hop payload. Let's see how that works.

==== Bob de-obfuscates his hop payload

Next, Bob will generate the +rho+ key from the Alice-Bob shared key. He will use this to generate a 2600 byte stream, using the +ChaCha20+ algorithm.

[TIP]
====
The first 1300 bytes of the byte stream generated by Bob are exactly the same as those generated by Alice using the +rho+ key.
====

Next, Bob applies the 2600 bytes of +rho+ byte stream to the 2600 bytes onion payload with a bitwise XOR operation.

The first 1300 bytes will become de-obfuscated by this XOR operation, because it is the same operation Alice applied and XOR is involutory. So Bob will _reveal_ his hop payload followed by some data that seems scrambled.

At the same time, applying the +rho+ byte stream to the 1300 zeroes that were added to the onion payload will turn them into seemingly random filler data. This operation is shown in <<bob_deobfuscates>>:

[[bob_deobfuscates]]
.Bob de-obfuscates the onion, obfuscates the filler
image::images/bob_deobfuscates.png[Bob de-obfuscates the onion, obfuscates the filler]

==== Bob extracts the outer HAMC for the next hop

Remember that an inner HMAC is included for each hop, then will then become the outer HMAC for the _next_ hop.
In this case, Bob extracts the inner HMAC (he's already verified the integrity of the encrypted packet w/ the outer HMAC), and puts it aside as he'll append it to the deobfuscated packet to allow Chan to verify the HMAC of her encrypted packet.

==== Bob removes his payload and left shifts the onion

Now Bob can remove his hop payload from the front of the onion and left-shift the remaining data. An amount of data equal to Bob's hop payload from the second-half 1300 bytes of filler will now shift into the onion payload space. This is shown in <<bob_removes_shifts>>:

[[bob_removes_shifts]]
.Bob removes the hop payload and left-shifts the rest, filling the gap with new filler
image::images/bob_removes_shifts.png[Bob removes the hop payload and left-shifts the rest, filling the gap with new filler]

Now Bob can keep the first half 1300 bytes, and discard the extended (filler) 1300 bytes.

Bob now has a 1300 byte onion packet to send to the next hop. It is almost identical to the onion payload that Alice had created for Chan, except that the last 65 or so bytes of filler was put there by Bob and will be different.

No one can tell the difference between filler put there by Alice and filler put there by Bob. Filler is filler! It's all random bytes anyway.

==== Bob constructs the new onion packet

Bob now copies the onion payload into the onion packet, appends the outer HMAC for chan, re-randomizes the session key with the Elliptic Curve multiplication operation, and appends a fresh version byte.

////
TODO

Clarify how the HMAC and session key are computed for this onion packet

////

==== Bob verifies the HTLC details

Bob's hop payload contains the instructions needed to create an HTLC for Chan.

In the hop payload, Bob finds a +short_channel_id+, +amt_to_forward+, and +cltv_expiry+.

First, Bob checks to see if he has a channel with that short ID. He finds that he has such a channel with Chan.

Next, Bob confirms that the outgoing amount (50,100 satoshis) is less than the incoming amount (50,200 satoshis) and therefore Bob's fee expectations are met.

Similarly, Bob checks that the outgoing cltv_expiry is less than the incoming cltv_expiry giving Bob enough time to claim the incoming HTLC if there is a breach.

==== Bob sends the update_add_htlc to Chan

Bob now constructs and HTLC to send to Chan, as follows:

channel_id:: This field will contain the Bob-Chan channel ID, which in our example is +000004040a61f0+ (see <<alice_dina_path_detail>>).

id:: The ID of this HTLC in this channel, starting at +0+.

amount_msat:: The amount of the HTLC, 50,100,000 milli-satoshis.

payment_hash:: The RIPEMD160(SHA256) payment hash, +9e017f6767971ed7cea17f98528d5f5c0ccb2c71+. This is the same as the payment hash from Alice's HTLC.

cltv_expiry:: The expiry timelock for the HTLC will be 700,038.

onion_routing_packet:: The onion packet Bob reconstructed after removing his hop payload.

==== Chan forwards the onion

Chan repeats the exact same process as Bob:

1. Chan receives the +update_add_htlc+ and processes the HTLC request, adding it to commitment transactions.

1. Chan generates the Alice-Chan shared key and the +mu+ subkey

1. Chan verifies the onion packet HMAC, then extracts the 1300 onion payload

1. Chan extends the onion payload by 1300 extra bytes, filling it with zeroes.

1. Chan uses the +rho+ key to produce 2600 bytes.

1. Chan uses the generated byte stream to XOR and de-obfuscate the onion payload. Simultaneously the XOR operation obfuscates the extra 1300 zeroes, turning them into filler

1. Chan extracts the inner HMAC in the payload, which will become the outer HAMC for Dina.

1. Chan removes his hop payload and left-shifts the onion payload by the same amount. Some of the filler generated in the 1300 extended bytes move into the first-half 1300 bytes becoming part of the onion payload.

1. Chan constructs the onion packet for Dina with this onion payload.

1. Chan builds an +update_add_htlc+ message for Dina and inserts the onion packet into it.

1. Chan sends the +update_add_htlc+ to Dina

==== Dina receives the final payload

When Dina receives the +update_add_htlc+ message from Chan, she knows from the +payment_hash+ that this is a payment for her. She knows she is the last hop in the onion.

Dina follows the exact same process as Bob and Chan to verify and unwrap the onion, except she doesn't construct new filler and doesn't forward anything. Instead, Dina responds to Chan with +update_fulfill_htlc+ to redeem the HTLC. The +update_fulfill_htlc+ will flow backwards along the path until it reaches Alice. All the HTLCs are redeemed and channel balances are updated. The payment is complete!

=== Returning errors

This far we've looked at the forward propagation of the onion establishing the HTLCs, and the backwards propagation of the payment secret unwinding the HTLCs once payment is successful.

There is another very important function of onion routing which is _error return_. If there is a problem with the payment, onion, or hops, we must propagate an error backwards to inform all nodes of the failure and unwind any HTLCs.

Errors generally fall into 3 categories: onion failures, node failures and channel failures. These furthermore may be subdivided into permanent and transient errors. Finally, some errors contain channel updates to help with future payment delivery attempts.

[NOTE]
====
Unlike messages in the Peer-to-Peer (P2P) protocol (defined in BOLT #2), errors are not sent as P2P messages but wrapped inside onion return packets and following the reverse of the onion path (back-propagating).
====

Error return is defined in https://github.com/lightningnetwork/lightning-rfc/blob/master/04-onion-routing.md#returning-errors[BOLT #4 - Onion Routing - Returning Errors].

Errors are encoded by the returning node (the one that discovered an error) encoded in a _return packet_ a follows:

----
    [32*byte:hmac]
    [u16:failure_len]
    [failure_len*byte:failuremsg]
    [u16:pad_len]
    [pad_len*byte:pad]
----

The return packet HMAC verification checksum is calculated with the +um+ key, generated from the shared secret established by the onion.

[TIP]
====
The "um" key name is the reverse of the "mu" name, indicating the same use but in the opposite direction (back-propagation)
====

Next, the returning node generates a +ammag+ (inverse of the word "gamma") key and obfuscates the return packet using an XOR operation with a byte-stream generated from +ammag+.

Finally the return node sends the return packet to the hop from which it received the original onion.

Each hop receiving an error, will generate an +ammag+ key and obfuscate the return packet again using an XOR operation with the byte-stream from +ammag+.

Eventually, the sender (origin node) receives a return packet. It will then generate +ammag+ and +um+ keys for each hop and XOR de-bfuscate the return error iteratively until it reveals the return packet.

==== Failure Messages

The +failuremsg+ is defined in https://github.com/lightningnetwork/lightning-rfc/blob/master/04-onion-routing.md#failure-messages[BOLT #4 - Onion Routing - Failure Messages].

A failure message consists of a 2-byte +failure code+ followed by the data applicable to that failute type.

The top byte of the +failure_code+ is a set of binary flags that can be combined (with binary OR):


* 0x8000 (BADONION): unparsable onion encrypted by sending peer
* 0x4000 (PERM): permanent failure (otherwise transient)
* 0x2000 (NODE): node failure (otherwise channel)
* 0x1000 (UPDATE): new channel update enclosed


The following failure types are currently defined:

include::failure_types_table.asciidoc[]

[[stuck_payments]]
===== Stuck payments

In the current implementation of the Lightning Network there is a possibility that a payment attempt becomes "stuck", neither fulfilled nor cancelled by an error. This can happen due to a bug on an intermediary node, a node going offline while handling HTLCs, or a malicious node holding HTLCs without reporting an error. In all of these cases, the HTLC cannot be resolved until it expires. The timelock (CLTV) that is set on every HTLC helps resolve this condition (among other possible HTLC routing and channel failures).

However, this means that the sender of the HTLC has to wait until expiry, and the funds committed to that HTLC remain unavailable until the HTLC expires. Furthermore, the sender *cannot retry* that same payment, because if they do they run the risk of *both* the original and the retried payment succeeding - the recipient gets paid twice. This is because once sent an HTLC cannot be "cancelled" by the sender - it either has to fail or expire. Stuck payments, while rare, create an unwanted user experience, where the user's wallet cannot pay or cancel a payment.

One proposed solution to this problem is called "stuckless payments", and it depends on Point Time-Locked Contracts (PTLCs), which are payment contract that use a different cryptographic primitive than HTLCs (i.e. point addition on the elliptic curve instead of a hash and secret pre-image). PTLCs are cumbersome using ECDSA but much easier with Bitcoin's Taproot and Schnorr Signature features which were recently locked in for activation in November 2021. It is expected that PTLCs will be implemented in the Lightning Network after these Bitcoin features become activated.

=== Conclusion

The Lightning Network's onion routing protocol is adapted from the Sphinx Protocol to better serve the needs of a payment network. As such, it offers a huge improvement in privacy and counter-surveillance compared to the public and transparent Bitcoin blockchain.