diff --git a/wire_protocol.asciidoc b/wire_protocol.asciidoc new file mode 100644 index 0000000..461a000 --- /dev/null +++ b/wire_protocol.asciidoc @@ -0,0 +1,1085 @@ += Wire Protocol: Framing & Extensibility + +== Intro + +In this chapter, we'll dive into the wire protocol of the Lightning network, +and also cover all the various extensibility levers that have been built into +the protocol. By the end of this chapter, and aspiring reader should be able to +write their very own wire protocol parser for the Lighting Network. In addition +to being able to write a custom wire protocol parser, a reader of this chapter +will gain a deep understanding with respect of the various upgrade mechanisms +that have been built into the protocol. + +== Wire Framing + +First, we being by describing the high level structure of the wire _framing_ +within the protocol. When we say framing, we mean the way that the bytes are +packed on the wire to _encode_ a particular protocol message. Without knowledge +of the framing system used in the protocol, a stirn go byters on the wirte would +resemble a series of random bytes as no structure has been imposed. By applying +proper framing to decode these bytes on the wire, we'll be able to extract +structure and finally parse this structure into protocol messages within our +higher-level language. + +It's important to note that as the Lightning Network is an _end to end +encrypted_ protocol, the wire framing is itself encapsulated within an +_encrypted_ message transport layer. As we learned in chapter XXX, the Lighting +Network uses Brontide, a custom variant of the Noise protocol to handle +transport encryption. Within this chapter, whenever we give an example of wire +framing, we assume the encryption layer has already been stripped away (when +decoding), or that we haven't yet encrypted the set of bytes before we send +them on the wire (encoding). + +=== High-Level Wire Framing + +With that said, we're ready to being describe the high-level schema used to +encode messages on the wire: + + * Messages on the wire begin with a _2 byte_ type field, followed by a + message payload. + * The message payload itself, can be up to 65 KB in size. + * All integers are encoded in big-endian (network order). + * Any bytes that follow after a defined message can be safely ignored. + +Yep, that's it. As the protocol relies on an _encapsulating_ transport protocol +encryption layer, we don't need an explicit length for each message type. This +is due to the fact that transport encryption works at the _message_ level, so +by the time we're ready to decode the next message, we already know the total +number of bytes of the message itself. Using 2 bytes for the message type +(encoded in big-endian) means that the protocol can have up to `2^16 - 1` or +`65535` distinct messages. Continuing, as we know all messages _MUST_ be below +65KB, this simplifies our parsing as we can use a _fixed_ sized buffer, and +maintain strong bounds on the total amount of memory required to parse an +incoming wire message. + +The final bullet point allows for a degree of _backwards_ compatibility, as new +nodes are able to provide information in the wire messages that older nodes +(which may not understand them can safely ignore). As we'll see below, this +feature combined with a very flexible wire message extensibility format also +allows the protocol to achieve _forwards_ compatibility as well. + +=== Type Encoding + +With this high level background provided, we'll now start at the most primitive +layer: parsing primitive types. In addition to encoding integers, the Lightning +Protocol also allows for encoding of a vast array of types including: variable +length byte slices, elliptic curve public keys, Bitcoin addresses, and +signatures. When we describe the _structure_ of wire messages later in this +chapter, we'll refer to the high-level type (the abstract type) rather than the +lower level representation of said type. In this section, we'll peel back this +abstraction layer to ensure our future wire parser is able to properly +encoding/decode any of the higher level types. + +In the following table, we'll map the higher-level name of a given type to the +high-level routine used to encode/decode the type. + +.High-level message types +[options="header"] +|================================================================================ +| High Level Type | Framing | Comment +| `node_alias` | A 32-byte fixed-length byte slice. | When decoding, reject if contents are not a valid UTF-8 string. +| `channel_id` | A 32-byte fixed-length byte slice that maps an outpoint to a 32 byte value. | Given an outpoint, one can convert it to a `channel_id` by taking the txid of the outpoint and XOR'ing it with the index (interpreted as the lower 2 bytes). +| `short_chan_id` | An unsigned 64-bit integer (`uint64`) | Composed of the block height (24 bits), transaction index (24 bits), and output index (16 bits) packed into 8 bytes. +| `milli_satoshi` | An unsigned 64-bit integer (`uint64`) | Represents 1000th of a satoshi. +| `satoshi` | An unsigned 64-bit integer (`uint64`) | The based unit of bitcoin. +| `satoshi` | An unsigned 64-bit integer (`uint64`) | The based unit of bitcoin. +| `pubkey` | An secp256k1 public key encoded in _compressed_ format, occupying 33 bytes. | Occupies a fixed 33-byte length on the wire. +| `sig` | An ECDSA signature of the secp256k1 Elliptic Curve. | Encoded as a _fixed_ 64-byte byte slice, packed as `R \|\| S` +| `uint8` | An 8-bit integer. | +| `uint16` | A 16-bit integer. | +| `uint64` | A 64-bit integer. | +| `[]byte` | A variable length byte slice. | Prefixed with a 16-bit integer denoting the length of the bytes. +| `color_rgb` | RGB color encoding. | Encoded as a series if 8-bit integers. +| `net_addr` | The encoding of a network address. | Encoded with a 1 byte prefix that denotes the type of address, followed by the address body. +|================================================================================ + +In the next section, we'll describe the structure of each of the wire messages +including the prefix type of the message along with the contents of its message +body. + +=== Type Length Value (TLV) Message Extensions + +Earlier in this chapter we mentioned that messages can be up to 65 KB in size, +and if while parsing a messages, extra bytes are left over, then those bytes +are to be _ignored_. At an initial glance, this requirement may appear to be +somewhat arbitrary, however upon close inspection it's actually the case that +this requirement allows for de-coupled de-synchronized evolution of the Lighting +Protocol itself. We'll opine further upon this notion towards the end of the +chapter. First, we'll turn our attention to exactly what those "extra bytes" at +the end of a message can be used for. + +==== The Protcol Buffer Message Format + +The Protocol Buffer (protobuf) message serialization format started out as an +internal format used at Google, and has blossomed into one of the most popular +message serialization formats used by developers globally. The protobuf format +describes how a message (usually some sort of data structure related to an API) +is to be encoded on the wire and decoded on the other end. Several "protobuf +compilers" exists in dozens of languages which act as a bridge that allows any +language to encode a protobuf that will be able to decode by a compliant decode +in another language. Such cross language data structure compatibility allows +for a wide range of innovation it's possible to transmit structure and even +typed data structures across language and abstraction boundaries. + +Protobufs are also known for their _flexibility_ with respect to how they +handle changes in the underlying messages structure. As long as the field +numbering schema is adhered to, then it's possible for a _newer_ write of +protobufs to include information within a protobuf that may be unknown to any +older readers. When the old reader encounters the new serialized format, if +there're types/fields that it doesn't understand, then it simply _ignores_ +them. This allows old clients and new clients to _co-exist_, as all clients can +parse _some_ portion of the newer message format. + +==== Forwards & Backwards Compatibility + +Protobufs are extremely popular amongst developers as they have built in +support for both _forwards_ and _backwards_ compatibility. Most developers are +likely familiar with the concept of backwards computability. In simple terms, +the principles states that any changes to a message format or API should be +done in a manner that doesn't _break_ support for older clients. Within our +protobuf extensibility examples above, backwards computability is achieved by +ensuring that new additions to the proto format don't break the known portions +of older readers. Forwards computability on the other hand is just as important +for de-synchronized updates however it's less commonly known. For a change to +be forwards compatible, then clients are to simply _ignore_ any information +they don't understand. The soft for mechanism of upgrading the Bitcoin +consensus system can be said to be both forwards and backwards compatible: any +clients that don't update can still use Bitcoin, and if they encounters any +transactions they don't understand, then they simply ignore them as their funds +aren't using those new features. + +==== Lighting's Protobuf Inspired Message Extension Format: `TLV` + +In order to be able to upgrade messages in both a forwards and backwards +compatible manner, in addition to feature bits (more on that later), the LN +utilizes a _Custom_ message serialization format plainly called: Type Length +Value, or TLV for short. The format was inspired by the widely used protobuf +format and borrows many concepts by significantly simplifying the +implementation as well as the software that interacts with message parsing. A +curious reader might ask "why not just use protobufs"? In response, the +Lighting developers would respond that we're able to have the best of the +extensibility of protobufs while also having the benefit of a smaller +implementation and thus attacks surface in the context of Lightning. As of +version v3.15.6, the protobuf compiler weighs in at over 656,671 lines of code. +In comparison lnd's implementation of the TLV message format weighs in at only +2.3k lines of code (including tests). + +With the necessary background presented, we're now ready to describe the TLV +format in detail. A TLV message extension is said to be a _stream_ of +individual TLV records. A single TLV record has three components: the type of +the record, the length of the record, and finally the opaque value of the +record: + + * `type`: An integer representing the name of the record being encoded. + * `length`: The length of the record. + * `value`: The opaque value of the record. + +Both the `type` and `length` are encoded using a variable sized integer that's +inspired by the variable sized integer (varint) used in Bitcoin's p2p protocol, +this variant is called `BigSize` for short. In its fullest form, a `BigSize` +integer can represent value up to 64-bits. In contrast to Bitcoin's varint +format, the `BigSize format instead encodes integers using a _big endian_ byte +ordering. + +The `BigSize` varint has the components: the discriminant and the body. In the +context of the `BigSize` integer, the discriminant communicates to the decoder +the _size_ of the variable size integer. Remember that the uniquer thign about +variable sized integers is that they allow a parser to use less bytes to encode +smaller integers than larger ones. This allows message formats to safe space, as +they're able to minimally encode numbers from 8 to 6-bits. Encoding a `BigSize` +integer can be defined using a piece-wise function that branches based on the +size of the integer to be encoded. + + * If the value is _less than_ `0xfd` (`253`): + * Then the discriminant isn't really used, and the encoding is simply the + integer itself. + + * This value allows us to encode very small integers with no additional + overhead + + * If the value is _less than or equal to_ `0xffff` (`65535`): + * Then the discriminant is encoded as `0xfd`, which indicates that the body is + that follows is larger than `0xfd`, but smaller than `0xffff`). + + * The body is then encoded as a _16-bit_ integer. Including the + discriminant, then we can encode a value that is greater than 253, but + less than 65535 using `3 bytes`. + + * If the value is less than `0xffffffff` (`4294967295`): + * Then the discriminant is encoded as `0xfe`. + + * The body is encoded using _32-bit_ integer, Including the discriminant, + then we can encode a value that's less than `4,294,967,295` using _5 + bytes_. + + * Otherwise, we'll just encode the value as a fully _64-bit_ integer. + + +Within the context of a TLV message, +values below `2^16` are said to be _reserved_ for future use. Values beyond this +range are to be used for "custom" message extensions used by higher-level +application protocols. The `value` is defined in terms of the `type`. In other +words, it can take any forma s parzers will attempt to coalsces it into a +higher-level types (such as a signatture) depending on the context of the type +itself. + +One issue with the protobuf format is the encodes of the same message may +output an entirely different set of bytes when encoded by two different +versions of the compiler. Such instances of a non-cannonical encoding are not +acceptable within teh context of Lighting, was many messages contain a +signature of the message digest. If it's possible for a message to be encoded +in two different ways, then it would be possible to break the authentication of +a signature inadvertently by re-encoding a message using a slightly different +set of bytes on the wire. + +In order to ensure that all encoded messages are canonical, the following +constraints are defined when encoding: + + * All records within a TLV stream MUST be encoded in order of strictly + increasing type. + + * All records must _minimally encode_ the `type` and `length` fields. In + orther woards, the smallest BigSIze representation for an integer MUST be + used at all times. + + * Each `type` may only appear _once_ within a given TLV stream. + +In addition to these writing requirements a series of higher-level +interpretation requirements are also defined based on the _arity_ of a given +`type` integer. We'll dive further into these details towards the end of the +chapter cone we talked about how the Lighting Protocol is upgraded in practice +and in theory. + + +=== Wire Messages + +In this section, well outline the precise structure of each of the wire +messages within the protocol. We'll do so in two parts: first we'll enumerate +all the currently defined wire message types along with the message name +corresponding to that type, we',l then double back and define the structure of +each of the wire messages (partitioned into logical groupings). + +First, we'll lead with an enumeration of all the currently defined types: + +.Message Types +[options="header"] +|============================================================================== +| Type Integer | Message Name | Category +| 16 | `init` | Connection Establishment +| 17 | `error` | Error Communication +| 18 | `ping` | Connection Liveness +| 19 | `pong` | Connection Liveness +| 32 | `open_channel` | Channel Funding +| 33 | `accept_channel` | Channel Funding +| 34 | `funding_created` | Channel Funding +| 35 | `funding_signed` | Channel Funding +| 36 | `funding_locked` | Channel Funding + Channel Operation +| 38 | `shutdown` | Channel Closing +| 39 | `closing_signed` | Channel Closing +| 128 | `update_add_htlc` | Channel Operation +| 130 | `update_fulfill_hltc` | Channel Operation +| 131 | `update_fail_htlc` | Channel Operation +| 132 | `commit_sig` | Channel Operation +| 133 | `revoke_and_ack` | Channel Operation +| 134 | `update_fee` | Channel Operation +| 135 | `update_fail_malformed_htlc` | Channel Operation +| 136 | `channel_reestablish` | Channel Operation +| 256 | `channel_announcement` | Channel Announcement +| 257 | `node_announcement` | Channel Announcement +| 258 | `channel_update` | Channel Announcement +| 259 | `announce_signatures` | Channel Announcement +| 261 | `query_short_chan_ids` | Channel Graph Syncing +| 262 | `reply_short_chan_ids_end` | Channel Graph Syncing +| 263 | `query_channel_range` | Channel Graph Syncing +| 264 | `reply_channel_range` | Channel Graph Syncing +| 265 | `gossip_timestamp_range` | Channel Graph Syncing +|============================================================================== + +In the above table, the `Category` field allows us to quickly categonize a +message based on its functionality within the protocol itself. At a high level, +we place a message into one of 8 (non exhaustive) buckets including: + + * *Connection Establishment*: Sent when a peer to peer connection is first + established. Also used in order to negotiate the set of _feature_ supported + by a new connection. + + * *Error Communication*: Used by peer to communicate the occurrence of + protocol level errors to each other. + + * *Connection Liveness*: Used by peers to check that a given transport + connection is still live. + + * *Channel Funding*: Used by peers to create a new payment channel. This + process is also known as the channel funding process. + + * *Channel Operation*: The act of updating a given channel _off-chain_. This + includes sending and receiving payments, as well as forwarding payments + within the network. + + * *Channel Announcement*: The process of announcing a new public channel to + the wider network so it can be used for routing purposes. + + * *Channel Graph Syncing*: The process of downloading+verifying the channel + graph. + + +Notice how messages that belong to the same category typically share an +adjacent _message type_ as well. This is done on purpose in order to group +semantically similar messages together within the specification itself. With +this roadmap laid out, we'll now visit each message category in order to define +the precise structure and semantics of all defined messages within the LN +protocol. + +==== Connection Establishment Messages + +Messages in this category are the very first message sent between peers once +they establish a transport connection. At the time of writing of this chapter, +there exists only a single messages within this category, the `init` message. +The `init` message is sent by _both_ sides of the connection once it has been +first established. No other messages are to be sent before the `init` message +has been sent by both parties. + +The structure of the `init` message is defined as follows: + +`init` message: + + * type: `16` + * fields: + * `uint16`: `global_features_len` + * `global_features_len*byte`: `global_features` + * `uint16`: `features_len` + * `features_len*byte`: `features` + * `tlv_stream_tlvs` + +Structurally, the `init` message is composed of two variable size bytes slices +that each store a set of _feature bits_. As we'll see later, feature bits are a +primitive used within the protocol in order to advertise the set of protocol +features a node either understands (optional features), or demands (required +features). + +Note that modern node implementations will only use the `features` field, with +items residing within the `global_features` vector for primarily _historical_ +purposes (backwards compatibility). + +What follows after the core message is a series of T.L.V, or Type Length Value +records which can be used to extend the message in a forwards+backwards +compatible manner in the future. We'll cover what TLV records are and how +they're used later in the chapter. + +An `init` message is then examined by a peer in order to determine if the +connection is well defined based on the set of optional and required feature +bits advertised by both sides. + +An optional feature means that a peer knows about a feature, but they don't +consider it critical to the operation of a new connection. An example of one +would be something like the ability to understand the semantics of a newly +added field to an existing message. + +On the other hand, required feature indicate that if the other peer doesn't +know about the feature, then the connection isn't well defined. An example of +such a feature would be a theoretical new channel type within the protocol: if +your peer doesn't know of this feature, they you don't want to keep the +connection as they're unable to open your new preferred channel type. + +==== Error Communication Messages + +Messages in this category are used to send connection level errors between two +peers. As we'll see later, another type of error exists in the protocol: an +HTLC forwarding level error. Connection level errors may signal things like +feature bit incompatibility, or the intent to force close (unilaterally +broadcast the latest signed commitment) + +The sole message in this category is the `error` message: + + * type: `17` + * fields: + * `channel_id`: `chan_id` + * `uint16`: `data_len` + * `data_len*byte`: `data` + +An `error` message can be sent within the scope of a particular channel by +setting the `channel_id`, to the `channel_id` of the channel under going this +new error state. Alternatively, if the error applies to the connection in +general, then the `channel_id` field should be set to all zeroes. This all zero +`channel_id` is also known as the connection level identifier for an error. + +Depending on the nature of the error, sending an `error` message to a peer you +have a channel with may indicate that the channel cannot continue without +manual intervention, so the only option at that point is to force close the +channel by broadcasting the latest commitment state of the channel. + +==== Connection Liveness + +Messages in this section are used to probe to determine if a connection is +still live or not. As the LN protocol somewhat abstracts over the underlying +transport being used to transmit the messages, a set of protocol level `ping` +and `pong` messages are defined. + +First, the `ping` message: + + * type: `18` + * fields: + * `uint16`: `num_pong_bytes` + * `uint16`: `ping_body_len` + * `ping_body_len*bytes`: `ping_body` + +Next it's companion, the `pong` message: + + * type: `19` + * fields: + * `uint16`: `pong_body_len` + * `ping_body_len*bytes`: `pong_body` + +A `ping` message can be sent by either party at any time. + +The `ping` message includes a `num_pong_bytes` field that is used to instruct +the receiving node with respect to how large the payload it sends in its `pong` +message is. The `ping` message also includes a `ping_body` opaque set of bytes +which can be safely ignored. It only serves to allow a sender to pad out `ping` +messages they send, which can be useful in attempting to thwart certain +de-anonymization techniques based on packet sizes on the wire. + +A `pong` message should be sent in response to a received `ping` message. The +receiver should read a set of `num_pong_bytes` random bytes to send back as the +`pong_body` field. Clever use of these fields/messages may allow a privacy +concious routing node to attempt to thwart certain classes of network +de-anonymization attempts, as they can create a "fake" transcript that +resembles other messages based on the packet sizes set across. Remember that by +default the LN uses an _encrypted_ transport, so a passive network monitor +cannot read the plaintext bytes, thus only has timing and packet sizes to go +off of. + +==== Channel Funding + +As we go on, we enter into the territory of the core messages that govern the +functionality and semantics of the Lightning Protocol. In this section, we'll +explore the messages sent during the process of creating a new channel. We'll +only describe the fields used as we'll leave a in in-depth analysis of the +funding process to chapter XXX. + +Messages that are sent during the channel funding flow belong to the following +set of 5 messages: `open_channel`, `accept_channel`, `funding_created`, +`funding_signed`, `funding_locked`. We'll leave a description of the precise +protocol flow involving these messages for a chapter XXX. In this section, +we'll simply enumerate the set of fields and briefly describe each one. + +The `open_channel` message: + + * type: `32` + * fields: + * `chain_hash`:chain_hash + * `32*byte`: `temp_chan_id` + * `uint64`: `funding_satoshis` + * `uint64`: `push_msat` + * `uint64`: `dust_limit_satoshis` + * `uint64`: `max_htlc_value_in_flight_msat` + * `uint64`: `channel_reserve_satoshis` + * `uint64`: `htlc_minimum_msat` + * `uint32`: `feerate_per_kw` + * `uint16`: `to_self_delay` + * `uint16`: `max_accepted_htlcs` + * `pubkey`: `funding_pubkey` + * `pubkey`: `revocation_basepoint` + * `pubkey`: `payment_basepoint` + * `pubkey`: `delayed_payment_basepoint` + * `pubkey`: `htlc_basepoint` + * `pubkey`: `first_per_commitment_point` + * `byte`: `channel_flags` + * `tlv_stream`: `tlvs` + +This is the first message sent when a node wishes to execute a new funding flow +with another node. This message contains all the necessary information required +for both peers to constructs both the funding transaction as well as the +commitment transaction. + +At the time of writing of this chapter, a single TLV record is defined within +the set of optional TLV records that may be appended to the end of a defined +message: + + * type: 0 + * data: `upfront_shutdown_script` + +The `upfront_shutdown_script` is a variable sized byte slice that MUST be a +valid public key script as accepted by the Bitcoin networks' consensus +algorithm. By providing such an address, the sending party is able to +effectively create a "closed loop" for their channel, as neither side will sign +off an cooperative closure transaction that pays to any other address. In +practice, this address is usually one derived from a cold storage wallet. + +The `channel_flags` field is a bitfield of which at the time of writing, only +the _first_ bit has any sort of significance. If this bit is set, then this +denotes that this channel is to be advertised to the public network as a route +bal channel. Otherwise, the channel is considered to be unadvertised, also +commonly referred to as a "private" channel. + +The `accept_channel` message is the response to the `open_channel` message: + + * type: `33` + * fields: + * `32*byte`: `temp_chan_id` + * `uint64`: `dust_limit_satoshis` + * `uint64`: `max_htlc_value_in_flight_msat` + * `uint64`: `channel_reserve_satoshis` + * `uint64`: `htlc_minimum_msat` + * `uint32`: `minimum_depth` + * `uint16`: `to_self_delay` + * `uint16`: `max_accepted_htlcs` + * `pubkey`: `funding_pubkey` + * `pubkey`: `revocation_basepoint` + * `pubkey`: `payment_basepoint` + * `pubkey`: `delayed_payment_basepoint` + * `pubkey`: `htlc_basepoint` + * `pubkey`: `first_per_commitment_point` + * `tlv_stream`: `tlvs` + +The `accept_channel` message is the second message sent during the funding flow +process. It serves to acknowledge an intent to open a channel with a new remote +peer. The message mostly echos the set of parameters that the responder wishes +to apply to their version of the commitment transaction. Later in Chapter XXX, +when we go into the funding process in details, we'll do a deep dive to explore +the implications of the various par maters that can be set when opening a new +channel. + +In response, the initiator will send the `funding_created` message: + + * type: `34` + * fields: + * `32*byte`: `temp_chan_id` + * `32*byte`: `funding_txid` + * `uint16`: `funding_output_index` + * `sig`: `commit_sig` + +Once the initiator of a channel receives the `accept_channel` message from the +responder, they they have all the materials they need in order to construct the +commitment transaction, as well as the funding transaction. As channels by +default are single funder (only one side commits funds), only the initiator +needs to construct the funding transaction. As a result, in order to allow the +responder to sign a version of a commitment transaction for the initiator, the +initiator, only needs to send the funding outpoint of the channel. + +To conclude the responder sends the `funding_signed` message: + + * type: `34` + * fields: + * `channel_id`: `channel_id` + * `sig`: `signature` + +To conclude after the responder receivers the `funding_created` message, they +now own a valid signature of the commitment transaction by the initiator. With +this signature they're able to exit the channel at any time by signing their +half of the multi-sig funding output, and broadcasting the transaction. This is +referred to as a force close. In order to give the initiator the ability to do +so was well, before the channel can be used, the responder then signs the +initiator's commitment transaction as well. + +Once this message has been received by the initiator, it's safe for them to +broadcast the funding transaction, as they're now able to exit the channel +agreement unilaterally. + +Once the funding transaction has received enough confirmations, the +`funding_locked` is sent: + + * type: `36 + * fields: + * `channel_id`: `channel_id` + * `pubkey`: `next_per_commitment_point` + +Once the funding transaction obtains a `minimum_depth` number of confirmations, +then the `funding_locked` message is to be sent by both sides. Only after this +message has been received, and sent can the channel being to be used. + +==== Channel Closing + +* type: `38` +* fields: + [channel_id:channel_id] +[u16:len] +[len*byte:scriptpubkey] + +* type: `39` +* fields: + [channel_id:channel_id] +[u64:fee_satoshis] +[signature:signature] + +#### Channel Operation + +In this section, we'll briefly describe the set of messages used to allow +anodes to operate a channel. By operation, we mean being able to send receive, +and forward payments for a given channel. + +In order to send, receive or forward a payment over a channel, an HTLC must +first be added to both commitment transactions that comprise of a channel link. + +* The `update_add_htlc` message allows either side to add a new HTLC to the +opposite commitment transaction: + + * type: `128` + * fields: + * `channel_id`: `channel_id` + * `uint64`: `id` + * `uint64`: `amount_msat` + * `sha256`: `payment_hash` + * `uint32`:`cltv_expiry` + * `1366*byte:`onion_routing_packet` + +Sending this message allows one party to initiate either sending a new payment, +or forwarding an existing payment that arrived via in incoming channel. The +message specifies the amount (`amount_msat`) along with the payment hash that +unlocks the payment itself. The set of forwarding instructions of the next hop +are onion encrypted within the `onion_routing_packet` field. In Chapter XXX on +multi-hop HTLC forwarding, we details the onion routing protocol used in the +Lighting Network in detail. + +Note that each HTLC sent uses an auto incrementing ID which is used by any +message which modifies na HTLC (settle or cancel) to reference the HTLC in a +unique manner scoped to the channel. + +The `update_fulfill_hltc` allow redemption (receipt) of an active HTLC: + + * type: `130` + * fields: + * `channel_id`: `channel_id` + * `uint64`: `id` + * `32*byte`: `payment_preimage` + +This message is sent by the HTLC receiver to the proposer in order to redeem an +active HTLC. The message references the `id` of the HTLC in question, and also +provides the pre-image (which unlocks the HLTC) as well. + +The `update_fail_htlc` is sent to remove an HTLC from a commitment transaction: + + * type: `131` + * fields: + * `channel_id`:channel_id` + * `uint64`: `id` + * `uint16`: `len` + * `len*byte`: `reason` + +The `update_fail_htlc` is the opposite of the `update_fulfill_hltc` message as +it allows the receiver of an HTLC to remove the very same HTLC. This message is +typically sent when an HTLC cannot be properly routed upstream, and needs to be +sent back to the sender in order to unravel the HTLC chain. As we'll explore in +Chapter XX, the message contains an _encrypted_ failure reason (`reason`) which +may allow the sender to either adjust their payment route, or terminate if the +failure itself is a terminal one. + +The `commit_sig` is used to stamp the creation of a new commitment transaction: + + * type: `132` + * fields: + * `channel_id`: `channel_id` + * `sig`: `signature` + * `uint16` `num_htlcs` + * `num_htlcs*sig: `htlc_signature` + +In addition to sending a signature for the next commitment transaction, the +sender of this message also needs to send a signature for each HTLC that's +present on the commitment transaction. This is due to the existence of the + + +The `revoke_and_ack` is sent to revoke a dated commitment: + * type: `133` + * fields: + * `channel_id`: `channel_id` + * `32*byte`: `per_commitment_secret` + * `pubkey`: `next_per_commitment_point` + +As the Lightning Network uses a replace-by-revoke commitment transaction, after +receiving a new commitment transaction via the `commit_sig` message, a party +must revoke their past commitment before they're able to receive another one. +While revoking a commitment transaction, the revoker then also provides the +next commitment point that's required to allow the other party to send them a +new commitment state. + +The `update_fee` is sent to update the fee on the current commitment +transactions: + + * type: `134` + * fields + * `channel_id`: `channel_id` + * `uint32`: `feerate_per_kw` + +This message can only be sent by the initiator of the channel they're the ones +that will pay for the commitment fee of the channel as along as it's open. + +The `update_fail_malformed_htlc` is sent to remove a corrupted HTLC: + + * type: `135` + * fields: + * `channel_id`: `channel_id` + * `uint64`: `id` + * `sha256`: `sha256_of_onion` + * `uint16`: `failure_code` + +This message is similar to the `update_fail_htlc` but it's rarely used in +practice. As mentioned above, each HTLC carries an onion encrypted routing +packet that also covers the integrity of portions of the HTLC itself. If a +party receives an onion packet that has somehow been corrupted along the way, +then it won't be able to decrypt the packet. As a result it also can't properly +forward the HTLC, therefore it'll send this message to signify that the HTLC +has been corrupted somewhere along the route back to the sender. + +==== Channel Announcement + +Messages in this category are used to announce components of the Channel Graph +authenticated data structure to the wider network. The Channel Graph has a +series of unique properties due to the condition that all data added to the +channel graph MUST also be anchored in the base Bitcoin blockchain. As a +result, in order to add a new entry to the channel graph, an agent must be an +on chain transaction fee. This serves as a natural spam de tenace for the +Lightning Network. + +The `channel_announcement` is used to announce a new channel to the wider +network: + + * type: `256` + * fields: + * `sig`: `node_signature_1` + * `sig`: `node_signature_2` + * `sig`: `bitcoin_signature_1` + * `sig`: `bitcoin_signature_2` + * `uint16`: `len` + * `len*byte`: `features` + * `chain_hash`: `chain_hash` + * `short_channel_id`: `short_channel_id` + * `pubkey`: `node_id_1` + * `pubkey`: `node_id_2` + * `pubkey`: `bitcoin_key_1` + * `pubkey`: `bitcoin_key_2` + +The series of signatures and public keys in the message serves to create a +_proof_ that the channel actually exists within the base Bitcoin blockchain. As +we'll detail in Chapter XXX, each channel is uniquely identified by a locator +that encodes it's _location_ within the blockchain. This locator is called this +`short_channel_id` and can fit into a 64-bit integer. + + +The `node_announcement` allows a node to announce/update it's vertex within the +greater Channel Graph: + + * type: `257` + * fields: + * `sig`:`signature` + * `uint64`: `flen` + * `flen*byte`: `features` + * `uint32`: `timestamp` + * `pubkey`: `node_id` + * `3*byte`: `rgb_color` + * `32*byte`: `alias` + * `uint16`: `addrlen` + * `addrlen*byte`: `addresses` + +Note that if a node doesn't have any advertised channel within the Channel +Graph, then this message is ignored in order to ensure that adding an item to +the Channel Graph bares an on-chain cost. In this case, the on-chain cost will +the cost of creating the channel which this node is connected to. + +In addition to advertising its feature set, this message also allows a node to +announce/update the set of network `addresses` that it can be reached at. + +The `channel_update` messages is sent to update the properties and policies of +an active channel edge within the Channel graph: + + * type: `258: + * fields: + * `signature`: `signature` + * `chain_hash`: `chain_hash` + * `short_channel_id`: `short_channel_id` + * `uint32`: `timestamp` + * `byte`: `message_flags` + * `byte`: `channel_flags` + * `uint16`: `cltv_expiry_delta` + * `uint64`: `htlc_minimum_msat` + * `uint32`: `fee_base_msat` + * `uint32`: `fee_proportional_millionths` + * `uint16`: `htlc_maximum_msat` + +In addition to being able to enable/disable a channel this message allows a +node to update it's routing fees as well as other fields that shape the type of +payment that is permitted to flow through this channel. + +The `announce_signatures` message is exchange by channel peers in order to +assemble the set of signatures required to produce a `channel_announcement` +message: + + * type: `259` + * fields: + * `channel_id`: `channel_id` + * `short_channel_id`: `short_channel_id` + * `sig`: `node_signature` + * `sig`: `bitcoin_signature` + +After the `funding_locked` message has been sent, if both sides wish to +advertise their channel to the network, then they'll each send the +`announce_signatures` message which allows both sides to emplace the 4 +signatures required to generate a `announce_signatures` message. + +==== Channel Graph Syncing + +The `query_short_chan_ids` allows a peer to obtain the channel information +related to a series of short channel IDs: + + * type: `261: + * fields: + * `chain_hash`: `chain_hash` + * `u16`: `len` + * `len*byte`: `encoded_short_ids` + * `query_short_channel_ids_tlvs`: `tlvs` + +As we'll learn in Chapter XXX, these channel IDs may be a series of channels +that were new to the sender, or were out of date which allows the sender to +obtain the latest set of information for a set of channels. + +The `reply_short_chan_ids_end` message is sent after a peer finishes responding +to a prior `query_short_chan_ids` message: + + * type; `262` + * fields: + * `chain_hash`: `chain_hash` + * `byte`: `full_information` + +This message signals to the receiving party that if they wish to send another +query message, they can now do so. + +The `query_channel_range` message allows a node to query for the set of channel +opened within a block range: + * type: `263: + * fields: + * `chain_hash`: `chain_hash` + * `u32`: `first_blocknum` + * `u32`: `number_of_blocks` + * `query_channel_range_tlvs`: `tlvs` + + +As channels are represented using a short channel ID that encodes the location +of a channel in the chain, a node on the network can use a block height as a +sort of _cursor_ to seek through the chain in order to discover a set of newly +opened channels. In Chapter XXX, we'll go through the protocol peers use to +sync the channel graph in more detail. + + +The `reply_channel_range` message is the response to `query_channel_range` and +includes the set of short channel IDs for known channels within that range: + * type: `264` + * fields: + * `chain_hash`: `chain_hash` + * `u32`: `first_blocknum` + * `u32`: `number_of_blocks` + * `byte`: `sync_complete` + * `u16`: `len` + * `len*byte`: `encoded_short_ids` + * `reply_channel_range_tlvs`: `tlvs` + +As a response to `query_channel_range`, this message sends back the set of +channels that were opened within that range. This process can be repeated with +the requester advancing their cursor further down the chain in order to +continue syncing the Channel Graph. + +The `gossip_timestamp_range` message allows a peer to start receiving new +incoming gossip messages on the network: + * type: `265: + * fields: + * `chain_hash`: `chain_hash` + * `u32`: `first_timestamp` + * `u32`: `timestamp_range` + +Once a peer has synced the channel graph, they can send this message if they +wish to receive real-time updates on changes in the Channel Graph. They can +also set the `first_timestamp` and `timestamp_range` fields if they wish to +receive a backlog of updates they may have missed while they were down. + + +== Feature Bits & Protocol Extensibility + +As the Lighting Network is a decentralized system, no one entity can enforce a +protocol change or modification upon all the users of the system. This +characteristic is also seen in other decentralized networks such as Bitcoin. +However, unlike Bitcoin overwhelming consensus *is not* require to change a +subset of the Lightning Network. Lighting is able to evolve at will without a +strong requirement of coordination, as unlike Bitcoin, there is no *global& +consensus required in the Lightning Network. Due to this fact and the several +upgrade mechanisms embedded in the Lighting Network, at most, only the +participants that wish to use these new Lighting Network feature need to +upgrade, and then they are able to interact w/ each other. + +In this section, we'll explore the various ways that developers and users are +able to design, roll out, deploy new features to the Lightning Network. The +designers of the origin Lightning Network knew at the time of drafting the +initial specification, that there were many possible future directions the +network could evolves towards. As a results, they made sure to emplace several +extensibility mechanisms within the network which can be used to upgrade the +network partially or fully in a decoupled, desynchronized, decentralized +manner. + +=== Feature Bits as an Upgrade Discoverability Mechanism + +An astute reader may have noticed the various locations that "feature bits" are +included within the Lightning Protocol. A "feature bit" is a bitfield that can +be used to advertise understanding or adherence to a possible network protocol +update. Feature bits are commonly assigned in *pairs*, meaning that each +potential new feature/upgrade always defines *two* bits within the bitfield. +One bit signals that the advertised feature is _optional_, meaning that the +node knows a about the feature, and can use it if compelled, but doesn't +consider it required for normal operation. The other bit signals that the +feature is instead *required*, meaning that the node will not continue +operation if a prospective peer doesn't understand that feature. + +Using these two bits optional and required, we can construct a simple +compatibility matrix that nodes/users can consult in order to determine if a +peer is compatible with a desired feature: + +.Feature Bit Compatability Matrix +[options="header"] +|======================================================== +|Bit Type|Remote Optional|Remote Required|Remote Unknown +|Local Optional|✅|✅|✅ +|Local Required|✅|✅|❌ +|Local Unknown|✅|❌|❌ +|======================================================== + +From this simplified compatibility matrix, we can see that as long as the other +party *knows* about our feature bit, then can interact with them using the +protocol. If the party doesn't even know about what bit we're referring to +*and* they require the feature, then we are incompatible with them. Within the +network, optional features are signalled using an _odd bit number_ while +required feature are signalled using an _even bit number_. As an example, if a +peer signals that they known of a feature that uses bit _15_, then we know that +this is an _optional_ feature, and we can interact with them or respond to +their messages even if we don't know about the feature. On the other hand, if +they instead signalled the feature using bit _16_, then we know this is a +required feature, and we can't interact with them unless our node also +understands that feature. + +The Lighting developers have come up with an easy to remember phrase that +encodes this matrix: "it's OK to be odd". This simple rule set allows for a +rich set of interactions within the protocol, as a simple bitmask operation +between two feature bit vectors allows peers to determine if certain +interactions are compatible with each other or not. In other words, feature +bits are used as an upgrade discoverability mechanism: they easily allow to +peers to understand if they are compatible or not based on the concepts of +optional, required, and unknown feature bits. + +Feature bits are found in the: `node_announcement`, `channel_announcement`, and +`init` messages within the protocol. As a result, these three messages can be +used to *signal* the knowledge and/or understanding of in-flight protocol +updates within the network. The feature bits found in the `node_announcement` +message can allow a peer to determine if their _connections_ are compatible or +not. The feature bits within the `channel_announcement` messages allows a peer +to determine if a given payment ype or HTLC can transit through a given peer or +not. The feature bits within the `init` message all peers to understand kif +they can maintain a connection, and also which features are negotiated for the +lifetime of a given connection. + +=== Utilizing TLV Records for Forwards+Backwards Compatibility + +As we learned earlier in the chapter, Type Length Value, or TLV records can be +used to extend messages in a forwards and backwards compatible manner. +Overtime, these records have been used to _extend_ existing messages without +breaking the protocol by utilizing the "undefined" area within a message beyond +that set of known bytes. + +As an example, the original Lighting Protocol didn't have a concept of the +_largest_ HTLC that could traverse through a channel as dictated by a routing +policy. Later on, the `max_htlc` field was added to the `channel_update` +message to phase in such a concept over time. Peers that held a +`channel_update` that set such a field but didn't even know the upgrade existed +where unaffected by the change, but may see their HTLCs rejected if they are +beyond the said limit. Newer peers on the other hand are able to parse, verify +and utilize the new field at will. + +Those familiar with the concept of soft-forks in Bitcoin may now see some +similarities between the two mechanism. Unlike Bitcoin consensus-level +soft-forks, upgrades to the Lighting Network don't require overwhelming +consensus in order to adopt. Instead, at minimum, only two peers within the +network need to understand new upgrade in order to start utilizing it without +any permission. Commonly these tow peers may be the receiver and sender of a +payment, or it may the initiator and responder of a new payment channel. + +=== A Taxonomy of Upgrade Mechanisms + +Rather than there being a single widely utilized upgrade mechanism within the +network (such as soft forks for base layer Bitcoin), there exist a wide +gradient of possible upgrade mechanisms within the Lighting Network. In this +section, we'll enumerate the various upgrade mechanism within the network, and +provide a real-world example of their usage in the past. + +==== Internal Network Upgrades + +We'll start with the upgrade type that requires the most extra protocol-level +coordination: internal network upgrades. An internal network upgrade is +characterized by one that requires *every single node* within a prospective +payment path to understand the new feature. Such an upgrade is similar to any +upgrade within the known internet that requires hardware level upgrades within +the core relay portion of the upgrade. In the context of LN however, we deal +with pure software, so such upgrades are easier to deploy, yet they still +require much more coordination than any other upgrade type utilize within the +network. + +One example of such an upgrade within the network was the move to using a TLV +encoding for the routing information encoded within the onion encrypted routing +packets utilized within the network. The prior format used a hand encoded +format to communicate information such as the next hop to send the payment to. +As this format was _fixed_ it meant that new protocol-level upgrades such as +extensions that allowed feature such as packet switching weren't possible +without. The move to encoding the information using the more flexible TLV +format meant that after the single upgrade, then any sort of feature that +modified the _type_ of information communicated at each hop could be rolled out +at will. + +It's worth mentioning that the TLV onion upgrade was a sort of "soft" internal +network upgrade, in that if a payment wasn't using any _new_ feature beyond +that new routing information encoding, then a payment could be transmitted +using a _mixed_ set of nodes, as no new information would be transmitted that +are required to forward the payment. However, if a new upgrade type instead +changed the _HTLC_ format, then the entire path would need to be upgraded, +otherwise the payment wouldn't be able to be fulfilled. + +==== End to End Upgrades + +To contrast the internal network upgrade, in this section we'll describe the +_end to end_ network upgrade. This upgrade type differs from the internal +network upgrade in that it only requires the "ends" of the payment, the sender +and receiver to upgrade in order to be utilized. This type of upgrade allows +for a wide array of unrestricted innovation within the network, as due to the +onion encrypted nature of payments within the network, those forwarding HTLCs +within the center of the network may not even know that new feature are being +utilized. + +One example of an end to end upgrade within the network was the roll out of +MPP, or multi-path payments. MPP is a protocol-level feature that enables a +single payment to be split into multiple parts or paths, to be assembled at the +receiver for settlement. The roll out our MPP was coupled with a new +`node_announcement` level feature bit that indicates that the receiver knows +how to handle partial payments. Assuming a sender and receiver know about each +other (possibly via a BOLT 11 invoice), then they're able to use the new +feature without any further negotiation. + +Anothert example of an end to end upgrade are the various types of +_spontaneous_ payments deployed within the network. One early type of +spontaneous payments called "keysend" worked by simply placing the pre-image of +a payment within the encrypted onion packet that is only decrypted by the +destination o of the payment. Upon receipt, the destination would decrypt the +pre-image, then use that to settle the payment. As the entire packet is end to +end encrypted, this payment type was safe, since none of the intermediate nodes +are able to fully unwrap the onion to uncover the payment pre-image that +corresponded to that payment hash. + +==== Channel Construction Level Updates + +The final broad category of updates within the network are those that happen at +the channel construction level, but which don't modify the structure of the +HTLC used widely within the network. When we say channel construction, we mean +_how_ the channel is funded or created. As an example, the eltoo channel type +can be rolled out within the network using a new `node_announcement` level +feature bit as well as a `channel_announcement` level feature bit. Only the two +peers on the sides of the channels needs to understand and advertise these new +features. This channel pair can then be used to forward any payment type +granted the channel supports it. + +The "anchor outputs" channel format which allows the commitment fee to be +bumped via CPFP, and second-level HTLCs aggregated amongst other transactions +was rolled out in such a manner. Using the implicit feature bit negotiation, if +both sides established a connection, and advertised knowledge of the new +channel type, then it would be used for any future channel funding attempts in +that channel.