lnbook/bitcoin-fundamentals-review.asciidoc

== Bitcoin Fundamentals Review

// TODO Fixes #584

The Lightning Network is capable of running above multiple blockchains, but is primarily anchored on Bitcoin. To understand LN, you need a fundamental understanding of Bitcoin and its building blocks.

There are many good resources that you can use to learn more about Bitcoin, including the "companion" book _Mastering Bitcoin 2nd Edition_, written by Andreas M. Antonopoulos, which you can find on GitHub under an open source license. However, you do not need to read a whole other book to be ready for this one!

In this chapter, we've collected the most important concepts you need to know about Bitcoin and explained them in the context of the Lightning Network. This way you can learn exactly what you need to know in order to grasp the Lightning Network without any distractions.

This chapter covers several important concepts from Bitcoin, including:

* Keys and digital signatures
* Bitcoin transactions and their structure
* Bitcoin transaction chaining
* Bitcoin Script - locking and unlocking scripts
* Basic locking scripts
* Complex and conditional locking scripts
* Timelocks
* Hash functions

=== Keys and digital signatures

((("cryptography", "defined")))((("cryptography", see="also keys and addresses")))You may have heard that bitcoin is based on _cryptography_, which is a branch of mathematics used extensively in computer security. Cryptography can also be used to prove knowledge of a secret without revealing that secret (digital signature), or prove the authenticity of data (digital fingerprint). These types of cryptographic proofs are the mathematical tools critical to bitcoin and used extensively in bitcoin applications.

((("digital keys", see="keys and addresses")))((("keys and addresses", "overview of", id="KAover04")))((("digital signatures", "purpose of")))Ownership of bitcoin is established through _digital keys_, _bitcoin addresses_, and _digital signatures_. The digital keys are not actually stored in the network, but are instead created and stored by users in a file, or simple database, called a _wallet_. The digital keys in a user's wallet are completely independent of the bitcoin protocol and can be generated and managed by the user's wallet software without reference to the blockchain or access to the internet.

Most bitcoin transactions require a valid digital signature to be included in the blockchain, which can only be generated with a secret key; therefore, anyone with a copy of that key has control of the bitcoin.  ((("witnesses")))The digital signature used to spend funds is also referred to as a _witness_, a term used in cryptography. The witness data in a bitcoin transaction testifies to the true ownership of the funds being spent. ((("public and private keys", "key pairs")))((("public and private keys", see="also keys and addresses")))Keys come in pairs consisting of a private (secret) key and a public key. Think of the public key as similar to a bank account number and the private key as similar to the secret PIN.

==== Private and public keys

((("keys and addresses", "overview of", "private key generation")))((("warnings and cautions", "private key protection")))A private key is simply a number, picked at random. In practice, and to make managing many keys easy, most bitcoin wallets generate a sequence of private keys from a single random _seed_, using a deterministic derivation algorithm. Simply put, a single random number is used to produce a repeatable sequence of seemingly random numbers that are used as private keys. This allows users to only backup the seed and be able to _derive_ all the keys they need from that seed.

Bitcoin, like many other cryptocurrencies and blockchains, uses _elliptic curves_ for security. In Bitcoin, elliptic curve multiplication on the _secp256k1_ elliptic curve is used as a _one-way function_. Simply put, the nature of elliptic curve math makes it trivial to calculate the scalar multiplication of a point but impossible to calculate the inverse ("division", or "discrete logarithm").

Each private key has a corresponding _public key_, which is calculated from the private key, using scalar multiplication on the elliptic curve. In simple terms, with a private key +k+, we can multiply it with a constant +G+ to produce a public key +K+:

----
K = k × G
----

It is impossible to reverse this calculation. Given a public key +K+, one cannot calculate the private key +k+. Division by +G+ is not possible in elliptic curve math. Instead, one would have to try all possible values of +k+ in an exhaustive process called a _brute force attack_. Because +k+ is a 256-bit number, exhausting all possible values with any classical computer would require more time and energy than available in this universe.

==== Hashes

Another important tool used extensively in Bitcoin, and in the Lightning Network, are _cryptographic hash functions_ and specifically the +SHA-256+ hash function.

A hash function also known as a _digest function_ is a function that takes arbitrary length data and transforms it into a fixed length result, called the _hash_, _digest_, or _fingerprint_. Importantly, hash functions are _one-way_ functions meaning that you can't reverse them and calculate the input data from the fingerprint.

[[SHA256]]
.The SHA-256 cryptographic hash algorithm
image::images/sha256.png["The SHA-256 cryptographic hash algorithm"]


For example, if we use a command-line terminal to feed the text "Mastering the Lightning Network" into the SHA256 function it will produce a fingerprint as follows:

----
$ echo -n "Mastering the Lightning Network" | shasum -a 256

ce86e4cd423d80d054b387aca23c02f5fc53b14be4f8d3ef14c089422b2235de  -
----

[TIP]
====
The input used to calculate a hash is also called a _pre-image_.
====

The length of the input can be much bigger of course. Let's try the same thing with the PDF file of the Bitcoin whitepaper from Satoshi Nakamoto:

----
$ wget http://bitcoin.org/bitcoin.pdf
$ cat bitcoin.pdf | shasum -a 256
b1674191a88ec5cdd733e4240a81803105dc412d6c6708d53ab94fc248f4f553  -
----

While it takes longer than a single sentence, the SHA256 function processes the 9-page PDF, "digesting" it into a 256-bit fingerprint.

Now at this point you might be wondering how it is possible for a function that digests data of unlimited size to produce a unique fingerprint that is a fixed-size number?

In theory, since there an infinite number of possible pre-images (inputs) and only a finite number of fingerprints, there must be many pre-images that produce the same 256-bit fingerprint. when two pre-images produce the same hash, this is known as a _collision_.

In practice, a 256-bit number is so large that you will never find a collision on purpose. Cryptographic hash functions work on the basis that a search for a collision is a brute-force effort that takes so much energy and time that it is not practically possible.

Cryptographic hash functions are broadly used in a variety of applications because they have some useful features. They are:

Deterministic:: The same input always produces the same hash.

Irreversible:: It is not possible to compute the pre-image of a hash.

Collission-Proof:: It is computationally infeasible to find two messages that have the same hash.

Uncorrelated:: A small change in the input produces such a big change in the output that the output seems uncorrelated to the input.

Uniform/Random:: A cryptographic hash function produces hashes that are uniformly distributed across the entire 256-bit space of possible outputs. The output of a hash appears to be random, though it is not truly random.

Using these features of cryptographic hashes, we can do build some interesting applications:

Fingerprints:: A hash can be used to fingerprint a file or message so that it can be uniquely identified. Hashes can be used as universal identifiers of any data set.

Integrity Proof:: A fingerprint of a file or message demonstrates its integrity, as the file or message cannot be tampered with or modified in any way without changing the fingeprirnt. This is often use to ensure software has not been tampered with before installing it on your computer.

Commitment/Non-repudiation:: You can commit to a specific preimage (e.g. a number or message) without revealing it, by publishing its hash. Later, you can reveal the secret and everyone can verify that it is the same thing you committed to earlier because it produces the published hash.

Proof-of-Work/Hash Grinding:: You can use a hash to prove you have done computational work, by showing a non-random pattern in the hash which can only be produced by repeated guesses at a pre-image. For example, the hash of a Bitcoin block header starts with a lot of zero bits. The only way to produce it is by changing a part of the header and hashing it trillions of times until it produces that pattern by chance.

Atomicity:: You can make a secret pre-image a condition of spending funds in several linked transactions. If any one of the parties reveals the pre-image in order to spend one of the transactions, all the other parties can now spend their transactions too. All or none become spendable, achieving atomicity across several transactions.
e to alter the message and still have the same hash.

==== Digital signatures

The private key is used to create signatures that are required to spend bitcoin by proving ownership of funds used in a transaction.

A digital signature is a number that is calculated from the application of the private key to a specific message.

Given a message m and a private key k, a signature function F_sig_ can produce a signature S:

latexmath:[ S = F{sign}(m, k) ]

This signature S can be independently verified by anyone who has the public key K (corresponding to private key k), and the message:

latexmath:[ S' = F{verify}(m, K, S) ]

If S' matches S, then the verifier can confirm that the message m was signed by someone who had access to the private key k. Importantly, the digital signature proves the possession of the private key k at the time of signing, without revealing k.

Digital signatures use a cryptographic hash algorithm. The signature is applied to a hash of the message, so that the message m is "summarized" to a fixed-length hash H(m) that serves as a fingerprint.

=== Bitcoin transactions

Transactions are data structures that encode the transfer of value between participants in the bitcoin system.

==== Inputs and outputs

The fundamental building block of a bitcoin transaction is a transaction output. Transaction outputs are indivisible chunks of bitcoin currency, recorded on the blockchain, and recognized as valid by the entire network. A transaction spends "inputs" and creates "outputs". Transaction inputs are simply references to outputs of previously recorded transactions. This way, each transaction spends the outputs of a previous transactions and creates new outputs.

[[transaction_structure]]
.A transaction transfers value from inputs to outputs
image::images/tx1.png["transaction inputs and outputs"]

Bitcoin full nodes track all available and spendable outputs, known as _unspent transaction outputs_, or UTXO. The collection of all UTXO is known as the UTXO set and currently numbers in the millions of UTXO. The UTXO set grows as new UTXO is created and shrinks when UTXO is consumed. Every transaction represents a change (state transition) in the UTXO set, by consuming one or more UTXO as _transaction inputs_ and creating one or more UTXO as its _transaction outputs_.

For example, let's assume that a user Alice has a 100,000 satoshi UTXO that she can spend. Alice can pay Bob 100,000 satoshi, by constructing a transaction with one input (consuming her existing 100,000 satoshi input) and one output that "pays" Bob 100,000 satoshi. Now Bob has a 100,000 satoshi UTXO that he can spend, creating a new transaction that consumes this new UTXO and spends it to another UTXO as a payment to another user, and so on.

[[alice_100ksat_to_bob]]
.Alice pays 100,000 satoshis to Bob
image::images/tx2.png["Alice pays 100,000 satoshis to Bob"]

A transaction output can have an arbitrary (integer) value denominated as a multiple of satoshis. Just as dollars can be divided down to two decimal places as cents, bitcoin can be divided down to eight decimal places as satoshis. Although an output can have any arbitrary value, once created it is indivisible. This is an important characteristic of outputs that needs to be emphasized: outputs are discrete and indivisible units of value, denominated in integer satoshis. An unspent output can only be consumed in its entirety by a transaction.

So what if Alice wants to pay Bob 50,000 satoshi, but only has an indivisible 100,000 satoshi UTXO? Alice will need to create a transaction that consumes (as its input) the 100,000 satoshi UTXO and has two outputs: one paying 50,000 satoshi to Bob and one paying 50,000 satoshi *back* to Alice as "change".

[[alice_50ksat_to_bob_change]]
.Alice pays 50k sat to Bob and 50k sat to herself as change
image::images/tx3.png["Alice pays 50,000 satoshis to Bob and 50,000 satoshis to herself as change"]

[TIP]
====
There's nothing "special" about a change output or any way to distinguish it from any other output. It doesn't have to be the last output. There could be more than one change output, or no change outputs. Only the creator of the transaction knows which outputs are to others and which outputs are to addresses they own and therefore "change".
====

Similarly, if Alice wants to pay Bob 85,000 satoshi but has two 50,000 satoshi UTXO available, she has to create a transaction with two inputs (consuming both her 50,000 satoshi UTXO) and two outputs, paying Bob 85,000 and sending 15,000 satoshi back to herself as change.

[[tx_twoin_twoout]]
.Alice uses two 50k inputs to pay 85k sat to Bob and 50k sat to herself as change
image::images/tx4.png["Alice uses two 50k inputs to pay 85k sat to Bob and 50k sat to herself as change"]

The illustrations and examples above show how a Bitcoin transaction combines (spends) one or more inputs and creates one or more outputs. A transaction can have hundreds or even thousands of inputs and outputs.

==== Transaction chains

Every output can be spent, as an input in a subsequent transaction. So for example, if Bob decided to spend 10,000 satoshi in a transaction paying Chan, and Chan spend 4,000 satoshi to pay Dina:

[[tx_chain]]
.Alice pays Bob who pays Chan who pays Dina
image::images/tx5.png["Alice pays Bob who pays Chan who pays Dina"]

An output is considered "spent" if it is referenced as an input in another transaction that is recorded on the blockchain. An output is considered "unspent" (and available for spending) if no recorded transaction references it.

The only type of transaction that doesn't have "inputs" is a special transaction created by Bitcoin miners called the _coinbase transaction_. The coinbase transaction has only outputs and no inputs because it creates new bitcoin from mining. Every other transaction spends one or more previously recorded outputs as its inputs.

Since transactions are chained, if you pick a transaction at random, you can follow any one of its inputs backwards to the previous transaction that created it. If you keep doing that you will eventually reach a coinbase transaction where the bitcoin was first mined.

==== Transaction identifiers

Every transaction in the Bitcoin system is identified by a unique identifier, called the _transaction ID_. To produce a unique identifier, we use the SHA-256 cryptoraphic hash function to produce a hash of the transaction's data. This "fingerprint" serves as a universal identifier. Once a transaction is recorded on the Bitcoin blockchain, it can be referenced by the transaction ID and every node in the Bitcoin network knows that this transaction is valid.

For example, a transaction ID might look like this:

.A transaction ID produced from hashing the transaction data
----
e31e4e214c3f436937c74b8663b3ca58f7ad5b3fce7783eb84fd9a5ee5b9a54c
----

This is a real transaction (created as an example for the "Mastering Bitcoin" book) that can be found on the Bitcoin blockchain.

Try to find it by entering this ID into a block explorer:

https://blockstream.info/tx/e31e4e214c3f436937c74b8663b3ca58f7ad5b3fce7783eb84fd9a5ee5b9a54c

or use the short link (case sensitive):

http://bit.ly/AliceTx


==== Output identifiers

Similarly, since every transaction has a unique ID, we can also identify a transaction output uniquely by reference to the transaction ID and the output index number. The first output in a transaction is output index 0, the second output is output index 1 and so on. By convention we write an output identifier as the transaction ID, a colon and the output index number:

.Identifying an output by transaction ID and index number
----
7957a35fe64f80d234d76d83a2a8f1a0d8149a41d81de548f0a65a8a999f6f18:0
----

Output identifiers are the mechanism that links transactions together in a chain. Every transaction input is a reference to a specific output of a previous transaction. That reference is an output identifier: a transaction ID and output index number. So a transaction "spends" a specific output (by index number) from a specific transaction (by transaction ID) to create new outputs that themselves can be spent by reference to the transaction ID and index number.

Here's the chain of transactions from Alice to Bob to Chan to Dina, this time with output identifiers in each of the inputs:

[[tx_chain_vout]]
.Transaction inputs are output identifiers forming a chain
image::images/tx6.png["Transaction inputs are output identifiers forming a chain"]

The input in Bob's transaction references Alice's transaction (by transaction ID) and the 0 indexed output.

The input in Chan's transaction references Bob's transaction by ID and the 1st indexed output, because Bob's