--> optimization problem (tend to be hard in general)
## What are good features
* fees should be chep
* success probabilities
** Why is liquidity a problem? Why do channel balances have to be private and what ist the problem with that?
*** if balance was not private nodes would have to tell everyone else about every channel update (that happens with payments) --> Scales as poorly as bitcoin
*** it is also nice for privacy
** How to obtain them?
** without quantifying the uncertainty this is like being blind without help and trying to walk around.
* other features
** CLTV to minimize locked capital in bad cases
** provenance scores
## Generating candidate pahts by solving / estimating the optimizaion problem
* Dijkstra for single paths
* Backward computation because of fees
## Trial and Error Loop
* the loop
** Generate candidate paths
** try to send
** update konwledge of the `uncertainty network`
* question: how long to remember the knowledge
* predict expected number of attempts
* issues: success rates drop with larger amounts --> Expected number of attempts rises
## Can we do better than single paths
* Pros and cons
++ smaller payments
-- more channels involved
* math says it is still an optimization problem or a min cost flow problem.
* depending on the goal algorithms do exist.
* Well known analogy from logistcs: How do you transport x goods from A to B with least cost of transport
## Perspective of Routing nodes
* provide a service to earn a fee
* how to improve the service?
** increase liquidity / provide more liquidity
*** more channels --> better liquidity
*** larger channels ---> higher reliability
*** abvoe goals (more channels, larger channels) are in conflict for fixed amount of liquidity, how to solve? ---> Depends on goals.
** increase updatime
** increase reliability via rebalancing
*** pro actively
*** lazy (JIT - Routing)
** offer outsourcing of routing for light clients via trampoline payments
If we knew the exact channel balances of every channel, we could easily compute one or more payment paths using any of the standard path finding algorithms taught in good computer science programs.
Actually, when we consider multipath payments, it is rather a flow problem than a path finding problem.
If exact information about channel balances were available, we could solve those problems in a way as to minimize the fees that would have to be paid by the payer to the nodes forwarding the payment.
However, as discussed, the balance information of all channels cannot be available to all participants of the network.
Thus, we need to have one or more innovative path finding strategies.
With only partial information about the network topology available this is a real challenge and active research is still being conducted into optimizing this part of the Lightning Network implementations.
The path finding strategy currently implemented in Lightning nodes is to probe paths until one is found that has enough liquidity to forward the payment.
_Source-based routing_ is a method of path-finding where the sender, i.e. the source, plans the path from itself, through the intermediary nodes, to the final destination.
Once a path has been found and selected, the sender sends the payment to the first intermediary node, who sends it to the second intermediary node, and so on until it reaches the destination.
While a payment is traveling along a path, the path typically does not get changed by any of the intermediary nodes, even if a shorter path or a cheaper path (in terms of routing fees) exists.
One of the reasons the Lightning Network uses source-based routing is to protect user privacy.
As discussed in the chapter on _Onion Routing_, the intermediary nodes transmitting the payment are not aware of the full path of the payment. They only know the node they received it from and the node they are sending it to.
Even if it specifies a path in the invoice, that path may no longer be viable by the time the invoice is paid, which could be several minutes or several days later.
On the other hand, source-based routing comes with some inherent drawbacks.
The sender chooses the path based on its current understanding of the topological map of the Lightning network.
As discussed in previous chapters, this map is necessarily incomplete. The sender cannot be aware of all the channels. And even if it is aware of them, it will not always know their latest balances.
The balances of channels change with every payment. Consequently, any topological knowledge becomes obsolete in a short space of time.
The standard path finding mechanism in source-based onion-routing that is implemented in all Lightning Network implementations is the following:
. Given the limited local topological knowledge the sender tries to find one or more routing paths.
. Select an arbitrary path of payment channels which satisfies 3 conditions:
* path connects sender and receiver of the payment,
* all channels on path have a presumed capacity of at least the payment amount,
* all channels on path accept HTLCs of the payment amount.
. Construct the "onion" from destination to sender according to the meta data of the channels (base fee, fee rate, CLTV delta).
. Send out the "onion" and expect one of two possible results returned:
* Preimages are returned by nodes if the payment settles successfully
* Error is returned if the payment fails.
. If the payment settles, the sender updates its topological knowledge based on this new information for future payments. The algorithm terminates.
. If the payment fails, the sender updates its topological knowledge based on this new information. It then selects a different path and starts the process again from the beginning.
Even with such primitive heuristics in place it could still be considered a random process or a random walk through the channel graph.
There can be several reasons why a payment may fail along the way.
Reasons for failure include: a routing node became unreachable, a routing channel no longer has the required balance, a routing node doesn't accept new HTLCs, the owner of a channel increased the channel fees, or the channel was closed in the interim.
Furthermore, there is no guarantee that the route chosen was the cheapest in terms of fees or the shortest in terms of channels involved.
At the time of writing this book, this is a design trade-off made to protect user privacy.
Thus he would like to receive an incoming ammount of `103020` satoshi (102000 + 1%) which is 20 satoshi more than our uninformed Alice actually sent him.
According to Bob's fee schedule Bob will reject this onion.
In the last step she would have applied Bob's fee (1%) to 102k to derive 102k + 1020 satoshi.
That makes a total of 103,020 satoshi that she needs to send to Bob.
As the routing fees can increase the amount that is being forwarded even beyond the capacity of small channels, it makes sense to start the construction of the onion and the path finding at the destination and work from the destination back towards the sender.
Developers mainly choose breadth-first search if the edges are all of equal weight.
In cases where the edges are not of equal weight the Dijkstra Algorithm is used.
In our case the weights of the edges could represent the routing fees.
Only edges with a capacity larger than the amount to be sent will be included in the search.
In this basic form pathfinding in the Lightning network is very simple and straight forward.
However, as we have already discussed in the introduction, channel balances cannot be shared with every participant every time a payment takes place as this would prevent scaling the network.
This turns our easy theoretical computer science problem into a rather complex real-world problem.
We now have to solve a pathfinding problem with only partial knowledge.
For example, we suspect which edges might be able to forward a payment because their capacity seems big enough.
But we can't be certain unless we try it out or ask the channel owners directly.
Even if we were able to ask the channel owners directly, their balance might change by the time we have asked others, computed a path, constructed an onion and send it along.
Not only do we have soley limited information but the information we have is highly dynamic and might change at any point in time without our knowledge.
One general observation that everyone can easily make is that if every node along a path is able to forward a certain amount of satoshis, these nodes will also be able to forward a lower amount of satoshis.
This is why many people intuitively believe that multipath payments might be a good strategy.
Instead of finding one path where every node has a large amount of liquidity the task is split into smaller ones.
Another reason is of course that the sender of a payment might just not have the amount they wish to send available in one single channel but distributed over several of his channels.
We simply note that multipath payments are equivalent to finding a flow between the source and the destination.
Finding flows in a static graph with full knowledge is computationally marginally more expensive than computing a shortest path.
On the other hand, given the dynamic reality of the Lightning Network and the fact that we do not need to compute a maximum flow, it is currently not known if the flow problem is more or less difficult than finding a path.
Both problems seem to have about the same difficulty and the problems are partially related as we will see in the following sections.
One fitting algorithm is the breadth-first seach traversal.
The graph algorithm will usually be constrained to channels whose capacity exceeds the payment amount.
In practice, due to channel reserve and the assumption that the capacity in the channel will not be sitting completely on one side, it is smarter to prefer larger channels.
Channel closings are not announced via the gossip protocol.
However, as the funding transaction is encoded by the short channel id of the channel and as it will be spent on closing the channel, nodes can use this on-chain information to update their knowledge about the network of channels.
Knowing for example that the third hop along a path returns an error of _insufficient balance_ means that the first two channels had enough balance and that the third channel did not have enough balance.
In general, edges with errors can be removed from the set of edges similarly to the edges with insufficient capacity.
Nodes can accumulate knowledge and update their knowledge with every failed or successful payment attempt.
It is important that nodes are careful with this data.
As the capacity information of channels from the gossip protocol and the blockchain data are verifiably correct, the data returned in failed onions can be incorrect.
Nodes might simply send an error back because they do not want to reveal balance information.
Besides, channel data continuously changes over time as the Lightning Network is very dynamic.
This implies that nodes should only use such data if it is not too old or use it only with limited confidence.
As time advances this information becomes stale and outdated and the confidence in this data diminishes.
The fourth source of information that the node can use are the routing hints in the BOLT 11 invoices.
Remember that a regular payment process starts with the person who wants to receive money producing a random secret and hashing it to derive the payment hash.
Invoices typically contain some meta data including some routing hints.
This is imperative if the person who wants to be paid does not have announced channels. In that case some unannounced channels will be specified within the invoice.
Otherwise the payer would not even be able to find a path to the "hidden" destination node.
Routing hints might also be used by the receiving node to indicate which public channels have enough inbound capacity to forward the payment.
In general, the longer a payment path is, the more likely it becomes that a channel with insufficient balance is selected.
Thus, receiving hints from the receiver indicating on which channels it wishes to receive funds is definitely helpful for the sender.
The probing-based approach that is used in the Lightning Network has several shortcomings.
Sending out an onion takes a certain amount of time.
The time depends on how many hops the onion is supposed to be forwarded, on the speed of nodes processing the onion, and on the topology on the network.
In the following diagram you can see how the round-trip time for onions in general increases with the amount of hops that the onion has encoded.
.Research shows that the onion round-trip time depends on the distance (CC-BY-SA Tikhomirov, Sergei & Pickhardt, Rene & Biryukov, Alex & Nowostawski, Mariusz. (2020). Probing Channel Balances in the Lightning Network.)
This diagram is just a snapshot from an experiment in early 2020 and results might change.
We learn from the diagram that payments can take several seconds while the node probes several paths.
This is due to the fact that a single onion can easily take a few seconds to return and a sender might have to send several onions sequentially while probing for a successful path.
In comparison, this will still be much faster than waiting for confirmations on a Bitcoin block; but it is not performant enough in an environment where payments need to settle fast.
People standing in a line at the grocery store cash register prefer not to wait several seconds.
Thus, Lightning developers have come up and implemented the following improvements to the probing algorithms.
We are also hopeful that additional improvements and optimizations can be discovered in the future.
Nodes ordinarily probe the network when making a payment. But nothing prevents them from probing the network periodically.
Instead of making a real payment, nodes could send out one or multiple _fake_ payments.
A fake payment is nothing but an onions with a random payment hash.
Given the properties of the hash function, it is save to assume that nobody knows the preimage.
If the payment amount is small enough, a fake payment will fail at the destination and this allows the sending node to learn about the balances on the path.
There are clear downsides to this approach.
It produces spam and heavy network load and therefore this behaviour is discouraged.
However, participants cannot easily be stopped from doing this.
Channel partners can detect this type of abuse by observing frequent payments that always fail.
As punishment channel partners can decide to produce errors right away without providing balance information
While a lot of information is not easily accessible, every time a path is probed the node learns something about the state of the network at that point in time.
Please note that one should never send two onions at the same time with the same payment hash for which the recipient knows the preimage.
As long as the onion is being processed and routed the payment is out of control of the sender.
In case two onions are sent at the same time, the recipient could very well release the preimage twice and get paid twice.
This is the reason why arbitrary probing should be conducted with a fake, i.e. purely random, payment hash.
With fake payment hashes the sender can probe concurrently as long as the sender has enough funds to pay for all the HTLCs.
Successful probing does not guarantee a following successful payment.
Assume a fake onion returns indicating that the payment hash was unknown to the recipient but otherwise the path has been possible.
The sender now uses the same path to send the payment with the corrent payment hash.
In the interim, the balance of a channel along the path changes rendering the path unworkable.
In this case the sender has to start all over again.
Admittedly the risk for this to happen is rather small but the possibility exists.
A potential improvement has been outlined by a suggested mechanism labelled as _stuckless payments_.
The proposal of _stuckless payments_ received positive feedback from developers.
It is unlikely that the mechanism is implemented before the Lightning Network switches from _Hashed Timelock Contracts_ (HTLCs) to _Point Timelock Contracts_ (PTLCs). PTLCs in turn will only be implemented after _Schnorr Signatures_ are activated on the Bitcoin Network.
Stuckless payments give control back to the sender of an onion.
We don't explain the details here, but stuckless payments empower the sender to cancel an onion.
The second advantage is that the path is locked, i.e. reserved, after it is found until it is settled.
This means that the sender can either cancel the onion or bring the onion to a successful conclusion.
In particular, the probed path once locked cannot change or be used by other routing requests in the interim between probing and setting up the HTLCs that are used to fulfill the request. The found path remains reserved until cancelled or the payment is successfully completed.
Using stuckless payments the time for a successful payment will reduce drastically.
The distadvantage is that the sender has to lock more bitcoin during the pathfinding process.
Due to timeouts these bitcoin can remain locked for several days before being released again.
Although this should not happen too frequently.
Another drawback is that the execution of this mechanism utilizes more resources of routing nodes.
A receiving node will see an incoming HTLC for a certain payment hash.
If the onion signals that the node is the final recipient and if the amount of the HTLC is less than the one specified in the invoice, the node would normally not accept the HTLC and send back an error notification.
However, using the _Total Value Locked_ (TLV) format of onions a sender can specify a total amount of the payment which is bigger than the HTLC.
In the TLV case, the recipient can safely accept the HTLC and wait for more HTLCs to arrive.
All parts of the payment will use the same payment hash.
The recipient will only release the preimage if the sum of all incoming HTLCs is at least the specified payment amount.
**Multipath or multipart payments?** You might have noticed that we named the chapter "multipath" payments but mentioned in the last paragraph that such a payment consists of several parts.
The protocol specification uses the abbrivation _MPP_ for _multipart payments_.
Multipath is just a special case of multipart.
Multipart covers all the cases of multipath plus the unusual case where multiple parts use the same path.
For simplicity we take the liberty to also abbriviate multipath payments with MPP.
It is important to recognize that a node that forwards HTLCs does not have to distinguish a single full payment from a partial multipart payment.
Only the receiving node needs to distinguish the two cases. Only the receiver needs to be ready to accept multipart payments.
In the BOLT 11 invoice specification there is a field for _feature bits_.
If a node wishes to accept multipart payments it must signal this by setting the corresponding feature bit (bit 16 of 17).
If a node wishes to send a multipart payment it can do so if the receiving node has signaled their willingess to accept such payments.
Currently there is no mechanism for routing nodes to split the payment amount and onion into several parts or merge several incoming HTLCs into a single onion.
Besides the potentially better chances to find smaller routes the sender might want to use a multipart payment because it does not have enough balance in a single payment channel.
If the channel had enough capacity this could be resolved with a circular rebalancing - which we will discuss in the next section.
However if the payment amount is bigger than the largest capacity of a channel that the sender has the sender can only pay the invoice if the recipient allows and supports multipart payments.
Similarly a recipient might not be able to receive a single payment of the requested amount and would have the interest of signaling multi part payments.
Luckily nodes will do this automatically and practially always signal the support for multi part payments if the implementation supports this feature.
The standard Lightning Network implementations which follow BOLT 1.1 all support this feature.
Multipart payments will almost always be more expensive than a single payment.
You will remember that the fees that routing nodes charge consist of a fee rate and of a base fee.
The total fee rate of a multipart payment stays roughly the same as a single payment.
However the base fee is added independent of the amount making multipart payments in most cases more expensive.
As the sender pays the fees the sender will not necessarily have the interest of splitting the payment in too many parts.
Thus implementations usually integrate multi part payments into the probing based approach.
For example after a single payment would not got through the node might split the amount into two payments and try a multipart payment with smaller amounts.
Those mulitpart payments could again be split down if they are not successfull along a route.
The advantages of multi part payments are quite obvious:
. bigger payment sizes
. higher success rates
On the other side we have a couple of downsides:
. Higher fees
. More HTLCs locked / more load on the network
. Potentially longer times. If only a single part gets stuck all the other HTLCs in flight have to wait locking liquidity of many nodes for a potentially longer time
. Leaks more information as the network is practically probed more heavily.
In this chapter you have already learnt that the path finding problem on the lightning network is actually rather a problem of finding a flow - which consists of several paths.
Very early research about pathfinding in payment channel networks suggests \footnote{FIND LINK} that rebalancing channels does not change the flow properties between nodes.
With rebalancing we mean shifting liquidity from one channel to another channel for example via a circular payment.
There is also the notion of offchain / onchain swaps with swapping services.
This form of rebalancing certainly changes also the topological properties like the flow of the network.
As rebalancing via circular self payments would not change the overall amount that an arbitrary node can send to any other node people thought that rebalancing is not very useful.
However in practice a node hardly wants to find the perfect flow or multipath to be able to send the absolute maximum amount to another node.
Nodes are rather interested in quickly finding a sufficient large flow so that they can make a reasonable payment.
Research conducted by Rene Pickhardt (one of the authors of this book) indicated that circular rebalancing operations improve the overall successrate in the network for arbitrary payments.
It turns out that there is various ways how rebalancing can be used and in some form it even resembles the functionality of a multi path payment.
Thus we decided to devote a section here on basics about rebalancing and how it can be used to improve the pathfinding abilities of the network.
We made the experience that most people call their payment channel balanced if they own the same amount of bitcoin in that channel as their channel partner.
While this seems intuitive we want to show that this intuition does not seem to be the best intuition for our goals.
In order to see this let us assume the Lightning Network at some point in time looks exactly like that.
All channels split the capacity 50 - 50 dividing it into half between the channel partners.
Sending back the money would be quite expensive and does not seem to be a realistic option.
Using an onchain swapping service after every payment to rebalance channels seems also problematic.
The entire idea of creating the Lightning Network was to have less on chain transaction and be able to send money between people without the necessity to do on chain transactions.
Thus there is only the last option which means that Chan could move the money from the Bob-Chan channel via the Bob-Erica channel to hhis Erica-Chan channel.
While the math theory tells us that rebalancing channels does not change the max flow between two nodes we see that it has changed the selected path of a payment.
Due to the onion routing and the privacy goals that are implemented in it we have a source based routing and thus assume the sender always has to select and thus find the path.
However this is not true!
When rebalancing comes into place we can use the local knowledge of the distribution of balances that nodes might have to help with selection of paths and finding a total payment path / multi path or flow.
We will explore this idea a little bit more in the upcoming section about JIT routing.
Remember in our example after Bob has paid Chan Bob had a total amount of 4 million satoshi, Chan had a total of 6 million satoshi and Erica still had 5 million satoshi as before.
Of course it would be possible to have payment channels between these three people with that distribution of funds so that everyone has 50% of the capacity on their side of the payment channel.
While the above picture shows that it is possible to have 50/50 channls after the payment this could only be achieved if the capacities would have been changed.
Changing the capacity of channels is only possible by closing and opening the channel or with the help of a technique called splicing.
The later is not widely deployed yet and would also depend on onchain transactions.
. Off-chain rebalancing does not change the fact how much money can flow from sender to receiver.
. Making payments changes how much money sender and receiver can send or receive. This is similar to the physical world where you also can only spend the cash that you have received first.
. The goal to have channels in a 50/50 state is not possible for all the nodes all the time and thus probably not a good one.
. Rebalancing in combination with payments changes the way money flew from the sender to the recipient. In particular it shifts can shift the responsability to find a path from the sender to several nodes on the network - even they don't know which path they are trying to find.
. Thus rebalancing can be a nice tool to support path finding.
The main problem with Lightning network channels from a routing and pathfinding perspective is that the liquidity is not known.
From that perspective the 50/50 approach which is not achievable makes sense.
If nodes could assume that other nodes always have a certain amount of the capacity on their side they could use that fraction of the capacity to make path finding decisions.
Initially all the channel balance of newly opened channels is on one side.
Thus if there is a new node which has opened some channels and received some channels all the channels are unbalanced and routing is always only possible in one direction.
Nodes and node operators could look at the channel balance coefficient which is defined as the ratio between the balance they hold on that channel divided by the capacity of that channel.
Researchers demonstrated that the overall likelihood to find a path increases if nodes aim to rebalance their channels in a way that their local channel balance coefficients all take the same value.
This target value can easily be computed as the amount of total funds that a node owns on the network devided by the sum of all capacities of channels that the node maintains.
We call this target value the node balance coefficient \nu.
Nodes can check wich channels have channel balance coefficient that is bigger than \nu and which have a channel balance coeffcient that is smaller than \nu.
after identifying such channels it makes sense to make circular self payments from the channels with too mcuh liquidity to the channels with too little liquidity.
This approach has an economical drawback.
Doing a circular self payment is not for free.
The nodes along the circular path will charge routing fees which always have to be paid by the initiator of the payment.
This would be your node if you wanted to rebalance your channels.
It might be justified for you to pay those fees upfront because you might earn them back with the routing fees that you charge if you can successfully forward payments.
However you do not really know in which direction you will have to route payments later.
In the worst cast you moved liquidity from a channel which you could have used perfectly to fulfill routing requests along that edge in this direction.
Not only would you have paid routing fees for a rebalancing operation you would also have depleeted your channel more quickly and might face the need to rebalance again.
We hope that you are not discouraged at this moment.
Rebalancing is still a viable thing.
While proactive rebalancing increases the reliablity of the network it is currently economically not viable.
However you could rebalance reactively or Just in Time at the moment when necessary.
Imagine you have a an incoming HTLCs and the onion says you are supposed to forward the payment along a channel where you lack sufficient balance.
The standard case of the protocol would be to return the onion with an onion and remove the incoming HTLC.
However noone stops your node from shortly interrupting the routing process and conduct a rebalancing operation to provide yourself with sufficient liquidity on the channel in question.
This method is called JIT-Routing as it helps nodes to reactively provide themselves with enough liquidity just in time.
The just in time Routing scheme has 2 major advantages over source based routing.
. It increases the privacy of channels. If nodes that do not have sufficient liquidity return the onions an attacker can use that behavior to probe for the channel balance. However if nodes rebalance their channels they will always be able to forward the payment and protect themselves from probing attacks.
. More importantly it resembles multipart payments in which the splitting of the payment is not been decided by the sender who would not know how balances remotely are distributed but the splitting would be achieved by the routing node that knows its local topology.
Let us elaborate on the second point and take the example in which Bob was supposed to forward the onion from Alice to Chan but does have enough liquidity on the channel with Chan.
If Bob now does a cebalancing operation through Erica and is able to afterwards forward the payment along to Bob he has effectively split the payment at his node to flow along two paths.
It is obvious that splitting a payment at the node that can't forward the entire payment is much more reliable and effective than letting the sender decide how to split a payment and into which amounts.
We thus can see that with the help of JIT-Routing rebalancing and multipart payments are actually not so different concepts and ideas.
There is another way how mutlipart payments and rebalancing can be combined.
Let us recall that nodes should always aim to have similar channel balance coefficients.
So if a node wants to make a multipart payment it could split the payment in such a way that it rebalances its channels.
Meaning it would only pay from channels on which it currently has too much liquidity.
Also it would use larger parts for the channels that have way too much liquidity and smaller amount for the channels that have just a little bit too much liquidity.
The optimal amounts can easily be computed with the following formulars.
* goals for rebalancing (low Gini coefficient and not 50 / 50)
* optimization problem / game theory
* JIT Routing
==== Optimizations for Multi path payments
The rebalancing goal with local channel balance coefficients could actually be integrated into multi path payments.
Thus if a node decides to send a payment along several paths it could very well use this opportunity to split the payment in a way that it improves the imbalance of its own channels.
So instead of splitting payments by 2 in a divide and conquorer strategy the node could use the following formula ...