lnbook/path-finding.asciidoc

Chapter overview:
  * How path finding works in the network

Relevant questions to answer:
  * What is packet switching? What is circuit switching? Which one does LN use today?
  * In the abstract what is path finding?
  * What is dijkstra's? What modifications need to be made to apply it to this domain?
  * Why must path finding happen backwards (receiver to sender)?
  * How is the information contained in a channel update used in path finding?
  * How can errors sent during payment routing help the sender to narrow their search space?
  * What is payment splitting? How does it work? What alternatives exist?
  * What information can be sent to intermediate and the final node aside from the critical routing data?
  * What are multi-hop locks? What addition privacy and security guarantees to they offer?
  * How can the flexible onion space be used to enabled packet switching in the network?

==== Finding a path

Payments on the Lightning Network are forwarded along a path of channels from one participant to another.
Thus, a path of payment channels has to be selected.
If we knew the exact channel balances of every channel we could easily compute a payment path using any of the standard path finding algorithms taught in any computer science program.
Actually when we consider multipath payments it is rather a flow problem than a path finding problem.
Since flows consist of several paths we conveniently talk about path finding.
With exact information about channel balances available we could solve those problems in a way to optimize the fees that would have to be paid by the payer to the nodes that kindly forward the payment.
However, as discussed the balance information of all channels is and cannot be available to all participants of the network.
Thus, we need to have one or more innovative path finding strategy.
These strategies must relate closely to the routing algorithm that is used.
As we will see in the next section, the Lightning Network uses a source based onion routing protocol for routing payments.
This means in particular that the sender of the payment has to find a path through the network.
With only partial information about the network topology available this is a real challenge and active research is still being conducted into optimizing this part of the Lightning Network implementations.
The fact that the path finding problem is not fully solved for the case of the Lightning Network is a major point of criticism towards the technology.
The path finding strategy currently implemented in Lightning nodes is to probe paths until one is found that has enough liquidity to forward the payment.
While this is not optimal and certainly can be improved, it should be noted that even this simplistic strategy works well.
This probing is done by the Lightning node or wallet and is not directly seen by the user of the software.
The user might only realize that probing is taking place if the payment is not going through instantly.
The algorithm currently also does not necessarily result in the path with the lowest fees.


=== What is "Source-Based" routing and why does the Lightning Network use it?

Source-based routing is a method of path-finding where the sender (i.e. the source) plans the path from itself, through the intermediary nodes, and finally to the destination.
Once a path has been selected, the sender sends the payment to the first intermediary node, who sends it to the second intermediary node and so on, until it reaches the destination.
While a payment is in-flight along a path, the path typically does not get changed by any of the intermediary nodes, even if a shorter path or a cheaper path (in terms of routing fees) exists.

The Lightning Network uses source-based routing at the time of writing in order to protect user privacy.
As discussed in the chapter on Onion Routing, the intermediary nodes transmitting the payment are not aware of the full path of the payment; they only know the node they received it from and the node they are sending it to.

We also cannot expect the destination to find a path.
Even if it specifies a path in the invoice, that path may no longer be viable by the time the invoice is paid, which could be several minutes or several days later.
The recipient can, however, specify "routing hints" in the invoice that may assist the sender in finding a possible path.

Furthermore, source-based routing comes with some inherent drawbacks.
The sender chooses the path based on their current understanding of the topological map of the Lightning network.
As discussed in previous chapters, this map is necessarily incomplete; the sender may not be aware of all the channels, and even if they are they will almost certainly not know the latest balances in each of the channels.
And even if the sender did have a complete topological map at one point in time, the balances of channels change with every payment, and so in a short space of time the map would become obsolete.
The standard path finding mechanism with source based onion routing that is implemented in all Lightning Network implementations is the following:

. Select an arbitrary path of payment channels which connects sender and receiver of the payment and for which all channels have a capacity of at least the payment amount and accept HTLCs of this amount.
. Construct the onion from destination to sender according to the meta data (basefee, feerate, CLTV delta) of the channels.
. Send out the onion and see if the payment settles by nodes sending back preimages or if the payment fails.
. If the payment fails use this knowledge to select a different path by starting at step 1.

This means that with every attempted payment nodes actually probe the network and will also learn some information about how balances are distributed.
Implementations will usually prioritise cheaper paths or exclude channels which recently have failed.
In that sense the selection is not completely arbitrary.
Still even with such heuristics in place it could still be considered as a random process or random walk through the channel graph.
There can be several reasons why the payment may fail along the way.
For example one of the nodes becomes unreachable, doesn't have the channel balance, can't accept new htlcs, has updated its fees to a higher amount, or the channel is closed in the interim.
Furthermore, there is no guarantee that the route chosen was the cheapest in terms of fees, or if a shorter path existed.
As at the time of writing, this is a design trade-off made to protect user privacy.

=== Paths are constructed from Destination to Source
Let us assume our standard example in which Alice wants to send a payment of 100k satoshi on a path via Bob and Wei to Gloria.
The Path obviously looks like (Alice)-->(Bob)-->(Wei)-->Gloria.
However Bob and Wei will charge routing fees to forward the onion.
As you already know nodes can charge two types of fees.
First the base fee that will be charged for any successfull forwarding and settlement of and htlc.
This fee is constant and does not depend on the amount that the node is supposed to forward.
Additionally nodes might charge a freerate.
The name rate alredy indicates that this fee depends on the amount that a node is supposed to forward.
Let us for the simplicity of assume that the feerate of Bob and Wei is very expensive with 1% for Bob and 2% for Wei.
However Bob and Wei will not take a base fee to keep things simple in our example.
If Alice constructs the Onion she has to include the routing fees as the differnce of the incoming htlc and the outgoing htlc.
Let us assume she falsely computes the following to construct the onions with the routing fees.
Alice knows that 1% of 100k satoshi is 1k satoshi which she belives she should include in Bob's onion.
Similarly she knows that 1% of 100k satoshi is 2k satoshi which she belives she should include in Wei's onion.
She would find out that Bob would not forward the onion, if she believed that she would have to pay a total of 3k satoshi and construct an onion that looks like this:

----
"route": [
      {
         "id": "Bob",
         "channel": "357",
         "direction": 1,
         "satoshi": 103000,
         "dealy": 187,
      },
      {
         "id": "Wei",
         "channel": "74",
         "direction": 1,
         "satoshi": 102000,
         "dealy": 183,
      },
      {
         "id": "Gloria",
         "channel": "452",
         "direction": 0,
         "satoshi": 100000,
         "dealy": 153,
      }
   ]
}
----

The reason for Bob to not forward the onion is that he expects the incoming amount to be 1% larger then the amount he is supposed to forward.
Thus he would like to have an incoming ammount of `103020` satoshi which is 20 satoshi more than Alice sent him.
According to his fee schedual Bob will have to reject the Onion.
If Alice constructed the onion from she would have computed 1% of the to forward amount correctly as 1% from 102k satoshi which is 1020 sat.
Adding 1020 to the 102000 satoshi that Wei needs to have on his incoming channel will result in the right value of 103020 satoshi that Bob requires.
As the routing fees can increase the amount that is being forwarded even beyond the capacity of small channels it makes sense to start the construction of the onion and the pathfinding from the destination to the sender.

[NOTE]
====
While onions are also constructed from inside to outside and thus start with the destination this is not the reason why pathfinding has to start with the destination node.
====

=== Fundamentals about path finding
Finding a path through a graph is a problem modern computers can solve rather efficiently.
Mostly developers choose bredth first search (if the edges are all of equal weight) or the Dijkstra Algorithm in cases where the edges are not of equal weight.
In our case the weights of the edges could be the routing fees and only edges will be included where the capacity is larger than the amount that is to be sent.
In this basic form pathfinding on the lightning network is very simple and straight forward.
However as we have already discussed in the introduction channel balances cannot be shared with every participant every time a payment takes place as the system would not scale.
Thus our easy computer science problem for which we know a solution turns into a rather complex problem.
We now have to solve a path finding problem with only partial knowledge.
For example we know which edges might be able to forward a payment because their capacity seems big enough but we can't know it for sure
unless we try it out or ask the channel owners directly.
Even if we where able to ask the channel owners directly their balance might change by the time we have asked others, computed a path, constructed an onion and send it along.
Thus we do not only have partial information but the information we have is highly dynamic and might change before we can use it.

One general observation that everyone can easily make is that if every node along a path is able to forward a certain amount of satoshis these nodes will also be able to forward a lower amount of satoshis.
This is why many people intuitively belive that multipath payments might be a good strategy.
Instead of finding one path where every node owns a large amount of liquidity the task is split into smaller ones.
Another reason is of course that the sender of a payment might just not have the amount they wish to send in one single channel but split over several channels.
We leave it to later sections of this chapter to discuss the strengths and weaknesses of multipath payments.
However we note that multi path payments are equivalent to finding a flow between the source and the destination.
While finding flows in a graph with full knowledge that is static is computationally a little bit more heavy than computing a shortest path it still seams to be a feasable problem.
Given the reality of the Lightning Network and the fact that we do not need to compute a max flow we currently do not know if the problem is more or less difficult as finding a path.
It seems to be about equally difficult and the problems are somewhat connect as we will see in the following sections.

=== Probing based path finding algorithm on the Lightning Network
As discussed in order to reliably find a path nodes would need to know the balance of remote payment channels and the balances would have to be static.
As both is not given nodes currently use a probing based algorithm.
In its most basic form the algorithm works as follows:

. Select a random path to the destination node
. Construct and send the onion
. wait for the response of the onion
. If response == preimage -> success
. If response == feilure -> start over from step one.

Nodes will use various sources of information to improve the selection of a random path.
The main source of information is the gossip protocol.
From the gossip protocol a node learns which other nodes exist and which channels have been opened.
This will basically provide a network view that can be used to run graph algorithms that generate plausible paths.
For example a breadth first seach traversal.
The graph algorithm will usually be constrained to channels that have at least the capacity of the amount to be sent.
In practice due to channel reserve and the assumption that the capacity in the channel will not be sitting completely on one side it is saver to prefer larger channels.

The second source of information is the blockchain itself.
If channels are closed this is not announced via the gossip protocol.
However as the funding transaction is encoded by the short channel id of the channel and as it will be spent if the channel is closed nodes have to use this information to update their knowledge about the network of channels.

Another source of information are the past payments themselves.
Onions can return with errors.
Knowing for example that the third hop along a path returns an error means that the first two channels had enough balance and that the third channel - depending on the error - did not have enough balance.
Such edges can be removed from the set of edges similarly to the edges that do not have enough capacity.
Similarly nodes could use such information from previous payment attempts.
It is important that nodes are carefull with this data.
As the capacity information of channels from the gossip protocol and blockchain data is verifiably correct the data from our third source of information can be incorrect.
Nodes might just send an error back because they do not want to reveal balance information.
Also the data might just change over time as the balances in the lightning network are not static and changing with every payment attempt that is being made.
Thus nodes should only use such data if it is not to far in the past or use it only with a certain confidence.

The fourth source of information that the node will use are the routing hints in the BOLT 11 invoice.
Remember that a regular payment process starts with the person who wants to receive money coming up with a random secret and hashing it as the payment hash.
This hash is usually transported to the sender via an invoice.
Invoices usually contain some meta data and in particular routing hints.
This is necessary if the person who wants to be paid does not have announced channels. In that case it will speciefie some unannounced channels within the invoice.
Otherwise the payer would not even be able to find a path to the "hidden" node.
Routing hints might also be used by the receiving node to indicate which public channels have enough inbound capacity for the payment and thus the ability to receive funds.
In general the further away from the originating peer the payment goes the more likely it becomes to select a channel with insufficient balance.
Thus indicating on which channels a node wishes to receive funds would actually be quite nice for the sender.


=== Improvements on Source based onion routing

The probing based approach that is used in the Lightning Network has several flaws.
Sending out an onion usually takes a certain amount of time.
The time depends on how many hops the onion is supposed to be forwarded and of course on the speed of nodes processing the onion and the topology on the web.
In the following diagram you can see how the time for onions to return in general increase with the amount of hops that the onion has encoded.

[[pathfinding-probing]]
.Research showing the times that onions take to return depending on the distance (CC-BY-SA Tikhomirov, Sergei & Pickhardt, Rene & Biryukov, Alex & Nowostawski, Mariusz. (2020). Probing Channel Balances in the Lightning Network.)
image:images/probingtimes.ppm[]
Of course this diagram was just a snapshot from an experiment in early 2020 and things might change.
We can learn from the Diagram that payments can take several seconds while the node tries to probe several paths.
This is due to the fact that the fact that single onions can easily take a few seconds to return and a sender might have to send several onions in a row while probing for a sucessfull path.
in generall this will still be much faster than waiting for confirmations on a bitcoin block but it is not sufficient in an environment where payments need to settle fast.
If people stand in a line at the cash register for their groceries this would be such a setting.
Thus lightnign developers


==== Probing based improvements
The last source of information that nodes could use is to probe the network themselves.
Instead of making the actual payment nodes could send out many fake payments which are onions to a random payment hash.
Given the properties of the hashfunction it is save to assume that noone would know the preimage.
In that sense the payment will only fail at the destination and nodes can learn a lot about the balances.
Of course this produces spam and heavy load on the network and it is not recommended that nodes do this.
However participants cannot really be stopped from doing this.
unless channel partners see a lot of traffic coming on a channel which always fails and never settles.
In this case channel partners could decide to close the channel.

[Note]
====
We want you to understand that Lightning Network by design does not have perfect privacy.
While a lot of information is not easily accessible every time a path is probed the node learns something about the state of the network at that point in time.
====

We note that one should not send two onions at the same time with the same payment hash for which the recipient knows the preimage.
As long as the onion is being processed and routed the payment is out of controll of the sender.
In case two onions are sent at the same time the recipient could very well release the preimage twice and get paid twice.
That was the reason why probing should be conducted with a fake payment hash.
in that case the sender can probe concurrently as long as the sender has enough funds to pay for all the HTLCs.
However there is a problem.
Assume an onion returns indicating that the payment hash was unknown to the recipient but otherwise the path would have been possible.
The sender would now use this exact path to send the payment with the corrent payment hash.
Meanwhile the balances of some channels along the path might have changed and the path does not exist anymore.
In this case the sender would have to start from the beginning all over again.
Admittedly the risk for this to happen is rather small but there is a chance.

A better way and potential improvement for the future of the Lightning network are stuckless payments.
There is a proposal for a system called stuckless payments that receives high appriciation by developers.
This proposal will probably not be implemented before the lightning network switches from Hashed Timelocked Contracts to Point time locked contracts which won't come before Schnorr Signatures are activated on the Bitcoin Network.
What stuckless payments can do is to give controll back to the sender of an Onion.
Without explaining the details here we just say that the sender can now cancle an onion.
This is great for redundant and concurrent pathfinding.
The sender could send out several real onions.
The first ones that arrives at the recipient will be settled.
All others will be cancled.
This increases the usuability of the Lightning Network on several levels.
One advantage is that the sender can try several paths at the same time.
The second advantage is that the path is locked after it is found and until it is settled.
This means that the sender can either cancle the onion or help to release the preimage (as senders have to do with the stuckless payment construction)
In particular the probed path cannot change or used by other routing requests between probing and setting up the htlcs that are used to fullfill the request.
The time for a a successfull payment will reduce drastically.
The distadvante is that the sender has to lock more bitcoin during the path finding process.
Due to timeouts these bitcoin can be locked for a couple of days before they can be used again.
This should not happen too often.
Also it utilizes more resources of other nodes.


==== Multipath
Multipath payments seem to be a natural way to improve the path finding chances of lightning nodes and success rates of payments.

* multipath and split payments

==== Rebalancing
* (proactive / reactive) Rebalancing