- redoing link_manager functions again to implement previously ignored review comments on several PRs
- conceptually merging "whitelist_routers" and new "known_{rids,rcs}", s.t. we can completely eliminate white/red/gray/green/etc lists in favor of something that isn't dumb
- Once we have our set of returned rc's and accepted rid's (ones that were found locally), the remainder are placed in an "unconfirmed" state
- Once there, they have five subsequent successful fetches to be found in request response, at which point their verification counter is incremented and their attempt counter is reset
- If they appear three times, they are "promoted" and moved to our "known_{rid,rc}" list
- disable reachability testing with config option; required to be done on testnet
- reachability testing pipeline through link_manager executes pings similar to storage server. connection established hook reports successful reachability, while connection closed callback (with non-default error code) reports unsuccessful testing
- bootstrap cooldown implemented with 1min timer in case all bootstraps fail
- set comparison implemented in non-initial and non-bootstrap rc fetching; set comparison in rid fetching is done every fetch
- nodedb get_random functions refactored into conditional/non-conditional methods. Conditional search implements reservoir sampling for one-pass accumulation of n random rcs
- greedy evaluation of returned rid's, simplifying post-processing logic to simple frequency comparison per rid against a constant threshold
- tidied up link_manager request/response handling
- TODO:
- review and decide thresholds
- evaluate necessity and potential implementation of rc comparison
- When receiving a request to fetch RouterID's, the remote endpoint fulfilling the request stores them in an unordered set. When the request caller receives that payload, it is loaded into a vector in the same order. However, we should just load it directly into an unordered set to enforce both the order and that none appear twice
- The trust model will have to operate on multiple large lists of RouterID's and RC's efficiently, and maintaining a sort order ensures the values are workable immediately after deserialization
Periodically clients will fetch the set of RouterIDs for all relays on
the network. It will request this list from a number (12, currently) of
relays, but as we are likely to be requesting from more relays than we
want to have edge connections, this request will itself be relayed to
the target source via one of our edges. As we can't trust our edge to
do this honestly, the responses are signed by the source relay.
TODO: the responses from all (12) relays are collected, then processed
together. The reconciliation of their responses is not yet implemented.
TODO: the source selection for this method obviously requires sources to
begin with, but this is the method by which we learn of
those...bootstrapping is still a bit in-progress, and will need to be
finished for this.
TODO: make Router call this periodically, as with RC fetching.
This command will be called periodically by clients to maintain a list
of RCs of active relay nodes. It will require another command (future
commit) to fetch the RouterIDs from many nodes and reconcile those so we
have some notion of good-ness of the RCs we're getting; if we get what
seems to be a bad set of RCs (this concept not yet implemented), we will
choose a different relay to fetch RCs from. These are left as TODOs for
now.
Relays will now re-sign and gossip their RCs every 6 hours (minus a
couple random minutes) using the new gossip_rc message.
Removes the old RCGossiper concept
We will want some notion of "when did we receive it" for RCs (or
RouterIDs, details tbd), but that will be per-source as a means to form
some metric of consensus/trust on which relays are *actually* on the
network. Clients don't have a blockchain daemon to pull this from, so
they have to ask many relays for the full list of relays and form a
trust model on that (bootstrapping problem notwithstanding).
We're removing the notion of find/lookup a singular RC, so this gets rid
of all functions which did that and replaces their usages with something
sensible.
RC "lookup" is being replaced with "gimme all recently updated RCs". As
such, doing a lookup on a specific RC is going away, as is network
exploration, so a lot of what RCLookupHandler was doing will no longer
be relevant. Functionality from it which was kept has moved to NodeDB,
as it makes sense for that functionality to live where the RCs live.
path build frames should be onioned at each hop to avoid a bad actor
controlling two nodes in a path being able to know (with certainty,
temporal correlation is hard to avoid) that they're hops on the same
path. This is desirable as in the worst case someone could be your edge
hop and terminal hop on a path, and now the terminal hop knows your IP
making the path basically pointless.
It's unnecessary abstraction that barely simplifies anything, and is now
only used in one single place anyway, which is easily replaced with a
(unabstracted) lambda.
Lots of code was using 32-byte nonces for xchacha20 symmetric
encryption, but this just means 8 extra bytes per packet wasted as
chacha is only using the first 24 bytes of that nonce anyway.
Changing this resulted in a lot of dead/dying code breaking, so this
commit also removes a lot of that (and comments a couple places with
TODO instead)
Also nounce -> nonce where it came up.
change path control message inner message response to take just a
string, which will be a bt-encoded response with an early key for
status. If there is a timeout we pass a bt dict that only has that as
the status, else the response we de-onioned should have either an OK
status or some other error.
change messages to use new status key
correctly call Path::EnterState on path build response
It seems RC refactor will obviate the need for a "get individual RC"
method, so this comments out some usage of that to sidestep build
errors, rather than correcting them in a way that will just be wasted.
- control messages can be sent along a path
- the path owner onion-encrypts the "inner" message for each hop in the
path
- relays on the path will onion the payload in both directions, such
that the terminal relay will get the plaintext "inner" message and the
client will get the plaintext "response" to that.
- control messages have (mostly, see below) been changed to be invokable
either over a path or directly to a relay, as appropriate.
TODO:
- exit messages need looked at, so they have not yet been changed for
this
- path transfer messages (traffic from client to client over 2 paths
with a shared "pivot") are not yet implemented
- .snodes don't need to support SRV records, so remove that
- untangle the mess of captured lambdas capturing other lambdas
capturing other lambdas; we still need a chain of nested lambdas
because we have a chain of callbacked events, but hiding the nesting
by capturing them in other lambdas didn't improve anything.