The iterator here to skip an obsolete bootstrap wasn't properly
reassigning the iterator, so "didn't work" (though why it was hanging
for me is entirely non-obvious).
Also refactored it to simplify/clarify it a bit.
- Move logging initialization to early in Configure rather than at the
end of FromConfig so that we can add debug logging inside
Configure/FromConfig/etc.
- add said debug logging to Configure/FromConfig/etc.
Without this, old config (with now-irrelevant settings) won't work in
newer lokinet, making lokinet fatal error on startup if one of the
no-longer-used options is still present.
when read/writing a .loki privkey file we dont rewind a llarp_buffer_t
after use. this is an argument in favor of just removing that type
from the code entirely.
fixes by using 2 distinct locally scoped llarp_buffer_t, one for read,
one for write.
Currently (from a recent PR) we aren't pinging oxend if not active, but
that behaviour ended up being quite wrong because lokinet needs to ping
even when decommissioned or deregistered (when decommissioned we need
the ping to get commissioned again, and if not registered we need the
ping to get past the "lokinet isn't pinging" nag screen to prepare a
registration).
This considerably revises the pinging behaviour:
- We ping oxend *unless* there is a specific error with our connections
(i.e. we *should* be establishing peer connections but don't have any)
- If we do have such an error, we send a new oxend "error" ping to
report the error to oxend and get oxend to hold off on sending uptime
proofs.
Along the way this also changes how we handle the current node state:
instead of just tracking deregistered/decommissioned, we now track three
states:
- LooksRegistered -- which means the SN is known to the network (but not
necessarily active or fully staked)
- LooksFunded -- which means it is known *and* is fully funded, but not
necessarily active
- LooksDecommissioned -- which means it is known, funded, and not
currently active (which implies decommissioned).
The funded (or more precisely, unfunded) state is now tracked in
rc_lookup_handler in a "greenlist" -- i.e. new SNs that are so new (i.e.
"green") that they aren't even fully staked or active yet.
This aligns service node updating logic a bit closer to what happens in
storage server, and should make it a bit more resilient, hopefully
tracking down the (off-Github) reported issue where lokinet sometimes
doesn't see itself as active.
- Initiate a service node list update in the 30s timer lokinet ping
timer (in case we miss a block notify for some reason); although this
is expensive, the next point mitigates it:
- Retrieve the block hash with the SN state update, and feed it back
into the next get_service_nodes call (as "poll_block_hash") so that
oxend just sends back a mostly-empty response when the block hasn't
changed, allowing both oxend and lokinet to skip nearly all of the
work of a service node list update when the block hasn't changed since
the last poll. (This was already partially implemenated--we were
already looking for "unchanged"--but without a block hash to get from
and pass back to oxend we'd never actually get an "unchanged" result).
- Tighten up the service node list handling by moving the "unchanged"
handling into the get_service_nodes response handler: this way the
HandleNewServiceNodeList function is only handling the list but not
the logic as to whether there actually is a new list or not.
Lots and lots of places in the code had broken < operators because they
are returning something like:
foo < other.foo or bar < other.bar;
but this breaks both the strict weak ordering requirements that are
required for the "Compare" requirement for things like
std::map/set/priority_queue.
For example:
a = {.foo=1, .bar=3}
b = {.foo=3, .bar=1}
does not have an ordering over a and b (both `a < b` and `b < a` are
satisfied at the same time).
This needs to be instead something like:
foo < other.foo or (foo == other.foo and bar < other.bar)
but that's a bit clunkier, and it is easier to use std::tie for tuple's
built-in < comparison which does the right thing:
std::tie(foo, bar) < std::tie(other.foo, other.bar)
(Initially I noticed this in SockAddr/sockaddr_in6, but upon further
investigation this extends to the major of multi-field `operator<`'s.)
This fixes it by using std::tie (or something similar) everywhere we are
doing multi-field inequalities.
If we get back an IPv6 address as the first gateway then we won't have
the expected IPv4 gateway that the route poker needs to operate.
This iterates through them separately so that we treat the IPv4 and IPv6
sides of an address as separate interfaces which should allow the route
poker to find the one it wants (and just skip the IPv6 one).
DRY a chunk of repeated code for finding a free private range.
Also fix it so that it will consider 10.255.0.1/16 and 192.168.255.1/24
(previously it would only check up to octet 254).
If running as a service node, we ping core on a regular interval to
inform it we're running and in a good state. If we're an active
(not decommissioned or deregistered) service node and have too few
peers and thus we're not actually connected to lokinet, we should skip
that ping so core doesn't think we're ok.
Adds a fallback bootstrap file path parameter to CMake, specify
-DBOOTSTRAP_SYSTEM_PATH="/path/to/file" to use.
Adds a list of (currently 1) obsolete bootstrap RouterIDs to check
bootstrap RCs against. Will not use bootstrap RCs if they're on that
list.
Log an error periodically if we appear to be an active service node but
have fewer than a set number (5) known peers.
Bumps oxen-logging version for literal _format.
No more llarp_buffer_t here!
(I was tracking down a segfault which led me in here and it was easier
to rewrite this to use bt_dict_{consumer,producer} than to decipher all
the cursed llarp_buffer_t and bencode callback nest).
We have basically this same bit of code in tons of places; consolidate
it into llarp::util::slurp_file/llarp::util::dump_file.
Also renames all the extra junk that crept into llarp/util/fs.hpp out of
there into llarp/util/file.hpp instead.
- Accept empty string or `null` for token to mean "no token."
- Accept `null` for range to mean "default range."
- Don't use a default range (::0/0) in lokinet-vpn because this will
fail if IPv6 ranges aren't supported on the platform (e.g. on
Windows), and isn't necessary: if we omit it then the rpc code already
uses ::0/0 or 0.0.0.0/0 by default, as needed.
- ReconfigureDNS wasn't returning the old servers; made it void instead
(the Apple code can just store a copy of the original upstream
servers instead).
- Reconfiguring DNS reset the unbound context but didn't replace it, so
a Down()/Up() would crash.
- Simplify Resolver() destructor to just call Down(), and make it final
just so that no one tries to inherit from us (so that calling a
virtual function from the destructor is safe).
- Rename CancelPendingQueries() to Down(); the former cancelled but also
shut down the object, so the name seemed a bit misleading.
- Rename SetInternalState in Resolver_Base to ResetResolver, so that we
aren't conflicting with ResetInternalState from Endpoint (which was a
problem because TunEndpoint inherited from both; it could be resolved
through the different argument type if we removed the default, but
that seems gross).
- Make Resolver use a bare unbound context pointer rather than a
shared_ptr; since Resolver (now) entirely manages it already we don't
need an extra management layer, and it saves a bunch of `.get()`s.