If wintun fails it seems to take about 15s, so extend the startup
timeout so that it can fail gracefully (and let us clean up before
exiting).
Also refactors the timeouts to chrono constants.
Fixes windows shutdown crashes:
- windivert wasn't handling an ERROR_NO_DATA, which it gets when
finished handling everything after a shutdown.
- wintun ReadPacket still gets invoked after end_session is called, but
shouldn't be. This adds an atomic<bool> to early return.
- fixes up some settings we send for windows service manager notify
- win32_platform.cpp is dead
- win32_platform.hpp is useless
Style changes from clang-tidy warnings:
- remove `virtual` from some definitions that already have `override`
- remove virtual destructor from NetworkInterface because it already has
a virtual destructor via the base type (and clang-tiny warns about it)
the win32 and sd_notify components provided a disjointed set of
similar high level functionality so we consolidate these duplicate
code paths into one that has the same lifecycle regardless of platform
to reduce complexity of this feature.
this new component is responsible for reporting state changes to the
system layer and optionally propagating state change to lokinet
requested by the system layer (used by windows service).
We're defining formats for std::chrono types, which feels wrong (because
fmt itself also has these), so just replace them with functions:
short_time_from_now(...) gives a short "in 14m12s" or "5.123s ago" time
span relative to now, given a time point. Precision gets reduced for
larger deviations from now (e.g. "4h12m ago").
ToString(Duration_t) gives a string such as "-3h22m02.123s" for a
duration.
The time_delta<T> was using the wrong duration type when formatting, so
was outputting millisecond precision in the systemd status string which
is pointless (and unintended).
The iterator here to skip an obsolete bootstrap wasn't properly
reassigning the iterator, so "didn't work" (though why it was hanging
for me is entirely non-obvious).
Also refactored it to simplify/clarify it a bit.
- Move logging initialization to early in Configure rather than at the
end of FromConfig so that we can add debug logging inside
Configure/FromConfig/etc.
- add said debug logging to Configure/FromConfig/etc.
Without this, old config (with now-irrelevant settings) won't work in
newer lokinet, making lokinet fatal error on startup if one of the
no-longer-used options is still present.
when read/writing a .loki privkey file we dont rewind a llarp_buffer_t
after use. this is an argument in favor of just removing that type
from the code entirely.
fixes by using 2 distinct locally scoped llarp_buffer_t, one for read,
one for write.
Currently (from a recent PR) we aren't pinging oxend if not active, but
that behaviour ended up being quite wrong because lokinet needs to ping
even when decommissioned or deregistered (when decommissioned we need
the ping to get commissioned again, and if not registered we need the
ping to get past the "lokinet isn't pinging" nag screen to prepare a
registration).
This considerably revises the pinging behaviour:
- We ping oxend *unless* there is a specific error with our connections
(i.e. we *should* be establishing peer connections but don't have any)
- If we do have such an error, we send a new oxend "error" ping to
report the error to oxend and get oxend to hold off on sending uptime
proofs.
Along the way this also changes how we handle the current node state:
instead of just tracking deregistered/decommissioned, we now track three
states:
- LooksRegistered -- which means the SN is known to the network (but not
necessarily active or fully staked)
- LooksFunded -- which means it is known *and* is fully funded, but not
necessarily active
- LooksDecommissioned -- which means it is known, funded, and not
currently active (which implies decommissioned).
The funded (or more precisely, unfunded) state is now tracked in
rc_lookup_handler in a "greenlist" -- i.e. new SNs that are so new (i.e.
"green") that they aren't even fully staked or active yet.
This aligns service node updating logic a bit closer to what happens in
storage server, and should make it a bit more resilient, hopefully
tracking down the (off-Github) reported issue where lokinet sometimes
doesn't see itself as active.
- Initiate a service node list update in the 30s timer lokinet ping
timer (in case we miss a block notify for some reason); although this
is expensive, the next point mitigates it:
- Retrieve the block hash with the SN state update, and feed it back
into the next get_service_nodes call (as "poll_block_hash") so that
oxend just sends back a mostly-empty response when the block hasn't
changed, allowing both oxend and lokinet to skip nearly all of the
work of a service node list update when the block hasn't changed since
the last poll. (This was already partially implemenated--we were
already looking for "unchanged"--but without a block hash to get from
and pass back to oxend we'd never actually get an "unchanged" result).
- Tighten up the service node list handling by moving the "unchanged"
handling into the get_service_nodes response handler: this way the
HandleNewServiceNodeList function is only handling the list but not
the logic as to whether there actually is a new list or not.
Lots and lots of places in the code had broken < operators because they
are returning something like:
foo < other.foo or bar < other.bar;
but this breaks both the strict weak ordering requirements that are
required for the "Compare" requirement for things like
std::map/set/priority_queue.
For example:
a = {.foo=1, .bar=3}
b = {.foo=3, .bar=1}
does not have an ordering over a and b (both `a < b` and `b < a` are
satisfied at the same time).
This needs to be instead something like:
foo < other.foo or (foo == other.foo and bar < other.bar)
but that's a bit clunkier, and it is easier to use std::tie for tuple's
built-in < comparison which does the right thing:
std::tie(foo, bar) < std::tie(other.foo, other.bar)
(Initially I noticed this in SockAddr/sockaddr_in6, but upon further
investigation this extends to the major of multi-field `operator<`'s.)
This fixes it by using std::tie (or something similar) everywhere we are
doing multi-field inequalities.
If we get back an IPv6 address as the first gateway then we won't have
the expected IPv4 gateway that the route poker needs to operate.
This iterates through them separately so that we treat the IPv4 and IPv6
sides of an address as separate interfaces which should allow the route
poker to find the one it wants (and just skip the IPv6 one).
DRY a chunk of repeated code for finding a free private range.
Also fix it so that it will consider 10.255.0.1/16 and 192.168.255.1/24
(previously it would only check up to octet 254).
If running as a service node, we ping core on a regular interval to
inform it we're running and in a good state. If we're an active
(not decommissioned or deregistered) service node and have too few
peers and thus we're not actually connected to lokinet, we should skip
that ping so core doesn't think we're ok.
Adds a fallback bootstrap file path parameter to CMake, specify
-DBOOTSTRAP_SYSTEM_PATH="/path/to/file" to use.
Adds a list of (currently 1) obsolete bootstrap RouterIDs to check
bootstrap RCs against. Will not use bootstrap RCs if they're on that
list.
Log an error periodically if we appear to be an active service node but
have fewer than a set number (5) known peers.
Bumps oxen-logging version for literal _format.