Currently (from a recent PR) we aren't pinging oxend if not active, but
that behaviour ended up being quite wrong because lokinet needs to ping
even when decommissioned or deregistered (when decommissioned we need
the ping to get commissioned again, and if not registered we need the
ping to get past the "lokinet isn't pinging" nag screen to prepare a
registration).
This considerably revises the pinging behaviour:
- We ping oxend *unless* there is a specific error with our connections
(i.e. we *should* be establishing peer connections but don't have any)
- If we do have such an error, we send a new oxend "error" ping to
report the error to oxend and get oxend to hold off on sending uptime
proofs.
Along the way this also changes how we handle the current node state:
instead of just tracking deregistered/decommissioned, we now track three
states:
- LooksRegistered -- which means the SN is known to the network (but not
necessarily active or fully staked)
- LooksFunded -- which means it is known *and* is fully funded, but not
necessarily active
- LooksDecommissioned -- which means it is known, funded, and not
currently active (which implies decommissioned).
The funded (or more precisely, unfunded) state is now tracked in
rc_lookup_handler in a "greenlist" -- i.e. new SNs that are so new (i.e.
"green") that they aren't even fully staked or active yet.
This aligns service node updating logic a bit closer to what happens in
storage server, and should make it a bit more resilient, hopefully
tracking down the (off-Github) reported issue where lokinet sometimes
doesn't see itself as active.
- Initiate a service node list update in the 30s timer lokinet ping
timer (in case we miss a block notify for some reason); although this
is expensive, the next point mitigates it:
- Retrieve the block hash with the SN state update, and feed it back
into the next get_service_nodes call (as "poll_block_hash") so that
oxend just sends back a mostly-empty response when the block hasn't
changed, allowing both oxend and lokinet to skip nearly all of the
work of a service node list update when the block hasn't changed since
the last poll. (This was already partially implemenated--we were
already looking for "unchanged"--but without a block hash to get from
and pass back to oxend we'd never actually get an "unchanged" result).
- Tighten up the service node list handling by moving the "unchanged"
handling into the get_service_nodes response handler: this way the
HandleNewServiceNodeList function is only handling the list but not
the logic as to whether there actually is a new list or not.
Lots and lots of places in the code had broken < operators because they
are returning something like:
foo < other.foo or bar < other.bar;
but this breaks both the strict weak ordering requirements that are
required for the "Compare" requirement for things like
std::map/set/priority_queue.
For example:
a = {.foo=1, .bar=3}
b = {.foo=3, .bar=1}
does not have an ordering over a and b (both `a < b` and `b < a` are
satisfied at the same time).
This needs to be instead something like:
foo < other.foo or (foo == other.foo and bar < other.bar)
but that's a bit clunkier, and it is easier to use std::tie for tuple's
built-in < comparison which does the right thing:
std::tie(foo, bar) < std::tie(other.foo, other.bar)
(Initially I noticed this in SockAddr/sockaddr_in6, but upon further
investigation this extends to the major of multi-field `operator<`'s.)
This fixes it by using std::tie (or something similar) everywhere we are
doing multi-field inequalities.
If we get back an IPv6 address as the first gateway then we won't have
the expected IPv4 gateway that the route poker needs to operate.
This iterates through them separately so that we treat the IPv4 and IPv6
sides of an address as separate interfaces which should allow the route
poker to find the one it wants (and just skip the IPv6 one).
DRY a chunk of repeated code for finding a free private range.
Also fix it so that it will consider 10.255.0.1/16 and 192.168.255.1/24
(previously it would only check up to octet 254).
If running as a service node, we ping core on a regular interval to
inform it we're running and in a good state. If we're an active
(not decommissioned or deregistered) service node and have too few
peers and thus we're not actually connected to lokinet, we should skip
that ping so core doesn't think we're ok.
Adds a fallback bootstrap file path parameter to CMake, specify
-DBOOTSTRAP_SYSTEM_PATH="/path/to/file" to use.
Adds a list of (currently 1) obsolete bootstrap RouterIDs to check
bootstrap RCs against. Will not use bootstrap RCs if they're on that
list.
Log an error periodically if we appear to be an active service node but
have fewer than a set number (5) known peers.
Bumps oxen-logging version for literal _format.
No more llarp_buffer_t here!
(I was tracking down a segfault which led me in here and it was easier
to rewrite this to use bt_dict_{consumer,producer} than to decipher all
the cursed llarp_buffer_t and bencode callback nest).
We have basically this same bit of code in tons of places; consolidate
it into llarp::util::slurp_file/llarp::util::dump_file.
Also renames all the extra junk that crept into llarp/util/fs.hpp out of
there into llarp/util/file.hpp instead.
- Accept empty string or `null` for token to mean "no token."
- Accept `null` for range to mean "default range."
- Don't use a default range (::0/0) in lokinet-vpn because this will
fail if IPv6 ranges aren't supported on the platform (e.g. on
Windows), and isn't necessary: if we omit it then the rpc code already
uses ::0/0 or 0.0.0.0/0 by default, as needed.
- ReconfigureDNS wasn't returning the old servers; made it void instead
(the Apple code can just store a copy of the original upstream
servers instead).
- Reconfiguring DNS reset the unbound context but didn't replace it, so
a Down()/Up() would crash.
- Simplify Resolver() destructor to just call Down(), and make it final
just so that no one tries to inherit from us (so that calling a
virtual function from the destructor is safe).
- Rename CancelPendingQueries() to Down(); the former cancelled but also
shut down the object, so the name seemed a bit misleading.
- Rename SetInternalState in Resolver_Base to ResetResolver, so that we
aren't conflicting with ResetInternalState from Endpoint (which was a
problem because TunEndpoint inherited from both; it could be resolved
through the different argument type if we removed the default, but
that seems gross).
- Make Resolver use a bare unbound context pointer rather than a
shared_ptr; since Resolver (now) entirely manages it already we don't
need an extra management layer, and it saves a bunch of `.get()`s.
On Apple, the network extension is outside the tunnel routing, so we
cannot have libunbound talk directly to upstream (it would leak DNS when
exit mode is enabled). Instead unbound *always* talks to a localhost
port where we have a "dns trampoline" that takes UDP packets and shoves
them through the tunnel.
We were doing that already, but recent changes here were overwriting the
libunbound settings with.
This also moves the upstream DNS configuration part of `Up()` into its
own method.
We don't have a resolver on macos, so we were running through this loop
with fails == 0 == m_Impls.size() and throwing, crashing the process.
Early return to avoid the failure and fix macos crash.
Apple supports anything here that Clang supports and should have them
set the same as everywhere else.
Most importantly this gives apple the -Wno-deprecated-declarations flag
which has been driving me nuts on macos.
This also version-gates the -Wno-deprecated-declarations so that it
will turn on again when we bump the version beyond .10.
We were requiring `->Next` be true, which means we skipped the last (and
often only) entry of the linked lists and so never properly found the
gateway.
- We need to pass a flag to get Windows to include gateway info.
- Refactor it to use microsoft's recommended magic default 15000 buffer
size and repeat in a loop a few times until it works. Developers,
developers, developers, developers!
- a `static` is less verbose and otherwise identical to an empty
namespace for a single declaration like this.
- operator== on two optionals already does exactly what the `is_equal`
lambda here is doing.
- formatting
- windivert was being set up *before* DNS is set up, so the DNS port was
nullopt and thus we couldn't properly identify upstream DNS traffic.
- close() doesn't close a socket on Windows, so the socket-bind-close
approach to get a free UDP port wasn't actually closing, and thus
unbound upstream constrained to the given port were completely
failing.
- The unbound thread was accessing the same shared_ptr instance as the
outer code, which isn't thread-safe; changed it to copy a weak_ptr
into the lambda instead.
- Exclude upstream DNS traffic in the filter rather than capturing and
reinjecting it.
The inner lambda here wasn't keeping the `Query` (`this`) alive, so
`src` wasn't valid anymore. This changes it to copy the `src`
shared_ptr into the lambda instead of capturing `this`, and fixes it.
The current code isn't working and gives a 0 (which then fails unbound
initialization). This replaces it by doing a socket+bind to find a free
port then immediately closes (but passes the port we got into unbound).
- Replaces RAII handling of DLLs with global function pointers. (We
don't unload the dll this way, but that seems unnecessary anyway).
- Simplifies code by just needing to call an init function, but not
needing to pass around an object holding the function pointers.
- Adds a templated dll loader that takes the dll and a list of
name/pointer pairs to load the dll and set the pointers in one shot.
ip_header wasn't 20 bytes on windows compilations for some unholy
reason. This restructures it to avoid the template and just use two
different structs for le/be with a condition_t for the ifdef, which
resolves it (and *also* apparently avoids the need for the pack).
Also add a static_assert to check the size.
Also do the same for ipv6.
Cast via an ordinary function pointer rather than a function pointer
reference to avoid the warning.
Also make the pointer in `Func_t` explicit rather than implicit (deduced
into the `Func_t` type) to make it clearer what is going on here.
Lots of tools struggle with non-default DNS port, so keep a listener on
127.3.2.1:53 (by default).
This required various changes to the config handling to hold a vector
(instead of an optional) of defaults and values, and now allows passing
in an array of defaults instead of just a single default.
It didn't do equality, it did "does the remaining space start with the
argument" (and so the replacement in the previous commit was broken).
This renames it to avoid the confusion and restores to what it was doing
on dev.
errno is only set if read returns < 0 and won't be set to 0 if read
succeeds, so we were bailing here frequently on successful reads
(whenever errno happened to be non-0).
This class is cursed, but also broken under gcc-12. Apply some lipstick
to get it moving again (but we really need to refactor this because it
is a mess).
* wintun vpn platform for windows
* bundle config snippets into nsis installer for exit node, keyfile persisting, reduced hops mode.
* use wintun for vpn platform
* isolate all windows platform specific code into their own compilation units and libraries
* split up internal libraries into more specific components
* rename liblokinet.a target to liblokinet-amalgum.a to elimiate ambiguity with liblokinet.so
* DNS platform for win32
* rename llarp/ev/ev_libuv.{c,h}pp to llarp/ev/libuv.{c,h}pp as the old name was idiotic
* split up net platform into win32 and posix specific compilation units
* rename lokinet_init.c to easter_eggs.cpp as that is what they are for and it does not need to be a c compilation target
* add cmake option STRIP_SYMBOLS for seperating out debug symbols for windows builds
* intercept dns traffic on all interfaces on windows using windivert and feed it into lokinet
we want to be able to have multiple locally bound dns sockets in lokinet so
i restructured most of the dns subsystem in order to make this easier.
specifically, we have a new structure to dns subsystem:
* dns::QueryJob_Base
base type for holding a dns query and response with virtual methods
in charge of sending a reply to whoever requested.
* dns::PacketSource_Base
base type for reading and writing dns messages to and from wherever they came from
* dns::Resolver_Base
base type for filtering and handling of dns messages asynchronously.
* dns::Server
contextualized per endpoint dns object, responsible for all dns related isms.
this change hides all impelementation details of all of the dns components.
adds some more helper functions for parsing dns and dealing with OwnedBuffer.
overall dns becomes less of a pain with this new structure. probably.
* make socket bind errors have a distinct message reported when caught using their own exception type
* omit printing banner in setup when we run from the lokinet executable (but not the liblokinet.so entry point)
Adds support for building Lokinet as a system extension, and fixes
various problems in the macos implementation found during development of
the system extension support.
outbound=:1234
outbound=0.0.0.0:1234
outbound=
outbound=0.0.0.0
now all default to use the inbound= IP. (If multiple inbound= IPs are
given, we raise an exception to abort startup).
Only applies to routers (since clients don't have inbound IPs), and
eliminates potential weird edge cases with local system IP and routing
shenanigans.
The general section comments contained all the descriptions for the
inbound/outbound settings, while inbound/outbound had no comment at all.
This moves the comments around to the individual settings, plus updates
some of the wording in the section.
We were failing when using `inbound=:1234`, rather than looking for a
default IP. This fixes it to still use the default IP, and change only
the default port.
Outbound behaviour should remain unchanged: i.e. `outbound=:2345` means
bind-to-wildcard-IP with a specific port.
Fixes:
- tighten reserved name detection to not match fooloki.loki, but instead
only match "foo.loki.loki" and "loki.loki" (and similar for reserved
name "snode.loki").
- IPv6 PTR parsing was completely broken.
- Added tests for the above two issues.
Cleanups:
- Eliminate llarp::dns::Name_t typedef for std::string
- Use optional return instead of bool + output param
- Use string_views; we were doing a *lot* of string substr's during
parsing, each of which allocates a new string.
- Use fmt instead of stringstream
- Simplify IPv4 PTR parsing
Using constructor inheritance DRYs the code, but unfortunately confuses
GCC as to where the proper "required from here" location is, which makes
debugging formatting errors very difficult. Avoid it (and update
oxen-logging to avoid it there as well).
Replaces custom logging system with spdlog-based oxen logging. This
commit mainly replaces the backend logging with the spdlog-based system,
but doesn't (yet) convert all the existing LogWarn, etc. to use the new
format-based logging.
New logging statements will look like:
llarp::log::warning(cat, "blah: {}", val);
where `cat` should be set up in each .cpp or cluster of .cpp files, as
described in the oxen-logging README.
As part of spdlog we get fmt, which gives us nice format strings, where
are applied generously in this commit.
Making types printable now requires two steps:
- add a ToString() method
- add this specialization:
template <>
constexpr inline bool llarp::IsToStringFormattable<llarp::Whatever> = true;
This will then allow the type to be printed as a "{}" value in a
fmt::format string. This is applied to all our printable types here,
and all of the `operator<<` are removed.
This commit also:
- replaces various uses of `operator<<` to ToString()
- replaces various uses of std::stringstream with either fmt::format or
plain std::string
- Rename some to_string and toString() methods to ToString() for
consistency (and to work with fmt)
- Replace `stringify(...)` and `make_exception` usage with fmt::format
(and remove stringify/make_exception from util/str.hpp).
Having it there (even defaulted, like this) means endpoint.hpp doesn't
have to include endpoint_state.hpp (which it otherwise would need,
because of the std::unique_ptr<EndpointState> default deleter
requirements).
We shouldn't be compiling these .cpp files at all on other platforms,
rather than compiling empty .cpp files (which later results in "... has
no symbols" warnings).
clean up version cmake stuff
clean up generated cpp version stuff
make all the windows rc stuff get generated by cmake
bump release motto message
properly inject release motto into version
if a pending inbound session did not complete a handshake after an unclean close from a previous session the
remote udp endpoint would remain stuck mapped as authed and thus any further attempts from the remote would
be silently dropped as it entered a stuck state in the state machine. this was happening as a small part
of the state machine was hidden in the implementation details of iwp, but instead should be in the super type
as it is logic exclusively outside the details which every dialect would have regardless of its details.
this commit will unmap the udp endpoint every time it needs to in the link layer state machine, independat of
the implementation details of the diact.
only send close packet once, before we were sending a close after we got a close causing excess log spam.
include handshake phase when checking for connection timeouts.
when we change our rc make sure to put it into nodedb too when we are a service node to prevent weirdness in dht lookups.