Match is a better term than ignore, if the other actions are not
returned but there is a matching filter rule. Otherwise, it is up to the
caller to ignore a match or not. Plus, we can implement Match filtering
rule too, e.g. for content logging as in OpenBSD/pf.
If the value for the Divert option is not yes|no, it is assumed to be a
Divert filtering rule. So the parser for filtering rules should issue
any errors.
Differentiate filter action for site match from no site match. The
search should stop if a match is found, even if the action does not
change anything in effect (divert/split action in divert/split mode,
respectively) or the action is ignored (pass action in passthrough
mode).
The Divert option is not equivalent to the command line -n option.
Also, move the global static split var to tmp struct removed after
config is finished.
- SSL and Dst Host filters can take all of the actions.
- HTTP filter can only take block action, not divert, split, or pass.
Because, we cannot tear a conn down and reconnect its src, after the
processing of HTTP request header is complete, e.g. SSLproxy line has
already been added to its dst buffer. Also, any change in child conns
would affect listening programs too.
- The precedence of filters is as Dst Host > SSL > HTTP.
- The precedence of actions is as Divert > Split > Pass > Block. This is
only for the same type of filter.
- The precedence of match sites is as sni > cn for ssl filter and host >
uri for http filter.
For example, pass action of dst host filter is taken before split action
of ssl filter, due to the precedence order of filters.
For example, pass action of sni site is taken before split action of cn,
due to the precedence order of sites.
We now create src ssl before enabling src to be able to take divert or
split actions of SSL filter. Otherwise, we wouldn't be able to switch
between divert and split while enabling src, only pass or block action
could be taken at that stage.
Also, refactor and clean up.
(Divert|Split|Pass|Block)
([from (
user (username|*) [desc keyword]|
ip (clientaddr|*)|
*)]
[to (
sni (servername[*]|*)|
cn (commonname[*]|*)|
host (host[*]|*)|
uri (uri[*]|*)|
ip (serveraddr|*)|
*)]
|*)
Also, fix a couple of issues with filter rule handling
Clean up
Now we don't go over all of the passsite rules in a linked list trying
to apply passsite to the sni or common names of a conn. Instead, we now
have user+keyword, keyword, ip, and all lists. For example, if we find
the conn user in the user+keyword list and a passsite in that list
matches, we don't look into other lists.
This change is expected to improve the performance of passsite
processing considerably, because in the earlier implementation we had to
go over all of the passsite rules trying to match passsite.
And this solution uses a correct data structure, even if not the best.
For example, each user or keyword in passsite rules is strdup()'ed only
once.
Note that a better solution could use, say, a hash table for users,
instead of a linked list. But hash tables are not suitable for keywords
or sites, because we search for substring matches with them, not exact
matches.
Also, this fixes passsite rules without any filters defined, i.e. to be
applied to all connections.
Also, now e2e tests error exit if WITHOUT_USERAUTH is enabled. E2e tests
require UserAuth enabled.
Now the site field in PassSite option can have an '*' suffix to search
for a match anywhere in sni or common names. Note that this is not a
regex or wildcard search.
Previously, we only supported exact matches in sni and between slashes
in common names. This change makes it possible to cover multiple sites
in one PassSite option. In fact, without this change, certain sites
could not be added as passsite, because it was impossible to know their
subdomain names beforehand, for example *.fbcdn.net, which may have many
subdomain names in place of asterisk.
So to use substring match, append an '*' to a site name in PassSite
option (the asterisk is removed before substring search). For example,
use ".fbcdn.net*" to match all subdomains of fbcdn.net, notice the
asterisk at the end.
We also add a warning log starting with "Closing on ssl error without
passsite match" to report sites that can be added as passsite, which is
expected to help in writing PassSite rules.
Also, we now set dstaddr_str earlier in conn handling, so we can print
it in debug logs. This also helps in IDLE and EXPIRED conn logs.
If more than one thread enters protossl_pass_site() with the same
proxyspec, they all use spec->opts->passsites->site. Since
protossl_pass_site() modifies the site arg, spec->opts->passsites->site
may be broken. For example, /example.org/ may become /example.or//,
which really happened.
We should identify conn user before setting dst up in split mode.
Because in split mode dst setup also sets src up too, which tries to
apply passsite rules and switch to passthrough mode. But since user
identification has not run yet, we don't know the user owner of the
conn, which fails passsite rules.
This prevents unnecessary malloc and memmove calls in non-debug mode.
This change is for correctness not for speed, because it improves
conn handling only of the first packet and for non-http protos.
- Fix segfault introduced in previous commit to prevent extra eof event.
We should NULL srvdst.bev after terminating child dst xferred from
srvdst of parent, so that we don't try to access srvdst.bev. This
happens if child conn with dst xferred from parent srvdst is terminated
before parent conn.
- Fix autossl crash trying to engage passthrough mode. We cannot engage
passthrough mode in autossl, because src is already enabled. But we
shouldn't crash either. These changes are expected to fix other possible
segfaults if passthrough is engaged on eventcb of a child conn.
We reuse srvdst as dst or child dst, so srvdst == dst or child_dst.
But if we don't NULL the callbacks of srvdst in split mode,
we randomly but rarely get a second eof event for srvdst during conn
termination (especially on arm64), which crashes us with signal 11 or
10, because the first eof event for dst frees the ctx.
Note that we don't free anything here, but just disable callbacks and
events.
This does not seem to happen with srvdst_xferred, but just to be safe we
do the same for it too.
This seems to be an issue with libevent.
TODO: Why does libevent raise the same event again for an already
disabled and freed conn end? Note again that srvdst == dst or child_dst
here.
Currently, an autossl conn writes 3x CONN lines in connection logs,
when:
1. tcp srvdst connects (before ssl upgrade)
2. ssl dst connects (src ssl is not complete yet)
3. ssl src connects
This causes misleading connection statistics, as in UTMFW.
TODO: We should write CONN logs at conn termination times, for all
protos not just autossl, not tcp or ssl connect times, so that we don't
write multiple CONN logs for a single conn.
The -n command line option enables split mode for all proxyspecs,
effectively making sslproxy behave like sslsplit.
Divert option can be set/unset globally and per-proxyspec.
Add e2e tests for split mode, and update make file for tests
accordingly.
Update documentation accordingly.
Improve code reuse, remove duplicate functions.
This change deserves a release of its own, hence v0.8.4.
Allow omitting the -T option, indicating the target is irrelevant.
The use case is an IDS sensor listening on a dummy interface for the
packets sslsplit produces. The IDS will listen in promisc mode, so the
target is irrelevant.
Copied from sslsplit.
Otherwise, we cannot classify user if we need to issue identify_user
events, in case database is busy or locked. We should call classify_user
callback right after the user is identified.
So we introduce classify_user callback to achieve that, which fixes the
classify_user behavior for autssl proto too.
Return void in pxy_userauth
Fix typo in clasify
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:106:10: warning:
‘__builtin_strncpy’ output truncated before terminating nul copying as
many bytes from a string as its length [-Wstringop-truncation]
logpkt.c:351:3: warning: converting a packed ‘ip4_hdr_t’ {aka ‘struct
<anonymous>’} pointer (alignment 1) to a ‘uint16_t’ {aka ‘short unsigned
int’} pointer (alignment 2) may result in an unaligned pointer value
[-Waddress-of-packed-member]
The global opts strings in this new tmp struct are used while cloning
global opts into proxyspec opts. A var of this type is passed around as
a flag to indicate if these opts are global (if non-NULL), so should be
stored in that struct and used as such, or proxyspec specific (if NULL),
so should not be used as global. This var is temporary, hence freed
immediately after configuration is complete.
Also improve and clean up.
These vars are used while configuring proxyspecs, and freed right after
they are used. So they should not be in proxyspec struct.
Refactor accordingly.
Add testproxy e2e tests for POP3 and SMTP protocol validation.
We have detected that POP3 and SMTP protocol validation was broken
thanks to these new testproxy e2e tests. This is yet another example why
e2e tests are important.
We don't need parent or child ids unless debugging. IDLE and EXPIRED
conn logs do not need to report ids either. Ids are useful only in
detailed debug logs.
This is necessary to uniquely identify child conns. The src fd of child
conns was possibly not unique. We use this id in debug logs only.
Also relocate the update code related with this id.
Do not pass pxy_thr_print_children() or bufferevent_getfd() to MAX() or
util_max() macro functions as params, or else they are called twice.
Since MAX() macro call duplicates params, do not call it nested either,
or else we get very long macro expansions.
ce5f409dbe
("Zero all bytes when passing file descriptors over AF_UNIX sockets",
2018-11-12)
Also, bufferevent_getfd() returns -1 if no file descriptor is associated
with the bufferevent.
Free vars.
Finalize sqlite3 statements.
Close sqlite3 db.
Init memory.
Do not close fd -1.
Some of these may be harmless, but we fix them anyway. Now valgrind
reports 0 "lost" memory, but some "still reachable", both for sslproxy
and lp.
We don't need a privsep call to open a socket for child listener,
because listener port of child conns are assigned by the system, hence
are from non-privileged range above 1024.
So the open privsep socket is used only to update user atime now.
We have carried almost all conn init tasks from thrmgr to conn handling
thread. So we immediately add the conn to the conn list of its thr,
which renders both pending ssl conns list and in_thr_conns flag useless.
The only time we go over the linked list is to check idle or expired
connections, or to print debug info. Otherwise, mostly what we need is
to add and remove list nodes. Removing a list node becomes a very simple
task if we keep track of the previous node too. So now we also keep
record of prev node, and update prev node as we add and remove nodes.
All three linked lists we use benefit from this data structure
improvement, making it very fast to remove a list node.
Another benefit of this change is that we don't need to identify conns
with their id numbers or child conns with their src fds. So now we
directly delete them, without needing to check their ids or fds.
Do we need a thr mutex? This mutex is for thread-safe access to
thr.load. But thrmgr read-accesses thr.load, and write-accesses are by
thr only. So can we really live without it?
So now we do a couple of expensive tasks on conn handling threads, not
on thrmgr: Add the conn to its thread conn list, check fd usage, nat
lookup dst, and make string src addr.
This prevents possible multithreading issues between thrmgr and conn
handling threads. So we can remove and clean up the code and comments
related with such possible issues now. For example, we can add the conn
to its thread list earlier, and we can handle errors immediately, thanks
to this early switch to conn handling threads. This also helps achieve
cleaner code.
Enable dst r/w events before socket connect.
Improve verbose debug logs using common header fields to better identify
connections.
Create function macros for fine* debug logs.
This happens if there was no autossl handshake prior to ClientHello,
e.g. no STARTTLS message. This is perhaps due to the SSL handshake of a
direct SSL connection, i.e. invalid protocol.
We should not crash upon protocol errors, hence the need for fuzzing
tests.
We don't do anything in srvdst writecb except for passhtrough mode.
We handle srvdst and dst connect tasks in connectcb for them by
arranging connect events correctly, so we don't need any extra flags.
Correct connect ordering helps us remove code checking if bev exists.
There were a lot of unnecessary code in autossl. Tcp and ssl code are
decoupled now.
Because we directly relay the packets from the server to the client
until we receive the first packet from the client, at which time we xfer
srvdst to the first child conn and effectively disable this readcb,
hence start diverting packets to the listening program.
Improve documentation.
Otherwise, we cannot properly shutdown the src conn end of an autossl
conn, and when the next conn uses the same fd of that src, the callback
functions (e.g. the writecb) do not fire, which effectively stalls the
conn. This fixes a longtime issue with autossl support.
So remove pxysslshut.c/h files, not used anymore
Otherwise, if we assume that the port is always 5 chars, we leave a NULL
char between the sslproxy header and CRLF, which confuses
pxy_insert_sslproxy_header() and pxy_try_remove_sslproxy_header(), and
we cannot remove the sslproxy header.
Create function macros for fine* debug logs
Fix a few memory leaks when DEBUG_PROXY enabled
Add main.mk to MKFS list
Put a few function params within DEBUG_PROXY directives
Check retval of a snprintf() call
Fix segfault with -w/-W options if no ssl proxyspec specified, also fixed in sslsplit develop: https://github.com/droe/sslsplit/issues/271
Various clean-up