Fix possible segfault if name has leading white space
Pass the name param to get_name_value() as char *, so it cannot be
modified ever
Improve unit tests for get_name_value and proxyspec_parse
And use all those return values.
Since we support include files now, we should be able to report in which
include file the error has occured. This is not possible if functions
just bail out calling exit(), because the user has to scroll back stderr
lines to find which include file has failed loading (a line starting
with 'Conf: ').
Plus, calling exit() on errors reduces unit testability of functions.
Also, handle all possible out of memory conditions in opts.c.
The new Define option can be used for defining macros to be used in
filtering rules. Macro names must begin with a '$' char. Macro values
must be separated with spaces.
Macros are expanded by rewriting the rule with the values of macro.
PassSite rules do not support macros (the PassSite option will be
deprecated in favor of filtering rules in the future).
Filtering rules can enable/disable or don't change logging. If a rule
does not mention a log action, its logging should not change. So, binary
log action fields were not enough to represent those 3 possibilities,
hence we have increased the size of those fields to 2-bits.
We should obey the order of rules as they are written in the conf file,
because latter rules should be able to override the log actions of
earlier rules. So, keep the order.
We should defer pass and/or block actions as long as possible, because a
higher precedence rule in SSL filter should be able to override (cancel)
deferred pass and block actions taken by a lower precedence rule in Dst
Host filter. And in HTTP filter the same applies to deferred block
actions taken by Dst Host and SSL filters.
Also, thanks to this new deferred actions, now HTTP filter can keep
enabled divert and split modes. In other words, a higher precedence HTTP
filter rule can cancel a deferred block action set by a lower precedence
rule earlier, which was not possible before without deferred actions and
rule precedence.
And other improvements.
Now filtering rules can disable log actions too. This is possible thanks
to the newly added precedence field of rules. Log actions of filtering
rules at higher precedence can modify logging now. In other words, more
specific rules can change the log actions of more general rules.
HTTP filtering rules can only disable logging.
Now we assign precedence to each filtering rule. More specific rules
have higher precedence. So, filtering rules at lower precedence cannot
override the actions applied to a conn by filtering rules at higher
precedence.
The other precedence rules still apply.
Log actions specified in HTTP filter rules can never enable disabled
logging, because their loggers would not be initialized.
Perhaps we should initialize them in the log submit function, if they
are initialized yet.
- Match action is added to be used with log actions only, the other
filter actions can specify log actions too
- Log actions do not configure any loggers. Global loggers for
respective log actions should have been configured for those log actions
to have any effect.
- If no filter rules are defined for a proxyspec, all log actions are
enabled. Otherwise, all log actions are disabled, and filtering rules
should enable them specifically.
- Fix max number of tokens in proxyspec and filter parsers
- Fix issues with rejecting unknown args in filter rule parser
- Do not use filter_rules field of proxyspec after config finished, it
is used for filter configuration and freed afterwards
Match is a better term than ignore, if the other actions are not
returned but there is a matching filter rule. Otherwise, it is up to the
caller to ignore a match or not. Plus, we can implement Match filtering
rule too, e.g. for content logging as in OpenBSD/pf.
If the value for the Divert option is not yes|no, it is assumed to be a
Divert filtering rule. So the parser for filtering rules should issue
any errors.
Differentiate filter action for site match from no site match. The
search should stop if a match is found, even if the action does not
change anything in effect (divert/split action in divert/split mode,
respectively) or the action is ignored (pass action in passthrough
mode).
The Divert option is not equivalent to the command line -n option.
Also, move the global static split var to tmp struct removed after
config is finished.
- SSL and Dst Host filters can take all of the actions.
- HTTP filter can only take block action, not divert, split, or pass.
Because, we cannot tear a conn down and reconnect its src, after the
processing of HTTP request header is complete, e.g. SSLproxy line has
already been added to its dst buffer. Also, any change in child conns
would affect listening programs too.
- The precedence of filters is as Dst Host > SSL > HTTP.
- The precedence of actions is as Divert > Split > Pass > Block. This is
only for the same type of filter.
- The precedence of match sites is as sni > cn for ssl filter and host >
uri for http filter.
For example, pass action of dst host filter is taken before split action
of ssl filter, due to the precedence order of filters.
For example, pass action of sni site is taken before split action of cn,
due to the precedence order of sites.
We now create src ssl before enabling src to be able to take divert or
split actions of SSL filter. Otherwise, we wouldn't be able to switch
between divert and split while enabling src, only pass or block action
could be taken at that stage.
Also, refactor and clean up.
(Divert|Split|Pass|Block)
([from (
user (username|*) [desc keyword]|
ip (clientaddr|*)|
*)]
[to (
sni (servername[*]|*)|
cn (commonname[*]|*)|
host (host[*]|*)|
uri (uri[*]|*)|
ip (serveraddr|*)|
*)]
|*)
Also, fix a couple of issues with filter rule handling
Clean up
Now we don't go over all of the passsite rules in a linked list trying
to apply passsite to the sni or common names of a conn. Instead, we now
have user+keyword, keyword, ip, and all lists. For example, if we find
the conn user in the user+keyword list and a passsite in that list
matches, we don't look into other lists.
This change is expected to improve the performance of passsite
processing considerably, because in the earlier implementation we had to
go over all of the passsite rules trying to match passsite.
And this solution uses a correct data structure, even if not the best.
For example, each user or keyword in passsite rules is strdup()'ed only
once.
Note that a better solution could use, say, a hash table for users,
instead of a linked list. But hash tables are not suitable for keywords
or sites, because we search for substring matches with them, not exact
matches.
Also, this fixes passsite rules without any filters defined, i.e. to be
applied to all connections.
Also, now e2e tests error exit if WITHOUT_USERAUTH is enabled. E2e tests
require UserAuth enabled.
Now the site field in PassSite option can have an '*' suffix to search
for a match anywhere in sni or common names. Note that this is not a
regex or wildcard search.
Previously, we only supported exact matches in sni and between slashes
in common names. This change makes it possible to cover multiple sites
in one PassSite option. In fact, without this change, certain sites
could not be added as passsite, because it was impossible to know their
subdomain names beforehand, for example *.fbcdn.net, which may have many
subdomain names in place of asterisk.
So to use substring match, append an '*' to a site name in PassSite
option (the asterisk is removed before substring search). For example,
use ".fbcdn.net*" to match all subdomains of fbcdn.net, notice the
asterisk at the end.
We also add a warning log starting with "Closing on ssl error without
passsite match" to report sites that can be added as passsite, which is
expected to help in writing PassSite rules.
Also, we now set dstaddr_str earlier in conn handling, so we can print
it in debug logs. This also helps in IDLE and EXPIRED conn logs.
If more than one thread enters protossl_pass_site() with the same
proxyspec, they all use spec->opts->passsites->site. Since
protossl_pass_site() modifies the site arg, spec->opts->passsites->site
may be broken. For example, /example.org/ may become /example.or//,
which really happened.
We should identify conn user before setting dst up in split mode.
Because in split mode dst setup also sets src up too, which tries to
apply passsite rules and switch to passthrough mode. But since user
identification has not run yet, we don't know the user owner of the
conn, which fails passsite rules.
This prevents unnecessary malloc and memmove calls in non-debug mode.
This change is for correctness not for speed, because it improves
conn handling only of the first packet and for non-http protos.
- Fix segfault introduced in previous commit to prevent extra eof event.
We should NULL srvdst.bev after terminating child dst xferred from
srvdst of parent, so that we don't try to access srvdst.bev. This
happens if child conn with dst xferred from parent srvdst is terminated
before parent conn.
- Fix autossl crash trying to engage passthrough mode. We cannot engage
passthrough mode in autossl, because src is already enabled. But we
shouldn't crash either. These changes are expected to fix other possible
segfaults if passthrough is engaged on eventcb of a child conn.
We reuse srvdst as dst or child dst, so srvdst == dst or child_dst.
But if we don't NULL the callbacks of srvdst in split mode,
we randomly but rarely get a second eof event for srvdst during conn
termination (especially on arm64), which crashes us with signal 11 or
10, because the first eof event for dst frees the ctx.
Note that we don't free anything here, but just disable callbacks and
events.
This does not seem to happen with srvdst_xferred, but just to be safe we
do the same for it too.
This seems to be an issue with libevent.
TODO: Why does libevent raise the same event again for an already
disabled and freed conn end? Note again that srvdst == dst or child_dst
here.
Currently, an autossl conn writes 3x CONN lines in connection logs,
when:
1. tcp srvdst connects (before ssl upgrade)
2. ssl dst connects (src ssl is not complete yet)
3. ssl src connects
This causes misleading connection statistics, as in UTMFW.
TODO: We should write CONN logs at conn termination times, for all
protos not just autossl, not tcp or ssl connect times, so that we don't
write multiple CONN logs for a single conn.
The -n command line option enables split mode for all proxyspecs,
effectively making sslproxy behave like sslsplit.
Divert option can be set/unset globally and per-proxyspec.
Add e2e tests for split mode, and update make file for tests
accordingly.
Update documentation accordingly.
Improve code reuse, remove duplicate functions.
This change deserves a release of its own, hence v0.8.4.
readcb fires before connect eventcb, so we enable it in readcb now. But
perhaps lp should behave like sslproxy and not enable readcb until after
connect eventcb.
Note that there is no problem with sslproxy, it's just lp.