all_sites and all_ports rules should be at the end of their lists, they
should be searched last, because they are the least specific rules in
their lists, hence have lower precedences.
Also, obey the order of rules in conf files by adding sites, ports, and
macro values to their lists in the same order they are in conf files.
Update the unit and e2e tests accordingly, and improve.
Now the target IP address filters can use port specs too.
Refactor for code reuse, create filter_action struct used by rules,
sites, and ports.
Also, improve code and documentation.
End-to-end tests now require testproxy v0.0.4, which supports the new
Reconnect command for the Pass filtering rule.
Split mode with the -n option also supports filtering rules, so the
Divert rule can enable the divert mode even with the -n option. This is
because the purpose of the -n option is to convert sslproxy into an
sslsplit, and we want to support filtering rules in sslsplit-like
sslproxy too.
Do not call protopassthrough_engage() twice. If we call it to apply the
deferred pass action and also set the ctx->pass flag, it will be called
again in protossl_setup_src_ssl(). So, just set the flag and leave the
rest to protossl_setup_src_ssl().
Return value of 1 is for ending macro expansion. Otherwise, it coincided
with closing brace retval, which was wrongly breaking out of the while
loop in load_proxyspec_struct().
TODO: Use enums for return values.
It is better to error out earlier than to pass the option down to be
rejected by its handler (the handler may fail to reject, as was the case
with a couple of options).
Fix possible segfault if name has leading white space
Pass the name param to get_name_value() as char *, so it cannot be
modified ever
Improve unit tests for get_name_value and proxyspec_parse
And use all those return values.
Since we support include files now, we should be able to report in which
include file the error has occured. This is not possible if functions
just bail out calling exit(), because the user has to scroll back stderr
lines to find which include file has failed loading (a line starting
with 'Conf: ').
Plus, calling exit() on errors reduces unit testability of functions.
Also, handle all possible out of memory conditions in opts.c.
The new Define option can be used for defining macros to be used in
filtering rules. Macro names must begin with a '$' char. Macro values
must be separated with spaces.
Macros are expanded by rewriting the rule with the values of macro.
PassSite rules do not support macros (the PassSite option will be
deprecated in favor of filtering rules in the future).
Filtering rules can enable/disable or don't change logging. If a rule
does not mention a log action, its logging should not change. So, binary
log action fields were not enough to represent those 3 possibilities,
hence we have increased the size of those fields to 2-bits.
We should obey the order of rules as they are written in the conf file,
because latter rules should be able to override the log actions of
earlier rules. So, keep the order.
We should defer pass and/or block actions as long as possible, because a
higher precedence rule in SSL filter should be able to override (cancel)
deferred pass and block actions taken by a lower precedence rule in Dst
Host filter. And in HTTP filter the same applies to deferred block
actions taken by Dst Host and SSL filters.
Also, thanks to this new deferred actions, now HTTP filter can keep
enabled divert and split modes. In other words, a higher precedence HTTP
filter rule can cancel a deferred block action set by a lower precedence
rule earlier, which was not possible before without deferred actions and
rule precedence.
And other improvements.
Now filtering rules can disable log actions too. This is possible thanks
to the newly added precedence field of rules. Log actions of filtering
rules at higher precedence can modify logging now. In other words, more
specific rules can change the log actions of more general rules.
HTTP filtering rules can only disable logging.
Now we assign precedence to each filtering rule. More specific rules
have higher precedence. So, filtering rules at lower precedence cannot
override the actions applied to a conn by filtering rules at higher
precedence.
The other precedence rules still apply.
Log actions specified in HTTP filter rules can never enable disabled
logging, because their loggers would not be initialized.
Perhaps we should initialize them in the log submit function, if they
are initialized yet.
- Match action is added to be used with log actions only, the other
filter actions can specify log actions too
- Log actions do not configure any loggers. Global loggers for
respective log actions should have been configured for those log actions
to have any effect.
- If no filter rules are defined for a proxyspec, all log actions are
enabled. Otherwise, all log actions are disabled, and filtering rules
should enable them specifically.
- Fix max number of tokens in proxyspec and filter parsers
- Fix issues with rejecting unknown args in filter rule parser
- Do not use filter_rules field of proxyspec after config finished, it
is used for filter configuration and freed afterwards
Match is a better term than ignore, if the other actions are not
returned but there is a matching filter rule. Otherwise, it is up to the
caller to ignore a match or not. Plus, we can implement Match filtering
rule too, e.g. for content logging as in OpenBSD/pf.
If the value for the Divert option is not yes|no, it is assumed to be a
Divert filtering rule. So the parser for filtering rules should issue
any errors.
Differentiate filter action for site match from no site match. The
search should stop if a match is found, even if the action does not
change anything in effect (divert/split action in divert/split mode,
respectively) or the action is ignored (pass action in passthrough
mode).
The Divert option is not equivalent to the command line -n option.
Also, move the global static split var to tmp struct removed after
config is finished.
- SSL and Dst Host filters can take all of the actions.
- HTTP filter can only take block action, not divert, split, or pass.
Because, we cannot tear a conn down and reconnect its src, after the
processing of HTTP request header is complete, e.g. SSLproxy line has
already been added to its dst buffer. Also, any change in child conns
would affect listening programs too.
- The precedence of filters is as Dst Host > SSL > HTTP.
- The precedence of actions is as Divert > Split > Pass > Block. This is
only for the same type of filter.
- The precedence of match sites is as sni > cn for ssl filter and host >
uri for http filter.
For example, pass action of dst host filter is taken before split action
of ssl filter, due to the precedence order of filters.
For example, pass action of sni site is taken before split action of cn,
due to the precedence order of sites.
We now create src ssl before enabling src to be able to take divert or
split actions of SSL filter. Otherwise, we wouldn't be able to switch
between divert and split while enabling src, only pass or block action
could be taken at that stage.
Also, refactor and clean up.
(Divert|Split|Pass|Block)
([from (
user (username|*) [desc keyword]|
ip (clientaddr|*)|
*)]
[to (
sni (servername[*]|*)|
cn (commonname[*]|*)|
host (host[*]|*)|
uri (uri[*]|*)|
ip (serveraddr|*)|
*)]
|*)
Also, fix a couple of issues with filter rule handling
Clean up
Now we don't go over all of the passsite rules in a linked list trying
to apply passsite to the sni or common names of a conn. Instead, we now
have user+keyword, keyword, ip, and all lists. For example, if we find
the conn user in the user+keyword list and a passsite in that list
matches, we don't look into other lists.
This change is expected to improve the performance of passsite
processing considerably, because in the earlier implementation we had to
go over all of the passsite rules trying to match passsite.
And this solution uses a correct data structure, even if not the best.
For example, each user or keyword in passsite rules is strdup()'ed only
once.
Note that a better solution could use, say, a hash table for users,
instead of a linked list. But hash tables are not suitable for keywords
or sites, because we search for substring matches with them, not exact
matches.
Also, this fixes passsite rules without any filters defined, i.e. to be
applied to all connections.
Also, now e2e tests error exit if WITHOUT_USERAUTH is enabled. E2e tests
require UserAuth enabled.
Now the site field in PassSite option can have an '*' suffix to search
for a match anywhere in sni or common names. Note that this is not a
regex or wildcard search.
Previously, we only supported exact matches in sni and between slashes
in common names. This change makes it possible to cover multiple sites
in one PassSite option. In fact, without this change, certain sites
could not be added as passsite, because it was impossible to know their
subdomain names beforehand, for example *.fbcdn.net, which may have many
subdomain names in place of asterisk.
So to use substring match, append an '*' to a site name in PassSite
option (the asterisk is removed before substring search). For example,
use ".fbcdn.net*" to match all subdomains of fbcdn.net, notice the
asterisk at the end.
We also add a warning log starting with "Closing on ssl error without
passsite match" to report sites that can be added as passsite, which is
expected to help in writing PassSite rules.
Also, we now set dstaddr_str earlier in conn handling, so we can print
it in debug logs. This also helps in IDLE and EXPIRED conn logs.
If more than one thread enters protossl_pass_site() with the same
proxyspec, they all use spec->opts->passsites->site. Since
protossl_pass_site() modifies the site arg, spec->opts->passsites->site
may be broken. For example, /example.org/ may become /example.or//,
which really happened.
We should identify conn user before setting dst up in split mode.
Because in split mode dst setup also sets src up too, which tries to
apply passsite rules and switch to passthrough mode. But since user
identification has not run yet, we don't know the user owner of the
conn, which fails passsite rules.
This prevents unnecessary malloc and memmove calls in non-debug mode.
This change is for correctness not for speed, because it improves
conn handling only of the first packet and for non-http protos.