Now, all of the 'to' site fields in filtering rules can specify a port,
not just the dstip sites.
Fix the precedence of sites in the same type of rules. For example, if
we find a match with an sni site, we should not stop searching for a
match in cn, because a matching cn site may have a higher precedence
than the matching sni site. We should apply the action of the cn site,
although sni rules have precedence over cn. The same applies to http
host and uri rules too.
Fix the precedence of dstip rules.
Improve and update unit and e2e tests accordingly.
Now, the filter uses B-trees for exact string matching and Aho-Corasick
machines for substring matching. B-trees and AC machines are exported to
linked lists for debug logging only.
Also,
- Separate all_sites and all_ports filters from substring filters. They
are not related with substring filters actually, and ACM keywords cannot
be empty strings anyway. So now they should be handled separately too.
- Improve debug logging of filtering rules.
- Update unit tests accordingly, and improve.
- Fix pxyconn_filter(), keep searching for a match in substring filters
if exact match does not have a matching site rule.
- Increase common names max len and tokens. weather.gov has 73 tokens.
- Rename keyword to desc.
- Update documentation.
- Clean up.
Add to the end of linked lists for correct list ordering, but btrees
cannot obey this ordering.
Also, update the unit tests accordingly.
And fix compile with WITHOUT_USERAUTH.
So, for 'to' fields too, we use two separate data structures: binary
search trees (BST) for exact match and linked lists for substring match.
Now all 'from' and 'to' fields in filtering rules use these two data
structures.
To repeat, filtering rules should be written with exact matches instead
of substring matches, as much as possible. Because BST search must be
much faster than substring search over linked lists.
To repeat, we have modifed kbtree to support complex data structures in
from fields.
Also, update the unit tests accordingly.
all_sites and all_ports rules should be at the end of their lists, they
should be searched last, because they are the least specific rules in
their lists, hence have lower precedences.
Also, obey the order of rules in conf files by adding sites, ports, and
macro values to their lists in the same order they are in conf files.
Update the unit and e2e tests accordingly, and improve.
Now the target IP address filters can use port specs too.
Refactor for code reuse, create filter_action struct used by rules,
sites, and ports.
Also, improve code and documentation.
End-to-end tests now require testproxy v0.0.4, which supports the new
Reconnect command for the Pass filtering rule.
Split mode with the -n option also supports filtering rules, so the
Divert rule can enable the divert mode even with the -n option. This is
because the purpose of the -n option is to convert sslproxy into an
sslsplit, and we want to support filtering rules in sslsplit-like
sslproxy too.
Fix possible segfault if name has leading white space
Pass the name param to get_name_value() as char *, so it cannot be
modified ever
Improve unit tests for get_name_value and proxyspec_parse
And use all those return values.
Since we support include files now, we should be able to report in which
include file the error has occured. This is not possible if functions
just bail out calling exit(), because the user has to scroll back stderr
lines to find which include file has failed loading (a line starting
with 'Conf: ').
Plus, calling exit() on errors reduces unit testability of functions.
Also, handle all possible out of memory conditions in opts.c.
Filtering rules can enable/disable or don't change logging. If a rule
does not mention a log action, its logging should not change. So, binary
log action fields were not enough to represent those 3 possibilities,
hence we have increased the size of those fields to 2-bits.
We should obey the order of rules as they are written in the conf file,
because latter rules should be able to override the log actions of
earlier rules. So, keep the order.
Now we assign precedence to each filtering rule. More specific rules
have higher precedence. So, filtering rules at lower precedence cannot
override the actions applied to a conn by filtering rules at higher
precedence.
The other precedence rules still apply.
- Match action is added to be used with log actions only, the other
filter actions can specify log actions too
- Log actions do not configure any loggers. Global loggers for
respective log actions should have been configured for those log actions
to have any effect.
- If no filter rules are defined for a proxyspec, all log actions are
enabled. Otherwise, all log actions are disabled, and filtering rules
should enable them specifically.
- Fix max number of tokens in proxyspec and filter parsers
- Fix issues with rejecting unknown args in filter rule parser
- Do not use filter_rules field of proxyspec after config finished, it
is used for filter configuration and freed afterwards
If the value for the Divert option is not yes|no, it is assumed to be a
Divert filtering rule. So the parser for filtering rules should issue
any errors.
(Divert|Split|Pass|Block)
([from (
user (username|*) [desc keyword]|
ip (clientaddr|*)|
*)]
[to (
sni (servername[*]|*)|
cn (commonname[*]|*)|
host (host[*]|*)|
uri (uri[*]|*)|
ip (serveraddr|*)|
*)]
|*)
Also, fix a couple of issues with filter rule handling
Clean up
Now we don't go over all of the passsite rules in a linked list trying
to apply passsite to the sni or common names of a conn. Instead, we now
have user+keyword, keyword, ip, and all lists. For example, if we find
the conn user in the user+keyword list and a passsite in that list
matches, we don't look into other lists.
This change is expected to improve the performance of passsite
processing considerably, because in the earlier implementation we had to
go over all of the passsite rules trying to match passsite.
And this solution uses a correct data structure, even if not the best.
For example, each user or keyword in passsite rules is strdup()'ed only
once.
Note that a better solution could use, say, a hash table for users,
instead of a linked list. But hash tables are not suitable for keywords
or sites, because we search for substring matches with them, not exact
matches.
Also, this fixes passsite rules without any filters defined, i.e. to be
applied to all connections.
Also, now e2e tests error exit if WITHOUT_USERAUTH is enabled. E2e tests
require UserAuth enabled.
The -n command line option enables split mode for all proxyspecs,
effectively making sslproxy behave like sslsplit.
Divert option can be set/unset globally and per-proxyspec.
Add e2e tests for split mode, and update make file for tests
accordingly.
Update documentation accordingly.
Improve code reuse, remove duplicate functions.
This change deserves a release of its own, hence v0.8.4.
readcb fires before connect eventcb, so we enable it in readcb now. But
perhaps lp should behave like sslproxy and not enable readcb until after
connect eventcb.
Note that there is no problem with sslproxy, it's just lp.
The global opts strings in this new tmp struct are used while cloning
global opts into proxyspec opts. A var of this type is passed around as
a flag to indicate if these opts are global (if non-NULL), so should be
stored in that struct and used as such, or proxyspec specific (if NULL),
so should not be used as global. This var is temporary, hence freed
immediately after configuration is complete.
Also improve and clean up.
Add testproxy e2e tests for POP3 and SMTP protocol validation.
We have detected that POP3 and SMTP protocol validation was broken
thanks to these new testproxy e2e tests. This is yet another example why
e2e tests are important.