Internals
WAF Engine
Waf is the main interface used to store settings, rules and create transactions, most directives will set variables for Waf instances. A coraza implementation might have unlimited Waf instances and each Waf might process unlimited transactions.
Transactions
Transactions are an instance of an url call for a Waf instance, transactions are created with wafinstance.NewTransaction()
. Transactions hold collections and configurations that may be updated using rules.
Macro Expansion
Macro expansions are a function available for transactions
, a macro expansion will compile a string and provide variables data to the current context. Macro expansion is performed by running a regular expression that will find %{request_headers.test}
and replace the value using:
v1 := tx.GetCollection(variables.RequestHeaders).GetFirstString("test")
v2 := tx.MacroExpansion("%{request_headers.test}")
v1 == v2
// true
Rules
Rules are triggered by RuleGroup.Evaluate(phase)
based on the phase number, rules with phase 0 or rule.AlwaysMatch
will always run. Rules that always run are SecMarkers or SecActions which means rules without operators.
Rules marked with a SecMarker will be used to control execution flow and tell the transaction to stop skipping rules from skipAfter
.
Different from ModSecurity, each rule is a unique struct in Coraza and is shared between each transaction of the same Waf
instance, which means a transaction should never update any field from a Rule and all variable fields must be stored within the transaction instead.
Once a rule is triggered, it will follow the following flow:
- Skip if this rule is removed for the current transaction
- Fill the
RULE
variable data which contains fields from the current rule - Apply removed targets for this transaction
- Compile each
variable
, normal, counters, negations and “always match” - Apply transformations for each variable, match or multi-match
- Execute the current operator for each variable
- Continue if there was any match
- Evaluate all non-disruptive actions
- Evaluate chains recursively
- Log data if requested
- Evaluate
disruptive
andflow
rules
The return of this function contains each MatchData
, which will tell the transaction where exactly the data was matched, Variable, Key and Value. Maybe we should add if it was negation in the future, SecActions and SecMarkers will return a placeholder.
Important: Rules may update a Transaction
behaviour but not a Waf
instance.
Operators
Operators are stored in github.com/corazawaf/coraza/tree/v3/dev/internal/operators
and contain an initializer and an evaluation function. Initializers are used to apply arguments during compilation, for example, "@rx /\d+/"
will run op.Init("/\\d+")
. op.Evaluate(tx, "args")
is applied for each compiled variable and will return if the condition matches. Operators uses Transaction
to create logs, capture fields and access additional variables from the transaction.
Note: Operators must be concurrent-friendly
Actions
Actions are stored in github.com/coraza-waf/coraza/v2/actions
and contains an initializer and an evaluation function, the initializers are evaluated during compilation, for example, id:4
will run act.Init("4")
. Depending on the Type()
of each action, it will run on different phases.
- Non-Disruptive: Do something, but that something does not and cannot affect the rule processing flow. Setting a variable, or changing its value is an example of a non-disruptive action. Non-disruptive action can appear in any rule, including each rule belonging to a chain. Non-disruptive rules are evaluated after the rule matches some data.
- Flow actions: These actions affect the rule flow (for example skip or skipAfter). Flow actions are evaluated after the rule successfully matched and will only run for the parent rule of a chain.
- Meta-data actions: Meta-data actions are used to provide more information about rules. Examples include id, rev, severity and msg. Meta-data rules are only initialized, they won’t be evaluated,
act.Evaluate(...)
will never be called.
Transformations
Transformations are simple functions to transform some string into another string. There is a special struct called transactions.Tools
, that contains useful “tools” required for some transformations, which are UnicodeMapping
for utf8ToUnicode
and waf.Logger
to debug transformations. More fields may be added in the future.
Note: Transformations are evaluated thousands of times per transaction and they must be SUPER FAST.
Rule Groups
Rule Groups are like Modsecurity Rules
, it’s just a container for rules that will return the list of rules concurrent-safe and will evaluate rules based on the requested phase.
Collections
Collections are used by Coraza to store Variables
, all Variables are treated as the same type, even if they map values, they are single values or arrays.
Collections are stored as a slice []*Collection
, each index is assigned based on it’s constant name provided by variables.go
. For example, if you want to get a collection you might use tx.GetCollection(variables.Files)
. If you want to transform a named variable to it’s constant you may use:
b, _ := variables.ParseVariable("FILES")
tx.GetCollection(b)
In the following example we are showing the output for tx.GetCollection(variables.RequestHeaders).Data()
.
{
"user-agent": [
"some user agent string"
]
}
Some helpers may be used for this cases, like tx.GetCollection(variables.RequestHeaders).GetFirstString("")
.
Variables are compiled in runtime in order to support Regex(precompiled) and XML, the function tx.GetField(variable)
. Using RuleVariable.Exceptions and []exceptions might seem redundant but both are different, the list of exception is complemented from the rule. In case of Regex, GetField
will use RuleVariable.Regex
to match data instead of RuleVariable.Key
.
Note: Collections are not concurrent-safe, don’t share transactions between routines.
Phases
Phases are used by RuleGroup
to filter between execution phases on HTTP/1.1 and HTTP/1.0.
Phase 1: Request Headers
This phase process theorically consists in three phases:
- Connection (
tx.ProcessConnection()
): Request address and port - Request line (
tx.ProcessURI()
): Request URL, does not include GET arguments - Request headers (
tx.ProcessRequestHeaders()
) Will evaluate phase 1
Phase 2: Request Body
This phase only runs when RequestBodyAcces
is On
, otherwise we will skip to phase 3. This phase will do one of the following:
- Reject transaction if the request body is too long and
RequestBodyLimitAction
is set toReject
- If
URLENCODED
: set POST arguments and request_Body - If
MULTIPART
: Parse files and set FILES variables - If
JSON
: Not implemented yet - If none of the above was met and
ForceRequestBodyVariable
is set to true, URLENCODED will be forced
See Body Handling for more info.
Phase 3: Response Headers
Phase 4: Response Body
Phase 5: Logging
This is a special phase, it will always run but it must be handled by the client. For example, if there is any error reported by Coraza, the client must at least implement a defer tx.ProcessLogging()
. This phase will close handlers, save persistent collections and write audit loggers, in order to write the audit loggers the following conditions must be met:
- Transaction was marked with
auditlog
action - There must be at least one audit logger (
SecAuditLog
) AuditEngine
must beOn
orRelevantOnly
- If
AuditEngine
was set toRelevantOnly
the response status must matchAuditLogRelevantStatus
Body handling
BodyBuffer is a struct that will manage the request or response buffer and store the data to temprary files if required. BodyBuffer will apply a few rules to decide whether to buffer the data in memory or write a temporary file, it will also return a Reader
to the memory buffer or the temporary file created. Temporary files must be deleted by tx.ProccessLoging
.
Persistent Collections
Not working yet.