Processing Pipelines

This documentation page describes the concepts and classes of pySigma that can be used for transformation of Sigma rules.

Sigma rules are tranformed to take care of differences between the Sigma rule and the target data model. Examples are differences in field naming schemes or value representation.

A processing pipeline has three stages:

  1. Rule pre-processing: transformations that are applied to the rule. Example: field name mapping, adding conditions.

  2. Query post-processing: transformations that are applied to the generated query. In this stage the transformaions have access to the query generated from the backend and the rule that was the source of the conversion. Example: embedding query and rule parts in a template to define custom output formats.

  3. Output finalization: finalizers operate on all post-processed queries to generate the final output. Example: merge all queries and add a header to the output.

Further resources:

Resolvers

Pipeline resolvers resolve identifiers and file names into a consolidated processing pipeline and take care of the appropriate ordering via the priority property that should be contained in a processing pipeline.

A processing pipeline resolver is a sigma.processing.resolver.ProcessingPipelineResolver object. It is initialized with an mapping between identifiers and sigma.processing.pipeline.ProcessingPipeline objects or callables that return such objects.

The method sigma.processing.resolver.ProcessingPipelineResolver.resolve_pipeline() returns a ProcessingPipeline object corresponsing with the given identifier or contained in the specified YAML file. sigma.processing.resolver.ProcessingPipelineResolver.resolve() returns a consolidated pipeline with the appropriate ordering as specified by the priority property of the specified pipelines.

class sigma.processing.resolver.ProcessingPipelineResolver(pipelines: ~typing.Dict[str, ~sigma.processing.pipeline.ProcessingPipeline | ~typing.Callable[[], ~sigma.processing.pipeline.ProcessingPipeline]] = <factory>)

A processing pipeline resolver resolves a list of pipeline specifiers into one summarized processing pipeline. It takes care of sorting by priority and resolution of filenames as well as pipeline name identifiers.

add_pipeline_class(pipeline: ProcessingPipeline) None

Add named processing pipeline object to resolver. This pipeline can be resolved by the name.

classmethod from_pipeline_list(pipelines: Iterable[ProcessingPipeline]) ProcessingPipelineResolver

Instantiate processing pipeline resolver from list of pipeline objects.

list_pipelines() Iterable[Tuple[str, ProcessingPipeline]]

List identifier/processing pipeline tuples.

resolve(pipeline_specs: List[str], target: str | None = None) ProcessingPipeline

Resolve a list of

  • processing pipeline names from pipelines added to the resolver or

  • file paths containing processing pipeline YAML definitions

into a consolidated processing piepline.

If target is specified this is passed in each resolve_pipeline call to perform a compatibility check for the usage of the specified backend with the pipeline.

resolve_pipeline(spec: str, target: str | None = None) ProcessingPipeline

Resolve single processing pipeline. It first tries to find a pipeline with this identifier in the registered pipelines. If this fails, spec is treated as file name. If this fails too, a SigmaPipelineNotFoundError is raised.

If target is specified, an additional check of the compatibility of the specified backend to the resolved pipeline is conducted. A SigmaPipelineNotAllowedForBackendError is raised if this check fails.

Processing Pipeline

Classes

class sigma.processing.pipeline.ProcessingPipeline(items: ~typing.List[~sigma.processing.pipeline.ProcessingItem] = <factory>, postprocessing_items: ~typing.List[~sigma.processing.postprocessing.QueryPostprocessingTransformation] = <factory>, finalizers: ~typing.List[~sigma.processing.finalization.Finalizer] = <factory>, vars: ~typing.Dict[str, ~typing.Any] = <factory>, priority: int = 0, name: str | None = None, allowed_backends: ~typing.FrozenSet[str] = <factory>)

A processing pipeline is configured with the transformation steps that are applied on Sigma rules and are configured by:

  • a backend to apply a set of base preprocessing of Sigma rules (e.g. renaming of fields).

  • the user in one or multiple configurations to conduct further rule transformation to adapt the rule to the environment.

A processing pipeline is instantiated once for a rule collection. Rules are processed in order of their appearance in a rule file or include order. Further, processing pipelines can be chained and contain variables that can be used from processing items.

apply(rule: SigmaRule | SigmaCorrelationRule) SigmaRule | SigmaCorrelationRule

Apply processing pipeline on Sigma rule.

field_was_processed_by(field: str | None, processing_item_id: str) bool

Check if field name was processed by a particular processing item.

classmethod from_dict(d: dict) ProcessingPipeline

Instantiate processing pipeline from a parsed processing item description.

classmethod from_yaml(processing_pipeline: str) ProcessingPipeline

Convert YAML input string into processing pipeline.

postprocess_query(rule: SigmaRule | SigmaCorrelationRule, query: Any) Any

Post-process queries with postprocessing_items.

track_field_processing_items(src_field: str, dest_field: List[str], processing_item_id: str | None) None

Track processing items that were applied to field names. This adds the processing_item_id to the set of applied processing items from src_field and assigns a copy of this set ass tracking set to all fields in dest_field.

class sigma.processing.pipeline.ProcessingItem(transformation: ~sigma.processing.transformations.Transformation, rule_condition_linking: ~typing.Callable[[~typing.Iterable[bool]], bool] = <built-in function all>, rule_condition_negation: bool = False, rule_conditions: ~typing.List[~sigma.processing.conditions.RuleProcessingCondition] = <factory>, identifier: str | None = None, detection_item_condition_linking: ~typing.Callable[[~typing.Iterable[bool]], bool] = <built-in function all>, detection_item_condition_negation: bool = False, detection_item_conditions: ~typing.List[~sigma.processing.conditions.DetectionItemProcessingCondition] = <factory>, field_name_condition_linking: ~typing.Callable[[~typing.Iterable[bool]], bool] = <built-in function all>, field_name_condition_negation: bool = False, field_name_conditions: ~typing.List[~sigma.processing.conditions.FieldNameProcessingCondition] = <factory>)

A processing item consists of an optional condition and a transformation that is applied in the case that the condition evaluates to true against the given Sigma rule or if the condition is not present.

Processing items are instantiated by the processing pipeline for a whole collection that is about to be converted by a backend.

apply(pipeline: ProcessingPipeline, rule: SigmaRule | SigmaCorrelationRule) bool

Matches condition against rule and performs transformation if condition is true or not present. Returns Sigma rule and bool if transformation was applied.

detection_item_condition_linking()

Return True if bool(x) is True for all values x in the iterable.

If the iterable is empty, return True.

field_name_condition_linking()

Return True if bool(x) is True for all values x in the iterable.

If the iterable is empty, return True.

classmethod from_dict(d: dict)

Instantiate processing item from parsed definition and variables.

match_detection_item(pipeline: ProcessingPipeline, detection_item: SigmaDetectionItem) bool

Evalutates detection item and field name conditions from processing item to detection item and returns result.

match_field_in_value(pipeline: ProcessingPipeline, value: SigmaType) bool

Evaluate field name conditions in field reference values and return result.

match_field_name(pipeline: ProcessingPipeline, field: str | None) bool

Evaluate field name conditions on field names and return result.

Specifying Processing Pipelines as YAML

A processing pipeline can be specified as YAML file that can be loaded with ProcessingPipeline.from_yaml(yaml) or by specifying a filename to ProcessingPipelineResolver.resolve() or ProcessingPipelineResolver.resolve_pipeline().

The following items are expected on the root level of the YAML file:

  • name: the name of the pipeline.

  • priority: specifies the ordering of the pipeline in case multiple pipelines are concatenated. Lower priorities are used first.

  • transformations: contains a list of transformation items for the rule pre-processing stage.

  • postprocessing: contains a list of transformation items for the query post-processing stage.

  • finalizers: contains a list of transformation items for the output finalization stage.

Some conventions used for processing pipeline priorities are:

Priority

Description

10

Log source pipelines like for Sysmon.

20

Pipelines provided by backend packages that should be run before the backend pipeline.

50

Backend pipelines that are integrated in the backend and applied automatically.

60

Backend output format pipelines that are integrated in the backend and applied automatically for the asscoiated output format.

Pipelines with the same priority are applied in the order they were provided. Pipelines without a priority are assumed to have the priority 0.

Transformation items are defined as a map as follows:

  • id: the identifier of the item. This is also tracked at detection item or condition level and can be used in future conditions.

  • type: the type of the transformation as specified in the identifier to class mappings below: Transformations

  • Arbitrary transformation parameters are specified at the samle level.

  • rule_conditions, detection_item_conditions, field_name_conditions: conditions of the type corresponding to the name.

Conditions are specified as follows:

  • type: defines the condition type. It must be one of the identifiers that are defined in Conditions

  • rule_cond_op, detection_item_cond_op, field_name_cond_op: boolean operator for the condition result. Must be one of or or and. Defaults to and.

  • rule_cond_not, detection_item_cond_not, field_name_cond_not: if set to True, the condition result is negated.

  • Arbitrary conditions parameters are specified on the same level.

Example:

name: Custom Sysmon field naming
priority: 100
transformations:
- id: field_mapping
    type: field_name_mapping
    mapping:
        CommandLine: command_line
    rule_conditions:
    - type: logsource
        service: sysmon

Conditions

New in version 0.8.0: Field name conditions.

There are three types of conditions:

  • Rule conditions are evaluated to the whole rule. They are defined in the rule_conditions attribute of a ProcessingItem. These can be applied in the rule pre-processing stage and the query post-processing stage.

  • Detection item conditions are evaluated for each detection item. They are defined in the detection_item_conditions attribute of a ProcessingPipeline. These can only be applied in the rule pre-processing stage.

  • Field name conditions are evaluated for field names that can be located in detection items or in the field name list of a Sigma rule. They are defined in the field_name_conditions attribute of detection_item_conditions attribute of a ProcessingPipeline. These can only be applied in the rule pre-processing stage.

In addition to the *_conditions attributes of ProcessingPipeline objects, there are two further attributes hat control the condition matching behavior:

  • rule_condition_linking, detection_item_condition_linking and field_name_condition_linking: one of any or all functions. Controls if one or all of the conditions from the list must match to result in an overall match.

  • rule_condition_negation, detection_item_condition_negation and field_name_condition_negation: if set to True, the condition result is negated.

The results of the evaluatuon of different condition types are and-linked. E.g. if a processing item contains rule and field name conditions, both must evaluate to True to get the overall result of True.

Rule Conditions

Detection Item Identifiers

Identifier

Class

logosurce

LogsourceCondition

contains_detection_item

RuleContainsDetectionItemCondition

processing_item_applied

RuleProcessingItemAppliedCondition

class sigma.processing.conditions.LogsourceCondition(category: str | None = None, product: str | None = None, service: str | None = None)

Matches log source on rule. Not specified log source fields are ignored.

class sigma.processing.conditions.RuleContainsDetectionItemCondition(field: str | None, value: str | int | float | bool)

Returns True if rule contains a detection item that matches the given field name and value.

class sigma.processing.conditions.RuleProcessingItemAppliedCondition(processing_item_id: str)

Checks if processing item was applied to rule.

Detection Item Conditions

Detection Item Identifiers

Identifier

Class

match_string

MatchStringCondition

processing_item_applied

DetectionItemProcessingItemAppliedCondition

class sigma.processing.conditions.MatchStringCondition(cond: Literal['any', 'all'], pattern: str, negate: bool = False)

Match string values with a regular expression ‘pattern’. The parameter ‘cond’ determines for detection items with multiple values if any or all strings must match. Generally, values which aren’t strings are skipped in any mode or result in a false result in all match mode.

class sigma.processing.conditions.DetectionItemProcessingItemAppliedCondition(processing_item_id: str)

Checks if processing item was applied to detection item.

Field Name Conditions

Field Name Identifiers

Identifier

Class

include_fields

IncludeFieldCondition

exclude_fields

ExcludeFieldCondition

processing_item_applied

FieldNameProcessingItemAppliedCondition

class sigma.processing.conditions.IncludeFieldCondition(fields: List[str], type: Literal['plain', 're'] = 'plain')

Matches on field name if it is contained in fields list. The parameter ‘type’ determines if field names are matched as plain string (“plain”) or regular expressions (“re”).

class sigma.processing.conditions.ExcludeFieldCondition(fields: List[str], type: Literal['plain', 're'] = 'plain')

Matches on field name if it is not contained in fields list.

class sigma.processing.conditions.FieldNameProcessingItemAppliedCondition(processing_item_id: str)

Checks if processing item was applied to a field name.

Base Classes

Base classes must be overridden to implement new conditions that can be used in processing pipelines. In addition, the new class should be mapped to an identifier. This allows to use the condition from processing pipelines defined in YAML files. The mapping is done in the dict rule_conditions or detection_item_conditions in the sigma.processing.conditions package for the respective condition types. This is not necessary for conditions that should be uses privately and not be distributed via the main pySigma distribution.

class sigma.processing.conditions.RuleProcessingCondition

Base for Sigma rule processing condition classes used in processing pipelines.

class sigma.processing.conditions.DetectionItemProcessingCondition

Base for Sigma detection item processing condition classes used in processing pipelines.

class sigma.processing.conditions.FieldNameProcessingCondition

Base class for conditions on field names in detection items, Sigma rule field lists and other use cases that require matching on field names without detection item context.

class sigma.processing.conditions.ValueProcessingCondition(cond: Literal['any', 'all'])

Base class for conditions on values in detection items. The ‘cond’ parameter determines if any or all values of a multivalued detection item must match to result in an overall match.

The method match_value is called for each value and must return a bool result. It should reject values which are incompatible with the condition with a False return value.

Transformations

Rule Pre-Processing Transformations

The following transformations with their corresponding identifiers for usage in YAML-based pipeline definitions are available:

Rule Pre-Processing Transformations

Identifier

Class

field_name_mapping

FieldMappingTransformation

field_name_prefix_mapping

FieldPrefixMappingTransformation

field_name_suffix

AddFieldnameSuffixTransformation

field_name_prefix

AddFieldnamePrefixTransformation

drop_detection_item

DropDetectionItemTransformation

wildcard_placeholders

WildcardPlaceholderTransformation

value_placeholders

ValueListPlaceholderTransformation

query_expression_placeholders

QueryExpressionPlaceholderTransformation

add_condition

AddConditionTransformation

change_logsource

ChangeLogsourceTransformation

replace_string

ReplaceStringTransformation

map_string

MapStringTransformation

set_state

SetStateTransformation

rule_failure

RuleFailureTransformation

detection_item_failure

DetectionItemFailureTransformation

class sigma.processing.transformations.FieldMappingTransformation(mapping: Dict[str, str | List[str]])

Map a field name to one or multiple different.

YAML example:

transformations:
  type: field_name_mapping
  mapping:
    EventID: EventCode
    CommandLine:
      - command_line
      - cmdline

This shows how to map the field name EventID to EventCode and CommandLine to command_line and cmdline. For the latter, OR-conditions will be generated to match the value on both fields. This is useful if different data models are used in the same system.

class sigma.processing.transformations.FieldPrefixMappingTransformation(mapping: Dict[str, str | List[str]])

Map a field name prefix to one or multiple different prefixes.

class sigma.processing.transformations.AddFieldnameSuffixTransformation(suffix: str)

Add field name suffix.

class sigma.processing.transformations.AddFieldnamePrefixTransformation(prefix: str)

Add field name prefix.

class sigma.processing.transformations.DropDetectionItemTransformation

Deletes detection items. This should only used in combination with a detection item condition.

class sigma.processing.transformations.WildcardPlaceholderTransformation(include: List[str] | None = None, exclude: List[str] | None = None)

Replaces placeholders with wildcards. This transformation is useful if remaining placeholders should be replaced with something meaningful to make conversion of rules possible without defining the placeholders content.

class sigma.processing.transformations.ValueListPlaceholderTransformation(include: List[str] | None = None, exclude: List[str] | None = None)

Replaces placeholders with values contained in variables defined in the configuration.

class sigma.processing.transformations.QueryExpressionPlaceholderTransformation(include: ~typing.List[str] | None = None, exclude: ~typing.List[str] | None = None, expression: str = '', mapping: ~typing.Dict[str, str] = <factory>)

Replaces a placeholder with a plain query containing the placeholder or an identifier mapped from the placeholder name. The main purpose is the generation of arbitrary list lookup expressions which are passed to the resulting query.

Parameters: * expression: string that contains query expression with {field} and {id} placeholder where placeholder identifier or a mapped identifier is inserted. * mapping: Mapping between placeholders and identifiers that should be used in the expression. If no mapping is provided the placeholder name is used.

class sigma.processing.transformations.AddConditionTransformation(conditions: ~typing.Dict[str, str | ~typing.List[str]] = <factory>, name: str | None = None, template: bool = False)

Add and condition expression to rule conditions.

If template is set to True the condition values are interpreted as string templates and the following placeholders are replaced:

  • $category, $product and $service: with the corresponding values of the Sigma rule log source.

class sigma.processing.transformations.ChangeLogsourceTransformation(category: str | None = None, product: str | None = None, service: str | None = None)

Replace log source as defined in transformation parameters.

class sigma.processing.transformations.ReplaceStringTransformation(regex: str, replacement: str)

Replace string part matched by regular expresssion with replacement string that can reference capture groups. It operates on the plain string representation of the SigmaString value.

This is basically an interface to re.sub() and can use all features available there.

class sigma.processing.transformations.MapStringTransformation(mapping: Dict[str, str | List[str]])

Map static string value to one or multiple other strings.

YAML example:

transformations:
  type: map_string
  mapping:
    value1: mapped1
    value2:
      - mapped2A
      - mapped2B
class sigma.processing.transformations.SetStateTransformation(key: str, val: Any)

Set pipeline state key to value.

class sigma.processing.transformations.RuleFailureTransformation(message: str)

Raise a SigmaTransformationError with the provided message. This enables transformation pipelines to signalize that a certain situation can’t be handled, e.g. only a subset of values is allowed because the target data model doesn’t offers all possibilities.

class sigma.processing.transformations.DetectionItemFailureTransformation(message: str)

Raise a SigmaTransformationError with the provided message. This enables transformation pipelines to signalize that a certain situation can’t be handled, e.g. only a subset of values is allowed because the target data model doesn’t offers all possibilities.

Query Post-Processing Transformations

New in version 0.10.0.

Query Post-Processing Transformations

Identifier

Class

embed

EmbedQueryTransformation

simple_template

QuerySimpleTemplateTransformation

template

QueryTemplateTransformation

json

EmbedQueryInJSONTransformation

replace

ReplaceQueryTransformation

class sigma.processing.postprocessing.EmbedQueryTransformation(prefix: str | None = None, suffix: str | None = None)

Embeds a query between a given prefix and suffix. Only applicable to string queries.

class sigma.processing.postprocessing.QuerySimpleTemplateTransformation(template: str)

Replace query with template that can refer to the following placeholders: * query: the postprocessed query. * rule: the Sigma rule including all its attributes like rule.title. * pipeline: the Sigma processing pipeline where this transformation is applied including all

current state information in pipeline.state.

The Python format string syntax (str.format()) is used.

class sigma.processing.postprocessing.QueryTemplateTransformation(template: str, path: str | None = None, autoescape: bool = False)

Apply Jinja2 template provided as template object variable to a query. The following variables are available in the context:

  • query: the postprocessed query.

  • rule: the Sigma rule including all its attributes like rule.title.

  • pipeline: the Sigma processing pipeline where this transformation is applied including all current state information in pipeline.state.

if path is given, template is considered as a relative path to a template file below the specified path. If it is not provided, the template is specified as plain string. autoescape controls the Jinja2 HTML/XML auto-escaping.

class sigma.processing.postprocessing.EmbedQueryInJSONTransformation(json_template: str)

Embeds a query into a JSON structure defined as string. the placeholder value %QUERY% is replaced with the query.

class sigma.processing.postprocessing.ReplaceQueryTransformation(pattern: str, replacement: str)

Replace query part specified by regular expression with a given string.

Output Finalization Transformations

New in version 0.10.0.

Output Finalization Transformations

Identifier

Class

concat

ConcatenateQueriesFinalizer

template

TemplateFinalizer

json

JSONFinalizer

yaml

YAMLFinalizer

class sigma.processing.finalization.ConcatenateQueriesFinalizer(separator: str = '\n', prefix: str = '', suffix: str = '')

Concatenate queries with a given separator and embed result within a prefix or suffix string.

class sigma.processing.finalization.TemplateFinalizer(template: str, path: str | None = None, autoescape: bool = False)

Apply Jinja2 template provided as template object variable to the queries. The following variables are available in the context:

  • queries: all post-processed queries generated by the backend.

  • pipeline: the Sigma processing pipeline where this transformation is applied including all current state information in pipeline.state.

if path is given, template is considered as a relative path to a template file below the specified path. If it is not provided, the template is specified as plain string. autoescape controls the Jinja2 HTML/XML auto-escaping.

class sigma.processing.finalization.JSONFinalizer(indent: int | None = None)
class sigma.processing.finalization.YAMLFinalizer(indent: int | None = None)

Base Classes

There are four transformation base classes that can be derived to implement transformations on particular parts of a Sigma rule or the whole Sigma rule:

class sigma.processing.transformations.Transformation

Base class for processing steps used in pipelines. Override apply with transformation that is applied to the whole rule.

class sigma.processing.transformations.DetectionItemTransformation

Iterates over all detection items of a Sigma rule and calls the apply_detection_item method for each of them if the detection item condition associated with the processing item evaluates to true. It also takes care to recurse into detections nested into detections.

The apply_detection_item method can directly change the detection or return a replacement object, which can be a SigmaDetection or a SigmaDetectionItem.

The processing item is automatically added to the applied items of the detection items if a replacement value was returned. In the other case the apply_detection_item method must take care of this to make conditional decisions in the processing pipeline working. This can be done with the detection_item_applied() method.

A detection item transformation also marks the item as unconvertible to plain data types.

class sigma.processing.transformations.ValueTransformation

Iterates over all values in all detection items of a Sigma rule and call apply_value method for each of them. The apply_value method can return a single value or a list of values which are inserted into the value list or None if the original value should be passed through. An empty list should be returned by apply_value to drop the value from the transformed results.

class sigma.processing.transformations.ConditionTransformation

Iterates over all rule conditions and calls the apply_condition method for each condition. Automatically takes care of marking condition as applied by processing item.

Transformation Tracking

tbd