-
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
23 changed files
with
1,512 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
--- | ||
title: Collect metrics | ||
--- | ||
|
||
Tenzir keeps track of metrics about node resource usage, pipeline state, and | ||
runtime performance. | ||
|
||
Metrics are stored as internal events in the node's storage engine, allowing you | ||
to work with metrics just like regular data. Use the | ||
[`metrics`](../tql2/operators/metrics.md) input operator to access the metrics. | ||
The operator documentation lists [all available | ||
metrics](../tql2/operators/metrics#schemas) in detail. | ||
|
||
The `metrics` operator provides a *copy* of existing metrics. You can use it | ||
multiple time to reference the same metrics feed. | ||
|
||
## Write metrics to a file | ||
|
||
Export metrics continuously to a file via `metrics --live`: | ||
|
||
```tql | ||
metrics live=true | ||
write_ndjson | ||
save_file "metrics.json", append=true | ||
``` | ||
|
||
This attaches to incoming metrics feed, renders them as NDJSON, and then writes | ||
the output to a file. Without the `live` option, the `metrics` operator returns | ||
the snapshot of all historical metrics. | ||
|
||
## Summarize metrics | ||
|
||
You can [shape](../usage/shape-data/README.md) metrics like ordinary data, | ||
e.g., write aggregations over metrics to compute runtime statistics suitable for | ||
reporting or dashboarding: | ||
|
||
```tql | ||
metrics "operator" | ||
where sink == true | ||
summarize runtime=sum(duration), pipeline_id | ||
sort -runtime | ||
``` | ||
|
||
The above example computes the total runtime over all pipelines grouped by their | ||
unique ID. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
44 changes: 44 additions & 0 deletions
44
src/content/docs/guides/usage/basics/manage-a-pipeline.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
--- | ||
title: Manage a pipeline | ||
--- | ||
|
||
A pipeline can be in one of the following **states** after you [run | ||
it](../run-pipelines): | ||
|
||
- **Created**: the pipeline has just been deployed. | ||
- **Running**: the pipeline is actively processing data. | ||
- **Completed**: there is no more data to process. | ||
- **Failed**: an error occurred. | ||
- **Paused**: the user interrupted execution, keeping in-memory state. | ||
- **Stopped**: the user interrupted execution, resetting all in-memory state. | ||
|
||
The [app](https://app.tenzir.com/) or [API](/api) allow you to manage the | ||
pipeline lifecycles. | ||
|
||
## Change the state of a pipeline | ||
|
||
In the [app](https://app.tenzir.com/overview), an icon visualizes the current | ||
pipeline state. Change a state as follows: | ||
|
||
1. Click the checkbox on the left next to the pipeline, or the checkbox in the | ||
column header to select all pipelines. | ||
2. Click the button corresponding to the desired action, i.e., *Start*, *Pause*, | ||
*Stop*, or *Delete*. | ||
3. Confirm your selection. | ||
|
||
For the [API](/api), use the following endpoints based on the desired actions: | ||
|
||
- Start, pause, and stop: | ||
[`/pipeline/update`](/api#/paths/~1pipeline~1update/post) | ||
- Delete: [`/pipeline/delete`](/api#/paths/~1pipeline~1delete/post) | ||
|
||
## Understand pipeline state transitions | ||
|
||
The diagram below illustrates the various states, where circles correspond to | ||
states and arrows to state transitions: | ||
|
||
 | ||
|
||
The grey buttons indicate the actions you, as a user, can take to transition | ||
into a different state. The orange arrows are transitions that take place | ||
automatically based on system events. |
2 changes: 2 additions & 0 deletions
2
src/content/docs/guides/usage/basics/manage-a-pipeline/pipeline-states.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,132 @@ | ||
--- | ||
title: Run pipelines | ||
--- | ||
|
||
You can run a [pipeline](../../../explanations/architecture/pipeline) in the | ||
app, on the command line using the `tenzir` binary, or configure it to run as | ||
code. | ||
|
||
## In the app | ||
|
||
Run a pipeline by writing typing it in the editor and hitting the *Run* button. | ||
|
||
The following invariants apply: | ||
|
||
1. You must start with an input operator | ||
2. The browser is always the output operator | ||
|
||
The diagram below illustrates these mechanics: | ||
|
||
 | ||
|
||
For example, write [`version`](../../tql2/operators/version.md) and click *Run* | ||
to see a single event arrive. | ||
|
||
## On the command line | ||
|
||
On the command line, run `tenzir <pipeline>` where `<pipeline>` is the | ||
definition of the pipeline. | ||
|
||
If the pipeline expects events as its input, an implicit `load_stdin | | ||
read_json` will be prepended. If it expects bytes instead, only `load_stdin` is | ||
prepended. Likewise, if the pipeline outputs events, an implicit `write_json | | ||
save_stdout` will be appended. If it outputs bytes instead, only `save_stdout` | ||
is appended. | ||
|
||
The diagram below illustrates these mechanics: | ||
|
||
 | ||
|
||
For example, run [`tenzir 'version | drop | ||
dependencies'`](../../tql2/operators/version.md) to see a single event in the | ||
terminal: | ||
|
||
```tql | ||
{ | ||
version: "4.22.1+g324214e6de", | ||
tag: "g324214e6de", | ||
major: 4, | ||
minor: 22, | ||
patch: 1, | ||
features: [], | ||
build: { | ||
type: "Release", | ||
tree_hash: "c4c37acb5f9dc1ce3806f40bbde17a08", | ||
assertions: false, | ||
sanitizers: { | ||
address: false, | ||
undefined_behavior: false, | ||
}, | ||
}, | ||
} | ||
``` | ||
|
||
You could also render the output differently by choosing a different format: | ||
|
||
```sh | ||
tenzir 'version | drop dependencies | write_csv' | ||
tenzir 'version | drop dependencies | write_ssv' | ||
tenzir 'version | drop dependencies | write_parquet | save_file "version.parquet' | ||
``` | ||
|
||
Instead of passing the pipeline description to the `tenzir` executable, you can | ||
also load the definition from a file via `-f`: | ||
|
||
```sh | ||
tenzir -f pipeline.tql | ||
``` | ||
|
||
This will interpret the file contents as pipeline and run it. | ||
|
||
## As Code | ||
|
||
In addition to running pipelines interactively, you can also deploy *pipelines as | ||
code (PaC)*. This infrastructure-as-code-like method differs from the app-based | ||
deployment in two ways: | ||
|
||
1. Pipelines deployed as code always start with the Tenzir node, ensuring | ||
continuous operation. | ||
2. To safeguard them, deletion via the user interface is disallowed. | ||
|
||
Here's a an example of deploying a pipeline through your configuration: | ||
|
||
```yaml title="<prefix>/etc/tenzir/tenzir.yaml" | ||
tenzir: | ||
pipelines: | ||
# A unique identifier for the pipeline that's used for metrics, diagnostics, | ||
# and API calls interacting with the pipeline. | ||
suricata-over-tcp: | ||
# An optional user-facing name for the pipeline. Defaults to the id. | ||
name: Onboard Suricata from TCP | ||
# An optional user-facing description of the pipeline. | ||
description: | | ||
Onboards Suricata EVE JSON from TCP port 34343. | ||
# The definition of the pipeline. Configured pipelines that fail to start | ||
# cause the node to fail to start. | ||
definition: | | ||
load_tcp "0.0.0.0:34343" | ||
read_suricata | ||
publish "suricata" | ||
# Pipelines that encounter an error stop running and show an error state. | ||
# This option causes pipelines to automatically restart when they | ||
# encounter an error instead. The first restart happens immediately, and | ||
# subsequent restarts after the configured delay, defaulting to 1 minute. | ||
# The following values are valid for this option: | ||
# - Omit the option, or set it to null or false to disable. | ||
# - Set the option to true to enable with the default delay of 1 minute. | ||
# - Set the option to a valid duration to enable with a custom delay. | ||
restart-on-error: 1 minute | ||
# Add a list of labels that are shown in the pipeline overview page at | ||
# app.tenzir.com. | ||
labels: | ||
- Suricata | ||
- Onboarding | ||
# Disable the pipeline. | ||
disabled: false | ||
# Pipelines that are unstoppable will run automatically and indefinitely. | ||
# They are not able to pause or stop. | ||
# If they do complete, they will end up in a failed state. | ||
# If `restart-on-error` is enabled, they will restart after the specified | ||
# duration. | ||
unstoppable: true | ||
``` |
10 changes: 10 additions & 0 deletions
10
src/content/docs/guides/usage/basics/run-pipelines/pipeline-browser.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 12 additions & 0 deletions
12
src/content/docs/guides/usage/basics/run-pipelines/pipeline-cli.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
--- | ||
title: Deduplicate events | ||
--- | ||
|
||
The [`deduplicate`](../tql2/operators/deduplicate.md) provides is a powerful | ||
mechanism to remove duplicate events in a pipeline. | ||
|
||
There are numerous use cases for deduplication, such as reducing noise, | ||
optimizing costs and make threat detection and response more efficent. Read our | ||
[blog post](/archive/reduce-cost-and-noise-with-deduplication) for high-level | ||
discussion. | ||
|
||
## Analyze unique host pairs | ||
|
||
Let's say you're investigating an incident and would like get a better of | ||
picture of what entities are involved in the communication. To this end, you | ||
would like to extract all unique host pairs to identify who communicated with | ||
whom. | ||
|
||
Here's how this looks like with Zeek data: | ||
|
||
```tql | ||
export | ||
where @schema == "zeek.conn" | ||
deduplicate {orig_h: id.orig_h, resp_h: id.resp_h} | ||
``` | ||
|
||
Providing `id.orig_h` and `id.resp_h` to the operator restricts the output to | ||
all unique host pairs. Note that flipped connections occur twice here, i.e., A → | ||
B as well as B → A are present. | ||
|
||
## Remove duplicate alerts | ||
|
||
Are you're overloaded with alerts, like every analyst? Let's remove some noise | ||
from our alerts. | ||
|
||
First, let's check what our alert dataset looks like: | ||
|
||
```tql | ||
export | ||
where @schema == "suricata.alert" | ||
top alert.signature | ||
head 5 | ||
``` | ||
|
||
```tql | ||
{ | ||
alert.signature: "ET MALWARE Cobalt Strike Beacon Observed", | ||
count: 117369, | ||
} | ||
{ | ||
alert.signature: "SURICATA STREAM ESTABLISHED packet out of window", | ||
count: 103198, | ||
} | ||
{ | ||
alert.signature: "SURICATA STREAM Packet with invalid ack", | ||
count: 21960, | ||
} | ||
{ | ||
alert.signature: "SURICATA STREAM ESTABLISHED invalid ack", | ||
count: 21920, | ||
} | ||
{ | ||
alert.signature: "ET JA3 Hash - [Abuse.ch] Possible Dridex", | ||
count: 16870, | ||
} | ||
``` | ||
|
||
Hundreds of thousands of alerts! Maybe I'm just interested in one per hour per | ||
host affected host pair? Here's the pipeline for this: | ||
|
||
```tql | ||
from "/tmp/eve.json", follow=true | ||
where @schema == "suricata.alert" | ||
deduplicate {src: src_ip, dst: dest_ip, sig: alert.signature}, timeout=1h | ||
import | ||
``` |
Oops, something went wrong.