Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] tag parse_groks with the pattern that matched #771

Open
ipsi opened this issue Mar 27, 2024 · 1 comment
Open

[feature] tag parse_groks with the pattern that matched #771

ipsi opened this issue Mar 27, 2024 · 1 comment
Labels
vrl: stdlib Changes to the standard library

Comments

@ipsi
Copy link

ipsi commented Mar 27, 2024

Per the discussion on the Vector repo, I would like it if VRL would tag the matching pattern, so that it's possible to, for example, route to different transformers or sinks based on which pattern was matched.

This is particularly relevant when hosting third-party software on Docker, for example, where you get might application logs mixed in with Apache Access & Error logs, and need to use something like Grok to detect the type and send to a different destination, or do post-processing on, for example.

I'm not sure what form this would take. For example, would a configuration like this:

parse_groks!(
	"2020-10-02T23:22:12.223222Z info Hello world",
	patterns: [
		"%{common_prefix} %{_status} %{_message}",
		"%{common_prefix} %{_message}",
	],
	aliases: {
		"common_prefix": "%{_timestamp} %{_loglevel}",
		"_timestamp": "%{TIMESTAMP_ISO8601:timestamp}",
		"_loglevel": "%{LOGLEVEL:level}",
		"_status": "%{POSINT:status}",
		"_message": "%{GREEDYDATA:message}"
	},
        tag_field: "grok_pattern"
)

Add the field grok_pattern=%{common_prefix} %{_status} %{_message} to the output, or grok_pattern=0, or would it be better to allow patterns to accept an object of named patterns, like so:

parse_groks!(
	"2020-10-02T23:22:12.223222Z info Hello world",
	patterns: {
		"status_pattern": "%{common_prefix} %{_status} %{_message}",
		"simple_pattern": "%{common_prefix} %{_message}",
	},
	aliases: {
		"common_prefix": "%{_timestamp} %{_loglevel}",
		"_timestamp": "%{TIMESTAMP_ISO8601:timestamp}",
		"_loglevel": "%{LOGLEVEL:level}",
		"_status": "%{POSINT:status}",
		"_message": "%{GREEDYDATA:message}"
	}
)

And (assuming VRL supports passing arrays OR objects like that), the output would automatically include `grok_pattern=status_pattern" if it's an object.

@jszwedko jszwedko added the vrl: stdlib Changes to the standard library label Mar 27, 2024
@kjetilho
Copy link

This would be welcome for me as well. I have a usecase where I am parsing sshd logs, but not all messages have anything sensible to capture to classify them, e.g.,

"fatal: userauth_pubkey: parse request failed: incomplete message \\[preauth\\]",

if I could get a return value such as grok_pattern=preauth_fail that would be very useful for the rest of the parsing and tagging of messages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
vrl: stdlib Changes to the standard library
Projects
None yet
Development

No branches or pull requests

3 participants