Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert fields at grok match #109

Closed
wants to merge 1 commit into from
Closed

Convert fields at grok match #109

wants to merge 1 commit into from

Conversation

z0rc
Copy link

@z0rc z0rc commented Jun 27, 2017

Instead of doing dedicated mutate convert action, we can convert known fields to integer or float right in grok match.

Ref: https://www.elastic.co/guide/en/logstash/5.4/plugins-filters-grok.html#_grok_basics

Optionally you can add a data type conversion to your grok pattern. By default all semantics are saved as strings. If you wish to convert a semantic’s data type, for example change a string to an integer then suffix it with the target data type. For example %{NUMBER:num:int} which converts the num semantic from a string to an integer. Currently the only supported conversions are int and float.

Instead of doing dedicated mutate convert action, we can convert known
fields to integer or float right in grok match.
@z0rc
Copy link
Author

z0rc commented Jun 27, 2017

Interesting, I'll check what's the case with test failures.

@whyscream
Copy link
Owner

You got what you asked for :)

  1) Failure:
TestGrokPatterns#test_postscreen_0013 [test/test.rb:56]:
Missing expected data in field 'postfix_postscreen_violation_time'.
Expected [0.0] to include "0".

The list of results, which is a single float 0.0, does not contain the string 0.

@z0rc
Copy link
Author

z0rc commented Jun 27, 2017

Well, yeah. I need to update expected values in fixtures, as grok does conversion in place with this change.

@z0rc
Copy link
Author

z0rc commented Jun 27, 2017

@whyscream test runner explicitly converts expected data to string here. I'm not sure how to address this without updating the ton of other fixtures, if to_s conversion is removed.

I'm leaving this for your judgement.

@whyscream
Copy link
Owner

The conversion in the test was probably added to work around this issue (reversed) with upstream patterns that are producing integers or floats. You changed quite a lot of result fields, so I'd expect that quite some tests will need updates too. I have no problems with changing the test suite in a way that causes the tests to detect proper data types too.

The explicit conversion you're removing takes some extra cycles I assume, but does it cause that much additional load that your optimization actually helps? Just curious on how you got the idea for the proposed change.

@z0rc
Copy link
Author

z0rc commented Jun 27, 2017

The explicit conversion you're removing takes some extra cycles I assume, but does it cause that much additional load that your optimization actually helps? Just curious on how you got the idea for the proposed change.

It's habit I developed over time writing custom grok patterns. It's easier and less error prone this way. For example, the removed mutate filter references fields that aren't present in grok template anymore, like postfix_uid.

As of test suite, I'll see what can be done with type conventions. Seems it's worth to refactor it to allow proper type matching.

@z0rc
Copy link
Author

z0rc commented Jul 4, 2017

I tried to remove to_s conversion for expected data and work through failing tests. This removal fixes couple of test cases, but introduces a bunch of new failures because of jls-grok bug.

The issue is jordansissel/ruby-grok#29.

Again, I'm not sure how to proceed from here as testing suite is FUBAR. Maybe switching to tests using logstash way is better here, ref: https://github.com/logstash-plugins/logstash-patterns-core/tree/master/spec.

@whyscream
Copy link
Owner

Thanks for trying!
I looked into it myself too, and am coming to the same conclusion. I guess that the patterns with type coercion appended won't work reliable inside logstash either, so without changes to thet grok library this will never work.

For now I'm leaving the type coercion as-is, when the grok library is fixed, we can revisit this issue, which looks like a nice improvement overall.

@z0rc
Copy link
Author

z0rc commented Jul 5, 2017

Okay, I'm closing this for now and will track issue with jls-grok.

Will create a new PR when mentioned issue resolved.

@z0rc z0rc closed this Jul 5, 2017
@z0rc z0rc deleted the grok-match-convert branch July 5, 2017 07:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants