Skip to content

Commit

Permalink
Make SDATA cumulative and improve performance
Browse files Browse the repository at this point in the history
Each `SDATA` extractor will add to the SDATA hash, instead of having the
"last" collector win.  This shouldn't affect much, but would allow for
logs containing JSON and lazy K/V pairs to expose both via SDATA.

The order of the SDATA collectors determines precedence. The highest
priority is RFC5424 Structured Data, followed by JSON, followed by
"lazy" K/V data.

Also:
* Bump everything to `use v5.16` as that's the minimum Perl version
* Add KV and JSON detection to the benchmark script for flame graphs on those parts
* Fix cpanfile generation
* Fix Github actions
* Add `normalize_test_result()` function for handling elements in the
  results that might change based on the environment
* Remove named captures. Named captures do make things more readable,
  but they are dramatically slower. Removing these from the library has
  increased the parse speed significantly.
* Add `psl_enable_sdata()` function to adjust enable full structured
  data parse mode.
* Don't create the empty hash as the copy takes more time than not
  reallocating it. Adjust tests to remove the empty keys.
  • Loading branch information
reyjrar committed Jan 12, 2025
1 parent 43f3f79 commit 81cdc83
Show file tree
Hide file tree
Showing 52 changed files with 237 additions and 235 deletions.
7 changes: 2 additions & 5 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,9 @@
# This is a basic workflow to help you get started with Actions

name: Test Matrix

# Controls when the workflow will run
on:
# Triggers the workflow on push or pull request events but only for the "master" branch
push:
branches: [ "master" ]
branches: [ "master", "develop" ]
pull_request:
branches: [ "master" ]

Expand All @@ -20,7 +17,7 @@ jobs:
strategy:
matrix:
os: ['ubuntu-latest']
perl: [ '5.40', '5.38', '5.36', '5.34', '5.32', '5.30', '5.28', '5.26', '5.24', '5.22', '5.20', '5.18', '5.16', '5.14' ]
perl: [ '5.40', '5.38', '5.36', '5.34', '5.32', '5.30', '5.28', '5.26', '5.24', '5.22', '5.20', '5.18', '5.16' ]
name: Perl ${{ matrix.perl }} on ${{ matrix.os }}
steps:
- uses: actions/checkout@v3
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@ Makefile.old
.cvsignore
pm_to_blib
MYMETA.*
nytprof*
14 changes: 5 additions & 9 deletions README.mkdn
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Parse::Syslog::Line - Simple syslog line parser

# VERSION

version 6.0
version 6.1

# SYNOPSIS

Expand Down Expand Up @@ -251,14 +251,6 @@ The changes to the API and fields returned are as follows:
- `NormalizeToUTC` is **deprecated**, every log now returns `datetime_utc`
- `OutputTimeZone` is **deprecated**, use `TimeMomentFormatString`
- **Field Changes**
- `date_raw`

**Removed** from the fields returned, use `datetime_raw`.

- `datetime_obj`

**Removed** from the fields returned as we dropped support for [DateTime](https://metacpan.org/pod/DateTime).

- `datetime_utc`

Present in every document, use this for portability.
Expand Down Expand Up @@ -455,6 +447,10 @@ whitespace will be assumed to be a continuation of the previous line.

It is not exported by default.

## psl\_enable\_sdata

Call this to turn on all the Structured Data Parsing Options

## preamble\_priority

Takes the Integer portion of the syslog messsage and returns
Expand Down
5 changes: 3 additions & 2 deletions benchmarks/00-defaults.pl
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
#!perl

use v5.14;
use v5.16;
use warnings;
use Dumbbench;
use Const::Fast;
use Parse::Syslog::Line;
psl_enable_sdata();

use FindBin;
use lib "$FindBin::Bin/../t/lib";
Expand All @@ -21,7 +22,7 @@
my ($test) = @_;
@copy = @msgs unless @copy and $last ne $test;
$last=$test;
parse_syslog_line(shift @copy);
my $doc = parse_syslog_line(shift @copy);
};

my $bench = Dumbbench->new(
Expand Down
2 changes: 1 addition & 1 deletion benchmarks/01-parse.pl
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!perl

use v5.14;
use v5.16;
use warnings;
use Benchmark qw/timethese cmpthese/;
use Const::Fast;
Expand Down
2 changes: 1 addition & 1 deletion benchmarks/02-benchsmarter.pl
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!perl

use v5.14;
use v5.16;
use warnings;
use Const::Fast;
use Dumbbench;
Expand Down
2 changes: 1 addition & 1 deletion benchmarks/xt-date-parsing-iso.pl
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/usr/bin/perl
#
use v5.14;
use v5.16;
use warnings;

use Benchmark qw(cmpthese);
Expand Down
2 changes: 1 addition & 1 deletion benchmarks/xt-date-parsing-legacy.pl
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/usr/bin/perl
#
use v5.14;
use v5.16;
use warnings;

use Benchmark qw(cmpthese);
Expand Down
6 changes: 3 additions & 3 deletions bin/parse-syslog-line.pl
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
#!perl
# PODNAME: parse-syslog-line.pl
# ABSTRACT: Parse a syslog message and display the structured data
use v5.14;
use v5.16;
use warnings;

use Data::Printer;
use Getopt::Long::Descriptive;
use Pod::Usage;
use JSON::MaybeXS;
use Parse::Syslog::Line qw( parse_syslog_line );
use Parse::Syslog::Line;
use YAML::XS;

my $enc;
Expand Down Expand Up @@ -94,9 +94,9 @@ =head1 EXAMPLES
=cut

# Configure
psl_enable_sdata() if $opt->sdata;
$Parse::Syslog::Line::PruneEmpty = !$opt->empty;
$Parse::Syslog::Line::PruneRaw = !$opt->raw;
$Parse::Syslog::Line::AutoDetectJSON = $Parse::Syslog::Line::AutoDetectKeyValues = $opt->sdata;
$Parse::Syslog::Line::RFC5424StructuredData = !$opt->disable_rfc_sdata;
$Parse::Syslog::Line::RFC5424StructuredDataStrict = $opt->strict_rfc_sdata;

Expand Down
8 changes: 4 additions & 4 deletions cpanfile
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ requires "Data::Printer" => "0";
requires "English" => "0";
requires "Exporter" => "0";
requires "Getopt::Long::Descriptive" => "0";
requires "Hash::Merge::Simple" => "0";
requires "JSON::MaybeXS" => "0";
requires "Module::Load" => "0";
requires "Module::Loaded" => "0";
Expand All @@ -15,7 +16,7 @@ requires "Pod::Usage" => "0";
requires "Ref::Util" => "0";
requires "Time::Moment" => "0";
requires "YAML::XS" => "0";
requires "perl" => "5.014";
requires "perl" => "v5.16.0";
requires "warnings" => "0";
recommends "Cpanel::JSON::XS" => "0";

Expand All @@ -34,16 +35,15 @@ on 'test' => sub {
requires "Test::Deep" => "0";
requires "Test::MockTime" => "0";
requires "Test::More" => "0";
requires "YAML" => "0";
requires "bignum" => "0";
requires "lib" => "0";
requires "perl" => "5.014";
requires "perl" => "v5.16.0";
requires "strict" => "0";
};

on 'configure' => sub {
requires "ExtUtils::MakeMaker" => "0";
requires "perl" => "v5.14.0";
requires "perl" => "v5.16.0";
};

on 'develop' => sub {
Expand Down
1 change: 1 addition & 0 deletions dist.ini
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ exclude_filename = cpanfile
exclude_filename = dist.ini
exclude_filename = weaver.ini
exclude_filename = weaver-ci.ini
exclude_filename = cpanfile
[ExecDir]
dir = bin
[PruneCruft]
Expand Down
Loading

0 comments on commit 81cdc83

Please sign in to comment.