According the documention of the
split function you can limit the number of fields that will be separated or split. From
http://perldoc.perl.org/functions/split.html :
Quote:
split /PATTERN/,EXPR,LIMIT
[snip]
If LIMIT is specified and positive, it represents the maximum number of fields into which the EXPR may be split; in other words, LIMIT is one greater than the maximum number of times EXPR may be split.
|
Looking at a sample line:
Code:
# date(1) time(2) ty source(4) destination(5) header info (6)
# 2014-01-31 16:14:30.938665 IP 80.25.124.114 > 1.2.22.222: ICMP echo request, id 0, seq 0, length 64
1 2 3 4 5 6 7
So in this case we would need a limit of 7. By throwing away field nr 6, the '>', we thus have a one stage
split approach:
Code:
sub read_simple {
my ($leading, $info);
my ($date, $time, $type, $source, $direction, $dest);
while (<DATA>) {
($date, $time, $type, $source, $direction, $dest, $info) = split( / /, $_, 7);
show_raw($date, $time, $type, $source, $dest, $info);
#export( $date, $time, $type, $source, $dest, $info);
}
}
A complete version:
Code:
#!/usr/bin/perl
use warnings;
use strict;
my $NULL = '\N'; # MySQL null value
# date(1) time(2) ty source(4) destination(5) header info (6)
# 2014-01-31 16:14:30.938665 IP 80.25.124.114 > 1.2.22.222: ICMP echo request, id 0, seq 0, length 64
# 2014-01-31 16:16:35.262293 IP 180.188.194.9.55459 > 1.2.22.222.25: Flags [S], seq 1118489574, win 14600, options [mss 1460,sackOK,TS[|tcp]>
sub show_raw {
my ($date, $time, $type, $source, $dest, $info) = @_;
print STDERR <<END;
Date : $date
Time : $time
Type : $type
Source IP : $source
Destination IP : $dest
Info : $info
=============================
END
}
sub read_simple {
my ($leading, $info);
my ($date, $time, $type, $source, $direction, $dest);
while (<DATA>) {
($date, $time, $type, $source, $direction, $dest, $info) = split( / /, $_, 7);
show_raw($date, $time, $type, $source, $dest, $info);
#export( $date, $time, $type, $source, $info);
}
}
read_simple();
__END__
2014-01-31 16:14:30.938665 IP 80.25.124.114 > 1.2.22.222: ICMP echo request, id 0, seq 0, length 64
2014-01-31 16:16:35.262293 IP 180.188.194.9.55459 > 1.2.22.222.25: Flags [S], seq 1118489574, win 14600, options [mss 1460,sackOK,TS[|tcp]>
2014-01-31 16:16:38.260924 IP 180.188.194.9.55459 > 1.2.22.222.25: Flags [S], seq 1118489574, win 14600, options [mss 1460,sackOK,TS[|tcp]>
The output:
Code:
$ simpler-short.pl 2>&1| less
Date : 2014-01-31
Time : 16:14:30.938665
Type : IP
Source IP : 80.25.124.114
Destination IP : 1.2.22.222:
Info : ICMP echo request, id 0, seq 0, length 64
=============================
Date : 2014-01-31
Time : 16:16:35.262293
Type : IP
Source IP : 180.188.194.9.55459
Destination IP : 1.2.22.222.25:
Info : Flags [S], seq 1118489574, win 14600, options [mss 1460,sackOK,TS[|tcp]>
=============================
Conclusion: a regular expression approach to
pflog file parsing is not needed at all. At least not for what I was trying to accomplish.
Why did I think I had to go that way?