How To Use Apache Scalp Log Analyzer to Catch Website Attacks

Author: erics, Posted August 31st, 2015 at 11:29:40am

Requires Python!

Scalp Home
https://code.google.com/p/apache-scalp/

Download Scalp:
https://code.google.com/p/apache-scalp/downloads/detail?name=scalp-0.4.py
Backup Link:
http://www.wyzaerd.com/scalp/scalp-0.4.py

Original (Broken) XML Rules File:
https://dev.itratos.de/projects/php-ids/repository/raw/trunk/lib/IDS/default_filter.xml
Fixed XML Rules File:
http://www.wyzaerd.com/scalp/default_filter.xml

To Fix the XML file:
Replace:
(?:all|distinct|[(!@]*)? with (?:all|distinct|[(!@]+)?
and:
(?i:(\%SYSTEMROOT\%)) with (?:(\%[sS][yY][sS][tT][eE][mM][rR][oO][oO][tT]\%))

Examples:

./scalp-0.4.py -f ./default_filter.xml -o ./scalp-output -l /var/log/httpd_log --html
./scalp-0.4.py -f ./default_filter.xml -o . -l /var/www/cust1/logs/access.log.1440892800

1 2	./scalp-0.4.py -f ./default_filter.xml -o ./scalp-output -l /var/log/httpd_log --html ./scalp-0.4.py -f ./default_filter.xml -o . -l /var/www/cust1/logs/access.log.1440892800

Current options:
exhaustive: Won’t stop at the first pattern matched, but will test all the patterns
tough: Will decode a part of potential attacks (this is done to use better the regexp from PHP-IDS in order to decrease the false-negative rate)
period: Specify a time-frame to look at, all the rest will be ignored
sample: Does a random sampling of the log lines in order to look at a certain percentage, this is useful when the user doesn’t want to do a full scan of all the log, but just ping it to see if there is some problem…
attack: Specify what classes of vulnerabilities the tool will look at (eg, look only for XSS, SQL Injection, etc.)

shell# ./scalp-0.4.py --help
Scalp the apache log! by Romain Gaucher - http://rgaucher.info
usage:  ./scalp.py [--log|-l log_file] [--filters|-f filter_file] [--period time-frame] [OPTIONS] [--attack a1,a2,..,an]
                   [--sample|-s 4.2]
   --log       |-l:  the apache log file './access_log' by default
   --filters   |-f:  the filter file     './default_filter.xml' by default
   --exhaustive|-e:  will report all type of attacks detected and not stop
                     at the first found
   --tough     |-u:  try to decode the potential attack vectors (may increase
                     the examination time)
   --period    |-p:  the period must be specified in the same format as in
                     the Apache logs using * as wild-card
                     ex: 04/Apr/2008:15:45;*/Mai/2008
                     if not specified at the end, the max or min are taken
   --html      |-h:  generate an HTML output
   --xml       |-x:  generate an XML output
   --text      |-t:  generate a simple text output (default)
   --except    |-c:  generate a file that contains the non examined logs due to the
                     main regular expression; ill-formed Apache log etc.
   --attack    |-a:  specify the list of attacks to look for
                     list: xss, sqli, csrf, dos, dt, spam, id, ref, lfi
                     the list of attacks should not contains spaces and comma separated
                     ex: xss,sqli,lfi,ref
   --output    |-o:  specifying the output directory; by default, scalp will try to write
                     in the same directory as the log file
   --sample    |-s:  use a random sample of the lines, the number (float in [0,100]) is
                     the percentage, ex: --sample 0.1 for 1/1000

shell# ./scalp-0.4.py --help

Scalp the apache log! by Romain Gaucher - http://rgaucher.info

usage: ./scalp.py [--log|-l log_file] [--filters|-f filter_file] [--period time-frame] [OPTIONS] [--attack a1,a2,..,an]

[--sample|-s 4.2]

--log |-l: the apache log file './access_log' by default

--filters |-f: the filter file './default_filter.xml' by default

--exhaustive|-e: will report all type of attacks detected and not stop

at the first found

--tough |-u: try to decode the potential attack vectors (may increase

the examination time)

--period |-p: the period must be specified in the same format as in

the Apache logs using * as wild-card

ex: 04/Apr/2008:15:45;*/Mai/2008

if not specified at the end, the max or min are taken

--html |-h: generate an HTML output

--xml |-x: generate an XML output

--text |-t: generate a simple text output (default)

--except |-c: generate a file that contains the non examined logs due to the

main regular expression; ill-formed Apache log etc.

--attack |-a: specify the list of attacks to look for

list: xss, sqli, csrf, dos, dt, spam, id, ref, lfi

the list of attacks should not contains spaces and comma separated

ex: xss,sqli,lfi,ref

--output |-o: specifying the output directory; by default, scalp will try to write

in the same directory as the log file

--sample |-s: use a random sample of the lines, the number (float in [0,100]) is

the percentage, ex: --sample 0.1 for 1/1000

Automation
Here is a small Perl script as a wrapper around Scalp for when you have multiple VirtualHost entries on your web server, each with different log files (YMMV):

#!/usr/bin/perl
#
# Script name: ids
use strict;
my $webdir	= '/var/www';
my $bindir	= '/root/scalp';
my @out	= `find $webdir -mtime -2 -name "*access*"`;
chomp(@out);
foreach my $path (@out) {
	print "$path\n";
	my @elements	= split(/\//,$path);
	my $outdir	= $elements[3];
	print `mkdir -p $bindir/$outdir` unless -d "$bindir/$outdir";
	print qx#$bindir/scalp-0.4.py -f $bindir/default_filter.xml -o "$bindir/$outdir" -l $path#;
}

#!/usr/bin/perl

# Script name: ids

use strict;

my $webdir = '/var/www';

my $bindir = '/root/scalp';

my @out = `find $webdir -mtime -2 -name "*access*"`;

chomp(@out);

foreach my $path (@out) {

print "$path\n";

my @elements = split(/\//,$path);

my $outdir = $elements[3];

print `mkdir -p $bindir/$outdir` unless -d "$bindir/$outdir";

print qx#$bindir/scalp-0.4.py -f $bindir/default_filter.xml -o "$bindir/$outdir" -l $path#;

}

Place scalp-0.4.py, default_filter.xml and the ids script into the $bindir directory you specified in the ids script.

root@myHost:/root/scalp # ./ids root@myHost:/root/scalp # for i in find . -name "*txt"; do vim $i; done

Here is a very crude but effective non-Scalp hit counter I whipped up, which simply gets the raw count of requests per IP address. Remember that each page render can consist of many calls, so be sure to baseline a normal page load quantity before panicking. As always, YMMV…
If you provide an optional space-delimited list of one or more IP addresses to look for on the command line, the script will output a summary of counts per log file for those IP addresses.

#!/usr/bin/perl
#
# Script name: ipcnt
use strict;

our $days = 2;
our $quiet = 0;
our $webdir = '/volumes/data/www';
while (@ARGV[0] =~ /^-/) {
        $_ = shift;
        $quiet		= 1,		next if /^-q$/;
        $days		= shift,	next if /^-d$/;
        $webdir		= shift,	next if /^-w$/;
}
die "Webdir $webdir not found" unless -d "$webdir";
$days	= 2 unless $days;

my %data;
my @out	= `find $webdir -mtime -$days -name "*access*"`;
chomp(@out);
my $c = scalar @out;
print "Processing $c files...\n";
foreach my $path (@out) {
	#print "Processing $path...\n";
	open(IN, "$path") or die;
	while (<IN>) {
		my $ip	= (split)[0];
		$data{$ip}{'pathcounts'}{$path} ++;
		$data{$ip}{'count'} ++;
	}
}

unless ($quiet) {
	print "\n\nIP: Count\n\n";
	foreach my $ip (sort { $data{$a}{'count'} <=> $data{$b}{'count'} } keys %data) {
		print qq|$ip: $data{$ip}{count}\n|;
	}
}

if (@ARGV) {
	foreach my $findme (@ARGV) {
		print "\n\nSearch results for $findme\n\n";
		foreach my $path (sort keys %{ $data{$findme}{'pathcounts'} }) {
			print qq|$path: $data{$findme}{'pathcounts'}{$path}\n|;
		}
	}
}

#!/usr/bin/perl

# Script name: ipcnt

use strict;

our $days = 2;

our $quiet = 0;

our $webdir = '/volumes/data/www';

while (@ARGV[0] =~ /^-/) {

$_ = shift;

$quiet = 1, next if /^-q$/;

$days = shift, next if /^-d$/;

$webdir = shift, next if /^-w$/;

}

die "Webdir $webdir not found" unless -d "$webdir";

$days = 2 unless $days;

my %data;

my @out = `find $webdir -mtime -$days -name "*access*"`;

chomp(@out);

my $c = scalar @out;

print "Processing $c files...\n";

foreach my $path (@out) {

#print "Processing $path...\n";

open(IN, "$path") or die;

while (<IN>) {

my $ip = (split)[0];

$data{$ip}{'pathcounts'}{$path} ++;

$data{$ip}{'count'} ++;

}

unless ($quiet) {

print "\n\nIP: Count\n\n";

foreach my $ip (sort { $data{$a}{'count'} <=> $data{$b}{'count'} } keys %data) {

print qq|$ip: $data{$ip}{count}\n|;

}

if (@ARGV) {

foreach my $findme (@ARGV) {

print "\n\nSearch results for $findme\n\n";

foreach my $path (sort keys %{ $data{$findme}{'pathcounts'} }) {

print qq|$path: $data{$findme}{'pathcounts'}{$path}\n|;

}

Categories: How-To's, Technology Tags: Analyze, apache, Attack, Detection, hack, Hackers, howto, IDS, Intrusion, Intrusion Detection, Log, Logs, Python, Scalp, tips

No comments as yet.

Leave Your Comment

All fields marked with "*" are required.

S	M	T	W	T	F	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

How To Use Apache Scalp Log Analyzer to Catch Website Attacks

No comments as yet.

Leave Your Comment

Our Link Network

Business

Client Websites

Fine Dining

Friends & Family

Galleries

Health

Resources

Technology

Travel

The Archives

Latest Ramblings…

Various Topics of No Interest