Tuesday, July 10, 2012

Generating Reports from Web Logs with AWStats

When you want to analyze the web access pattern from the web access logs, AWStats (http://awstats.sourceforge.net) is a handy solution. In my case, I needed to collect summary data from Tomcat access log files and build proper sample data for load testing.
Here's how to generate reports with AWStats from an access log file:

1. Prerequisites


2. Install AWStats

If you extract the compressed AWStats distribution file, then you can find the `awstats_configure.pl' script under `tools' directory. You can start from the script like the following example.
 
$ perl ./awstats_configure.pl

<SNIP>

Do you want to continue setup from this NON standard directory [yN] ? y

<SNIP>

-----> Need to create a new config file ?
Do you want me to build a new AWStats config/profile file (required if first install) [y/N] ? y

-----> Define config file name to create
What is the name of your web site or profile analysis ?
Example: www.mysite.com
Example: demo
Your web site, virtual server or profile name:
> demo


<SNIP>

Press ENTER to continue... 

<SNIP>


Press ENTER to finish...


In the above example, I just installed AWStats just to generate reports offline from access log files without installing onto Apache Web Server for simplicity.
In the second prompt, I just typed 'demo' for a demo analysis task.
The above execution will generate the configuration file for the demo into the `../wwwroot/cgi-bin/awstats.demo.conf' file.

3. Setting the configuration file

Let's open and edit the configuration file for the 'demo' analysis task.
Assuming you're going to analyze a Tomcat access log file, which is in Apache Common Log format.
Here are what you need to edit at least in the configuration file (e.g., `../wwwroot/cgi-bin/awstats.demo.conf'):

# <SNIP>

# Set the access log file path here
LogFile="/var/log/tomcat/access.log"

# <SNIP>

# Examples for Apache combined logs (following two examples are equivalent):
# LogFormat = 1
# <SNIP/>
# For Apache Common Log Format (e.g., Tomcat access log), set it to 4.
LogFormat=4

# <SNIP>

# Set the data directory where AWStats internal data files are stored.
DirData="/var/log/data"

# <SNIP>


With the above configuration (the name of which is 'demo' as shown earlier), this analysis task will analyze the log file configured by 'LogFile' directive, and the internal data will be stored in the directory configured by 'DirData' directive.

4. Update Log Data

Now, you can run AWStats. Go to the `../wwwroot/cgi-bin/' directory and run the following command to update the data from the configured log file:

$ cd ../wwwroot/cgi-bin/
$ perl awstats.pl -config=demo -update

Create/Update database for config "./awstats.demo.conf" by AWStats version 7.0 (build 1.971)
From data in log file "/var/log/tomcat/access.log"...
Phase 1 : First bypass old records, searching new record...
Searching new records from beginning of log file...
Phase 2 : Now process new records (Flush history on disk after 20000 hosts)...
Jumped lines in file: 0
Parsed lines in file: 44217
 Found 0 dropped records,
 Found 0 comments,
 Found 0 blank records,
 Found 1 corrupted records,
 Found 0 old records,
 Found 44216 new qualified records.
 

By the above command, AWStats will reads all the data from the configured log file and update the internal data files.
If you want to delete the data and re-update from the log files, then you can simply delete all the `*.txt' files in the data directory (which was configured by DirData directive above) and run `perl awstats.pl -config=demo -update` again.

5. Generate Reports

Finally, you can generate a report from the updated data by the following command:

#
# First copy the awstats_buildstaticpages.pl script from tools directory 
# if not exists here.
#
$ cp ../../tools/awstats_buildstaticpages.pl ./

$ perl awstats_buildstaticpages.pl -config=demo -month=all -year=2012 -dir=/tmp -awstatsprog=./awstats.pl -buildpdf=/usr/bin/htmldoc

or

$ perl awstats_buildstaticpages.pl -config=demo -month=all -year=2012 -dir=/tmp -awstatsprog=./awstats.pl

Main HTML page is 'awstats.demo.html'.
PDF file is 'awstats.demo.pdf'.

$



Now, the report file is generated into either html files or /tmp/awstats.demo.pdf!

You can skip `-buildpdf ...' option if you do not have HTMLDOC installed.  

Open the pdf file or the main html page now. It contains nice reports!

No comments:

Post a Comment