Good Morning,
The discussion about "RayStedman.org Bandwidth" inspired me to write a script that reports who the largest bandwidth consumers by ip address and host name. The report looks like this:
3,867,534,553 66.159.202.142 adsl-66-159-202-142.dslextreme.com. 3,847,010,060 190.82.182.19 190-82-182-19.adsl.cust.tie.cl. 1,410,308,739 130.160.110.250 1,051,088,947 216.57.200.57
I'm sure this kind of thing has been done many times in the past by other tools. I thought I would post the script I created just in case it might be helpful to others on this forum.
Thanks again for your feedback on this topic. Greg
#!/bin/bash
# big_bw -- written by Greg Sims 05/01/08
# this script takes as input apache httpd log files access_log and # access_log.processed. a report is generated that contains one line # per ip address with the following fields: bandwidth consumed, # the ip address and the host name associated with the ip address. # # it is important to use mod_logio in the creation of the log files # to ensure the proper number of bytes are recorded in each log # entry. please see http://www.devside.net/guides/config/bytes-sent # how to accomplish this.
# directory where access_log and access_log.processed are located # basedir="/var/www/vhosts/raystedman.net/statistics/logs/"
# create bw.raw containing the ip address and bandwidth for each record; # sort the resulting file by ip address # cd /tmp cat $basedir"access_log" >bw.log cat $basedir"access_log.processed" >>bw.log
cat bw.log | cut -d' ' --field=1,10 | sort >bw.raw
# read through bw.raw and create bw.sum which contains one line per # ip address. each line in bw.sum contains the amount of bandwidth # consumed and the ip address that used the bandwidth # thisip="" rm -f bw.sum
while read inputline; do ip=$(echo "$inputline" | cut -d " " -f 1) bw=$(echo "$inputline" | cut -d " " -f 2) if [ "$bw" = "-" ]; then bw=0 fi
if [ "$thisip" != "$ip" ]; then echo $thisipbw $thisip >>bw.sum thisip=$ip thisipbw=$bw else if [ $bw != "-" ]; then thisipbw=$(( $thisipbw + $bw )) fi fi
done < "bw.raw"
# sort bw.sum so the largest amount of bandwidth used is at the top. # create bw.sum.sort which is the largest 35 consumers of bandwidth. # write a report to stdout doing some formatting in the process. # sort -nr bw.sum | head -n 35 >bw.sum.sort
while read inputline; do bw=$(echo "$inputline" | cut -d " " -f 1) bw=$(echo "$bw" | sed -e :a -e 's/(.*[0-9])([0-9]{3})/\1,\2/;ta') ip=$(echo "$inputline" | cut -d " " -f 2)
echo -n $bw | sed -e :a -e 's/^.{1,14}$/ &/;ta' echo -n " " echo -n $ip | sed -e :a -e 's/^.{1,15}$/ &/;ta' echo -n " " host_name=$(host $ip | sed 's/^.*pointer //' | sed 's/.*DOMAIN)//') host_name=$(echo "$host_name" | sed 's/.*alias for //') echo $host_name
done <"bw.sum.sort"