nate wrote:
There are several tools that will collect interface traffic data via SNMP and record it so you can graph, show high/low/average values over a time span, etc. Cacti (in the epel repo) is probably the easiest to set up, OpenNMS ( probably the most complete. These could also get their data from a port on a managed switch or router if that makes it easier to show the connections you need to split out.
One thing to note for billing, often times bandwidth is billed on a 95th percentile level, and cacti is not good for that if you want accuracy.
Yes, sometimes you pay for a fixed pipe and sometimes you have burstable capacity where you pay for what you use. Cacti and Opennms both use a data storage format where the samples are stored at their full resolution for some time interval, then averaged into longer aggregates as they age to keep the file size down while still keeping a long history. If you use the rrd or jrobin tools to compute min/max/average/percent values over a time range, I believe they normalize the samples to the worst that applies to any part of the range. That is, if you want samples as collected, you must restrict the range of the request to the time span before aggregation happens. The default in opennms is 2 weeks - I'm not sure about cacti.
We use RTG(in my research last year it seemed RTG was the most frequently mentioned tool that was best for this purpose) to measure our main pipes for billing comparison purposes, matches much closer to what the ISPs say, and cacti is quite a bit off. I wouldn't rely on RTG for normal network monitoring(UI isn't that good etc), but for links where billing information is important at least for 95th percentile, don't rely on cacti alone.
Note RTG is not the same as MRTG, though I think I recall seeing RTG was inspired by MRTG.
Not sure how OpenNMS handles that sort of thing.
Pretty much the same, but it defaults to using a java re-implementation of the rrd tools called jrobin. I don't think there is a way to show percentiles in the stock opennms graphs but the jrobin class has the low level methods and the web site shows a way to do it in a groovy (that's the language, not my impression) script.
Not to knock cacti, I use it extensively, currently have a server collecting more than 20 million points of data a day.
Isp's would likely use either a 30-day or actual month of 5-minute samples, computing a 95th percentile by discarding the highest 5 percent of the samples and picking the highest remaining value. To match that, you'd either have to adjust the rrd/jrobin storage formats to retain full-resolution samples that long or extract the individual sample values before they are aggregated and process them some other way.
If anyone has an easier way, please let me know - I need to do this myself for a few connections.