We have a httpd instance that uses Virtual Hosts to serve content on various domains. What we are lacking is some sort of (near) real-time tool for showing the shape of our traffic.
We can see the output of the server-status page, but I'd like a little more than that:
- Traffic numbers by virtual host, to see which ones are busy.
- Traffic numbers by client IP, to detect and allow us to block basic DoS / over-enthusiastic crawlers on occasion.
- Persist and graph this data, so that we can observe trends.
So there's at least 2 requirements there - a planning / projection aspect, and a dashboard 'WTF is happening at the moment?' view.
I haven't been able to find anything that does this out of the box, but I can't believe that I'm the first person to want this sort of thing?
-
I often use munin for stuff like this, and there is an apache plugin. However, it will not break down the traffic per vitualhost. I've seen solutions that use apache mod_watch, but that package is pretty old and doesn't seem to be well maintained.
From rorr -
I'd recommend shipping your logs off to a splunk instance for analysis. It isn't real time, but I believe it can be pretty darn close. The free version will analyze up to 500MB of logfiles each day, which is a pretty busy website.
From Tim Howland -
I use chartbeat.com to see realtime stats such as number of visitors, etc. I'm a customer, I don't work for them. You drop in their javascript code similar to how you drop in google analytics.
jabley : Unfortunately I can't - this is mobile and Javascript doesn't work / isn't enabled a lot of the time.From dar -
Webalizer is a very good analysis tool that works on the apache logs - it will give you post-mortem per virtual host with client IPs and a lot of other useful information. It is very not real time though - you're supposed to run it daily on your logs (using cron or something).
As the real time logs can be very useful for the stuff that you need, you can pipe them to a database or some real time log analyzer and do the analysis yourself - but I'm not familiar with a specific software solution that does this and writing such a thing would take some serious development.
As external solution goes, I recommend using ntop which is a realtime network traffic analyzer. It has tons of features so it can take some time to figure out how to work this thing, but it does do full HTTP protocol analysis so it can show you what virtual hosts people are using to hit your site with - both in (near) real time and with history.
From Guss -
I think apachetop may be something you can use to satisfy the first two points:
http://www.webta.org/projects/apachetop/
Personally instead of using that, I wrote something that just scrapes the apache status page (you'd have to enable mod_status), something that's easily replicatable with an hour or three of scripting. The last point it likely best done through log analysis, rather than through polling the apache status page repeatedly.
jabley : Funnily enough I did write something like that already. The downside with that is that when things start to fail, the apache status page tends to return a 503. My thin integrates with ganglia, to give us graphs. But maybe SNMP would be a better choice? Thanks for the apachetop suggestion - that almost seems like Good Enough.From epic9x
0 comments:
Post a Comment