SourceForge.net Logo

 

 

Stats Collector

 

Stats Collector is a set of tools intended to get statistical values from remote nodes. This covers standard server metrics as well as any other numerical value which might be useful to know the state of the node (as the number of running processes or users). It is constructed following a distributed agent-manager model. RRDtool is employed as data backend, thus providing visualization capabilities.

Internals

The core is an UDP client/server written in Perl. The statistical values are generated at the client side using any available mechanismo (usually shell and Perl scripts), and are stored and plotted on the server side.

Data collection is performed at the client periodically executing a set of sampler scripts. The measured values get sent to the manager node by mean of a simple protocol. The software installed at the manager side includes specific functions for every kind of metric, and is in charge to store the values on the proper RRD.

The creation of graphs and web pages is not accomplished by stats collector. It is an independent task, which can be performed with any capable software. In any case, the use of rrdUtils is highly recommended due to the high level of integraton among both tools.

Available metrics

The available metrics are not too elaborated, and they are closely related to usual Unix commands, appearing small differences among different Unix flavours. Main developement is undertaken on Solaris and Linux servers, so they are the widely supported. However, most of the sampler scripts are usable on a broader set of platforms, and they have been partially tested on FreeBSD, Digital Unix and IRIX system.
It is also possible to collect more specialized metrics using any available command. Using the command pure-ftpwho, which is part of PureFTPd package, we get a text summary of the server activity, which easily integrates into a sampler scripts and is sent to the collector station.

The sampler scripts from version 1.5 collect the values listed below:
MetricOSCommandDescription
avgload*uptimeLoad system averages
disks*df -kHard disks usage
vmstat1,2vmstatVirtual memory system utilization
netstat1,2netstat -anSummary of TCP socket utilization
iostat1,2iostatI/O statistics
process1,2topParameters of running processes (configurable)
netstat_-i1,2netstat -iNetwork interfaces usage
ip, tcp, udp1,2netstat -sSNMP-alike parameters about network utilization
 
1Solaris
2Linux

The communication protocol

The protocol used on the messages sent to the collector station is rather simple, and is just a simple formatting on the message text. The fields are delimited with Unix pipes ('|'), and the header needs four fields. They contain the node and metric name, and also time and date (formatted as SS:MM:HH and DD/MM/YYYY following Perl). The remaining fields stands to carry the data, with free format because each metric values string is read at the server side using specific functions.

Additional information

Powered by SHARK