statzlogger logs statz

statzlogger was inspired by Google’s sawzall log processing language and a blog post. Sawzall is a great idea, but it implements a lot of features already available in a higher-level, dynamic language like Python. The logging.statistics proposal caused me to connect the dots, and I suddenly realized that (as is fairly evident if you read the language description), sawzall is both a log processing and log generating language. While its input processing capabilities are redundant from the Python perspective, sawzall’s output aggregators would make it easy to track statistics during an application run. And these emitters are closely analogous to logging.Handler instances.

So, this is statzlogger: a set of custom handler implementations that let you aggregate statistics while your application runs. statzlogger takes advantage of logging‘s thread management and propagation and borrows its interface. You can call logging.debug() with a few extra parameters in its extra keyword argument and statzlogger takes care of the rest. Use it to track the number of requests your webapp serves; to count the number of unique words in a file; or to track the slowest queries to your database.

Why ‘statzlogger’?

Because I wanted the name to shorten to ‘szl’ in homage to the source of statzlogger‘s handler design.

Installing statzlogger

You can install the latest stable version of statzlogger using pip:

$ pip install stacklogger

Public repositories for the project are hosted at github and bitbucket, so you can use either git or Mercurial to get a copy of the project’s code and history:

$ hg clone http://bitbucket.org/wcmaier/statzlogger
$ git clone git://github.com/wcmaier/statzlogger.git

If you notice a problem with statzlogger, please report it using the github issue tracker (or, if you have a fix, send a pull request).

A note about versions

statzlogger is developed along two branches. The first, ‘default’ (or ‘master’ in git) contains new features and possible bugs – this branch is the active development branch. The second, ‘stable’, contains releases both major and minor as well as bugfixes. If you’d like to help improve statzlogger, take a look at default/master. Otherwise, stick with stable.

Basic Usage

statzlogger is implemented as a set of handlers that are compatible with the standard logging module, so you can simply plug the desired handler into your application’s existing logging configuration. statzlogger doesn’t require a real logging configuration, though, so feel free to skip it.

To use statzlogger, create any number of regular logging.Logger instances:

import logging
import time
import stacklogger as szl

reqs = logging.getLogger("stats.requests")
reqs.addHandler(sql.Collection())

Note: it’s a good idea to cluster all of your statzlogger loggers in a single namespace. Then, you can control their output via a single instance, turning them on or off as necessary. To track stuff, log messages on the reqs logger each time your application serves a request for a URL:

now = time.time()
hour = now - (now % 3600)
reqs.debug("/my/app?user=foo", extra=dict(index=hour)

When your application is complete, you’ll find the requested URLs indexed by hour in the indices dictionary on the reqs logger:

import operator

data = reqs.handlers[0].indices.items()
for requests, hour in sorted(data, key=operator.itemgetter(1)):
    print "%d: %d requests % (hour, len(requests)

statzlogger providers a number of handlers to aggregate your data; see below for more information.

API

class statzlogger.StatzHandler(level=0)

Bases: logging.Handler

A basic handler to receive statistics in the form of LogRecords.

The StatzHandler knows how to index and aggregate LogRecords. Various subclasses may aggregate records differently, but they all maintain a indices attribute. indices is a dictionary with keys accumulated during a logging run; its values are the logged data aggregated by index.

These handlers rely on extra information supplied when a LogRecord is created (see getvalue()). Instantiation of a StatzHandler is no different from that of a normal Handler.

emit(record)

Emit the record.

Typically, this means aggregating it in one of the handler’s indices (under indices).

emitvalue(value, index)

Emit a value for a single index.

getindices(record)

Return a list of indices for a given record.

The list of indices will either contain the record’s index attribute or a list generated from its iterable indices attribute. If both attributes are present, index will be added to indices. If no indices are defined, the resulting list will be [None].

getvalue(record)

Return the value of a LogRecord instance.

If record has a value attribute, use that. Otherwise, use its msg attribute. Note: LogRecords can be given (nearly) arbitrary attributes at creation time by passing the extra keyword argument to the logging method. For example:

>>> logging.debug("a message", extra={"value": "the real value"})
indices

A dictionary of indices.

emit() stores new record values after determining the appropriate index for a record (see getindices()).

class statzlogger.Sum(level=0, default=0, op=<built-in function add>)

Bases: statzlogger.StatzHandler

The arithmetic sum of the value of each record.

Doesn’t make sense for eg string values, but the implementation won’t complain. Parameters:

  • default starting value
  • op operator to add values together
class statzlogger.Collection(level=0, default=[], op=<built-in function add>)

Bases: statzlogger.Sum

A collection of records values.

class statzlogger.Maximum(level=0, size=None, weight=1, reverse=True)

Bases: statzlogger.Collection

Keep only the values with the highest weight.

In addition to the usual msg or value attributes, a LogRecord may set a weight attribute to influence the record’s place in the sorted collection. Parameters:

  • size maximum size of each index
  • weight default record weight
  • reverse direction in which to sort the collection
class statzlogger.Minimum(level=0, size=None, weight=1, reverse=False)

Bases: statzlogger.Maximum

Keep only the values with the lowest weight.

class statzlogger.Set(level=0, default=set([]), size=None, op=<method 'union' of 'set' objects>)

Bases: statzlogger.Collection

A collection of unique items.

If any index grows beyond size members, the entire index is removed.

Table Of Contents

This Page