Skip to content

Commit

Permalink
Less Ziggy, more BlueOx
Browse files Browse the repository at this point in the history
  • Loading branch information
rhettg committed Apr 10, 2012
1 parent 584a834 commit f1b87bf
Show file tree
Hide file tree
Showing 23 changed files with 190 additions and 190 deletions.
8 changes: 7 additions & 1 deletion CHANGES
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
blueox (0.2.0)

* Rename so as to have a unique pypi name.

-- Rhett Garber <[email protected]> Tue, 10 Apr 2012 14:10:00 -0700

ziggy (0.1.0)

* Initial Release

-- Rhett Garber <[email protected]> Thu, 22 Aug 2012 15:53:00 -0700
-- Rhett Garber <[email protected]> Thu, 22 Mar 2012 15:53:00 -0700
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.PHONY: all pep8 pyflakes clean dev

PYTHON=/usr/bin/python
PYTHON=python
GITIGNORES=$(shell cat .gitignore |tr "\\n" ",")

all: pep8
Expand Down
10 changes: 5 additions & 5 deletions README
Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
Ziggy - A python application logging framework
BlueOx - A python application logging framework

Ziggy is a python based logging and data collection framework. The problem it
BlueOx is a python based logging and data collection framework. The problem it
attempts to solve is one where you have multiple python processes across
multiple hosts processing some sort of requests. You generally want to collect:

* Performance data (counters, timers, etc)
* User activity
* Errors (and debugging data)

Use ziggy to record that data, aggregate it to a central logging server where
Use BlueOx to record that data, aggregate it to a central logging server where
it can be written to disk.

In addition, it's often useful to be able to plug reporting scripts into the
logging server so as to generate live stats and reports or do ad hoc analysis.

Ziggy's collection functionality is fairly advanced allowing heirarchies of
BlueOx's collection functionality is fairly advanced allowing heirarchies of
collectors and queuing in the event of failure. For example, it's recommend to
run an instance of `ziggyd` on each host, and then configure each of those
run an instance of `oxd` on each host, and then configure each of those
collectors to forward log events to a central collector on a dedicated log
machine.

Expand Down
108 changes: 55 additions & 53 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,58 +1,60 @@
Ziggy - A python application logging framework
BlueOx - A python application logging framework
=========================

Ziggy is a python based logging and data collection framework. The problem it
BlueOx is a python based logging and data collection framework. The problem it
attempts to solve is one where you have multiple python processes across
multiple hosts processing some sort of requests. You generally want to collect:

* Performance data (counters, timers, etc)
* User activity
* Errors (and debugging data)

Use ziggy to record that data, aggregate it to a central logging server where
Use BlueOx to record that data, aggregate it to a central logging server where
it can be written to disk.

In addition, it's often useful to be able to plug reporting scripts into the
logging server so as to generate live stats and reports or do ad hoc analysis.

Ziggy's collection functionality is fairly advanced allowing heirarchies of
BlueOx's collection functionality is fairly advanced allowing heirarchies of
collectors and queuing in the event of failure. For example, it's recommend to
run an instance of `ziggyd` on each host, and then configure each of those
run an instance of `oxd` on each host, and then configure each of those
collectors to forward log events to a central collector on a dedicated log
machine.

BlueOx is named after Paul Bunyan's Blue Ox "Babe". A great help for giant logging problems.

Installation
----------------

Ziggy requires Python 2.7, ZeroMQ and BSON.
BlueOx requires Python 2.7, ZeroMQ and BSON.

The full python library requirements are given `requirements.txt` and is designed to be used with virtualenv.

Tornado is not required for operation of ziggy, but development and running tests will likely require it.
Tornado is not required for operation of BlueOx, but development and running tests will likely require it.

I expect debian packaging will be developed soon.

Application Integration
-----------------

Applications emit ziggy events by using a context manager and globally accessible ziggy functions.
Applications emit BlueOx events by using a context manager and globally accessible BlueOx functions.

Events have a type, which indicates what will be ultimately logged together.

Events also have an id that can be used to tie them together with other related events.

For example, in a web application, an application might choose the use ziggy as follows:
For example, in a web application, an application might choose the use BlueOx as follows:


def handle(request):
with ziggy.Context('request'):
ziggy.set('user_agent', self.headers['UserAgent'])
with blueox.Context('request'):
blueox.set('user_agent', self.headers['UserAgent'])

with ziggy.timeit('request_time'):
with blueox.timeit('request_time'):
do_stuff()

ziggy.set('response.status', self.response.status)
ziggy.set('response.size', len(self.response.body))
blueox.set('response.status', self.response.status)
blueox.set('response.size', len(self.response.body))


The above sample would generate one event that contains all the details about a
Expand All @@ -65,24 +67,24 @@ Indicate you want this behavior for your context by naming with a prefixing '.'.
For example, inside some application code (in `do_stuff()` above), you might execute some sql queries.

def execute(cursor, query, args):
with ziggy.Context('.sql'):
ziggy.set('query', query)
with ziggy.timeit('query_time'):
with blueox.Context('.sql'):
blueox.set('query', query)
with blueox.timeit('query_time'):
res = cursor.execute(query, args)
ziggy.set('row_count', len(res))
blueox.set('row_count', len(res))
return res

Each SQL query would then be logged as a seperate event. However, each event
will have the unique id provided by the parent `request` context. The name of the context will become `request.sql`.

You can provide you're own id or allow ziggy to autogenerate one for the top-level context.
You can provide you're own id or allow BlueOx to autogenerate one for the top-level context.

Ziggy also provides the ability to do sampling. This means only a set
BlueOx also provides the ability to do sampling. This means only a set
percentage of generate events will actually be logged. You can choose sampling
based on any level of the context:

with ziggy.Context('.memcache', sample=('..', 0.25)):
ziggy.set('key', key)
with blueox.Context('.memcache', sample=('..', 0.25)):
blueox.set('key', key)
client.set(key, value)

In the above example, only 25% of requests will include the memcache data. If
Expand All @@ -91,20 +93,20 @@ events would be logged.

### Configuration

If ziggy has not been explicitly configured, all the calls to ziggy will essentially be no-ops. This is
If BlueOx has not been explicitly configured, all the calls to BlueOx will essentially be no-ops. This is
rather useful in testing contexts so as to not generate a bunch of bogus data.

For production use, you'll need to set the collection host and port:

ziggy.configure("127.0.0.1", 3514)
blueox.configure("127.0.0.1", 3514)

### Logging module integration

Ziggy comes with a log handler that can be added to your `logging` module setup for easy integration into existing logging setups.
BlueOx comes with a log handler that can be added to your `logging` module setup for easy integration into existing logging setups.

For example:

handler = ziggy.LogHandler()
handler = blueox.LogHandler()
handler.setLevel(logging.INFO)
logging.getLogger('').addHandler(handler)

Expand All @@ -113,29 +115,29 @@ configured by passing a type_name to the `LogHandler`

### Tornado Integration

Ziggy comes out of the box with support for Tornado web server. This is
particularly challenging since one of the goals for ziggy is to, like the
BlueOx comes out of the box with support for Tornado web server. This is
particularly challenging since one of the goals for BlueOx is to, like the
logging module, have globally accessible contexts so you don't have to pass
anything around to have access to all the heirarchical goodness.

Since you'll likely want to have a context per web request, it's difficult o
work around tornado's async machinery to make that work well.
Fear not, batteries included: `ziggy.tornado_utils`
Fear not, batteries included: `blueox.tornado_utils`

The most straightfoward way to integrate ziggy into a tornado application requires two things:
The most straightfoward way to integrate BlueOx into a tornado application requires two things:

1. Allow ziggy to monkey patch async tooling (tornado.gen primarily)
1. Use or re-implement the provided base request handler `ziggy.tornado_utils.SampleRequestHandler`
1. Allow BlueOx to monkey patch async tooling (tornado.gen primarily)
1. Use or re-implement the provided base request handler `blueox.tornado_utils.SampleRequestHandler`

To install the monkey patching, add the line:

ziggy.tornado_utils.install()
blueox.tornado_utils.install()

This must be executed BEFORE any of your RequestHandlers are imported.

This is required if you are using `@web.asynchronous` and `@gen.engine`. If you are
manually managing callbacks (which you probably shouldn't be), you'll need
manually recall the ziggy context with `self.ziggy.start()`
manually recall the BlueOx context with `self.blueox.start()`

See `tests/tornado_app.py` for an example of all this.

Expand All @@ -148,71 +150,71 @@ you to name it whatever you want.
Event Collection
-----------------

Events are collected by a ziggy daemon (`ziggyd`) and can be configured in a variety of topologies.
Events are collected by a BlueOx daemon (`oxd`) and can be configured in a variety of topologies.

It's recommended that you run a ziggy daemon on each host, and then a master ziggy daemon that collects
It's recommended that you run a BlueOx daemon on each host, and then a master BlueOx daemon that collects
all the streams together for logging. In this configuration, failure of the centralized collector would not
result in any data loss as the local instances would just queue up their events.

So on your local machine, you'd run:

ziggyd --forward=master:3514
oxd --forward=master:3514

And on the master collection machine, you'd run:

ziggyd --collect="*:3514" --log-path=/var/log/ziggy/
oxd --collect="*:3514" --log-path=/var/log/blueox/

Logs are stored in BSON format, so you'll need some tooling for doing log
analysis. This is easily done with the tool `ziggyview`.
analysis. This is easily done with the tool `oxview`.

For example:

cat /var/log/ziggy/request.120310.bson | ziggyview
cat /var/log/blueox/request.120310.bson | oxview

ziggyview --log-path=/var/log/ziggy --type-name="request" --start-date=20120313 --end-date=20120315
oxview --log-path=/var/log/blueox --type-name="request" --start-date=20120313 --end-date=20120315

Where `request` is the channel you want to examine.

You can also connect to `ziggyd` and get a live streaming of log data:
You can also connect to `oxd` and get a live streaming of log data:

ziggyview -H localhost:3513 --type-name="request*"
oxview -H localhost:3513 --type-name="request*"

Note the use of '*' to indicate a prefix query for the type filter. This will
return all events with a type that begins with 'request'

### A Note About Ports

There are several types of network ports in use with Ziggy:
There are several types of network ports in use with BlueOx:

1. Control Port (default 127.0.0.1:3513)
1. Collection Port (default 127.0.0.1:3514)
1. Streaming Port (no default, randomonly assigned)

Both the Control and Collection ports are configurable from the command line.

When configuring forwarding between ziggyd instances, you'll want to always use
When configuring forwarding between oxd instances, you'll want to always use
the collection port.

When configuring an application to send data to a ziggyd instance, you'll want
When configuring an application to send data to a oxd instance, you'll want
to use the collection port as well.

For administrative (and `ziggyview` work) you'll use the control port. The
control port (and ziggy administrative interface) can be used to discover all
For administrative (and `oxview` work) you'll use the control port. The
control port (and BlueOx administrative interface) can be used to discover all
the other ports. The reason the collection port must be configured explicitly
for actual logging purposes is to better handle reconnects and failures.


Administration
---------------
Use the `ziggyctl` tool to collect useful stats or make other adjustments to a running ziggyd instance.
Use the `oxctl` tool to collect useful stats or make other adjustments to a running oxd instance.

For example:

ziggyctl status
oxctl status

or

ziggyctl shutdown
oxctl shutdown


Development
Expand All @@ -233,9 +235,9 @@ Or if you are running individual tests, use `testify` directly:

TODO List
----------------
* Failure of the central collector is only recoverable if the `ziggyd`
* Failure of the central collector is only recoverable if the `oxd`
instance comes back on the same ip address. It would be nice
to be able to live reconfigure through `ziggyctl` to point to a backup collector.
to be able to live reconfigure through `oxctl` to point to a backup collector.
* Debian packaging would probably be convinient.
* Need more Real World data on what becomes a bottleneck first: CPU or
Network. Adding options for compression would be pretty easy.
Expand Down
2 changes: 1 addition & 1 deletion bin/ziggyctl → bin/oxctl
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ import pprint
import zmq
import bson

log = logging.getLogger('ziggyview')
log = logging.getLogger('blueox.ctl')

def setup_logging(options):
if len(options.verbose) > 1:
Expand Down
6 changes: 3 additions & 3 deletions bin/ziggyd → bin/oxd
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
# -*- coding: utf-8 -*-

"""
ziggyd
oxd
~~~~~~~~
Main logging daemon for ziggy events.
Main logging daemon for BlueOx events.
:copyright: (c) 2012 by Rhett Garber
:license: ISC, see LICENSE for more details.
Expand Down Expand Up @@ -34,7 +34,7 @@ FILE_POLL_INTERVAL = 1.0
# How long before we close an open idle file
FILE_IDLE_TIMEOUT = 60.0

log = logging.getLogger("ziggy.d")
log = logging.getLogger("blueox.d")

def setup_logging(options):
if len(options.verbose) > 1:
Expand Down
12 changes: 6 additions & 6 deletions bin/ziggyview → bin/oxview
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@ import pprint
import bson
import zmq

import ziggy.client
import blueox.client

log = logging.getLogger('ziggyview')
log = logging.getLogger('blueox.view')

def setup_logging(options):
if len(options.verbose) > 1:
Expand All @@ -38,20 +38,20 @@ def main():

setup_logging(options)
if sys.stdin.isatty():
log.info("Loading stream from ziggyd")
log.info("Loading stream from oxd")

out_stream = ziggy.client.subscribe_stream(options.host, options.type_name)
out_stream = blueox.client.subscribe_stream(options.host, options.type_name)
else:
if options.type_name is not None:
parser.error("Can't specify a name from stdin")
sys.exit(1)

log.info("Loading stream from stdin")
stdin = io.open(sys.stdin.fileno(), mode='rb', closefd=False)
out_stream = ziggy.client.decode_stream(stdin)
out_stream = blueox.client.decode_stream(stdin)

if options.group:
out_stream = ziggy.client.Grouper(out_stream)
out_stream = blueox.client.Grouper(out_stream)

for line in out_stream:
if options.pretty:
Expand Down
Loading

0 comments on commit f1b87bf

Please sign in to comment.