Skip to content
This repository has been archived by the owner on Nov 29, 2022. It is now read-only.

Piecewise

John Tigue edited this page Jun 5, 2015 · 3 revisions

M-Lab IT-VM status

  • M-Lab is completing work on an easily deployable virtual machine to provide a stable platform for ingesting data from Big Query into a Postgres data store and provide a web API to back visualizations and content.
  • This VM will become part of an "M-Lab Integration Toolkit", and thus we will refer to this sub-project as the M-Lab Integration Toolkit VM (M-Lab IT-VM).
  • The M-Lab IT-VM is intended as a portable, configurable, postgres backed VM, allowing M-Lab partners to easily deploy a web portal showing M-Lab data and provide their audiences with test integrations and data visualizations.
  • M-Lab is tracking this sub-project here: https://github.com/dwins/piecewise

A basic feature list includes:

  • Setup scripts based on an Ansible "playbook" that prepares a Postgres database and python webserver on a single machine. (completed)
  • An accompanying Vagrant configuration file automates the creation of a local virtual machine using the playbook. (completed)
  • Data collection is based on a configuration file which includes the region where data should be collected, Google API account, etc.
  • Python module called 'piecewise.' is doing the heavy lifting of data collection, aggregation, and statistics.

Status

  • Completed configurability work with example configuration
  • Building on configurability infrastructure, refactored web API for extensibility - no new web code should be needed for statistics that are added in the future.
  • Mostly finished median calculation
  • Began work on Maxmind ISP identification.
  • VM created to publish a test instance and coordinate integration of SEAnet work

Piecewise

Piecewise defines an "Aggregate" concept which represents a statistic such that the computed statistics over subpopulations can be combined to get the same statistic computed over the combined population. \For example, if we have the minimum RTT aggregated by minute, we can find the hourly minimum by considering only the minutely values. Piecewise uses these aggregate definitions to fetch values from BigQuery (via an API account), insert the aggregated results into Postgres, and then combine them on the fly when querying.

  • Piecewise has been tested with minimum RTT and average RTT statistics.
  • Piecewise has two executable modules (piecewise.ingest and piecewise.query) which have been use for ad-hoc testing

Open questions

[Just notes from JFT]

  • Will Turf still be needed server-side to do things like "this point is in this GeoJSON polygon."