Skip to content

Commit

Permalink
update dependencies
Browse files Browse the repository at this point in the history
  • Loading branch information
sanity committed Sep 21, 2024
1 parent 3b94ac6 commit 4b2ec26
Show file tree
Hide file tree
Showing 3 changed files with 54 additions and 27 deletions.
4 changes: 2 additions & 2 deletions .aider.conf.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
dark-mode: true
test-cmd: RUSTFLAGS="-D warnings" cargo check
auto-test: true
test-cmd: RUSTFLAGS="-D warnings" cargo test
auto-test: true
6 changes: 3 additions & 3 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name = "pav_regression"
description = "The pair adjacent violators algorithm for isotonic regression"
homepage = "https://github.com/sanity/pav.rs"
repository = "https://github.com/sanity/pav.rs"
version = "0.5.1"
version = "0.5.2"
authors = ["Ian Clarke <[email protected]>"]
edition = "2018"
license = "LGPL-3.0-or-later"
Expand All @@ -16,8 +16,8 @@ name = "pav_regression"
path = "src/lib.rs"

[dependencies]
ordered-float = "3.6.0"
serde = { version = "1.0.159", features = ["derive"] }
ordered-float = "3.9.2"
serde = { version = "1.0.210", features = ["derive"] }
thiserror = "1.0"

[dev-dependencies]
Expand Down
71 changes: 49 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,53 +1,76 @@
# Pair Adjacent Violators for Rust

[![Rust](https://github.com/sanity/pav.rs/actions/workflows/rust.yml/badge.svg)](https://github.com/sanity/pav.rs/actions/workflows/rust.yml) [![crates.io](https://img.shields.io/crates/v/pav_regression.svg)](https://crates.io/crates/pav_regression)

[![Rust](https://github.com/sanity/pav.rs/actions/workflows/rust.yml/badge.svg)](https://github.com/sanity/pav.rs/actions/workflows/rust.yml)
[![crates.io](https://img.shields.io/crates/v/pav_regression.svg)](https://crates.io/crates/pav_regression)

## Overview

An implementation of the [Pair Adjacent Violators](https://onlinelibrary.wiley.com/doi/pdf/10.1002/9781118763155.app3) algorithm for [isotonic regression](https://en.wikipedia.org/wiki/Isotonic_regression). Note this algorithm is also known as "Pool Adjacent Violators".
An implementation of the
[Pair Adjacent Violators](https://onlinelibrary.wiley.com/doi/pdf/10.1002/9781118763155.app3)
algorithm for [isotonic regression](https://en.wikipedia.org/wiki/Isotonic_regression). Note this
algorithm is also known as "Pool Adjacent Violators".

### What is "Isotonic Regression" and why should I care?

Imagine you have two variables, _x_ and _y_, and you don't know the relationship between them, but you know that if _x_ increases then _y_ will increase, and if _x_ decreases then _y_ will decrease. Alternatively it may be the opposite, if _x_ increases then _y_ decreases, and if _x_ decreases then _y_ increases.
Imagine you have two variables, _x_ and _y_, and you don't know the relationship between them, but
you know that if _x_ increases then _y_ will increase, and if _x_ decreases then _y_ will decrease.
Alternatively it may be the opposite, if _x_ increases then _y_ decreases, and if _x_ decreases then
_y_ increases.

Examples of such isotonic or monotonic relationships include:

* _x_ is the pressure applied to the accelerator in a car, _y_ is the acceleration of the car (acceleration increases as more pressure is applied)
* _x_ is the rate at which a web server is receiving HTTP requests, _y_ is the CPU usage of the web server (server CPU usage will increase as the request rate increases)
* _x_ is the price of an item, and _y_ is the probability that someone will buy it (this would be a decreasing relationship, as _x_ increases _y_ decreases)
- _x_ is the pressure applied to the accelerator in a car, _y_ is the acceleration of the car
(acceleration increases as more pressure is applied)
- _x_ is the rate at which a web server is receiving HTTP requests, _y_ is the CPU usage of the web
server (server CPU usage will increase as the request rate increases)
- _x_ is the price of an item, and _y_ is the probability that someone will buy it (this would be a
decreasing relationship, as _x_ increases _y_ decreases)

These are all examples of an isotonic relationship between two variables, where the relationship is likely to be more complex than linear.
These are all examples of an isotonic relationship between two variables, where the relationship is
likely to be more complex than linear.

So we know the relationship between _x_ and _y_ is isotonic, and let's also say that we've been able to collect data about actual _x_ and _y_ values that occur in practice.
So we know the relationship between _x_ and _y_ is isotonic, and let's also say that we've been able
to collect data about actual _x_ and _y_ values that occur in practice.

What we'd really like to be able to do is estimate, for any given _x_, what _y_ will be, or alternatively for any given _y_, what _x_ would be required.
What we'd really like to be able to do is estimate, for any given _x_, what _y_ will be, or
alternatively for any given _y_, what _x_ would be required.

But of course real-world data is noisy, and is unlikely to be strictly isotonic, so we want something that allows us to feed in this raw noisy data, figure out the actual relationship between _x_ and _y_, and then use this to allow us to predict _y_ given _x_, or to predict what value of _x_ will give us a particular value of _y_. This is the purpose of the pair-adjacent-violators algorithm.
But of course real-world data is noisy, and is unlikely to be strictly isotonic, so we want
something that allows us to feed in this raw noisy data, figure out the actual relationship between
_x_ and _y_, and then use this to allow us to predict _y_ given _x_, or to predict what value of _x_
will give us a particular value of _y_. This is the purpose of the pair-adjacent-violators
algorithm.

#### ...and why should I care?

Using the examples I provide above:

* A self-driving car could use it to learn how much pressure to apply to the accelerator to give a desired amount of acceleration
* An autoscaling system could use it to help predict how many web servers they need to handle a given amount of web traffic
* A retailer could use it to choose a price for an item that maximizes their profit (aka "yield optimization")
- A self-driving car could use it to learn how much pressure to apply to the accelerator to give a
desired amount of acceleration
- An autoscaling system could use it to help predict how many web servers they need to handle a
given amount of web traffic
- A retailer could use it to choose a price for an item that maximizes their profit (aka "yield
optimization")

#### Isotonic regression in online advertising

If you have an hour to spare, and are interested in learning more about how online advertising works - you should check out [this lecture](https://vimeo.com/137999578) that I gave in 2015 where I explain how we were able to use pair adjacent violators to solve some fun problems.
If you have an hour to spare, and are interested in learning more about how online advertising
works - you should check out [this lecture](https://vimeo.com/137999578) that I gave in 2015 where I
explain how we were able to use pair adjacent violators to solve some fun problems.

#### A picture is worth a thousand words

Here is the relationship that PAV extracts from some very noisy input data where there is an increasing relationship between _x_ and _y_:
Here is the relationship that PAV extracts from some very noisy input data where there is an
increasing relationship between _x_ and _y_:

![PAV in action](https://sanity.github.io/pairAdjacentViolators/pav-example.png)

## Features

* Smart linear interpolation between points and extrapolation outside the training data domain
* Fairly efficient implementation without compromizing code readability
* Will intelligently extrapolate to compute _y_ for values of _x_ greater or less than those used to build the PAV model
- Smart linear interpolation between points and extrapolation outside the training data domain
- Fairly efficient implementation without compromizing code readability
- Will intelligently extrapolate to compute _y_ for values of _x_ greater or less than those used to
build the PAV model

## Usage example

Expand All @@ -68,11 +91,15 @@ Here is the relationship that PAV extracts from some very noisy input data where
);
```

For more examples please see the [unit tests](https://github.com/sanity/pav.rs/blob/master/src/pav.rs#L170).
For more examples please see the
[unit tests](https://github.com/sanity/pav.rs/blob/master/src/pav.rs#L170).

## License
Released under the [LGPL](https://en.wikipedia.org/wiki/GNU_Lesser_General_Public_License) version 3 by [Ian Clarke](http://blog.locut.us/).

Released under the [LGPL](https://en.wikipedia.org/wiki/GNU_Lesser_General_Public_License) version 3
by [Ian Clarke](http://blog.locut.us/).

## See also

* An earlier implementation of PAV for Kotlin/JVM by the same author: https://github.com/sanity/pairAdjacentViolators
- An earlier implementation of PAV for Kotlin/JVM by the same author:
https://github.com/sanity/pairAdjacentViolators

0 comments on commit 4b2ec26

Please sign in to comment.