Skip to content

Resampling Time Series Data with Pandas

Daniel Goldfarb edited this page Aug 19, 2020 · 13 revisions

Two ways to resample for OHLC data using Pandas:


1. Single time-series value to OHLC data:

In this method, you take a single value (for example "Close") and use that to generate Open, High, Low, and Close for the resample period.

If you want Volume also, you then have to resample the volume separately. Then, to be able to use the resampled data in mpf.plot() you need to combine the two resamples (ohlc and volume) into a single dataframe, and ensure the column names are what mplfinance wants (or alternatively, use the columns= kwarg to specify your own custom column names). This method uses the ohcl() aggregator function of the Pandas.DataFrame.Resample class.


2. Existing OHLCV data to new OHLC data

In this method, you already have Open,High,Low,Close, and Volume columns, and you want to use all of these columns for the resample.

In this case you must provide a mapping to the agg() function of the Resample class; the mapping specifies how to aggregate each column individually. For example, to aggregate all of the High's, you take the maximum of all the High's for your new High, whereas to aggregate the Open's, you take the first of all the Opens as your new Open.




SEE ALSO:


Resampling options

  • pandas comes with many in-built options for resampling, and you can even define your own methods.
  • In terms of date ranges, the following is a table for common time period options when resampling a time series:
Alias Description
B Business day
D Calendar day
W Weekly
M Month end
Q Quarter end
A Year end
BA Business year end
AS Year start
H Hourly frequency
T, min Minutely frequency
S Secondly frequency
L, ms Millisecond frequency
U, us Microsecond frequency
N, ns Nanosecond frequency

These are some of the common methods you might use for resampling:

Method Description
bfill Backward fill
count Count of values
ffill Forward fill
first First valid data value
last Last valid data value
max Maximum data value
mean Mean of values in time range
median Median of values in time range
min Minimum data value
nunique Number of unique values
ohlc Opening value, highest value, lowest value, closing value
pad Same as forward fill
std Standard deviation of values
sum Sum of values
var Variance of values