Skip to content

Commit

Permalink
removed resample (#93)
Browse files Browse the repository at this point in the history
* removed resample

* updated version number

* updated changelog
  • Loading branch information
tailaiw authored Mar 9, 2020
1 parent de098cf commit 33395fa
Show file tree
Hide file tree
Showing 8 changed files with 8 additions and 280 deletions.
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@
# The short X.Y version.
version = "0.6"
# The full version, including alpha/beta/rc tags.
release = "0.6.0-dev.26+pr.92"
release = "0.6.0-dev.27+pr.93"

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
Expand Down
132 changes: 4 additions & 128 deletions docs/notebooks/demo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -3282,130 +3282,6 @@
"precision(expanded_true_anomaly_list, detected_anomaly_list)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Resample a time series with new spacing"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Function `resample` samples time series with given new spacing. Resampled values are determined by time-weighted linear interpolation."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In following example, we resample the given time series with spacing of 1 day."
]
},
{
"cell_type": "code",
"execution_count": 98,
"metadata": {},
"outputs": [],
"source": [
"s = pd.Series([1, 2.5, 3, 4.25, 6, 9],\n",
" index=pd.DatetimeIndex(['2017-1-1 00:00:00', '2017-1-2 12:00:00', '2017-1-3 00:00:00',\n",
" '2017-1-4 06:00:00', '2017-1-6 00:00:00', '2017-1-9 00:00:00']))"
]
},
{
"cell_type": "code",
"execution_count": 99,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2017-01-01 1.0\n",
"2017-01-02 2.0\n",
"2017-01-03 3.0\n",
"2017-01-04 4.0\n",
"2017-01-05 5.0\n",
"2017-01-06 6.0\n",
"2017-01-07 7.0\n",
"2017-01-08 8.0\n",
"2017-01-09 9.0\n",
"Freq: D, dtype: float64"
]
},
"execution_count": 99,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from adtk.data import resample\n",
"resample(s, dT=\"1 day\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If no spacing is specified, the greatest common divider of original time steps will be used, which makes the refinement of time line a minimal refinement subject to keeping all original time points still included in the resampled time series. Please note that this may dramatically increase the size of time series and memory usage.\n",
"\n",
"For the previous example, such a refinement yields constant spacing of 6 hours. All original time points are kept in the refined series."
]
},
{
"cell_type": "code",
"execution_count": 100,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2017-01-01 00:00:00 1.00\n",
"2017-01-01 06:00:00 1.25\n",
"2017-01-01 12:00:00 1.50\n",
"2017-01-01 18:00:00 1.75\n",
"2017-01-02 00:00:00 2.00\n",
"2017-01-02 06:00:00 2.25\n",
"2017-01-02 12:00:00 2.50\n",
"2017-01-02 18:00:00 2.75\n",
"2017-01-03 00:00:00 3.00\n",
"2017-01-03 06:00:00 3.25\n",
"2017-01-03 12:00:00 3.50\n",
"2017-01-03 18:00:00 3.75\n",
"2017-01-04 00:00:00 4.00\n",
"2017-01-04 06:00:00 4.25\n",
"2017-01-04 12:00:00 4.50\n",
"2017-01-04 18:00:00 4.75\n",
"2017-01-05 00:00:00 5.00\n",
"2017-01-05 06:00:00 5.25\n",
"2017-01-05 12:00:00 5.50\n",
"2017-01-05 18:00:00 5.75\n",
"2017-01-06 00:00:00 6.00\n",
"2017-01-06 06:00:00 6.25\n",
"2017-01-06 12:00:00 6.50\n",
"2017-01-06 18:00:00 6.75\n",
"2017-01-07 00:00:00 7.00\n",
"2017-01-07 06:00:00 7.25\n",
"2017-01-07 12:00:00 7.50\n",
"2017-01-07 18:00:00 7.75\n",
"2017-01-08 00:00:00 8.00\n",
"2017-01-08 06:00:00 8.25\n",
"2017-01-08 12:00:00 8.50\n",
"2017-01-08 18:00:00 8.75\n",
"2017-01-09 00:00:00 9.00\n",
"Freq: 6H, dtype: float64"
]
},
"execution_count": 100,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"resample(s)"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand All @@ -3432,7 +3308,7 @@
},
{
"cell_type": "code",
"execution_count": 101,
"execution_count": 98,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -3463,7 +3339,7 @@
},
{
"cell_type": "code",
"execution_count": 102,
"execution_count": 99,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -3491,7 +3367,7 @@
},
{
"cell_type": "code",
"execution_count": 103,
"execution_count": 100,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -3519,7 +3395,7 @@
},
{
"cell_type": "code",
"execution_count": 104,
"execution_count": 101,
"metadata": {},
"outputs": [
{
Expand Down
1 change: 1 addition & 0 deletions docs/releasehistory.rst
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@ Version 0.6.0-dev
- Added consistency check between training and testing inputs in multivariate models (0.6.0-dev.23+pr.89)
- Improved time index check in time-dependent models (0.6.0-dev.24+pr.90, 0.6.0-dev.25+pr.91)
- Changed the output type of `adtk.data.split_train_test` from a 2-tuple of lists to a list of 2-tuples (0.6.0-dev.26+pr.92)
- Removed `adtk.data.resample` because its functionality is highly overlapped with pandas resampler module (0.6.0-dev.27+pr.93)

Version 0.5.5 (Feb 24, 2020)
===================================
Expand Down
2 changes: 1 addition & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[metadata]
name = adtk
version = 0.6.0-dev.26+pr.92
version = 0.6.0-dev.27+pr.93
author = Arundo Analytics, Inc.
maintainer = Tailai Wen
maintainer_email = [email protected]
Expand Down
2 changes: 1 addition & 1 deletion src/adtk/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,4 @@
"""

__version__ = "0.6.0-dev.26+pr.92"
__version__ = "0.6.0-dev.27+pr.93"
2 changes: 0 additions & 2 deletions src/adtk/data/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

from ._data import (
expand_events,
resample,
split_train_test,
to_events,
to_labels,
Expand All @@ -16,6 +15,5 @@
"to_labels",
"expand_events",
"validate_events",
"resample",
"split_train_test",
]
67 changes: 0 additions & 67 deletions src/adtk/data/_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -537,73 +537,6 @@ def expand_events(
raise TypeError("Arugment `lists` must be a list or a dict of lists.")


def resample(
ts: Union[pd.Series, pd.DataFrame], dT: pd.Timedelta = None
) -> Union[pd.Series, pd.DataFrame]:
"""Resample the time points of a time series with given constant spacing.
The values at new time points are calcuated by time-weighted linear
interpolation.
Parameters
----------
ts: pandas Series or DataFrame
Time series to resample.
dT: pandas Timedelta, str, or int, optional
The new constant time step.
- If str, it must be able to be converted into a pandas Timedelta
object.
- If int, it must be in nanosecond.
If not given, the greatest common divider of original time steps will
be used, which makes the refinement a minimal refinement subject to
keeping all original time points still included in the resampled time
series. Please note that this may dramatically increase the size of
time series and memory usage.
Default: None.
Returns
-------
pandas Series or DataFrame
Resampled time series.
"""

def gcd_of_array(arr: np.array) -> int:
"""Get the GCD of an array of integer"""
return reduce(gcd, arr)

ts = validate_series(ts)

isSeries = False
if isinstance(ts, pd.Series):
isSeries = True
seriesName = ts.name
ts = ts.to_frame()

if dT is None:
dT = pd.Timedelta(
np.timedelta64(gcd_of_array([int(dt) for dt in np.diff(ts.index)]))
)
elif not isinstance(dT, pd.Timedelta):
dT = pd.Timedelta(dT)

rdf = pd.DataFrame(index=pd.date_range(ts.index[0], ts.index[-1], freq=dT))

rdf = rdf.join(ts, how="outer")
rdf = rdf.interpolate("index")

rdf = rdf.reindex(pd.date_range(ts.index[0], ts.index[-1], freq=dT))

if isSeries:
rdf = rdf[rdf.columns[0]]
rdf.name = seriesName

return rdf


def split_train_test(
ts: Union[pd.Series, pd.DataFrame],
mode: int = 1,
Expand Down
80 changes: 0 additions & 80 deletions tests/test_resample.py

This file was deleted.

0 comments on commit 33395fa

Please sign in to comment.