Rolling Means behaviour #705

maravenag · 2021-09-10T13:59:03Z

maravenag
Sep 10, 2021

Hi friends!, I have a doubt with the following code:

from river import compose, stats
from river import feature_extraction as fx
import numbers

data = [
    {"store":1, "x":1, "y":1},
    {"store":2, "x":1, "y":0},
    {"store":2, "x":1, "y":1},
    {"store":1, "x":1, "y":0},
    {"store":1, "x":1, "y":0},
    {"store":2, "x":1, "y":0},
    {"store":2, "x":1, "y":1},
    {"store":1, "x":1, "y":1},
]

features_pipeline = compose.TransformerUnion(    
    fx.Agg(on='y' ,
            by=['store','x'],
            how=stats.RollingMean(window_size=3)),
)

numerical_features = compose.Pipeline(
        compose.SelectType(numbers.Number),
    )

model = compose.Pipeline(features_pipeline,
                         numerical_features)
for x in data:
    model = model.learn_one(x)

test = [{"store":1, "x":1, "y":1},
        {"store":1, "x":1, "y":1},
        {"store":1, "x":1, "y":1},
        {"store":1, "x":1, "y":0},
        {"store":1, "x":1, "y":0},
        {"store":1, "x":1, "y":0}
       ]
 
 for t in test:
    print(model.transform_one(t))

What I should expect (and I can be wrong), is that if I use the transform_one() method the Rolling Means shouldn't be updated but it's transforming the data and updating the means. It is this behaviour normal?.

This is the output of the prints:

{'y_rollingmean_3_by_store_and_x': 1.0}
{'y_rollingmean_3_by_store_and_x': 1.0}
{'y_rollingmean_3_by_store_and_x': 1.0}
{'y_rollingmean_3_by_store_and_x': 0.6666666666666666}
{'y_rollingmean_3_by_store_and_x': 0.3333333333333333}
{'y_rollingmean_3_by_store_and_x': 0.0}

Thank you if someone can help me.

Answered by MaxHalford

Sep 10, 2021

Hello friend!

It's not a bug, it's a feature.

In online learning, unsupervised stuff is update at prediction time, not at learning time. I know it's not intuitive, probably we've all been used to batch fit/predict paradigm. But if you think about it, it actually makes so much sense for online learning.

View full answer

MaxHalford · 2021-09-10T14:15:25Z

MaxHalford
Sep 10, 2021
Maintainer

Hello friend!

It's not a bug, it's a feature.

In online learning, unsupervised stuff is update at prediction time, not at learning time. I know it's not intuitive, probably we've all been used to batch fit/predict paradigm. But if you think about it, it actually makes so much sense for online learning.

1 reply

maravenag Sep 10, 2021
Author

Oh that makes sense!, we were trying to use the awesome feature_extraction from River as a feature generator for a batch model. So I think we need to use a different approach. Thank you so much for you answer Max!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rolling Means behaviour #705

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Rolling Means behaviour #705

maravenag Sep 10, 2021

Replies: 1 comment · 1 reply

MaxHalford Sep 10, 2021 Maintainer

maravenag Sep 10, 2021 Author

maravenag
Sep 10, 2021

Replies: 1 comment 1 reply

MaxHalford
Sep 10, 2021
Maintainer

maravenag Sep 10, 2021
Author