Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow sampling #7

Open
y2kappa opened this issue Oct 19, 2020 · 2 comments
Open

Allow sampling #7

y2kappa opened this issue Oct 19, 2020 · 2 comments
Labels
good first issue Good for newcomers help wanted Extra attention is needed

Comments

@y2kappa
Copy link
Owner

y2kappa commented Oct 19, 2020

I need to run this in production because I simply cannot profile locally. If my server has many hits this takes a big toll on my log quota so I'd like to sample when trace information is dumped.

I think the user should be able to provide a sampling rate, like 500, it has 1/500 chance of being recorded.

I think this should be done in the new constructor:

pub fn new(id: &str, processor: Option<fn(&str)>, stats: Option<TracingStats>) -> Trace {
        let trace = Trace {
            id: id.into(),
            processor: processor.unwrap_or(|x: &str| {
                println!("{}", x);
            }),
            stats: stats.unwrap_or(TracingStats::None),
        };
        Self::register(&trace.id);
        trace
}

Something like

if rand() == self.sampling_rate {
    Self::register(&trace.id);
}
@y2kappa y2kappa added good first issue Good for newcomers help wanted Extra attention is needed labels Oct 19, 2020
@yonip23
Copy link
Contributor

yonip23 commented Oct 20, 2020

Why not calculating statistics on each function - min, max, and avg. ?
I don't like the idea of using random in such a deterministic crate...

@y2kappa
Copy link
Owner Author

y2kappa commented Oct 20, 2020

Hey, thanks for the response. Let me clarify.

If for example you run this on a production task, you may not want to dump these stats on every invocation of your service/endpoint and as such you could do it every 100 requests, especially if your service gets many hits. However, once at the beginning of the request when register is called for the first time, and if the random condition passes, then you will be profiling the entire execution path.

This helps with logging for example, if you have a limited log quota. And if you want to calculate statistics on your request, even if it's sampled, you might get a good approximation due to law of large numbers. In my use case I want it every 100 requests and then aggregate them all using some log queries.

However, it's an opt-in feature and it would definitely not default to true.

I hope you're not confusing this with the issue of calculating stats for the collected data.

If you have a different idea about this, let me know, my use case may not be so widespread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants