Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Which stats and aggregates should we maintain? #1697

Open
snarfed opened this issue Jan 16, 2025 · 0 comments
Open

Which stats and aggregates should we maintain? #1697

snarfed opened this issue Jan 16, 2025 · 0 comments
Labels

Comments

@snarfed
Copy link
Owner

snarfed commented Jan 16, 2025

We'd like to start tracking aggregates and statistics for Bridgy Fed usage. We do a bit of that now, eg total users, broken down per network, current activity rate, etc. But we'd like more, notably historically over time, eg MAU/DAU, activity rate, etc. Sadly our data volume is already high enough - our current datastore is ~1T - that we don't keep all the raw data we'd need to calculate those kinds of things, either historically or going forward.

These kinds of stats would be useful for a number of things, notably conversations with potential funders.

(As a counterexample, Bridgy classic is small enough that I have kept all the historical raw data, mostly in cold storage, and I've regularly crunched and reported statistics for a long time: https://snarfed.org/2023-06-02_bridgy-stats-update-8 , https://snarfed.org/?s=bridgy%20stats . It's fun! Mostly 😁)

I don't expect to store all of BF's raw data long term, in any form, but I can estimate some stats historically, and more importantly I can start tracking them going forward. So, what do we want? Assume these are both totals and broken down by source and destination network:

  • active users per day, month, ...?
  • counts of activities/interactions across the bridge, broken down by type
  • account migrations in, out, and across (eventually)
  • custom usernames/handles
  • remote "instances" seen, interacted with
  • explicitly opted out users, instances?
  • total domains seen/interacted with...?
  • total interactions seen/handled, whether or not they get bridged?
  • costs, both total and per user, per active user, per interaction
  • ...?

cc @anujahooja

@snarfed snarfed added the infra label Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant