You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Given that mastodon tries to avoid algorithms that impose preferences onto users, it would be good to thoroughly document the algorithms used in the "explore" sections.
"View hashtags that are currently being used more frequently than usual."
"Links that have been shared more than others."
"Tags that are being used more frequently within the past week." (this is also incorrect if I look at the code below)
The links at the bottom lead to code with no comments.
What is the threshold? How is frequency determined? How is the parent pool of hashtags found?
After digging for an hour, I found these functions in the code:
expected = tag.history.get(at_time - 1.day).accounts.to_f
expected = 1.0 if expected.zero?
observed = tag.history.get(at_time).accounts.to_f
max_time = tag.max_score_at
max_score = tag.max_score
max_score = 0 if max_time.nil? || max_time < (at_time - options[:max_score_cooldown])
score = if expected > observed || observed < options[:threshold]
0
else
((observed - expected)**2) / expected
end
if score > max_score
max_score = score
max_time = at_time
# Not interested in triggering any callbacks for this
tag.update_columns(max_score: max_score, max_score_at: max_time)
end
decaying_score = max_score * (0.5**((at_time.to_f - max_time.to_f) / options[:max_score_halflife].to_f))
next unless decaying_score >= options[:decay_threshold]
items << { score: decaying_score, item: tag }
From this, I can see that the admin can alter the behaviour with options.
I think it would be nice to be transparent about the algorithms used, both to users and to developers.
I suggest two improvements:
on the "Explore" page, add a "more information" link to the "These are posts from across the social web that are gaining traction today. Newer posts with more boosts and favorites are ranked higher." popup, which leads to a documentation page that presents the algorithm configuration used in this instance. Same for each of Posts, Hashtags, People, News
on that documentation page, give the algorithm used with the option values of the instance.
For the posts, this could look something like:
<details><summary>These are posts from across the social web that are gaining traction today. Newer posts with more boosts and favorites are ranked higher (click for details)</summary>
<p>
This instance uses the algorithm below with the options
<ul>
<li>threshold = 100
<li>score_halflife = 1234s
</ul>
The popularity score of an eligible post is computed with the number of reblogs and favourites, and the age in seconds of a post as:
<blockquote>
expected = 1.0
observed = reblogs_count + favourites_count
if expected > observed or observed < threshold:
score = 0
else:
score = ((observed - expected)**2) / expected * (0.5**(age / score_halflife))
</blockquote>
</details>
Given that mastodon tries to avoid algorithms that impose preferences onto users, it would be good to thoroughly document the algorithms used in the "explore" sections.
Currently the documentation has vague statements. On https://docs.joinmastodon.org/methods/trends/ :
The links at the bottom lead to code with no comments.
What is the threshold? How is frequency determined? How is the parent pool of hashtags found?
After digging for an hour, I found these functions in the code:
From this, I can see that the admin can alter the behaviour with options.
I think it would be nice to be transparent about the algorithms used, both to users and to developers.
I suggest two improvements:
For the posts, this could look something like:
The text was updated successfully, but these errors were encountered: