Feat: improve influencers endpoint #234

amirRamirfatahi · 2024-11-29T21:43:28Z

Pre-submission Checklist

For tests to work you need a working neo4j and redis instance with the example dataset in docker/db-graph

Testing: Implement and pass new tests for the new features/fixes, cargo test.
Performance: Ensure new code has relevant performance benchmarks, cargo bench

tipogi

For future reference, it would be helpful to avoid merging one PR into another, as it can make things a bit confusing to review

tipogi · 2024-12-06T14:47:18Z

src/models/user/influencers.rs

The Influencers struct can remain minimal, as its primary purpose is limited to this module. Unnecessary data structures like the HashMap can be removed, simplifying the model and reducing overhead.

The Influencers struct is only used within this module, there are no external dependencies on its structure. The stream module (src/user/stream.rs), only needs pubkys to after create UserViews so we can return the vector of tuples (Vec<(String, u64)>)

All required functionality, including filtering and ranking influencers by timeframe, can be achieved directly with Redis Sorted Sets.

The cypher query for influencers, filtered by timeframe, can be optimized to return an array of arrays (e.g., [[pubky, score], [...]]). This eliminates the need for Influencers struct conversions.

let result = retrieve_from_graph::<Vec<(String, u64)>>(query, "influencers").await?;

All Redis indexes that have an expiration time are prefixed with CACHE keyword for clarity and consistency throughout the project. This prefix denotes their cache nature and helps distinguish them from other keys in Redis

tipogi · 2024-12-06T14:48:43Z

src/routes/v0/stream/users.rs

@@ -31,7 +34,10 @@ pub struct UserStreamQuery {
        ("skip" = Option<usize>, Query, description = "Skip N followers"),
        ("limit" = Option<usize>, Query, description = "Retrieve N followers"),
        ("source" = Option<UserStreamSource>, Query, description = "Source of users for the stream."),
-        ("depth" = Option<usize>, Query, description = "User trusted network depth, user following users distance. Numbers bigger than 4, will be ignored")
+        ("source_reach" = Option<StreamReach>, Query, description = "The target reach of the source. Supported in 'influencers' source."),


To continue with the naming conventions, it should be reach

this reach is related to the source so it's source's reach, hence the naming.

The reach parameter name is concise, aligns with existing conventions, and avoids unnecessary redundancy. The swagger description already clarifies its relationship to the source. Another point, source_reach cannot be combined with the other UserStreamSource so better candidate, if you want to add a prefix, would be influencer_reach but I do not see neither...

tipogi · 2024-12-06T14:52:41Z

src/routes/v0/stream/users.rs

        query.depth,
+        Some(timeframe),
+        query.preview,
    )
    .await


If it does not have any influencers in the selected timeframe, it throws a 500 error which it should be 404 type

It's using the same function we already have? I don't think so as this function is returning a None if the inner user id list is empty. Am I missing something?

ok fixed it.

tipogi · 2024-12-13T14:40:29Z

src/models/user/influencers.rs

+use crate::queries;
+use crate::RedisOps;
+
+const GLOBAL_INFLUENCERS_PREFIX: &str = "Cached:Influencers";


Wrong cache prefix, it has to be Cache

tipogi · 2024-12-13T14:44:41Z

src/models/user/influencers.rs

+#[derive(Deserialize, Serialize, ToSchema, Debug, Clone)]
+pub struct Influencer {
+    pub id: String,
+    pub score: f64,
+}
+
+// Define a newtype wrapper
+#[derive(Serialize, Deserialize, Debug, ToSchema, Default, Clone)]
+pub struct Influencers(pub Vec<Influencer>);
+
+impl RedisOps for Influencers {}
+
+// Create a Influencers instance directly from an iterator of Influencer items
+// Need it in collect()
+impl FromIterator<Influencer> for Influencers {
+    fn from_iter<I: IntoIterator<Item = Influencer>>(iter: I) -> Self {
+        Influencers(iter.into_iter().collect())
+    }
+}
+
+// Implement Deref so Influencers can be used like Vec<String>
+impl Deref for Influencers {
+    type Target = Vec<Influencer>;
+
+    fn deref(&self) -> &Self::Target {
+        &self.0
+    }
+}
+


The use of Influencer struct feels unnecessary, as a tuple (score, pubky) seems to capture the same logic effectively. A vector of such tuples, Vec<(score, pubky)>, Influencers struct now, would suffice, reducing complexity without losing clarity

I wouldn't call it unnecessary.
Having a struct for it means in the future we can easily add more properties to it if needed, without having to refactor a lot of code.
This isn't hurting readability, performance or consistency, so I don't think it hurts to keep it here.
What do you think?

You know what I think from the beginning 😁. That data about users will always go to get UserViews so now and in the future, I do not think that we will need that struct. @SHAcollision any opinion about that?

tipogi · 2024-12-13T14:46:24Z

src/models/user/influencers.rs

+            Some(user_id) => {
+                Influencers::get_influencers_by_reach(
+                    user_id,
+                    reach.unwrap(),


What about if reach is None. Risk of panic

reach couldn't be none because of previous checks.
Nonetheless, you are correct. will fix it.

tipogi · 2024-12-13T16:03:24Z

src/models/user/stream.rs

-            .map(|set| set.into_iter().map(|(user_id, _score)| user_id).collect()),
+            .map(|result| {
+                result
+                    .iter()
+                    .map(|influencer| influencer.id.clone())
+                    .collect()
+            }),


I think the previous implementation was correct. We do not need to clone() the user pubky plus do not the create new structs

#234 (comment)

yes, that depends in the Influencers struct 👍

tipogi · 2024-12-13T16:07:59Z

src/routes/v0/stream/users.rs

+    match UserStream::get_by_id(&UserStreamInput {
+        user_id: query.user_id.clone(),
+        viewer_id: query.viewer_id,
+        skip: Some(skip),
+        limit: Some(limit),
+        source: source.clone(),
+        source_reach: query.source_reach,
+        depth: query.depth,
+        timeframe: Some(timeframe),
+        preview: query.preview,
+    })


This function exceeds the recommended maximum of 7 parameters, which may cause a clippy warning

That's why it's a struct and the function is just getting one argument.

ups! you are right, I miss that struct haahh

SHAcollision · 2024-12-19T10:27:43Z

src/models/user/stream.rs

 use crate::{db::kv::index::sorted_sets::SortOrder, RedisOps};
 use crate::{get_neo4j_graph, queries};
 use serde::{Deserialize, Serialize};
 use tokio::task::spawn;
 use utoipa::ToSchema;

 pub const USER_MOSTFOLLOWED_KEY_PARTS: [&str; 2] = ["Users", "MostFollowed"];
-pub const USER_PIONEERS_KEY_PARTS: [&str; 2] = ["Users", "Pioneers"];
+pub const USER_INFLUENCERS_KEY_PARTS: [&str; 2] = ["Users", "Influencers"];


This is a breaking change of Index schemas and we do not have a "migration". What is the plan? Is this rename needed just yet? If we update or staging server the feature will stop working.

this was fixed like this because the graph is still too small for a reindex to be costly

amirRamirfatahi added 7 commits November 29, 2024 14:23

rename pioneers to influencers

22519d0

Merge branch 'feature/hot-tags' into feature/influencers

65d5bf6

add influencers

233bbf3

Add queries

97a8a65

fix build

188fe22

Add reindex

e3f19c4

Use new influencers in user stream

d62f98f

JeanlChristophe mentioned this pull request Dec 3, 2024

Feat: improve pioneers/influencers #89

Open

amirRamirfatahi marked this pull request as ready for review December 3, 2024 13:48

amirRamirfatahi self-assigned this Dec 3, 2024

amirRamirfatahi added the 🔮 nexus label Dec 3, 2024

Merge branch 'main' into feature/influencers

4549801

amirRamirfatahi added 🌐 service refactor Non-Feature Update labels Dec 3, 2024

amirRamirfatahi added 3 commits December 3, 2024 10:00

fix lint issues

42d9688

Add more tests and fix review issues

99e0a59

Merge branch 'feature/hot-tags' into feature/influencers

1fdac36

amirRamirfatahi force-pushed the feature/influencers branch from c39dc2a to 1fdac36 Compare December 5, 2024 08:02

amirRamirfatahi added 4 commits December 5, 2024 03:21

Add preview option to influencers

b76cdd2

fix broken test from merge conflict

bd37289

remove reach from influencer enum and move it to query param

4d52912

fix query string syntax error

85f7465

tipogi self-requested a review December 6, 2024 09:21

tipogi reviewed Dec 6, 2024

View reviewed changes

amirRamirfatahi added 4 commits December 11, 2024 00:37

Merge branch 'main' into feature/influencers

6b91172

Refactor Influencers cache

2b4d5a2

fix lint issues

4f18737

Add tests and fix found bugs

1b49dbd

amirRamirfatahi force-pushed the feature/influencers branch from a0403f7 to 1b49dbd Compare December 11, 2024 22:08

tipogi reviewed Dec 13, 2024

View reviewed changes

amirRamirfatahi added 4 commits December 13, 2024 17:00

rename source_reach to reach

f762204

fix cache prefixes using cached instead of cache

072a65a

remove unsafe unwraps

66ab11c

Merge branch 'main' into feature/influencers

7caf20f

SHAcollision linked an issue Dec 19, 2024 that may be closed by this pull request

Feat: improve pioneers/influencers #89

Open

SHAcollision changed the title ~~Feature/influencers~~ feat (service): add reach timeframe query params to influencer endpoint Dec 19, 2024

SHAcollision reviewed Dec 19, 2024

View reviewed changes

amirRamirfatahi changed the title ~~feat (service): add reach timeframe query params to influencer endpoint~~ influencers Dec 20, 2024

SHAcollision changed the title ~~influencers~~ Feat: improve influencers endpoint Jan 1, 2025

amirRamirfatahi added 2 commits January 13, 2025 08:08

Merge branch 'main' into feature/influencers

4162acc

Merge branch 'main' into feature/influencers

c47b9dd

SHAcollision requested a review from tipogi January 14, 2025 11:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: improve influencers endpoint #234

Feat: improve influencers endpoint #234

amirRamirfatahi commented Nov 29, 2024

tipogi left a comment

tipogi Dec 6, 2024

tipogi Dec 6, 2024

amirRamirfatahi Dec 11, 2024

tipogi Dec 13, 2024

tipogi Dec 6, 2024

amirRamirfatahi Dec 11, 2024

amirRamirfatahi Dec 11, 2024

tipogi Dec 13, 2024

tipogi Dec 13, 2024

amirRamirfatahi Dec 13, 2024

tipogi Dec 15, 2024

tipogi Dec 13, 2024 •

edited

Loading

amirRamirfatahi Dec 13, 2024

tipogi Dec 13, 2024

amirRamirfatahi Dec 13, 2024

tipogi Dec 15, 2024

tipogi Dec 13, 2024

amirRamirfatahi Dec 13, 2024

tipogi Dec 15, 2024

SHAcollision Dec 19, 2024 •

edited

Loading

amirRamirfatahi Dec 20, 2024

Feat: improve influencers endpoint #234

Are you sure you want to change the base?

Feat: improve influencers endpoint #234

Conversation

amirRamirfatahi commented Nov 29, 2024

Pre-submission Checklist

tipogi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tipogi Dec 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SHAcollision Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tipogi Dec 13, 2024 •

edited

Loading

SHAcollision Dec 19, 2024 •

edited

Loading