-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: improve influencers endpoint #234
base: main
Are you sure you want to change the base?
Conversation
c39dc2a
to
1fdac36
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For future reference, it would be helpful to avoid merging one PR into another, as it can make things a bit confusing to review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Influencers
struct can remain minimal, as its primary purpose is limited to this module. Unnecessary data structures like the HashMap
can be removed, simplifying the model and reducing overhead.
- The
Influencers
struct is only used within this module, there are no external dependencies on its structure. The stream module (src/user/stream.rs
), only needspubky
s to after createUserView
s so we can return the vector of tuples (Vec<(String, u64)>
) - All required functionality, including filtering and ranking influencers by timeframe, can be achieved directly with Redis Sorted Sets.
- The cypher query for influencers, filtered by timeframe, can be optimized to return an array of arrays (e.g., [[pubky, score], [...]]). This eliminates the need for
Influencers
struct conversions.
let result = retrieve_from_graph::<Vec<(String, u64)>>(query, "influencers").await?;
- All Redis indexes that have an expiration time are prefixed with
CACHE
keyword for clarity and consistency throughout the project. This prefix denotes their cache nature and helps distinguish them from other keys in Redis
src/routes/v0/stream/users.rs
Outdated
@@ -31,7 +34,10 @@ pub struct UserStreamQuery { | |||
("skip" = Option<usize>, Query, description = "Skip N followers"), | |||
("limit" = Option<usize>, Query, description = "Retrieve N followers"), | |||
("source" = Option<UserStreamSource>, Query, description = "Source of users for the stream."), | |||
("depth" = Option<usize>, Query, description = "User trusted network depth, user following users distance. Numbers bigger than 4, will be ignored") | |||
("source_reach" = Option<StreamReach>, Query, description = "The target reach of the source. Supported in 'influencers' source."), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To continue with the naming conventions, it should be reach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this reach
is related to the source
so it's source's reach, hence the naming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reach
parameter name is concise, aligns with existing conventions, and avoids unnecessary redundancy. The swagger description already clarifies its relationship to the source. Another point, source_reach
cannot be combined with the other UserStreamSource
so better candidate, if you want to add a prefix, would be influencer_reach
but I do not see neither...
query.depth, | ||
Some(timeframe), | ||
query.preview, | ||
) | ||
.await |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it does not have any influencers in the selected timeframe
, it throws a 500 error which it should be 404 type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's using the same function we already have? I don't think so as this function is returning a None if the inner user id list is empty. Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok fixed it.
a0403f7
to
1b49dbd
Compare
src/models/user/influencers.rs
Outdated
use crate::queries; | ||
use crate::RedisOps; | ||
|
||
const GLOBAL_INFLUENCERS_PREFIX: &str = "Cached:Influencers"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong cache prefix, it has to be Cache
#[derive(Deserialize, Serialize, ToSchema, Debug, Clone)] | ||
pub struct Influencer { | ||
pub id: String, | ||
pub score: f64, | ||
} | ||
|
||
// Define a newtype wrapper | ||
#[derive(Serialize, Deserialize, Debug, ToSchema, Default, Clone)] | ||
pub struct Influencers(pub Vec<Influencer>); | ||
|
||
impl RedisOps for Influencers {} | ||
|
||
// Create a Influencers instance directly from an iterator of Influencer items | ||
// Need it in collect() | ||
impl FromIterator<Influencer> for Influencers { | ||
fn from_iter<I: IntoIterator<Item = Influencer>>(iter: I) -> Self { | ||
Influencers(iter.into_iter().collect()) | ||
} | ||
} | ||
|
||
// Implement Deref so Influencers can be used like Vec<String> | ||
impl Deref for Influencers { | ||
type Target = Vec<Influencer>; | ||
|
||
fn deref(&self) -> &Self::Target { | ||
&self.0 | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of Influencer struct feels unnecessary, as a tuple (score, pubky)
seems to capture the same logic effectively. A vector of such tuples, Vec<(score, pubky)>
, Influencers struct now, would suffice, reducing complexity without losing clarity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't call it unnecessary.
Having a struct for it means in the future we can easily add more properties to it if needed, without having to refactor a lot of code.
This isn't hurting readability, performance or consistency, so I don't think it hurts to keep it here.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know what I think from the beginning 😁. That data about users will always go to get UserViews
so now and in the future, I do not think that we will need that struct. @SHAcollision any opinion about that?
src/models/user/influencers.rs
Outdated
Some(user_id) => { | ||
Influencers::get_influencers_by_reach( | ||
user_id, | ||
reach.unwrap(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about if reach
is None
. Risk of panic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reach couldn't be none because of previous checks.
Nonetheless, you are correct. will fix it.
.map(|set| set.into_iter().map(|(user_id, _score)| user_id).collect()), | ||
.map(|result| { | ||
result | ||
.iter() | ||
.map(|influencer| influencer.id.clone()) | ||
.collect() | ||
}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the previous implementation was correct. We do not need to clone()
the user pubky plus do not the create new structs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that depends in the Influencers
struct 👍
src/routes/v0/stream/users.rs
Outdated
match UserStream::get_by_id(&UserStreamInput { | ||
user_id: query.user_id.clone(), | ||
viewer_id: query.viewer_id, | ||
skip: Some(skip), | ||
limit: Some(limit), | ||
source: source.clone(), | ||
source_reach: query.source_reach, | ||
depth: query.depth, | ||
timeframe: Some(timeframe), | ||
preview: query.preview, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function exceeds the recommended maximum of 7 parameters, which may cause a clippy
warning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's why it's a struct and the function is just getting one argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ups! you are right, I miss that struct haahh
use crate::{db::kv::index::sorted_sets::SortOrder, RedisOps}; | ||
use crate::{get_neo4j_graph, queries}; | ||
use serde::{Deserialize, Serialize}; | ||
use tokio::task::spawn; | ||
use utoipa::ToSchema; | ||
|
||
pub const USER_MOSTFOLLOWED_KEY_PARTS: [&str; 2] = ["Users", "MostFollowed"]; | ||
pub const USER_PIONEERS_KEY_PARTS: [&str; 2] = ["Users", "Pioneers"]; | ||
pub const USER_INFLUENCERS_KEY_PARTS: [&str; 2] = ["Users", "Influencers"]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a breaking change of Index schemas and we do not have a "migration". What is the plan? Is this rename needed just yet? If we update or staging server the feature will stop working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was fixed like this because the graph is still too small for a reindex to be costly
Pre-submission Checklist
cargo test
.cargo bench