You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi there,
I am still learning reclin functionality but have been pretty happy with this package so thank you so much for your work and making this utility available.
I am working on validating the linkage and proper threshold and in so doing trying to retain the weights but for some reason i am losing them and just looking for confirmation that the approach i am taking should work. When i join the Linked.. data frame with the P_Links_Att.. data frame on Id.x and Id.y i would expect this to give me all the weights onto that linked data set up the vast majority of the Linked.. records don't get a weight and looking into the p object and associated P_Link_Atts.. data frame there are many linkages shown in the Linked.. dataframe that are not in the weights.
My presumption is that all the x and y values are row names so create separate columns titled "Id.x" and "Id.y" as joining vectors but maybe thats where i am going wrong.
My goal is to just be able to retain the weights values after applying the link() function so i can check how my linkage does based on weight so i can adjust. Thanks for any help and hope this issue is clear. Sorry to not be able to supply data but its filled with PII but if a more workable example is necessary i can build some vignette data.
#Blocking
p <- pair_blocking(Select_Ems..,Select_Partic.., c("County","Crash_Week"), large = FALSE)
#Compare the records on their linkage keys - basic
#p <- compare_pairs(p, by = c("First_Name","Middle_Initial","Last_Name","DOB","Sex"))
#Compare using Jaro-Winkler
#p <- compare_pairs(p, by = c("First_Name","Middle_Initial","Last_Name","DOB","Sex","Crash_Date"), default_comparator =
jaro_winkler(0.9), overwrite = TRUE)
p <- compare_pairs(p, by =
c("First_Name","Middle_Initial","Last_Name","DOB_Day","DOB_Month","DOB_Year","Sex","Crash_Date"),
default_comparator = jaro_winkler(0.9), overwrite = TRUE)
#Force 1 to 1 linkage
p_4 <- select_n_to_m(p, "weight", var = "ntom", threshold = 2.2)
#Keep only links with x id
Linked.. <- link(p_4, all_x=TRUE, all_y = FALSE)
#Create a data frame object of linked data attributes
P_Link_Atts.. <- as.data.frame(p) %>% mutate(Id.x = as.character(x), Id.y = as.character(y))
#Join weights
Linked.. <- left_join(Linked.., P_Link_Atts.., by = c("Id.x","Id.y"))
The text was updated successfully, but these errors were encountered:
Hi there,
I am still learning reclin functionality but have been pretty happy with this package so thank you so much for your work and making this utility available.
I am working on validating the linkage and proper threshold and in so doing trying to retain the weights but for some reason i am losing them and just looking for confirmation that the approach i am taking should work. When i join the Linked.. data frame with the P_Links_Att.. data frame on Id.x and Id.y i would expect this to give me all the weights onto that linked data set up the vast majority of the Linked.. records don't get a weight and looking into the p object and associated P_Link_Atts.. data frame there are many linkages shown in the Linked.. dataframe that are not in the weights.
My presumption is that all the x and y values are row names so create separate columns titled "Id.x" and "Id.y" as joining vectors but maybe thats where i am going wrong.
My goal is to just be able to retain the weights values after applying the link() function so i can check how my linkage does based on weight so i can adjust. Thanks for any help and hope this issue is clear. Sorry to not be able to supply data but its filled with PII but if a more workable example is necessary i can build some vignette data.
The text was updated successfully, but these errors were encountered: