Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AlignK not aligning clusters across k #93

Open
RvV1979 opened this issue Sep 26, 2023 · 2 comments
Open

AlignK not aligning clusters across k #93

RvV1979 opened this issue Sep 26, 2023 · 2 comments

Comments

@RvV1979
Copy link

RvV1979 commented Sep 26, 2023

I am having trouble aligning K based on my slist (attached in RDS format here: (https://github.com/royfrancis/pophelper/files/12670714/slist.RDS.zip) )

my code

slist<-readRDS("slist.RDS")
plotQ(slist,returnplot = T, exportplot =F, imgoutput="join")

generated the plot below:
plotq_slist

my code
plotQ(alignK(slist),returnplot = T, exportplot =F, imgoutput="join")
including cluster alignment generated the plot below:
plotQ_alignK_slist

As you can see, cluster alignment is solved within each k. However, it is nowhere near perfectly aligned across different k. Do you have some advice on how to improve this?

Thanks

@henni164
Copy link

Just wanted to bump this- I have encountered the same issue with alignK across K from admixture input.

My fix was to rename the headers in the qlist dataframes for subsequent Ks based on the cluster names from the smallest K... a bit manual but it did work, since the cluster colors match the cluster names when you have not done alignK().

In your case, you could swap the column names between Cluster4 + Cluster3 at K = 4

something like colnames(slist[[5]]) <- c("Cluster1","Cluster2","Cluster4","Cluster3")

Hopefully this will help you at least get the look you are going for..

Cheers

@RvV1979
Copy link
Author

RvV1979 commented Nov 15, 2023

Thanks for your reply. I was specifically looking for a non-manual solution. In the end, I implemented a fix based on the fix_colours() function from in https://github.com/TCLamnidis/AdmixturePlotter but then applying it to an slist. This seems to work for me (I am not a coder, however, and there may still be much room for improvement).

#attempt to replicate fixcolours from AdmixturePlotter script
slistk<-tabulateQ(slist)$k
for (i in 2: length(slistk)){
  k<-slistk[i]
  prevk<-slistk[i-1]
  slistattributes<-attributes(slist[[i]]) # copy attributes
  cor_mat<-cor(slist[[i-1]],slist[[i]])
  component_order<-apply(cor_mat, 1,which.max) 
  while (any(duplicated(component_order)) ) {
    duplicates <- as.numeric(component_order[which(duplicated(component_order))])
    for (duplicate in duplicates){
      condition = T
      while (condition == T) {
        cor_mat[cor_mat==min(cor_mat[component_order==duplicate,duplicate])] = -100 # remove lowest-scoring duplicate by setting correlation to -100
        component_order<-apply(cor_mat, 1,which.max)
        if (duplicate %in% as.numeric(component_order[which(duplicated(component_order))]) == F) {condition = F} # keep going as long as duplicates remain
      }
    }
  }
  if (k != prevk ) {component_order<-c(component_order, setdiff(1:k, component_order)) } # add additional component when k is increased
  slist[[i]]<-slist[[i]][,component_order]
  attributes(slist[[i]])<-slistattributes #restore attributes
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants