Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In what scenarios should QuantManagedCollisionEmbeddingCollection be used? #2724

Open
tiankongdeguiji opened this issue Feb 5, 2025 · 5 comments

Comments

@tiankongdeguiji
Copy link
Contributor

I believe that when we need to use a quantized ManagedCollisionEmbeddingCollection, we can configure the mapping from EmbeddingCollection to QuantEmbeddingCollection within quantize_dynamic. This makes the existence of a QuantManagedCollisionEmbeddingCollection class unnecessary. In what scenarios should QuantManagedCollisionEmbeddingCollection be used?

@tiankongdeguiji
Copy link
Contributor Author

hi, @iamzainhuda @kausv @dstaay-fb @henrylhtsang @PaulZhang12 can you see this problem?

@kausv
Copy link
Contributor

kausv commented Feb 11, 2025

@tiankongdeguiji QuantManagedEmbeddingCollection is a stronger indication that the underlying EC is quantized. It is also to be used for inference as it implicitly calls reset_inference_mode().
The functionality can be achieved without using the class too if needed.

@tiankongdeguiji
Copy link
Contributor Author

@tiankongdeguiji QuantManagedEmbeddingCollection is a stronger indication that the underlying EC is quantized. It is also to be used for inference as it implicitly calls reset_inference_mode(). The functionality can be achieved without using the class too if needed.

thx! I believe that during the inference phase, the MC module will not do profile. Could you please explain the rationale behind setting self._evicted = False and self._last_eviction_iter = -1 in the reset_inference_mode() method? It seems to me that self._evicted and self._last_eviction_iter are only utilized within the profile function.

@kausv
Copy link
Contributor

kausv commented Feb 12, 2025

I think reset_inference_mode() should also set self.training to False which we have missed.
Since profile() is called in forward we want to skip that. We also don't want to perform any evictions.
This function does a lot more in another instance of ManagedCollision that I am working on open sourcing.

@tiankongdeguiji
Copy link
Contributor Author

I think reset_inference_mode() should also set self.training to False which we have missed. Since profile() is called in forward we want to skip that. We also don't want to perform any evictions. This function does a lot more in another instance of ManagedCollision that I am working on open sourcing.

thx! Could you see this problem #2669 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants