Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating embeddings instead of output prediction #86

Open
JorritWillaert opened this issue Nov 20, 2022 · 1 comment
Open

Creating embeddings instead of output prediction #86

JorritWillaert opened this issue Nov 20, 2022 · 1 comment

Comments

@JorritWillaert
Copy link

Hi! I was wondering if I a GPT Code Clippy model could generate embeddings instead of output generation?
The purpose is to embed code in a semantical space, such that it can be used as a feature for another neural network. I have done the same with BERT (more as a baseline, since this model is not trained on code), and with the OpenAI Codex model (with a paying API), and therefore would love to use one of your models as well.

Thank you!

@ncoop57
Copy link
Collaborator

ncoop57 commented Nov 26, 2022

Hi @JorritWillaert !

So you could use our models for embedding. However, I would not recommend it. I'd suggest checking out graphcodebert from Microsoft for doing this. It gets good performance across a ton of other code related tasks: https://huggingface.co/microsoft/graphcodebert-base

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants