Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Can we reconstruct the original audio from the latent embeddings? #3

Open
sleepingcat4 opened this issue Sep 10, 2024 · 7 comments

Comments

@sleepingcat4
Copy link

I wanted to use your project to create a dataset but if original audio can be reconstructed then I can't : ( That's why, I need to know before getting started. [I want to make latent for 20-30TB data]

@marcoppasini
Copy link
Collaborator

Yes, latents can be decoded back to waveforms (this is the main goal of Music2Latent, so you may explore other models in your case).

@sleepingcat4
Copy link
Author

sleepingcat4 commented Sep 10, 2024

@marcoppasini are you guys planning to make the original training code open source? If yes, then I may use your model for training and completely hide the source of my audio

@marcoppasini
Copy link
Collaborator

Hey @sleepingcat4 I have just released the training code under the 'training' branch, feel free to try it out!

@sleepingcat4
Copy link
Author

@marcoppasini btw did you consider classification with your model? I tried but didn't receive good outcome over Wav2vec

@marcoppasini
Copy link
Collaborator

@sleepingcat4 I tried some music-related downstream tasks in the paper (https://arxiv.org/abs/2408.06500), but I did not explore much more unfortunately.

@sleepingcat4
Copy link
Author

@marcoppasini do you have any experience with feature extractor and how it can be used for downstream task like classifications or analysis in general?

Because if you have some examples to show, I have enough data to run some good Futher experiments

@sleepingcat4
Copy link
Author

@marcoppasini btw I wanted to ask, if you are available for a collaboration. Cuz I had a been planning to release a dataset in the next few days that suppose to be an improvement and beating LAION-DISCO 12M at many regards and both mine and LAION-DISCO 12M are based on Music.

the lab, I am working now LAION AI, was looking for someone who can help us and lend a hand in training and creating foundational music generative model and a few other generative models both in music and human speech. Including several dataset releases.

I love this Music2latent project as it removes fundamental burden of modelling music through naive terms like (tempo, time difference, components and etc) so having you with us will definitely be interesting.

We have the compute and resources but finding someone niche in music in a bit hard so definitely going to appreciate any help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants