Expected results? #4

brianc118 · 2019-10-02T04:01:27Z

I get the following when evaluating on MAPS after training the model over 100k iterations.

These metrics appear to be quite low, especially the frame metrics which are 0.65/0.65/0.64 whereas the Maestro paper reports 0.90/0.95/0.81.

Is this expected?

Thanks!

                            note precision                : 0.795 ± 0.096
                            note recall                   : 0.756 ± 0.109
                            note f1                       : 0.773 ± 0.096
                            note overlap                  : 0.541 ± 0.101
               note-with-offsets precision                : 0.362 ± 0.127
               note-with-offsets recall                   : 0.345 ± 0.126
               note-with-offsets f1                       : 0.352 ± 0.125
               note-with-offsets overlap                  : 0.808 ± 0.092
              note-with-velocity precision                : 0.739 ± 0.093
              note-with-velocity recall                   : 0.704 ± 0.110
              note-with-velocity f1                       : 0.719 ± 0.096
              note-with-velocity overlap                  : 0.543 ± 0.102
  note-with-offsets-and-velocity precision                : 0.341 ± 0.123
  note-with-offsets-and-velocity recall                   : 0.325 ± 0.124
  note-with-offsets-and-velocity f1                       : 0.332 ± 0.122
  note-with-offsets-and-velocity overlap                  : 0.807 ± 0.092
                           frame f1                       : 0.636 ± 0.108
                           frame precision                : 0.649 ± 0.163
                           frame recall                   : 0.654 ± 0.102
                           frame accuracy                 : 0.475 ± 0.115
                           frame substitution_error       : 0.106 ± 0.058
                           frame miss_error               : 0.240 ± 0.108
                           frame false_alarm_error        : 0.337 ± 0.338
                           frame total_error              : 0.683 ± 0.337
                           frame chroma_precision         : 0.686 ± 0.155
                           frame chroma_recall            : 0.696 ± 0.102
                           frame chroma_accuracy          : 0.516 ± 0.106
                           frame chroma_substitution_error: 0.064 ± 0.033
                           frame chroma_miss_error        : 0.240 ± 0.108
                           frame chroma_false_alarm_error : 0.337 ± 0.338
                           frame chroma_total_error       : 0.641 ± 0.315

The text was updated successfully, but these errors were encountered:

jongwook · 2019-10-02T04:31:02Z

I have also noticed low performance on MAPS. But the performance in the MAESTRO paper to compare fairly is the second row in Table 6, the one tested on MAPS without data augmentation: 0.82/0.83/0.61.

I'm suspecting that this is partly related to #3 , although I haven't had bandwidth to verify that.

I'll be able to get back on this this month before ISMIR starts.

brianc118 · 2019-10-04T11:27:37Z

@jongwook I'm going to try the proposed fix in #3 . Wouldn't the fairer comparison the one trained on MAESTRO not MAPS (fourth row of Table 6)?

jongwook · 2019-10-04T18:54:00Z

Thanks! As far as I understand, all of their experiments are trained on MAESTRO, and Table 6 (row 1-2) shows how the model generalizes to the MAPS dataset - I wasn't able to reproduce the same level of generalizability with my implementation. When trained & tested on MAESTRO, this implementation can achieve similar performance to the row 4 of table 6.

hanjuTsai · 2019-10-30T06:12:31Z

I have also trained the model for 100k, but I got the following result. Should I predict on earlier iteration checkpoints? Or maybe something went wrong?

/home/hanju/miniconda3/envs/hanju/lib/python3.6/site-packages/mir_eval/transcription.py:167: UserWarning: Estimated notes are empty.
  warnings.warn("Estimated notes are empty.")
/home/hanju/miniconda3/envs/hanju/lib/python3.6/site-packages/mir_eval/multipitch.py:275: UserWarning: Estimate frequencies are all empty.
  warnings.warn("Estimate frequencies are all empty.")
100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 60/60 [02:55<00:00,  2.24s/it]
                            note precision                : 0.000 ± 0.000
                            note recall                   : 0.000 ± 0.000
                            note f1                       : 0.000 ± 0.000
                            note overlap                  : 0.000 ± 0.000
               note-with-offsets precision                : 0.000 ± 0.000
               note-with-offsets recall                   : 0.000 ± 0.000
               note-with-offsets f1                       : 0.000 ± 0.000
               note-with-offsets overlap                  : 0.000 ± 0.000
              note-with-velocity precision                : 0.000 ± 0.000
              note-with-velocity recall                   : 0.000 ± 0.000
              note-with-velocity f1                       : 0.000 ± 0.000
              note-with-velocity overlap                  : 0.000 ± 0.000
                           frame f1                       : 0.000 ± 0.000
                           frame precision                : 0.000 ± 0.000
                           frame recall                   : 0.000 ± 0.000
                           frame accuracy                 : 0.000 ± 0.000
                           frame substitution_error       : 0.000 ± 0.000
                           frame miss_error               : 1.000 ± 0.000
                           frame false_alarm_error        : 0.000 ± 0.000
                           frame total_error              : 1.000 ± 0.000
                           frame chroma_precision         : 0.000 ± 0.000
                           frame chroma_recall            : 0.000 ± 0.000
                           frame chroma_accuracy          : 0.000 ± 0.000
                           frame chroma_substitution_error: 0.000 ± 0.000
                           frame chroma_miss_error        : 1.000 ± 0.000
                           frame chroma_false_alarm_error : 0.000 ± 0.000
                           frame chroma_total_error       : 1.000 ± 0.000

brianc118 · 2019-10-31T04:11:39Z

@hanjuTsai see #1

jongwook · 2019-11-04T14:37:28Z

Thanks @brianc118 !
@hanjuTsai I've updated requirements.txt to install the latest mir_eval commit directly from Github. Now just running

pip install -r requirements.txt

should download all required dependencies.

justachetan · 2020-03-21T08:01:10Z

Hi @jongwook, I have been trying to train on MAPS since yesterday, however, I am still facing the Userwarning shown above. Additionally, the metrics.json file in my case is shown to be empty. Could you kindly point out what I might be doing wrong?

The command I am using for running is:

python3 train.py with train_on="MAPS" logdir=runs/baseline iterations=10000 validation_interval=100 checkpoint_interval=100

I checked it till after 1100 iterations, but there was no change.

jongwook · 2020-03-21T08:20:39Z

@justachetan Is your loss decreasing? You'll need ~100,000 iterations (ideally more) to see sensible results

justachetan · 2020-03-21T08:22:49Z

I am not able to even see the loss. The metrics.json file generates is completely empty. As per my understanding, on running the above command, the metrics should get logged after every 100 iterations, right? I saw in another issue that they were able to see some results after 500 iterations, as the Userwarning about empty reference frames went away.

Hence, I was confused as to why this is happening.

jongwook · 2020-03-21T08:25:22Z

train.py writes tensorboard logs from which you should be able to see the loss curves.

justachetan · 2020-03-21T08:38:39Z

This is what the loss plot looks like. The metrics.json file is still empty though.

EDIT: The same plot, slightly zoomed out. It seems to be fluctuating like crazy, I don't know why

EDIT 2: Additionally, plots of frame metrics and note metrics don't even get loaded, I am guessing because of the User warning from mir_eval. These results are after around 4.1k iterations while training on MAPS dataset.

jongwook · 2020-03-21T08:47:20Z

I think you'll need to let the training run for a lot more, until at least 100k. The loss will stay at ~0.2 for a short while, and decrease steadily:

BTW I don't remember saving metrics.json during training. Are you sure that file is generated by my code?

justachetan · 2020-03-21T08:51:40Z

Yep, I have not made any changes to your code as of now. I assumed that it was getting generated from there.

justachetan · 2020-03-21T08:52:26Z

I think you'll need to let the training run for a lot more, until at least 100k. The loss will stay at ~0.2 for a short while, and decrease steadily:

BTW I don't remember saving metrics.json during training. Are you sure that file is generated by my code?

Your loss plot does not seem to have as many fluctuations as mine. Is this while training on MAPS only?

jongwook · 2020-03-21T08:55:16Z

FYI your plot contains lines from multiple tensorboard log files, hence looking messy. Also note that my plots are smoothed significantly; the dim curves in the background are the actual data points. If you train until ~100k it'll look similar.

justachetan · 2020-03-21T09:12:31Z

So even in your case, while training on MAPS, the Accuracy/Recall plots for notes and frames are not available till 100k iterations?

jongwook · 2020-03-21T09:16:27Z

I don't have the numbers for MAPS at hand, but it'll be generally similar. See Figure 6 of https://arxiv.org/pdf/1906.08512.pdf ; the blue baseline curve is for the MAESTRO dataset.

justachetan · 2020-03-21T09:45:10Z

The plot seems to indicate that you were not getting any values on Frame F1 or Note F1 till about 100k iterations. Probably due to the mir_eval issue itself. Could you kindly confirm if this is correct?

jongwook · 2020-03-21T09:48:00Z

You'll get sensible frame/note F1 values after around 100k, as said earlier. mir_eval will stop complaining once the predictions start having some notes.

AkasaTanabe · 2020-08-13T23:40:15Z

Hi @jongwook, I have read your paper ADVERSARIAL LEARNING FOR IMPROVED ONSETS AND FRAMES MUSIC TRANSCRIPTION. In your paper, you reported good experiment results in both onsets and frames and your proposed method.
I think you might solve this problem already. If so, I would like you to teach us how to solve this problem.
Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expected results? #4

Expected results? #4

brianc118 commented Oct 2, 2019

jongwook commented Oct 2, 2019 •

edited

Loading

brianc118 commented Oct 4, 2019

jongwook commented Oct 4, 2019

hanjuTsai commented Oct 30, 2019

brianc118 commented Oct 31, 2019

jongwook commented Nov 4, 2019 •

edited

Loading

justachetan commented Mar 21, 2020

jongwook commented Mar 21, 2020

justachetan commented Mar 21, 2020

jongwook commented Mar 21, 2020

justachetan commented Mar 21, 2020 •

edited

Loading

jongwook commented Mar 21, 2020

justachetan commented Mar 21, 2020

justachetan commented Mar 21, 2020

jongwook commented Mar 21, 2020

justachetan commented Mar 21, 2020

jongwook commented Mar 21, 2020

justachetan commented Mar 21, 2020 •

edited

Loading

jongwook commented Mar 21, 2020

AkasaTanabe commented Aug 13, 2020

Expected results? #4

Expected results? #4

Comments

brianc118 commented Oct 2, 2019

jongwook commented Oct 2, 2019 • edited Loading

brianc118 commented Oct 4, 2019

jongwook commented Oct 4, 2019

hanjuTsai commented Oct 30, 2019

brianc118 commented Oct 31, 2019

jongwook commented Nov 4, 2019 • edited Loading

justachetan commented Mar 21, 2020

jongwook commented Mar 21, 2020

justachetan commented Mar 21, 2020

jongwook commented Mar 21, 2020

justachetan commented Mar 21, 2020 • edited Loading

jongwook commented Mar 21, 2020

justachetan commented Mar 21, 2020

justachetan commented Mar 21, 2020

jongwook commented Mar 21, 2020

justachetan commented Mar 21, 2020

jongwook commented Mar 21, 2020

justachetan commented Mar 21, 2020 • edited Loading

jongwook commented Mar 21, 2020

AkasaTanabe commented Aug 13, 2020

jongwook commented Oct 2, 2019 •

edited

Loading

jongwook commented Nov 4, 2019 •

edited

Loading

justachetan commented Mar 21, 2020 •

edited

Loading

justachetan commented Mar 21, 2020 •

edited

Loading