When trouble-shooting, it's important to have a basic understanding of how ly2video works.
At a high level, the process is essentially this:
- Perform some pre-processing on the input
.ly
file, including the following:- remove any headers
- remove any line/page breaks
- enable
one-line-breaking
mode - disable generation of page numbers
- unfold repeats
- set paper margins
- include a special
dump-spacetime-info.ly
library which causes spatial and temporal data about eachNoteHead
andChordName
grob to be dumped toSTDOUT
in a special format which can easily be parsed by ly2video.
- Run LilyPond on the result to render to PDF, PNG, and MIDI. The PDF / PNG files will contain one extremely long line of music.
- Parse the space-time grob data dumped to standard output by the above LilyPond run.
- If a beatmap file is provided, run midi-rubato to splice tempo change events into the MIDI file.
- Parse the MIDI file to extract relevant events.
- Convert each grob's coordinate into an index, which is the x-coordinate of the center of the grob in the PNG file which contains it.
- Align the grob at each index with the notes in each MIDI tick.
- For each index in the notation, generate a series of video still frames, where the first one in the series has the cursor line centred on the notes at that index, and the last is the final frame before the cursor line is centred on the notes at the next index. The MIDI event stream is used to ensure that these video frames have the correct timing to be synchronized with the corresponding MIDI ticks aligned by the preceding step described above.
Notated music is sufficiently complex that synchronizing video frames of notated output from LilyPond with MIDI events generated by LilyPond is a non-trivial problem.
Here are some examples of difficulties which need to be solved:
- Since music notation is not (normally) proportionally spaced, the speed at which the video progresses through the notated music depends on the "density" of notes within a given time interval. For example, a video of a series of semi-breves (whole notes) will progress much slower than a video of a series of semi-quavers (16th notes).
- The speed of the video is also affected by MIDI tempo change events, and this is further complicated by the fact that such events can occur in between MIDI NoteOn events.
- Each tie between two notes results in one less MIDI NoteOn event.
- Notes in a chord usually appear in the same MIDI tick, but appear in different locations in the notated music. Sometimes they share the same note stem but appear on opposite sides of it, resulting in slightly different X co-ordinates.
- In contrast, grace notes (acciaccaturas and appogiaturas) also result in very close X co-ordinates but distinct ticks for the corresponding MIDI events.
- ChordNames are notated as text, but result in multi-note chords.
When ly2video fails to synchronize the audio and video correctly,
there are a number of things which can be investigated. The first
step is to enable debugging by running ly2video with the --debug
option, and to prevent it from removing all the temporary files on
exit via the --keep
option. Temporary files are stored in the
ly2video.tmp/
subdirectory of the directory from which ly2video was
invoked.
The next step is to identify at which point in the audio/video synchronization breaks. With debugging enabled, ly2video will output multiple sets of debug lines explaining the synchronization process.
For example, a healthy synchronization point looks like this:
index 739, tick 1792
midiPitches: {63: midi.NoteOnEvent(tick=1792, channel=1, data=[63, 92])}
indexPitches: {63.0: (u"ef'", 55, 2)}
matched 'ef'' @ 55:2 to MIDI pitch 63
all pitches matched in this MIDI tick!
This shows that ly2video was expecting the note at index 739
(i.e. x=739 in the PNG rendered by LilyPond) to match up with the
NoteOn event in MIDI tick 1792. You could then look at
ly2video.tmp/sanitised-page0001.png
to see which music appears at
739 pixels from the left. For example, if you open the image in The
Gimp, then you can move the mouse pointer from left
to right until the X co-ordinate displayed at the bottom left of the
image window shows 739.
They both had the same pitch (E flat) so synchronization proceeds.
In contrast, an unhealthy synchronization point might look like this:
index 739, tick 0
midiPitches: {73: midi.NoteOnEvent(tick=0, channel=0, data=[73, 90]), 66: midi.NoteOnEvent(tick=0, channel=0, data=[66, 90]), 70: midi.NoteOnEvent(tick=0, channel=0, data=[70, 90]), 63: midi.NoteOnEvent(tick=0, channel=1, data=[63, 63])}
indexPitches: {63.0: (u"ef'", 55, 2)}
matched 'ef'' @ 55:2 to MIDI pitch 63
WARNING: only matched 1/5 MIDI notes at index 739 tick 0
pitch 73 length 2
pitch 66 length 2
pitch 70 length 2
Whilst one note at tick 0 (the very beginning of the MIDI stream)
matched the note in the PNG / PDF files, there were four other NoteOn
events in the same MIDI tick with no corresponding NoteHead grob in
the PNG / PDF notation. In this case it occurred because there was a
ChordName symbol at the beginning of the piece which was rendered as
text in the PNG / PDF but as a four-note chord in the MIDI stream.
(Incidentally, this has since been fixed, as can be seen by running
ly2video on test/regressions/chord-start/input.ly
.)
If you get an error like ERROR: Wanted to skip 5 consecutive MIDI ticks which suggests a catastrophic loss of synchronization; aborting.
then it will be preceded by
[FIXME: Sorry, I got interrupted while writing this, and came
back to it years later, by which time I forgot what I was going to
say. But I'm guessing the suggestion was to look at what precedes the
ERROR
in the debug.]
index 969, tick 6336
midiPitches: {65: midi.NoteOnEvent(tick=6336, channel=1, data=[65, 92])}
indexPitches: {58.0: (u'bf', 56, 29)}
WARNING: skipping MIDI tick 6336; contents:
pitch 65 length 2
This shows that ly2video was expecting the note at index 969 to match
up with the note in MIDI tick 6336, but there was a mismatch in note
pitches. Sometimes a mismatch is OK, for example if a note was hidden
by \hideNotes
in the .ly
source, it will still appear in the MIDI
output. ly2video will tolerate this by skipping up to 5 consecutive
ticks. However if it encounters more than 5 mismatches in a row, it
assumes that somehow it got the audio and video irreparably out of
sync, at which point it gives up.
The indexPitches
line above indicates that a B flat note should
appear at this position, which is equivalent to MIDI pitch 58 (middle
C is 60). However, the midiPitches
line immediately preceding it
shows that MIDI pitch 65 was encountered in the MIDI stream.
Rather than trying to second-guess the relationships between LilyPond's rendered grobs and the MIDI events it generates, it would be better to extend LilyPond's Translators to output this information in a format which ly2video can consume.
Other approaches to generating video from LilyPond may or may not take a better approach; see also issue #67 (Decide on best-of-breed future approach to .ly video generation).