Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MP4 Mux and Demux Bugs #37

Open
daniellovera opened this issue Jan 8, 2025 · 8 comments
Open

MP4 Mux and Demux Bugs #37

daniellovera opened this issue Jan 8, 2025 · 8 comments

Comments

@daniellovera
Copy link
Contributor

Hi,

I was testing https://zhaohappy.github.io/libmedia/test/avtranscoder.html with the attached file (bbb_input.mp4) and I noticed a couple bugs. The bugs change the duration, frame rate, and quality so I am sharing them here.

I used these settings:
image

on this file
https://github.com/user-attachments/assets/a2bd89be-13cf-430d-a6fb-36f6164d59d9

got this log

set log level: 1
AVTranscoder pipelines started
connect stream 0, taskId: 26b34f0c-7373-465a-aa39-9df61fb6afa9
connect stream 1, taskId: 26b34f0c-7373-465a-aa39-9df61fb6afa9

AVTranscoder version v0.5.0-1-gb86adc4 Copyright (c) 2024-present the libmedia developers
Input #0, mp4, from 'bbb_input.mp4':
Duration: 00:00:10.026, start: 00:00:00.000, bitrate: 405 kbps/s
Stream #0:0 Video: h264 (High), yuv420p(tv, bt709), 1280x720 [SAR: 1:1 DAR 16:9], 277 kbps/s, 60.00 fps, 60.00 tbr, 15k tbn (default)
Metadata:
creationTime: 0
modificationTime: 0
matrix: [1,0,0,0,1,0,0,0,1]
language: 21956
languageString: und
handlerName: GPAC ISO Video Handler
vendorId:
encoder: libmedia-v0.5.0-1-gb86adc4
naluLengthSizeMinusOne: 3
Stream #0:1 Audio: aac (LC), 48000 Hz, stereo, float-planar, 128 kbps/s (default)
Metadata:
creationTime: 0
modificationTime: 0
language: 21956
languageString: und
handlerName: GPAC ISO Audio Handler
vendorId:
Stream #0:0 -> #0:0 (h264 -> h264)
Stream #0:1 -> #0:1 (aac -> copy)

Output #0, mkv, from 'test_muxing.mkv':
Duration: 00:00:10.026, start: 00:00:00.000, bitrate: 405 kbps/s
Stream #0:0 Video: h264 (High), yuv420p(tv, bt709), 1280x720 [SAR: 1:1 DAR 16:9], 277 kbps/s, 60.00 fps, 60.00 tbr, 15k tbn (default)
Metadata:
creationTime: 0
modificationTime: 0
matrix: [1,0,0,0,1,0,0,0,1]
language: 21956
languageString: und
handlerName: GPAC ISO Video Handler
vendorId:
encoder: libmedia-v0.5.0-1-gb86adc4
naluLengthSizeMinusOne: 3
Stream #0:1 Audio: aac (LC), 48000 Hz, stereo, float-planar, 128 kbps/s (default)
Metadata:
creationTime: 0
modificationTime: 0
language: 21956
languageString: und
handlerName: GPAC ISO Audio Handler
vendorId:

start demux loop, taskId: 26b34f0c-7373-465a-aa39-9df61fb6afa9
[26b34f0c-7373-465a-aa39-9df61fb6afa9] frame=3 fps=3000.00 size=0kB time=00:00:00.000 bitrate=0.00kbps speed=0.00x progress=0.00%
[26b34f0c-7373-465a-aa39-9df61fb6afa9] frame=136 fps=135.32 size=0kB time=00:00:02.216 bitrate=0.00kbps speed=2.20x progress=22.16%
[26b34f0c-7373-465a-aa39-9df61fb6afa9] frame=264 fps=131.28 size=0kB time=00:00:04.366 bitrate=0.00kbps speed=2.17x progress=43.66%
[26b34f0c-7373-465a-aa39-9df61fb6afa9] frame=390 fps=129.40 size=220.075kB time=00:00:06.466 bitrate=272.29kbps speed=2.15x progress=64.66%
[26b34f0c-7373-465a-aa39-9df61fb6afa9] frame=517 fps=128.51 size=220.075kB time=00:00:08.566 bitrate=205.53kbps speed=2.13x progress=85.66%
demuxer ended, taskId: 26b34f0c-7373-465a-aa39-9df61fb6afa9
[26b34f0c-7373-465a-aa39-9df61fb6afa9] frame=587 fps=116.63 size=220.075kB time=00:00:09.733 bitrate=180.89kbps speed=1.93x progress=97.33%
[26b34f0c-7373-465a-aa39-9df61fb6afa9] frame=587 fps=97.15 size=220.075kB time=00:00:09.733 bitrate=180.89kbps speed=1.61x progress=97.33%
video hardware decoder flush failed, ignore it, taskId: 349cd90e-d541-428b-9b84-6290f55649e3
video decoder ended, taskId: 349cd90e-d541-428b-9b84-6290f55649e3
video decoder ended, taskId: 349cd90e-d541-428b-9b84-6290f55649e3
video encoder ended, taskId: 349cd90e-d541-428b-9b84-6290f55649e3
26b34f0c-7373-465a-aa39-9df61fb6afa9
transcode ended, taskId: 26b34f0c-7373-465a-aa39-9df61fb6afa9, cost: 00:00:06.943

Created this file (I had to zip the mkv to attach):
test_muxing.zip

Issues:

  1. The bitrate is incorrectly determined automatically and instead of reencoding video at a bit rate of 866.226 kbps/s per mediainfo.js, it encodes at 277 kbps/s. This makes the quality poor.
  2. The duration of bbb_input.mp4 is 10, but it's being calculated as Duration: 00:00:10.026
  3. The mkv has a frame count of 602 - I think the ref frames are being incorrectly transcoded

MP4 can have negative DTS, and may need more than 12 samples due to how MP4 interleaves data. I think these are the issues, if it's helpful.

if (!streamFirstGotMap[stream.index]) {
stream.firstDTS = avpacket.dts
stream.startTime = avpacket.pts
streamFirstGotMap[stream.index] = true
}
if (avpacket.pts < stream.startTime) {
stream.startTime = avpacket.pts
}

Should that be

if (!streamFirstGotMap[stream.index]) {
  stream.firstDTS = avpacket.dts
  stream.startTime = avpacket.pts 
  streamFirstGotMap[stream.index] = true
} else if (avpacket.pts < stream.startTime) {
  stream.startTime = avpacket.pts
}

And for MP4, should use PTS instead of DTS for duration calculation:
so

const duration = Number(avpacket.dts - stream.firstDTS) * stream.timeBase.num / stream.timeBase.den

&
const duration = Number(avpacket.dts - stream.firstDTS) * stream.timeBase.num / stream.timeBase.den

I think should be
const duration = Number(avpacket.pts - stream.startTime) * stream.timeBase.num / stream.timeBase.den

I am still working on building libmedia myself so I am unable to test these just yet and wanted to share.

Thank you, I appreciate libmedia and it is great.

@zhaohappy
Copy link
Owner

The case of 12 samples is only for options fastOpen has set. By default, the analysis will be done up to maxAnalyzeDuration, but there is a bug here. Currently, maxAnalyzeDuration is set to 10 seconds, but the video is only 6 seconds long, which causes the bitrate not to be recalculated when exiting the analyzeStreams function, thus using the calculated value of 12 samples. I will fix to recalculate bitrate when io end appears in analyzeStreams later. In addition, I also agree that pts should be used for calculation here.

duration I used the pts + duration of the last sample to overwrite the value in the mdhd box. Will the subtle difference here cause any problems?

The mkv has a frame count of 602, I don't understand this error. Through testing, the file in test_muxing.zip does have only 600 frames of video.
image

@daniellovera
Copy link
Contributor Author

I'm glad we found the bug, thank you for fixing it. I checked and it did calculate bitrate more accurately for that video. Also, we can ignore

  1. The mkv has a frame count of 602 - I think the ref frames are being incorrectly transcoded

As that's a bad statement on my part and caused by me not understanding mediainfo.js vs checking frames individually.

That said, I should back up and restate the issue I am concerned about. A key use of libmedia for me is to transcode and transmux files. A common use case will be that a user wants to upload a video to a website (example, tiktok) which has a 10 minute duration limit for files. Let's say the original file is 10 minutes, but is MKV and TikTok will not accept MKV but requires MP4. Since the original file is 10 minutes exactly the user would expect it meets TikTok's requirements for duration but needs to transmux the file to MP4. If when I transmux the file for them so that it meets requirements then it is a problem if the duration is now 10 minutes and 0.033 seconds, because the file will be rejected for not meeting the duration requirement. Another example would be the user has a 10 minute MP4 but the codec is not acceptable to TikTok (AV1 or mpeg-2) and the user needs me to transcode the file for them then when I transcode the new file cannot increase in duration.

I hope I explained that well. What this means is that a transcoded or transmuxed file that has a longer than expected duration will be rejected and my service fails to help the user. This means that to me, the file being longer than the original is a problem, but if it's shorter than the original it's less so because if a transcoded or transmuxed file is shorter than expected, either the AV sync was improved (a good thing) or some frames at the beginning or end of the video were dropped (a bad thing but not probably not noticeable to the user, and also does not block uploading the video).

So with that understanding, let me address the items individually.

duration I used the pts + duration of the last sample to overwrite the value in the mdhd box. Will the subtle difference here cause any problems?

My original question

The duration of bbb_input.mp4 is 10, but it's being calculated as Duration: 00:00:10.026

was bad. I misunderstood how that worked and asked the question badly. duration + pts for mdhd will not cause any issue for me. In fact, it's good because as I showed above, I'd rather have the longer, more accurate duration known because then I can intentionally crop if necessary. I looked at the frame level data and realized that the video is 10min, but the audio is 10.026 and the video stream has a 21ms delay before starting and a 5ms last frame duration increase. My apologies for missing this the first time and not realizing that mediainfo.js was incorrect.

But I did notice something kind of weird, and possibly unintended, while working through this.

I did this with the transcoder demo:

(File 1) original MP4 file -> (File 2) mkv file, copying both streams

AVTranscoder version v0.5.0-5-g4ccb6e5 Copyright (c) 2024-present the libmedia developers
Input #0, mp4, from 'bbb_input.mp4':
Duration: 00:00:10.026, start: 00:00:00.000, bitrate: 998 kbps/s
Stream #0:0 Video: h264 (High), yuv420p(tv, bt709), 1280x720 [SAR: 1:1 DAR 16:9], 870 kbps/s, 60.00 fps, 60.00 tbr, 15k tbn (default)
Metadata:
creationTime: 0
modificationTime: 0
matrix: [1,0,0,0,1,0,0,0,1]
language: 21956
languageString: und
handlerName: GPAC ISO Video Handler
vendorId:
encoder:
naluLengthSizeMinusOne: 3
Stream #0:1 Audio: aac (LC), 48000 Hz, stereo, float-planar, 128 kbps/s (default)
Metadata:
creationTime: 0
modificationTime: 0
language: 21956
languageString: und
handlerName: GPAC ISO Audio Handler
vendorId:
Stream #0:0 -> #0:0 (h264 -> copy)
Stream #0:1 -> #0:1 (aac -> copy)

Output #0, mkv, from 'test_muxing-mp4-to-mkv.mkv':
Duration: 00:00:10.026, start: 00:00:00.000, bitrate: 998 kbps/s
Stream #0:0 Video: h264 (High), yuv420p(tv, bt709), 1280x720 [SAR: 1:1 DAR 16:9], 870 kbps/s, 60.00 fps, 60.00 tbr, 15k tbn (default)
Metadata:
creationTime: 0
modificationTime: 0
matrix: [1,0,0,0,1,0,0,0,1]
language: 21956
languageString: und
handlerName: GPAC ISO Video Handler
vendorId:
encoder:
naluLengthSizeMinusOne: 3
Stream #0:1 Audio: aac (LC), 48000 Hz, stereo, float-planar, 128 kbps/s (default)
Metadata:
creationTime: 0
modificationTime: 0
language: 21956
languageString: und
handlerName: GPAC ISO Audio Handler
vendorId:

[f40f04fd-f4da-4e72-a8d6-808ac714c009] frame=0 fps=0.00 size=0kB time=00:00:00.000 bitrate=0.00kbps speed=0.00x progress=0.00%
f40f04fd-f4da-4e72-a8d6-808ac714c009
transcode ended, taskId: f40f04fd-f4da-4e72-a8d6-808ac714c009, cost: 00:00:00.436

and then this
(File 1) original file -> (File 3) mkv, copying audio and re-encoding video as 264

AVTranscoder version v0.5.0-5-g4ccb6e5 Copyright (c) 2024-present the libmedia developers
Input #0, mp4, from 'bbb_input.mp4':
  Duration: 00:00:10.026, start: 00:00:00.000, bitrate: 998 kbps/s
  Stream #0:0 Video: h264 (High), yuv420p(tv, bt709), 1280x720 [SAR: 1:1 DAR 16:9], 870 kbps/s, 60.00 fps, 60.00 tbr, 15k tbn (default) 
    Metadata:
      creationTime: 0
      modificationTime: 0
      matrix: [1,0,0,0,1,0,0,0,1]
      language: 21956
      languageString: und
      handlerName: GPAC ISO Video Handler
      vendorId: 
      encoder: libmedia-v0.5.0-5-g4ccb6e5
      naluLengthSizeMinusOne: 3
  Stream #0:1 Audio: aac (LC), 48000 Hz, stereo, float-planar, 128 kbps/s (default) 
    Metadata:
      creationTime: 0
      modificationTime: 0
      language: 21956
      languageString: und
      handlerName: GPAC ISO Audio Handler
      vendorId: 
  Stream #0:0 -> #0:0 (h264 -> h264)
  Stream #0:1 -> #0:1 (aac -> copy)

Output #0, mkv, from 'test_muxing-mp4-to-mkv-encodeh264':
  Duration: 00:00:10.026, start: 00:00:00.000, bitrate: 998 kbps/s
  Stream #0:0 Video: h264 (High), yuv420p(tv, bt709), 1280x720 [SAR: 1:1 DAR 16:9], 870 kbps/s, 60.00 fps, 60.00 tbr, 15k tbn (default) 
    Metadata:
      creationTime: 0
      modificationTime: 0
      matrix: [1,0,0,0,1,0,0,0,1]
      language: 21956
      languageString: und
      handlerName: GPAC ISO Video Handler
      vendorId: 
      encoder: libmedia-v0.5.0-5-g4ccb6e5
      naluLengthSizeMinusOne: 3
  Stream #0:1 Audio: aac (LC), 48000 Hz, stereo, float-planar, 128 kbps/s (default) 
    Metadata:
      creationTime: 0
      modificationTime: 0
      language: 21956
      languageString: und
      handlerName: GPAC ISO Audio Handler
      vendorId: 

[9a3bc80b-10de-41e1-a877-9b097d2626a3] frame=2 fps=2.00 size=0kB time=00:00:00.000 bitrate=0.00kbps speed=NaNx progress=0.00%
[9a3bc80b-10de-41e1-a877-9b097d2626a3] frame=147 fps=146.56 size=0kB time=00:00:02.400 bitrate=0.00kbps speed=2.39x progress=24.00%
[9a3bc80b-10de-41e1-a877-9b097d2626a3] frame=283 fps=141.15 size=0kB time=00:00:04.700 bitrate=0.00kbps speed=2.34x progress=47.00%
[9a3bc80b-10de-41e1-a877-9b097d2626a3] frame=417 fps=138.58 size=655.461kB time=00:00:06.933 bitrate=756.34kbps speed=2.30x progress=69.33%
[9a3bc80b-10de-41e1-a877-9b097d2626a3] frame=565 fps=140.83 size=655.461kB time=00:00:09.366 bitrate=559.86kbps speed=2.33x progress=93.66%
9a3bc80b-10de-41e1-a877-9b097d2626a3
transcode ended, taskId: 9a3bc80b-10de-41e1-a877-9b097d2626a3, cost: 00:00:04.804

and I got two different MKV files that have different durations.
the mkv from copying streams has a 10.016 duration per lib media, and the mkv with re-encoded video has a 10.005 duration
Both have 600 video frames

Where this gets weird is when I transmux both those MKV files back to MP4, copying audio and video streams

(File 2) -> (File 4)

AVTranscoder version v0.5.0-5-g4ccb6e5 Copyright (c) 2024-present the libmedia developers
Input #0, mkv, from 'test_muxing-mp4-to-mkv.mkv':
Duration: 00:00:10.016, start: 00:00:00.000, bitrate: 994 kbps/s
Stream #0:0 Video: h264 (High), yuv420p(tv, bt709), 1280x720 [SAR: 1:1 DAR 16:9], 866 kbps/s, 60.00 fps, 60.00 tbr, 1000 tbn (default)
Metadata:
name: 21956
naluLengthSizeMinusOne: 3
Stream #0:1 Audio: aac (LC), 48000 Hz, stereo, float-planar, 128 kbps/s (default)
Metadata:
name: 21956
Stream #0:0 -> #0:0 (h264 -> copy)
Stream #0:1 -> #0:1 (aac -> copy)

Output #0, mp4, from 'test_muxing-mp4-to-mkv-back2mp4.mp4':
Duration: 00:00:10.016, start: 00:00:00.000, bitrate: 994 kbps/s
Stream #0:0 Video: h264 (High), yuv420p(tv, bt709), 1280x720 [SAR: 1:1 DAR 16:9], 866 kbps/s, 60.00 fps, 60.00 tbr, 1000 tbn (default)
Metadata:
name: 21956
naluLengthSizeMinusOne: 3
Stream #0:1 Audio: aac (LC), 48000 Hz, stereo, float-planar, 128 kbps/s (default)
Metadata:
name: 21956

[dfec039d-fed0-4b52-96d4-c512f8f75be3] frame=0 fps=0.00 size=0kB time=00:00:00.000 bitrate=0.00kbps speed=NaNx progress=0.00%
dfec039d-fed0-4b52-96d4-c512f8f75be3
transcode ended, taskId: dfec039d-fed0-4b52-96d4-c512f8f75be3, cost: 00:00:00.411

and

(File 3) -> (File 5)

AVTranscoder version v0.5.0-5-g4ccb6e5 Copyright (c) 2024-present the libmedia developers
Input #0, mkv, from 'test_muxing-mp4-to-mkv-encodeh264.mkv':
Duration: 00:00:10.005, start: 00:00:00.000, bitrate: 1071 kbps/s
Stream #0:0 Video: h264 (High), yuv420p(tv, bt709), 1280x720 [SAR: 1:1 DAR 16:9], 943 kbps/s, 60.00 fps, 60.00 tbr, 1000 tbn (default)
Metadata:
name: 21956
naluLengthSizeMinusOne: 3
Stream #0:1 Audio: aac (LC), 48000 Hz, stereo, float-planar, 128 kbps/s (default)
Metadata:
name: 21956
Stream #0:0 -> #0:0 (h264 -> copy)
Stream #0:1 -> #0:1 (aac -> copy)

Output #0, mp4, from 'test_muxing-mp4-to-mkv-encodeh264-back2mp4.mp4':
Duration: 00:00:10.005, start: 00:00:00.000, bitrate: 1071 kbps/s
Stream #0:0 Video: h264 (High), yuv420p(tv, bt709), 1280x720 [SAR: 1:1 DAR 16:9], 943 kbps/s, 60.00 fps, 60.00 tbr, 1000 tbn (default)
Metadata:
name: 21956
naluLengthSizeMinusOne: 3
Stream #0:1 Audio: aac (LC), 48000 Hz, stereo, float-planar, 128 kbps/s (default)
Metadata:
name: 21956

[7b4b9fe2-fba3-47fa-a6b6-c4e5a7d633f1] frame=0 fps=0.00 size=16.799kB time=00:00:00.000 bitrate=0.00kbps speed=NaNx progress=0.00%
7b4b9fe2-fba3-47fa-a6b6-c4e5a7d633f1
transcode ended, taskId: 7b4b9fe2-fba3-47fa-a6b6-c4e5a7d633f1, cost: 00:00:00.387

Here's the summary:
File 1 MP4, 600 video frames, 469 audio frames, 10.026 duration
File 2 MKV, 600 video frames, 470 audio frames, 10.016 duration
File 3 MKV, 600 video frames, 470 audio frames, 10.005 duration
File 4 MP4, 599 video frames, 469 audio frames, 10.046 duration
File 5 MP4, 599 video frames, 469 audio frames, 10.030 duration

Both File 4 and File 5 have their first frame with a negative dts, which is why for MP4 I think duration needs to be calculated with pts and not dts. I think this is how libavformat/mov.c in ffmpeg handles it:
https://github.com/FFmpeg/FFmpeg/blob/251de1791e645f16e80b09d82999d4a5e24b1ad1/libavformat/mov.c#L4291-L4296

We're beyond my knowledge at this point, so I'm mostly sharing this detail in case it helps. I think the expectation, and please let me know if I'm wrong, is that duration should not change when transcoding or transmuxing a file unless a different duration is specified, and if that's not the expectation then I need to adjust accordingly so my users don't get uploads rejected.

Thank you!

Supporting code and files below:

The command I used to check frame data:

setlocal enabledelayedexpansion
@echo off
::for %%a in (*.mp4,*.mpg,*.flv) Do (
for %%a in ("%~dpnx1") Do (
set /a count=0
cd %%~dpa
echo frame,media_type,stream_index,key_frame,pkt_pts,pkt_pts_time,pkt_dts,pkt_dts_time,best_effort_timestamp,^
best_effort_timestamp_time,pkt_duration,pkt_duration_time,pkt_pos,pkt_size,^
Width,Height,pix_fmt,sample_aspect_ratio,pict_type,coded_picture_number,display_picture_number,^
interlaced_frame,top_field_first,repeat_pict,color_range,color_space,color_primaries,color_transfer,^
chroma_location > "%%~na_ffprobe.csv"
echo No.  pts_time  type > "%%~na_AllFrames.txt"
ver > nul
set /a Number=0
ffprobe.exe -v quiet -select_streams v:0 -print_format csv -show_entries frame "%%~nxa"  >> "%%~na_ffprobe.csv"
for /F "tokens=4,6,18,19 delims=," %%b in ('findstr "video" "%%~na_ffprobe.csv"') do (
set "x=%%b"
set "y=%%c"
set "z=%%d"
set "w=%%e"
set "v=%%f"
set "u=%%g"
set sort=!y:~0,-7!
::if !sort! GEQ 0 if !sort! LEQ 10 echo !count!  !y:~0,13!  !x! !w:~0,1! >> "%%~na_AllFrames.txt"
echo !count! !y:~0,13!  !x! !w:~0,1! >> "%%~na_AllFrames.txt"
set /a count+=1
 set /a ekko=count%%100
 if !ekko! EQU 0 echo Frame !count! processed
)
)
rem pause

and the relevant output files:

bbb_input_AllFrames.txt
test_muxing-mp4-to-mkv_AllFrames.txt
test_muxing-mp4-to-mkv-encodeh264_AllFrames.txt
test_muxing-mp4-to-mkv-encodeh264-back2mp4_AllFrames.txt
test_muxing-mp4-to-mkv-back2mp4_AllFrames.txt

and

bbb_input_ffprobe.csv
test_muxing-mp4-to-mkv_ffprobe.csv
test_muxing-mp4-to-mkv-encodeh264_ffprobe.csv
test_muxing-mp4-to-mkv-encodeh264-back2mp4_ffprobe.csv
test_muxing-mp4-to-mkv-back2mp4_ffprobe.csv

@daniellovera
Copy link
Contributor Author

daniellovera commented Jan 9, 2025

This might be helpful. If using ctts to calculate pts before calculating duration for MP4, it's possible that the MP4 file is missing the CTTS atom. This was the mkvMerge implementation to calculate pts when ctts is missing: https://gitlab.com/mbunkus/mkvtoolnix/-/commit/1813e97cbf8f3691c55207f6c72d046ec00a2381

Would also help AVPlayer handle damaged MP4 files without dropping frames.

Edit: I'm working on building libmedia locally so I can test and contribute better, but am currently having some issues with using build-package.js to build cheap. Will post a follow up once it's clearer but right now it can't find the wat2wasm files, and also is calling import.meta for the CJS package.

@zhaohappy
Copy link
Owner

There are some errors in libmedia's duration calculation of tracks. I have fixed the calculation of video. But there are still some questions about the calculation of audio. As shown in the test video file above, we can see that the audio has 469 audio frames with 1024 samples and 1 audio frame with 768 samples, with total of 469 * 1024 + 768 = 481024 samples. The samplerate is 48000, so the calculated duration = 481024 / 48000 = 10.0213 seconds, but why is the calculated duration 10 seconds here? The duration in the mdhd box in the file which using FFmpeg remux is also 10 seconds. There may be some details I missed here, I need more time to determine the details.

@zhaohappy
Copy link
Owner

zhaohappy commented Jan 13, 2025

I understand the details here now. Duration cannot only be determined by the maximum pts, but also needs to consider the start time in the elst box. As shown in the test file above, the start time of the audio track in the elst box is 1024, so the duration of the audio needs to be subtracted by 1024 to equal 10 seconds. Now I have fixed it, the duration of the mp4 track needs to remove the part where the pts is a negative number.

@daniellovera
Copy link
Contributor Author

Brilliant, thank you. I sent a sponsorship via paypal.

I have a couple questions on building libmedia so I can contribute better and also verify without waiting for https://zhaohappy.github.io/libmedia/test/avtranscoder.html to be updated. Do you build with npm or pnpm? I ask because npm doesn't support the "workspace:*" syntax in package.json. Or if it does, I cannot figure out how to make it work.

For example in /avtranscoder/package.json I had to change the dependencies to look like this to run with npm

 "dependencies": {
    "@libmedia/common": "file:../common",
    "@libmedia/cheap": "file:../cheap",
    "@libmedia/avutil": "file:../avutil",
    "@libmedia/avprotocol": "file:../avprotocol",
    "@libmedia/avcodec": "file:../avcodec",
    "@libmedia/avformat": "file:../avformat",
    "@libmedia/avfilter": "file:../avfilter",
    "@libmedia/avpipeline": "file:../avpipeline",
    "@libmedia/avnetwork": "file:../avnetwork",
    "@libmedia/audioresample": "file:../audioresample",
    "@libmedia/audiostretchpitch": "file:../audiostretchpitch",
    "@libmedia/videoscale": "file:../videoscale"
  }

I also had to put:

  "workspaces": [
    "src/*"
  ],

into the main package.json file in libmedia/

If you are building with pnpm I can build that way as well to eliminate any potential variations. If you can share the .npmrc file (remove user names, passwords, and tokens if they are in there) then I'll mimic exactly.

Also, when I built it using "node build-package.js --package=all" it put all the types files (and only the types files) into /dist/src/* for each package. It also created a /dist/ in each /src/* folder with all the compiled files, including the types files. Is that the correct and intended behavior?

Finally, when using the packages, such as @libmedia/avtranscoder, in a separate project I'll be using the es6 modules to import into my Svelte/Sveltekit project that uses Vite for dev and vite build for prod. The avtranscoder library is only used in the browser, so it does not get server side rendered or run in nodejs. When I read https://github.com/daniellovera/libmedia/blob/master/site/docs/guide/quick-start.en-US.md and https://github.com/daniellovera/libmedia/blob/master/site/docs/guide/threads.en-US.md I was not understanding if I need to use the transformer with Vite. If I build @libmedia/avtranscoder the same way as you do and then import the package into my main project, does my main project need to use the transformer on the package again?

@zhaohappy
Copy link
Owner

Currently, the package.json in the subdirectory under the src directory is only used to publish npm packages. The package.json and webpack.config.js in the root directory are used for compilation and testing. The following is the steps of publishing npm packages now. If you need to publish to your private npm repository, you can refer to it:

  1. Execute node scripts/update-version.js --feature to update the version of package.json in all subdirectories, commit the changes, and use git tag them with the same version.
  2. Execute node scripts/build-package.js --package=all to compile all packages. This step will generate the files to be published in the dist directory of each subdirectory.
  3. Execute node scripts/update-dependencies.js to update the dependency package versions declared in package.json for all packages, that is, replace workspace:* with the real version.
  4. Execute sh ./build/publish.sh to push all packages to the npm repository. You can use npm to login to your private repository before executing.

The use of transformer depends on whether you will access pointer type data. If you only use @libmedia/avtranscoder, you don’t need it, but the AVCodecID constant enum is required in the getWasm callback. The esbuild in vite does not support typescript's constant enum. Here you can switch to tsc to compile or use the enum value directly. If you use the API of other packages, you usually need to access pointer type data, then you need to use transformer.

Finally, thank you for your sponsorship. This is my first sponsorship :)

@daniellovera
Copy link
Contributor Author

First of many, no doubt.

That makes sense about using the transformer. I'll simply use the enum value directly, saves complicating my build step.

I found two other small items as part of replicating the build process that I'll send PRs for:

In build-package.js, the buildAll command is missing avutil
In cheap/webassembly/WebAssemblyRunner.ts it's using __dirname and then being compiled by webpack
https://github.com/zhaohappy/cheap/blob/a3d5a92099cf05b9e26de9fb86d7fc5f44275c0f/webassembly/WebAssemblyRunner.ts#L496

I checked that build process above and it will work, thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants