Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HACK] Rewrite vimeo iframe with video html tag #173

Closed
wants to merge 3 commits into from

Conversation

mgautierfr
Copy link
Contributor

@mgautierfr mgautierfr commented Feb 2, 2024

The PR is HACK (a.k.a. a hammer to smash a fly).
It replace the <iframe> loading the vimeo js player with simple html video tag pointing to the video file.

It doesn't really respect the initial content, but it works damn well.

The code is a bit ugly and need some re-organisation. But I will do it if we agree with the idea.

Your comments are welcomed.

(ping @Jaifroid)

We also are a bit more selective in how we set the rewrite_context.
Vimeo iframe embedded a js player. But we don't need it, what we need
is to read the video, html5 video tag is enough.
@benoit74
Copy link
Collaborator

benoit74 commented Feb 2, 2024

Some first comments:

  • I don't get how this relates to Fix Vimeo video #172 which is also supposed to fix the situation
  • Does it means that we rely on browser support of HTML5 videos ?
  • I imagine that we have different controls (position, playback sound, speed if it exists, ...) than in the original player?

@Jaifroid
Copy link

Jaifroid commented Feb 2, 2024

@mgautierfr Personally I'm a pragmatist, and not averse to hacks, though I feel sorry for the squashed fly (!). If it works, and you think it's generalizable (though nothing is really generalizable with WARC-based ZIMs), then the concept seems fine to me! Goodness knows, I've written hundreds of hacks to get Zimit ZIMs working in old browsers, including some that don't even support wombat.js, so this hack looks quite tame and orderly in comparison. 🤔

@rgaudin
Copy link
Member

rgaudin commented Feb 2, 2024

Haven't looked at the implementation but just from the PR description, I'd say it's a bad idea. This looks like the opposite of what zimit should be about and honestly if we introduce this, I'm afraid that hammer will become too handy too often and we'll drift away from source fidelity.

@kelson42 your feedback is requested here so @mgautierfr knows on which PR he should focus his efforts.

@mgautierfr
Copy link
Contributor Author

I don't get how this relates to #172 which is also supposed to fix the situation

The problem of #172 is that we have to detect that a plain js content is used as a module and we must rewrite it differently. And I don't know how to do this right now. This is a point Ilya warned use about.

For now, #172 use a kind of hardcoded value/hack ("module.js" in url) and even with that, video doesn't not always play on the first time.

Does it means that we rely on browser support of HTML5 videos ?

I cannot be certain here as I don't know all internal thing of youtube/vimeo player but I would say that we already rely on browser support.

The js player does a lot of things we don't want or is not relevant here:

  • Phone home to send statistics
  • Switch video resolution/bitrate (from user interaction or automatically). We don't care here as we have only one video source.
  • Display related videos the user may want to watch after.
  • Display a link "Watch on youtube" (maybe the only thing we want to keep)

But at the end, there is a html video tag and it is the browser playing the video.

I imagine that we have different controls (position, playback sound, speed if it exists, ...) than in the original player?

Yes, we have the native controls. On firefox it is :

  • Play/Pause button
  • progress bar
  • Current "position" (time) on total time
  • Sound
  • Fullscreen

Haven't looked at the implementation but just from the PR description, I'd say it's a bad idea. This looks like the opposite of what zimit should be about and honestly if we introduce this, I'm afraid that hammer will become too handy too often and we'll drift away from source fidelity.

I had the same feeling at first. But the more I think about that, the more I think it is kind of ok.

We already drift from source in other scrappers. Mwoffliner use custom css and do some html cleanup. Sotoki works from dump, not html. We reencode image to reduce size....

@kelson42
Copy link
Contributor

kelson42 commented Feb 3, 2024

If this works fine with the replayer, then it should work fine here as well without this kind of ugly hack. Otherwise we have a most serious problem which deserves to be properly described.

@Jaifroid
Copy link

Jaifroid commented Feb 4, 2024

I've just tested with Kiwix Serve an old copy of mes-quartiers-chinois_fr_all_2023-05.zim. The video doesn't play there either but it is scraped and in the ZIM. It looks like this bug is or was in warc2zim prior to the switch to Zimit v2. I mean, clearly it's still a bug, just not sure where it could or should be fixed.

Further info: I saw this error in 2023-05 version, which seems to imply that the error there could be due to not allowing the domain of the Vimeo player to be included in the ZIM:

Refused to load the script 'https://f.vimeocdn.com/p/4.23.24/js/player.module.js' because it violates the following Content Security Policy directive: "default-src 'unsafe-eval' 'unsafe-inline' 'self' data: blob: mediastream: ws: wss:". Note that 'script-src-elem' was not explicitly set, so 'default-src' is used as a fallback.

EDIT2: I did some more investigation of video playback issues (YouTube this time) in #144. That issue affects Android, but it's a similar case where probably replacing the player with a simple HTML video element would allow YouTube videos to play on Android, though it might depend on the browser / webview capabilities.

@Jaifroid
Copy link

Jaifroid commented Feb 4, 2024

Side comment: not quite sure what to make of this size difference. Surely it can't be accounted for just due to lack of small textual headers stored in ZIM?

image

@benoit74
Copy link
Collaborator

benoit74 commented Feb 5, 2024

Size difference of "mes_quartiers_chinois" is "my fault", I changed the recipe configuration to stop archiving bullshit. I just started again Zimit 1 on this new recipe configuration so that we can compare "same things", it keeps bothering everyone. Sorry for that.

Regarding the PR and the more general situation of #144, it think the situation clearly begins to be "worrying". I mean, the reason to move to Zimit 2 was two-fold AFAIK: get rid of service workers and have proper videos playback on all readers. First objective seems to be almost OK, while there is still much needed for second one.

Given all remarks of @Jaifroid and @mgautierfr, I begin to wonder if it makes really sense to try to fix the JS for videos. Youtube seems to still not be ok on all readers except kiwix-serve / desktop. We are still failing to make Vimeo work properly. We never tried Twitter. And there are probably many more players in the wild (and they will get updated).

All these players have very complex JS code, which was already hard to fix for the desktop. Might be that webrecorder teams never looked after other devices. And might be that some specific JS files are requested depending on the detected device, and this will never be done for all devices by Browsertrix crawler (which simulates a device, and only one at a time obviously).

I think two efforts have to be considered:

  • continue to invest time in understanding (or explaining) why webrecorder replayer is succeeding where we are failing
  • consider replacing all videos (not only Vimeo ones) by something else ; I wonder if it would make sense to rely on video.js player to have better support of the various devices + more features in the player (with less dependency on the browser / device).

@Jaifroid
Copy link

Jaifroid commented Feb 5, 2024

  • consider replacing all videos (not only Vimeo ones) by something else ; I wonder if it would make sense to rely on video.js player to have better support of the various devices + more features in the player (with less dependency on the browser / device).

I quite like @mgautierfr's suggestion to use the HTML5 <video> element, because it's simple, very widely supported, and doesn't rely on plugins, just native video support. See https://caniuse.com/?search=video%20element.

Of course it would need testing in practice, and some provision for subtitle tracks (supported natively in the element) would need to be provided. It might even cure #144 (video formats for Android) if my preliminary tests with playing the MP4s "as-is" on Android proves to be the case also for the <video> element. The current problem on Android seems to be specifically that the video frame's (Google-provided) JS tries to "optimize" playback on Android, and it goes off looking for a version of the video that wasn't scraped.

ISTM that there are only two ways round this: 1. add code in the scraper to get the Android (and potentially iOS, etc.) versions alongside the desktop version; 2. override the playback JS, or the playback iframe, so that it uses a simpler process and just gets what's available. I think the HTML5 proposal would fulfil the second solution elegantly (subject to testing).

@mgautierfr
Copy link
Contributor Author

For information, here 3 zims files generated from same warc:

@Jaifroid
Copy link

Jaifroid commented Feb 5, 2024

Thanks @mgautierfr, I tested these on desktop PWA, official Android app, and PWA on Android.

In short, the wabac.js version doesn't work on any of the tested platforms (as we know). The Vimeo player version works patchily on both desktop PWA and Android (official app and PWA). I found it tricky to get the Vimeo player version to play, and had to reload the page a few times. I noticed an error in console.log (on BrowserStack in the PWA) about the js failing to get the bounding client rectangle, which seemed to block it if it tripped.

The native HTML5 version is a smoother experience on the tested platforms, though different browsers seem to have different codec support. A Samsung S23 played the HTML5 version very well, whereas a midrange Samsung didn't seem to be able to handle the format in one browser, but did work in another (both Chromium-based). There could be something else going on here in the PWA code that's unrelated -- it needs testing on Kiwix Serve in browsers on Android (can't test these that way).

Overall, the native widget still gave the best experience on average.

Unrelated note: on the official Android app in dark mode, entering full-screen video causes the video in the HTML5 version to become inverted (colour-wise), but I suspect this is easy to fix with a stylesheet rule in the client. This didn't happen with the PWA on Android, but it uses darkReader which probably handles this case.

@Jaifroid
Copy link

Jaifroid commented Feb 5, 2024

I quickly tested the native and Vimeo-player versions in iOS 17 as well (PWA). Here the news isn't so good.

  1. The Vimeo player shows the error in first screenshot below.
  2. With the native player, it refuses to respond, and in DEV tools there is a curious message inserted into the HTML Your browser doesn't support the video tag (bottom screenshot). However, this is patently false (see caniuse), so I'm confused. EDIT: I see you inserted that message in the PR here. In fact, if I switch to our old DOM Injection Mode (JQuery mode), the video plays fine. In this mode, we go get the video ourselves, transform it into a blob: URL, and insert it into the video tag's source src. Hey presto.

Long story short: someone needs to check these in the iOS app, as there may be other issues at work specific to the PWA on iOS Safari.

image
image

@Jaifroid
Copy link

Jaifroid commented Feb 6, 2024

@benoit wrote:

Youtube seems to still not be ok on all readers except kiwix-serve / desktop.

@benoit74 I think I have a significant finding on this front in #144 (comment), and it's not due to the Zimit2 code. In sum, an incompatibility with Android players appears to have been introduced round about 8th December, and the incompatibility on iOS seems to be related to an incompatible regex in the new code.

@benoit74
Copy link
Collaborator

benoit74 commented Feb 6, 2024

@Jaifroid I think we now have the same understanding. FYI, we have agreed yesterday in a team meeting that the Vimeo situation needs little / no attention for the moment because the Youtube ones (there are many as you said) are way more important since it used to work better with zimit1 + it represent a high percentage of "market share" + we are not even sure Vimeo worked once.

This kind of only works only if you have only youtube video in the zim
file. (Or at least, we will always take the first one)
@benoit74
Copy link
Collaborator

benoit74 commented May 2, 2024

Superseeded by #228

@benoit74 benoit74 closed this May 2, 2024
@benoit74 benoit74 deleted the vimeo_as_video branch May 21, 2024 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants