Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature : provides .epub files preview pictures #21

Closed
7 changes: 7 additions & 0 deletions lib/AppInfo/Application.php
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
use OCA\Epubviewer\Listener\BeforeTemplateRenderedListener;
use OCA\Epubviewer\Listener\FilesLoadAdditionalScriptsListener;
use OCA\Epubviewer\Listener\PublicShareBeforeTemplateRenderedListener;
use OCA\Epubviewer\Preview\EPubPreview;
use OCA\Files\Event\LoadAdditionalScriptsEvent;

use OCP\AppFramework\App;
Expand All @@ -27,6 +28,8 @@ public function __construct() {
public function register(IRegistrationContext $context): void {
include_once __DIR__ . '/../../vendor/autoload.php';

$this->registerProvider($context);

// “Emitted before the rendering step of each TemplateResponse. The event holds a flag that specifies if a user is logged in.”
// See: https://docs.nextcloud.com/server/latest/developer_manual/basics/events.html#oca-settings-events-beforetemplaterenderedevent
$context->registerEventListener(BeforeTemplateRenderedEvent::class, BeforeTemplateRenderedListener::class);
Expand All @@ -44,6 +47,10 @@ public function register(IRegistrationContext $context): void {
// Hooks::register();
}

private function registerProvider(IRegistrationContext $context): void {
$context->registerPreviewProvider(EPubPreview::class, '/^application\/epub\+zip$/');
}

public function boot(IBootContext $context): void {
}
}
248 changes: 248 additions & 0 deletions lib/Preview/EPubPreview.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,248 @@
<?php
/**
*
* @author Sebastien Marinier <[email protected]>
*
* @license AGPL-3.0
*
* This code is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License, version 3,
* as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License, version 3,
* along with this program. If not, see <http://www.gnu.org/licenses/>
*
*/

namespace OCA\Epubviewer\Preview;

//.epub
use OC\Archive\ZIP;
use OCP\Files\File;
use OCP\Files\FileInfo;
use OCP\IImage;
use OCP\Preview\IProviderV2;
use OCP\ITempManager;

class EPubPreview implements IProviderV2 {
private ?ZIP $zip = null;

Check failure

Code scanning / Psalm

UndefinedClass Error

Class, interface or enum named OC\Archive\ZIP does not exist
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto


/**
* {@inheritDoc}
*/
public function getMimeType(): string {
return '/application\/epub\+zip/';
}

/**
* Check if a preview can be generated for $path
*
* {@inheritDoc}
*/
public function isAvailable(FileInfo $file): bool {
return true;
}

/**
* @inheritDoc
*/
public function getThumbnail(File $file, int $maxX, int $maxY): ?IImage {
$image = $this->extractThumbnail($file, '');
if ($image && $image->valid()) {
return $image;
}
return null;
}

/**
* extractThumbnail from complicated epub format
*/
private function extractThumbnail(File $file, string $path): ?IImage {
$tmpManager = \OC::$server->get(ITempManager::class);

Check failure

Code scanning / Psalm

UndefinedClass Error

Class, interface or enum named OC does not exist
Copy link
Owner

@devnoname120 devnoname120 Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarinier Instance of ITempManager should be obtained through dependency injection from Nextcloud. See here for an example: https://github.com/nextcloud/assistant/pull/71/files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. There is also an example in a preview, where i took it i think : lib/private/Preview/Office.php
So this is 50% chance.;) (I think it's the same at least). Not sure the problem doesn't come from Psalm.

Could you confirm here you want this ? As this is less efficient. Indeed, the dependency injection is done by server get class only when the preview is built (then the preview image is stored the cache). If the injection is in constructor, it is called on all files twice (to check the mime type)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. There is also an example in a preview, where i took it i think : lib/private/Preview/Office.php

Old code.

As this is less efficient. Indeed, the dependency injection is done by server get class only when the preview is built (then the preview image is stored the cache). If the injection is in constructor, it is called on all files twice (to check the mime type)

Unless there is a bug the performance impact should be negligible. The file access takes many orders of magnitude more time than the near-instant dependency injection. The \OC::$server->get() notation is being phased out so it's not future proof.

$sourceTmp = $tmpManager->getTemporaryFile();

try {
$content = $file->fopen('r');
file_put_contents($sourceTmp, $content);

Check failure on line 71 in lib/Preview/EPubPreview.php

View workflow job for this annotation

GitHub Actions / static-code-analysis

PossiblyFalseArgument

lib/Preview/EPubPreview.php:71:34: PossiblyFalseArgument: Argument 2 of file_put_contents cannot be false, possibly array<array-key, string>|resource|string value expected (see https://psalm.dev/104)

$this->zip = new ZIP($sourceTmp);

$img_data = null;
$contentPath = $this->getContentPath();
if ($contentPath) {
$package = $this->extractXML($contentPath);

Check failure on line 78 in lib/Preview/EPubPreview.php

View workflow job for this annotation

GitHub Actions / static-code-analysis

ArgumentTypeCoercion

lib/Preview/EPubPreview.php:78:34: ArgumentTypeCoercion: Argument 1 of OCA\Epubviewer\Preview\EPubPreview::extractXML expects 'META-INF/container.xml', but parent type non-falsy-string provided (see https://psalm.dev/193)
if ($package) {
$path = $contentPath;
$img_src = $cover = null;
// Try first through <manifest>
$items = $package->manifest->children();
foreach($items as $item) {

Check failure on line 84 in lib/Preview/EPubPreview.php

View workflow job for this annotation

GitHub Actions / static-code-analysis

PossiblyNullIterator

lib/Preview/EPubPreview.php:84:14: PossiblyNullIterator: Cannot iterate over nullable var SimpleXMLElement|null (see https://psalm.dev/097)
if (($item['id'] == 'cover' || $item['id'] == 'cover-image') && preg_match('/image\//', (string) $item['media-type'])) {
$img_src = (string) $item['href'];
break;
}
}

// in references
if (!$img_src) {
$references = $package->guide->children();
foreach($references as $reference) {

Check failure on line 94 in lib/Preview/EPubPreview.php

View workflow job for this annotation

GitHub Actions / static-code-analysis

PossiblyNullIterator

lib/Preview/EPubPreview.php:94:15: PossiblyNullIterator: Cannot iterate over nullable var SimpleXMLElement|null (see https://psalm.dev/097)
if ($reference['type'] == 'cover' || $reference['type'] == 'title-page') {
$cover = (string) $reference['href'];
break;
}
}
}

// no cover ? no image ? take the first page
if (!$img_src && !$cover) {
$first_page_id = (string) $package->spine->itemref['idref'];
if ($first_page_id) {
foreach($items as $item) {

Check failure on line 106 in lib/Preview/EPubPreview.php

View workflow job for this annotation

GitHub Actions / static-code-analysis

PossiblyNullIterator

lib/Preview/EPubPreview.php:106:16: PossiblyNullIterator: Cannot iterate over nullable var SimpleXMLElement|null (see https://psalm.dev/097)
if ($item['id'] == $first_page_id) {
$cover = (string) $item['href'];
break;
}
}

}
}

// have we a "cover" file ?
if ($cover) {
// relative to container
$img_src = null;
$path = $this->resolvePath($path, $cover);
$dom = $this->extractHTML($path);
if ($dom) {
// search img
$images = $dom->getElementsByTagName('img');
if ($images->length) {
$img_src = $images[0]->getAttribute('src');

Check failure on line 126 in lib/Preview/EPubPreview.php

View workflow job for this annotation

GitHub Actions / static-code-analysis

PossiblyNullReference

lib/Preview/EPubPreview.php:126:32: PossiblyNullReference: Cannot call method getAttribute on possibly null value (see https://psalm.dev/083)

Check failure on line 126 in lib/Preview/EPubPreview.php

View workflow job for this annotation

GitHub Actions / static-code-analysis

UndefinedMethod

lib/Preview/EPubPreview.php:126:32: UndefinedMethod: Method DOMNode::getAttribute does not exist (see https://psalm.dev/022)
} else {
$images = $dom->getElementsByTagName('image');
if ($images->length) {
$img_src = $images[0]->getAttribute('xlink:href');

Check failure on line 130 in lib/Preview/EPubPreview.php

View workflow job for this annotation

GitHub Actions / static-code-analysis

PossiblyNullReference

lib/Preview/EPubPreview.php:130:33: PossiblyNullReference: Cannot call method getAttribute on possibly null value (see https://psalm.dev/083)

Check failure on line 130 in lib/Preview/EPubPreview.php

View workflow job for this annotation

GitHub Actions / static-code-analysis

UndefinedMethod

lib/Preview/EPubPreview.php:130:33: UndefinedMethod: Method DOMNode::getAttribute does not exist (see https://psalm.dev/022)
}
}
}
}// cover

// img ?
if ($img_src) {
$img_src = $this->resolvePath($path, $img_src);
$img_data = $this->extractFileData($img_src);

Check failure on line 139 in lib/Preview/EPubPreview.php

View workflow job for this annotation

GitHub Actions / static-code-analysis

ArgumentTypeCoercion

lib/Preview/EPubPreview.php:139:42: ArgumentTypeCoercion: Argument 1 of OCA\Epubviewer\Preview\EPubPreview::extractFileData expects 'META-INF/container.xml', but parent type string provided (see https://psalm.dev/193)
}
}
}

// Pfff. Make a pause
if ($img_data) {
$image = new \OC_Image();
$image->loadFromData($img_data);
return $image;
}
return null;
} catch (\Exception $e) {
return null;
}
}

/**
* find the main content XML (usually "content.opf")
*/
private function getContentPath() : ?string {
$xml_container = $this->extractXML('META-INF/container.xml');
if (is_object($xml_container)) {
$full_path = $xml_container->rootfiles->rootfile['full-path'][0];

Check notice

Code scanning / Psalm

PossiblyNullArrayAccess Note

Cannot access array value on possibly null variable $xml_container->rootfiles->rootfile['full-path'] of type SimpleXMLElement|null
if ($full_path) {

Check notice

Code scanning / Psalm

RiskyTruthyFalsyComparison Note

Operand of type SimpleXMLElement|null contains type SimpleXMLElement, which can be falsy and truthy. This can cause possibly unexpected behavior. Use strict comparison instead.
return $full_path->__toString();
}
}
return null;
}

/**
* extract HTML from Zip path
* @param string $path
* @return \DOMDocument|null
*/
protected function extractHTML(string $path): \DOMDocument|null {
$html = $this->extractFileData($path);

Check notice

Code scanning / Psalm

ArgumentTypeCoercion Note

Argument 1 of OCA\Epubviewer\Preview\EPubPreview::extractFileData expects 'META-INF/container.xml', but parent type string provided
if (is_string($html)) {
$dom = new \DOMDocument('1.0', 'utf-8');
$dom->strictErrorChecking = false;
if (@$dom->loadHTML($html)) {

Check notice

Code scanning / Psalm

ArgumentTypeCoercion Note

Argument 1 of DOMDocument::loadHTML expects non-empty-string, but parent type string provided
return $dom;
}
}
return null;
}

/**
* extract XML from Zip path
*
* @psalm-param 'META-INF/container.xml' $path
*/
private function extractXML(string $path): \SimpleXMLElement|false|null {
$xml = $this->extractFileData($path);
if (is_string($xml)) {
return simplexml_load_string($xml);
}
return null;
}

/**
* get unzipped data
*
* @param string $path file path in zip
*
* @psalm-param 'META-INF/container.xml' $path
*
* @return false|null|string
*/
private function extractFileData(string $path): string|false|null {
if ($this->zip === null) {
return null;
}
$fp = $this->zip->getStream($path, 'r');

Check failure

Code scanning / Psalm

UndefinedClass Error

Class, interface or enum named OC\Archive\ZIP does not exist
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarinier Does OC\Archive\ZIP exist at all if the app files_zip isn't installed?

Copy link
Owner

@devnoname120 devnoname120 Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK so I guess it's a private API: https://github.com/nextcloud/server/blob/952271929d888c8333f5b64aa676f802a8b682af/lib/private/Archive/ZIP.php#L13

How much effort would it require to change the code to use PHP's ZipArchive directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be a pity, as the ZIP class embeds ZipArchive properly, and logs errors in NextCloud. Its probably a definition that is necessary to do with Psalm.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that it's a private file that should not be used in apps. This means that if we use this implementation it will unexpectedly break in an update without any warnings. I don't want to maintain code that relies on non-standard internal APIs.

if ($fp) {
$content = stream_get_contents($fp);
fclose($fp);
return $content;
}
return null;
}

/**
* Resolve relative $relPath from $path (removes ./, ../)
*
* @param string $path reference path
* @param string $relPath relative path
* @return string
*/
private function resolvePath(string $path, string $relPath): string {
$path = dirname($path).'/'.$relPath;
$pieces = explode('/', $path);
$parents = [];
foreach($pieces as $dir) {
switch($dir) {
case '.':
// Don't need to do anything here
break;
case '..':
array_pop($parents);
break;
default:
$parents[] = $dir;
break;
}
}
return implode('/', $parents);
}
}
Loading