Skip to content

Commit

Permalink
Converto to modular drop-in Zotero translation code repo
Browse files Browse the repository at this point in the history
  • Loading branch information
adomasven committed Jul 26, 2021
1 parent cc93cd7 commit 9643728
Show file tree
Hide file tree
Showing 34 changed files with 5,200 additions and 1,507 deletions.
6 changes: 6 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
[submodule "modules/zotero"]
path = modules/zotero
url = [email protected]:zotero/zotero.git
[submodule "modules/utilities"]
path = modules/utilities
url = [email protected]:zotero/utilities.git
[submodule "modules/--force"]
path = modules/--force
url = [email protected]:zotero/utilities.git
41 changes: 38 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,42 @@
Run
# Zotero Translate

This repository contains the Zotero translation architecture code responsible for
parsing Zotero translators and running them on live and static web pages to retrieve
Zotero items.

A consumer of this repository needs to implement the following interfaces:

- `Zotero.Translators` found in `translators.js`
- `Zotero.HTTP` found in `http.js`
- `Zotero.Translate.ItemSaver` found in `translation/translate_item.js`

You also need to:
- Call `Zotero.Schema.init(data)` with Zotero `schema.json`.
- If running in a ModuleJS environment (e.g. Node.js) call `require('./cachedTypes').setTypeSchema(typeSchema)`
with the result of `resource/zoteroTypeSchemaData.js`.

Please bundle translators and Zotero schema with the translation architecture.
Do not load them from a remote server.

You may also want to reimplement or modify:

- `Zotero.Repo` found in `repo.js` to set up periodic translator update retrieval
- `Zotero.Debug` found in `debug.js` to customize debug logging
- `Zotero` and `Zotero.Prefs` found in `zotero.js` to set up the environment and
long-term preference storage
- `Zotero.Translate.ItemGetter` found in `translation/translate_item.js` for export
translation
- `Zotero.Translate.SandboxManager` found in `translation/sandboxManager.js` for
a tighter Sandbox environment if available on your the platform

### Example

See `example/index.html` for file loading order.

To run the example:
```bash
$ git submodule update --init
$ google-chrome --disable-web-security --user-data-dir=/tmp/chromeTemp
$ google-chrome --disable-web-security --user-data-dir=/tmp/chromeTemvar
```

Open `src/index.html` in the CORS ignoring Google Chrome
Open `example/index.html` in the CORS ignoring Google Chrome
249 changes: 249 additions & 0 deletions example/http.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,249 @@
/*
***** BEGIN LICENSE BLOCK *****
Copyright © 2021 Corporation for Digital Scholarship
Vienna, Virginia, USA
http://zotero.org
This file is part of Zotero.
Zotero is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
Zotero is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with Zotero. If not, see <http://www.gnu.org/licenses/>.
***** END LICENSE BLOCK *****
*/

/**
* Functions for performing HTTP requests, both via XMLHTTPRequest and using a hidden browser
* @namespace
*/
Zotero.HTTP = new function() {
this.StatusError = function(xmlhttp, url) {
this.message = `HTTP request to ${url} rejected with status ${xmlhttp.status}`;
this.status = xmlhttp.status;
try {
this.responseText = typeof xmlhttp.responseText == 'string' ? xmlhttp.responseText : undefined;
} catch (e) {}
};
this.StatusError.prototype = Object.create(Error.prototype);

this.TimeoutError = function(ms) {
this.message = `HTTP request has timed out after ${ms}ms`;
};
this.TimeoutError.prototype = Object.create(Error.prototype);

/**
* Get a promise for a HTTP request
*
* @param {String} method The method of the request ("GET", "POST", "HEAD", or "OPTIONS")
* @param {String} url URL to request
* @param {Object} [options] Options for HTTP request:<ul>
* <li>body - The body of a POST request</li>
* <li>headers - Object of HTTP headers to send with the request</li>
* <li>debug - Log response text and status code</li>
* <li>logBodyLength - Length of request body to log</li>
* <li>timeout - Request timeout specified in milliseconds [default 15000]</li>
* <li>responseType - The response type of the request from the XHR spec</li>
* <li>responseCharset - The charset the response should be interpreted as</li>
* <li>successCodes - HTTP status codes that are considered successful, or FALSE to allow all</li>
* </ul>
* @return {Promise<XMLHttpRequest>} A promise resolved with the XMLHttpRequest object if the
* request succeeds, or rejected if the browser is offline or a non-2XX status response
* code is received (or a code not in options.successCodes if provided).
*/
this.request = function(method, url, options = {}) {
// Default options
options = Object.assign({
body: null,
headers: {},
debug: false,
logBodyLength: 1024,
timeout: 15000,
responseType: '',
responseCharset: null,
successCodes: null
}, options);


let logBody = '';
if (['GET', 'HEAD'].includes(method)) {
if (options.body != null) {
throw new Error(`HTTP ${method} cannot have a request body (${options.body})`)
}
} else if(options.body) {
options.body = typeof options.body == 'string' ? options.body : JSON.stringify(options.body);

if (!options.headers) options.headers = {};
if (!options.headers["Content-Type"]) {
options.headers["Content-Type"] = "application/x-www-form-urlencoded";
}
else if (options.headers["Content-Type"] == 'multipart/form-data') {
// Allow XHR to set Content-Type with boundary for multipart/form-data
delete options.headers["Content-Type"];
}

logBody = `: ${options.body.substr(0, options.logBodyLength)}` +
options.body.length > options.logBodyLength ? '...' : '';
// TODO: make sure below does its job in every API call instance
// Don't display password or session id in console
logBody = logBody.replace(/password":"[^"]+/, 'password":"********');
logBody = logBody.replace(/password=[^&]+/, 'password=********');
}
Zotero.debug(`HTTP ${method} ${url}${logBody}`);

var xmlhttp = new XMLHttpRequest();
xmlhttp.timeout = options.timeout;
var promise = Zotero.HTTP._attachHandlers(url, xmlhttp, options);

xmlhttp.open(method, url, true);

for (let header in options.headers) {
xmlhttp.setRequestHeader(header, options.headers[header]);
}

xmlhttp.responseType = options.responseType || '';

// Maybe should provide "mimeType" option instead. This is xpcom legacy, where responseCharset
// could be controlled manually
if (options.responseCharset) {
xmlhttp.overrideMimeType("text/plain; charset=" + options.responseCharset);
}

xmlhttp.send(options.body);

return promise.then(function(xmlhttp) {
if (options.debug) {
if (xmlhttp.responseType == '' || xmlhttp.responseType == 'text') {
Zotero.debug(`HTTP ${xmlhttp.status} response: ${xmlhttp.responseText}`);
}
else {
Zotero.debug(`HTTP ${xmlhttp.status} response`);
}
}

let invalidDefaultStatus = options.successCodes === null &&
(xmlhttp.status < 200 || xmlhttp.status >= 300);
let invalidStatus = Array.isArray(options.successCodes) && !options.successCodes.includes(xmlhttp.status);
if (invalidDefaultStatus || invalidStatus) {
throw new Zotero.HTTP.StatusError(xmlhttp, url);
}
return xmlhttp;
});
};
/**
* Send an HTTP GET request via XMLHTTPRequest
*
* @deprecated Use {@link Zotero.HTTP.request}
* @param {String} url URL to request
* @param {Function} onDone Callback to be executed upon request completion
* @param {String} responseCharset
* @param {N/A} cookieSandbox Not used in Connector
* @param {Object} headers HTTP headers to include with the request
* @return {Boolean} True if the request was sent, or false if the browser is offline
*/
this.doGet = function(url, onDone, responseCharset, cookieSandbox, headers) {
Zotero.debug('Zotero.HTTP.doGet is deprecated. Use Zotero.HTTP.request');
this.request('GET', url, {responseCharset, headers})
.then(onDone, function(e) {
onDone({status: e.status, responseText: e.responseText});
throw (e);
});
return true;
};

/**
* Send an HTTP POST request via XMLHTTPRequest
*
* @deprecated Use {@link Zotero.HTTP.request}
* @param {String} url URL to request
* @param {String|Object[]} body Request body
* @param {Function} onDone Callback to be executed upon request completion
* @param {String} headers Request HTTP headers
* @param {String} responseCharset
* @return {Boolean} True if the request was sent, or false if the browser is offline
*/
this.doPost = function(url, body, onDone, headers, responseCharset) {
Zotero.debug('Zotero.HTTP.doPost is deprecated. Use Zotero.HTTP.request');
this.request('POST', url, {body, responseCharset, headers})
.then(onDone, function(e) {
onDone({status: e.status, responseText: e.responseText});
throw (e);
});
return true;
};


/**
* Adds a ES6 Proxied location attribute
* @param doc
* @param docUrl
*/
this.wrapDocument = function(doc, docURL) {
let url = require('url');
docURL = url.parse(docURL);
docURL.toString = () => this.href;
var wrappedDoc = new Proxy(doc, {
get: function (t, prop) {
if (prop === 'location') {
return docURL;
}
else if (prop == 'evaluate') {
// If you pass the document itself into doc.evaluate as the second argument
// it fails, because it receives a proxy, which isn't of type `Node` for some reason.
// Native code magic.
return function() {
if (arguments[1] == wrappedDoc) {
arguments[1] = t;
}
return t.evaluate.apply(t, arguments)
}
}
else {
if (typeof t[prop] == 'function') {
return t[prop].bind(t);
}
return t[prop];
}
}
});
return wrappedDoc;
};


/**
* Adds request handlers to the XMLHttpRequest and returns a promise that resolves when
* the request is complete. xmlhttp.send() still needs to be called, this just attaches the
* handler
*
* See {@link Zotero.HTTP.request} for parameters
* @private
*/
this._attachHandlers = function(url, xmlhttp, options) {
var deferred = Zotero.Promise.defer();
xmlhttp.onload = () => deferred.resolve(xmlhttp);
xmlhttp.onerror = xmlhttp.onabort = function() {
var e = new Zotero.HTTP.StatusError(xmlhttp, url);
if (options.successCodes === false) {
deferred.resolve(xmlhttp);
} else {
deferred.reject(e);
}
};
xmlhttp.ontimeout = function() {
var e = new Zotero.HTTP.TimeoutError(xmlhttp.timeout);
Zotero.logError(e);
deferred.reject(e);
};
return deferred.promise;
};
}
78 changes: 78 additions & 0 deletions example/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
<!DOCTYPE html>
<!--
***** BEGIN LICENSE BLOCK *****
Copyright © 2019 Center for History and New Media
George Mason University, Fairfax, Virginia, USA
http://zotero.org
This file is part of Zotero.
Zotero is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
Zotero is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with Zotero. If not, see <http://www.gnu.org/licenses/>.
***** END LICENSE BLOCK *****
-->
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Zotero Translate</title>
</head>
<script src="../src/zotero.js"></script>
<script src="../src/promise.js"></script>
<script src="../modules/utilities/openurl.js"></script>
<script src="../modules/utilities/date.js"></script>
<script src="../modules/utilities/xregexp-all.js"></script>
<script src="../modules/utilities/xregexp-unicode-zotero.js"></script>
<script src="../modules/utilities/utilities.js"></script>
<script src="../modules/utilities/utilities_item.js"></script>
<script src="../modules/utilities/schema.js"></script>
<script src="../src/utilities_translate.js"></script>
<script src="../src/debug.js"></script>
<script src="../src/http.js"></script>
<script src="../src/translator.js"></script>
<script src="../src/translators.js"></script>
<script src="../src/repo.js"></script>
<script src="../src/translation/translate.js"></script>
<script src="../src/translation/sandboxManager.js"></script>
<script src="../src/translation/translate_item.js"></script>
<script src="../src/tlds.js"></script>
<script src="../src/proxy.js"></script>
<script src="../src/rdf/init.js"></script>
<script src="../src/rdf/uri.js"></script>
<script src="../src/rdf/term.js"></script>
<script src="../src/rdf/identity.js"></script>
<script src="../src/rdf/n3parser.js"></script>
<script src="../src/rdf/rdfparser.js"></script>
<script src="../src/rdf/serialize.js"></script>
<script src="../src/resource/zoteroTypeSchemaData.js"></script>
<script src="../src/cachedTypes.js"></script>
<script src="http.js"></script>
<script src="translators.js"></script>
<script src="translate_item.js"></script>
<script src="index.js"></script>
<body>
<div>
<label>URL:
<input type="text" id="url">
</label>
<br/>
<label>HTML:
<textarea cols="120" rows="30" id="html"></textarea>
</label>
<br/>
<input type="button" value="Translate" onclick="doTranslate()">
</div>
<pre id="result" style="white-space: normal"></pre>
</body>
</html>
6 changes: 4 additions & 2 deletions src/index.js → example/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,9 @@ function addLocationPropToDoc(doc, docURL) {
return wrappedDoc;
};

window.addEventListener('DOMContentLoaded', function() {
window.addEventListener('DOMContentLoaded', async function() {
Zotero.Debug.init(1);
Zotero.Repo.init();
await Zotero.Translators.init();
xhr = await Zotero.HTTP.request('GET', 'https://api.zotero.org/schema', { responseType: 'json' });
Zotero.Schema.init(xhr.response);
});
Loading

0 comments on commit 9643728

Please sign in to comment.