-
Notifications
You must be signed in to change notification settings - Fork 19
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Converto to modular drop-in Zotero translation code repo
- Loading branch information
Showing
34 changed files
with
5,200 additions
and
1,507 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,9 @@ | ||
[submodule "modules/zotero"] | ||
path = modules/zotero | ||
url = [email protected]:zotero/zotero.git | ||
[submodule "modules/utilities"] | ||
path = modules/utilities | ||
url = [email protected]:zotero/utilities.git | ||
[submodule "modules/--force"] | ||
path = modules/--force | ||
url = [email protected]:zotero/utilities.git |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,42 @@ | ||
Run | ||
# Zotero Translate | ||
|
||
This repository contains the Zotero translation architecture code responsible for | ||
parsing Zotero translators and running them on live and static web pages to retrieve | ||
Zotero items. | ||
|
||
A consumer of this repository needs to implement the following interfaces: | ||
|
||
- `Zotero.Translators` found in `translators.js` | ||
- `Zotero.HTTP` found in `http.js` | ||
- `Zotero.Translate.ItemSaver` found in `translation/translate_item.js` | ||
|
||
You also need to: | ||
- Call `Zotero.Schema.init(data)` with Zotero `schema.json`. | ||
- If running in a ModuleJS environment (e.g. Node.js) call `require('./cachedTypes').setTypeSchema(typeSchema)` | ||
with the result of `resource/zoteroTypeSchemaData.js`. | ||
|
||
Please bundle translators and Zotero schema with the translation architecture. | ||
Do not load them from a remote server. | ||
|
||
You may also want to reimplement or modify: | ||
|
||
- `Zotero.Repo` found in `repo.js` to set up periodic translator update retrieval | ||
- `Zotero.Debug` found in `debug.js` to customize debug logging | ||
- `Zotero` and `Zotero.Prefs` found in `zotero.js` to set up the environment and | ||
long-term preference storage | ||
- `Zotero.Translate.ItemGetter` found in `translation/translate_item.js` for export | ||
translation | ||
- `Zotero.Translate.SandboxManager` found in `translation/sandboxManager.js` for | ||
a tighter Sandbox environment if available on your the platform | ||
|
||
### Example | ||
|
||
See `example/index.html` for file loading order. | ||
|
||
To run the example: | ||
```bash | ||
$ git submodule update --init | ||
$ google-chrome --disable-web-security --user-data-dir=/tmp/chromeTemp | ||
$ google-chrome --disable-web-security --user-data-dir=/tmp/chromeTemvar | ||
``` | ||
|
||
Open `src/index.html` in the CORS ignoring Google Chrome | ||
Open `example/index.html` in the CORS ignoring Google Chrome |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,249 @@ | ||
/* | ||
***** BEGIN LICENSE BLOCK ***** | ||
Copyright © 2021 Corporation for Digital Scholarship | ||
Vienna, Virginia, USA | ||
http://zotero.org | ||
This file is part of Zotero. | ||
Zotero is free software: you can redistribute it and/or modify | ||
it under the terms of the GNU Affero General Public License as published by | ||
the Free Software Foundation, either version 3 of the License, or | ||
(at your option) any later version. | ||
Zotero is distributed in the hope that it will be useful, | ||
but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
GNU Affero General Public License for more details. | ||
You should have received a copy of the GNU Affero General Public License | ||
along with Zotero. If not, see <http://www.gnu.org/licenses/>. | ||
***** END LICENSE BLOCK ***** | ||
*/ | ||
|
||
/** | ||
* Functions for performing HTTP requests, both via XMLHTTPRequest and using a hidden browser | ||
* @namespace | ||
*/ | ||
Zotero.HTTP = new function() { | ||
this.StatusError = function(xmlhttp, url) { | ||
this.message = `HTTP request to ${url} rejected with status ${xmlhttp.status}`; | ||
this.status = xmlhttp.status; | ||
try { | ||
this.responseText = typeof xmlhttp.responseText == 'string' ? xmlhttp.responseText : undefined; | ||
} catch (e) {} | ||
}; | ||
this.StatusError.prototype = Object.create(Error.prototype); | ||
|
||
this.TimeoutError = function(ms) { | ||
this.message = `HTTP request has timed out after ${ms}ms`; | ||
}; | ||
this.TimeoutError.prototype = Object.create(Error.prototype); | ||
|
||
/** | ||
* Get a promise for a HTTP request | ||
* | ||
* @param {String} method The method of the request ("GET", "POST", "HEAD", or "OPTIONS") | ||
* @param {String} url URL to request | ||
* @param {Object} [options] Options for HTTP request:<ul> | ||
* <li>body - The body of a POST request</li> | ||
* <li>headers - Object of HTTP headers to send with the request</li> | ||
* <li>debug - Log response text and status code</li> | ||
* <li>logBodyLength - Length of request body to log</li> | ||
* <li>timeout - Request timeout specified in milliseconds [default 15000]</li> | ||
* <li>responseType - The response type of the request from the XHR spec</li> | ||
* <li>responseCharset - The charset the response should be interpreted as</li> | ||
* <li>successCodes - HTTP status codes that are considered successful, or FALSE to allow all</li> | ||
* </ul> | ||
* @return {Promise<XMLHttpRequest>} A promise resolved with the XMLHttpRequest object if the | ||
* request succeeds, or rejected if the browser is offline or a non-2XX status response | ||
* code is received (or a code not in options.successCodes if provided). | ||
*/ | ||
this.request = function(method, url, options = {}) { | ||
// Default options | ||
options = Object.assign({ | ||
body: null, | ||
headers: {}, | ||
debug: false, | ||
logBodyLength: 1024, | ||
timeout: 15000, | ||
responseType: '', | ||
responseCharset: null, | ||
successCodes: null | ||
}, options); | ||
|
||
|
||
let logBody = ''; | ||
if (['GET', 'HEAD'].includes(method)) { | ||
if (options.body != null) { | ||
throw new Error(`HTTP ${method} cannot have a request body (${options.body})`) | ||
} | ||
} else if(options.body) { | ||
options.body = typeof options.body == 'string' ? options.body : JSON.stringify(options.body); | ||
|
||
if (!options.headers) options.headers = {}; | ||
if (!options.headers["Content-Type"]) { | ||
options.headers["Content-Type"] = "application/x-www-form-urlencoded"; | ||
} | ||
else if (options.headers["Content-Type"] == 'multipart/form-data') { | ||
// Allow XHR to set Content-Type with boundary for multipart/form-data | ||
delete options.headers["Content-Type"]; | ||
} | ||
|
||
logBody = `: ${options.body.substr(0, options.logBodyLength)}` + | ||
options.body.length > options.logBodyLength ? '...' : ''; | ||
// TODO: make sure below does its job in every API call instance | ||
// Don't display password or session id in console | ||
logBody = logBody.replace(/password":"[^"]+/, 'password":"********'); | ||
logBody = logBody.replace(/password=[^&]+/, 'password=********'); | ||
} | ||
Zotero.debug(`HTTP ${method} ${url}${logBody}`); | ||
|
||
var xmlhttp = new XMLHttpRequest(); | ||
xmlhttp.timeout = options.timeout; | ||
var promise = Zotero.HTTP._attachHandlers(url, xmlhttp, options); | ||
|
||
xmlhttp.open(method, url, true); | ||
|
||
for (let header in options.headers) { | ||
xmlhttp.setRequestHeader(header, options.headers[header]); | ||
} | ||
|
||
xmlhttp.responseType = options.responseType || ''; | ||
|
||
// Maybe should provide "mimeType" option instead. This is xpcom legacy, where responseCharset | ||
// could be controlled manually | ||
if (options.responseCharset) { | ||
xmlhttp.overrideMimeType("text/plain; charset=" + options.responseCharset); | ||
} | ||
|
||
xmlhttp.send(options.body); | ||
|
||
return promise.then(function(xmlhttp) { | ||
if (options.debug) { | ||
if (xmlhttp.responseType == '' || xmlhttp.responseType == 'text') { | ||
Zotero.debug(`HTTP ${xmlhttp.status} response: ${xmlhttp.responseText}`); | ||
} | ||
else { | ||
Zotero.debug(`HTTP ${xmlhttp.status} response`); | ||
} | ||
} | ||
|
||
let invalidDefaultStatus = options.successCodes === null && | ||
(xmlhttp.status < 200 || xmlhttp.status >= 300); | ||
let invalidStatus = Array.isArray(options.successCodes) && !options.successCodes.includes(xmlhttp.status); | ||
if (invalidDefaultStatus || invalidStatus) { | ||
throw new Zotero.HTTP.StatusError(xmlhttp, url); | ||
} | ||
return xmlhttp; | ||
}); | ||
}; | ||
/** | ||
* Send an HTTP GET request via XMLHTTPRequest | ||
* | ||
* @deprecated Use {@link Zotero.HTTP.request} | ||
* @param {String} url URL to request | ||
* @param {Function} onDone Callback to be executed upon request completion | ||
* @param {String} responseCharset | ||
* @param {N/A} cookieSandbox Not used in Connector | ||
* @param {Object} headers HTTP headers to include with the request | ||
* @return {Boolean} True if the request was sent, or false if the browser is offline | ||
*/ | ||
this.doGet = function(url, onDone, responseCharset, cookieSandbox, headers) { | ||
Zotero.debug('Zotero.HTTP.doGet is deprecated. Use Zotero.HTTP.request'); | ||
this.request('GET', url, {responseCharset, headers}) | ||
.then(onDone, function(e) { | ||
onDone({status: e.status, responseText: e.responseText}); | ||
throw (e); | ||
}); | ||
return true; | ||
}; | ||
|
||
/** | ||
* Send an HTTP POST request via XMLHTTPRequest | ||
* | ||
* @deprecated Use {@link Zotero.HTTP.request} | ||
* @param {String} url URL to request | ||
* @param {String|Object[]} body Request body | ||
* @param {Function} onDone Callback to be executed upon request completion | ||
* @param {String} headers Request HTTP headers | ||
* @param {String} responseCharset | ||
* @return {Boolean} True if the request was sent, or false if the browser is offline | ||
*/ | ||
this.doPost = function(url, body, onDone, headers, responseCharset) { | ||
Zotero.debug('Zotero.HTTP.doPost is deprecated. Use Zotero.HTTP.request'); | ||
this.request('POST', url, {body, responseCharset, headers}) | ||
.then(onDone, function(e) { | ||
onDone({status: e.status, responseText: e.responseText}); | ||
throw (e); | ||
}); | ||
return true; | ||
}; | ||
|
||
|
||
/** | ||
* Adds a ES6 Proxied location attribute | ||
* @param doc | ||
* @param docUrl | ||
*/ | ||
this.wrapDocument = function(doc, docURL) { | ||
let url = require('url'); | ||
docURL = url.parse(docURL); | ||
docURL.toString = () => this.href; | ||
var wrappedDoc = new Proxy(doc, { | ||
get: function (t, prop) { | ||
if (prop === 'location') { | ||
return docURL; | ||
} | ||
else if (prop == 'evaluate') { | ||
// If you pass the document itself into doc.evaluate as the second argument | ||
// it fails, because it receives a proxy, which isn't of type `Node` for some reason. | ||
// Native code magic. | ||
return function() { | ||
if (arguments[1] == wrappedDoc) { | ||
arguments[1] = t; | ||
} | ||
return t.evaluate.apply(t, arguments) | ||
} | ||
} | ||
else { | ||
if (typeof t[prop] == 'function') { | ||
return t[prop].bind(t); | ||
} | ||
return t[prop]; | ||
} | ||
} | ||
}); | ||
return wrappedDoc; | ||
}; | ||
|
||
|
||
/** | ||
* Adds request handlers to the XMLHttpRequest and returns a promise that resolves when | ||
* the request is complete. xmlhttp.send() still needs to be called, this just attaches the | ||
* handler | ||
* | ||
* See {@link Zotero.HTTP.request} for parameters | ||
* @private | ||
*/ | ||
this._attachHandlers = function(url, xmlhttp, options) { | ||
var deferred = Zotero.Promise.defer(); | ||
xmlhttp.onload = () => deferred.resolve(xmlhttp); | ||
xmlhttp.onerror = xmlhttp.onabort = function() { | ||
var e = new Zotero.HTTP.StatusError(xmlhttp, url); | ||
if (options.successCodes === false) { | ||
deferred.resolve(xmlhttp); | ||
} else { | ||
deferred.reject(e); | ||
} | ||
}; | ||
xmlhttp.ontimeout = function() { | ||
var e = new Zotero.HTTP.TimeoutError(xmlhttp.timeout); | ||
Zotero.logError(e); | ||
deferred.reject(e); | ||
}; | ||
return deferred.promise; | ||
}; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
<!DOCTYPE html> | ||
<!-- | ||
***** BEGIN LICENSE BLOCK ***** | ||
Copyright © 2019 Center for History and New Media | ||
George Mason University, Fairfax, Virginia, USA | ||
http://zotero.org | ||
This file is part of Zotero. | ||
Zotero is free software: you can redistribute it and/or modify | ||
it under the terms of the GNU Affero General Public License as published by | ||
the Free Software Foundation, either version 3 of the License, or | ||
(at your option) any later version. | ||
Zotero is distributed in the hope that it will be useful, | ||
but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
GNU Affero General Public License for more details. | ||
You should have received a copy of the GNU Affero General Public License | ||
along with Zotero. If not, see <http://www.gnu.org/licenses/>. | ||
***** END LICENSE BLOCK ***** | ||
--> | ||
<html lang="en"> | ||
<head> | ||
<meta charset="UTF-8"> | ||
<title>Zotero Translate</title> | ||
</head> | ||
<script src="../src/zotero.js"></script> | ||
<script src="../src/promise.js"></script> | ||
<script src="../modules/utilities/openurl.js"></script> | ||
<script src="../modules/utilities/date.js"></script> | ||
<script src="../modules/utilities/xregexp-all.js"></script> | ||
<script src="../modules/utilities/xregexp-unicode-zotero.js"></script> | ||
<script src="../modules/utilities/utilities.js"></script> | ||
<script src="../modules/utilities/utilities_item.js"></script> | ||
<script src="../modules/utilities/schema.js"></script> | ||
<script src="../src/utilities_translate.js"></script> | ||
<script src="../src/debug.js"></script> | ||
<script src="../src/http.js"></script> | ||
<script src="../src/translator.js"></script> | ||
<script src="../src/translators.js"></script> | ||
<script src="../src/repo.js"></script> | ||
<script src="../src/translation/translate.js"></script> | ||
<script src="../src/translation/sandboxManager.js"></script> | ||
<script src="../src/translation/translate_item.js"></script> | ||
<script src="../src/tlds.js"></script> | ||
<script src="../src/proxy.js"></script> | ||
<script src="../src/rdf/init.js"></script> | ||
<script src="../src/rdf/uri.js"></script> | ||
<script src="../src/rdf/term.js"></script> | ||
<script src="../src/rdf/identity.js"></script> | ||
<script src="../src/rdf/n3parser.js"></script> | ||
<script src="../src/rdf/rdfparser.js"></script> | ||
<script src="../src/rdf/serialize.js"></script> | ||
<script src="../src/resource/zoteroTypeSchemaData.js"></script> | ||
<script src="../src/cachedTypes.js"></script> | ||
<script src="http.js"></script> | ||
<script src="translators.js"></script> | ||
<script src="translate_item.js"></script> | ||
<script src="index.js"></script> | ||
<body> | ||
<div> | ||
<label>URL: | ||
<input type="text" id="url"> | ||
</label> | ||
<br/> | ||
<label>HTML: | ||
<textarea cols="120" rows="30" id="html"></textarea> | ||
</label> | ||
<br/> | ||
<input type="button" value="Translate" onclick="doTranslate()"> | ||
</div> | ||
<pre id="result" style="white-space: normal"></pre> | ||
</body> | ||
</html> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.