diff --git a/data/JTEI/13_2020-22/jtei-cc-ra-wittern-189-source.xml b/data/JTEI/13_2020-22/jtei-cc-ra-wittern-189-source.xml index 988de0ad..73bb74af 100644 --- a/data/JTEI/13_2020-22/jtei-cc-ra-wittern-189-source.xml +++ b/data/JTEI/13_2020-22/jtei-cc-ra-wittern-189-source.xml @@ -355,7 +355,7 @@ excerpt from such a text) and a number of published versions derived from it have all been encoded in TEI, with the most recent version in TEI P5 and Unicode.The most recent version of the whole textual database is available in the CBETA XML P5 GitHub repository, accessed April 20, 2020, .

@@ -489,31 +489,33 @@

By the year 2010, the practice of using separate text files for different witnesses of a text had become well established in our workflow. For tracking changes to these files, we had used version control tools from the start. At some point, we realized that the modern - distributed variety of these tools, Git and GitHub, - not only had the potential to solve the problem of keeping track of changes made to a - file, but could also be used to hold all witnesses of a text in one repository, each of - them represented as a branch. (In the terminology of version control - software, a branch is one current state in the editing history of the file, which has been - given a name to make it easy to address it and to track changes along a specific - trajectory.)

+ distributed variety of these tools, Git and GitHub, not only had the potential + to solve the problem of keeping track of changes made to a file, but could also be used to + hold all witnesses of a text in one repository, each of them represented as a + branch. (In the terminology of version control software, a branch + is one current state in the editing history of the file, which has been given a name to + make it easy to address it and to track changes along a specific trajectory.)

The distributed nature of this toolchain, which unlike earlier version control systems does not require a central authority, also seemed to have the potential to solve another problem I had been trying to solve almost from the beginning of my work with digital texts. As stated already, one of the aims of my work from the outset was to make a digital version of a text at least as versatile as a printed scholarly edition. For me, this also included taking ownership of one specific copy of such an edition and tracking the work by - adding marginal notes, comments, and references directly into the book. With GitHub as a repository for texts and Git as a means to control the various maintenance - tasks, researchers interested in a text could clone the text, add their own marginal - notes, then make their version of the text available to us or any other researcher to - integrate, if we so chose.

-

A Git workflow can use any kind of digital material, - but it works better with textual material as opposed to images or videos, and even better - for texts that use lines as a structural element. This again is where the plain text we - used in the Daozang jiyao project worked better than did the XML tree - structure, which is at the core of every TEI file.

+ adding marginal notes, comments, and references directly into the book. With GitHub + as a repository for texts and Git as a means to control the various maintenance tasks, + researchers interested in a text could clone the text, add their own marginal notes, then + make their version of the text available to us or any other researcher to integrate, if we + so chose.

+

A Git + workflow can use any kind of digital material, but it works better with textual material + as opposed to images or videos, and even better for texts that use lines as a structural + element. This again is where the plain text we used in the Daozang + jiyao project worked better than did the XML tree structure, which is at the + core of every TEI file.

When I first presented this idea at the TEI conference in Würzburg in October 2011, I got this comment via a tweet from one of the most respected members of the TEI community (: @rahtz: interesting that @@ -525,32 +527,32 @@

As described in that talk (published as Wittern 2013), the text format used here is not simply plain text, but rather an extended - form of the text format used in the Emacs - Orgmode,Accessed May 18, 2020, . in spirit comparable to the much - more frequently seen Markdown, but better. The defining difference here is the more - elegant and functional choice of markup elements, and the fact that the format was - originally conceived as the base for a note-taking and scheduling application, so the - markup itself and the software that operates on it are essentially one unit, and the - development of the software (which is itself community driven) informs the choices and + form of the text format ugit logsed in the Emacs + Orgmode,Accessed May 18, 2020, . in spirit + comparable to the much more frequently seen Markdown, but better. The defining difference + here is the more elegant and functional choice of markup elements, and the fact that the + format was originally conceived as the base for a note-taking and scheduling application, + so the markup itself and the software that operates on it are essentially one unit, and + the development of the software (which is itself community driven) informs the choices and considerations for markup constructs. For the DZJY project, we added a few more conventions, to accommodate our specific needs, but without changing any of the essential - features. Org mode uses what I called an - implicit markup, which is exactly the opposite of XML. Org mode’s - markup is as short as possible and in many cases derived from context. An asterisk - * followed by a space at the start of a line indicates a heading of level - one, instead of TEI’s div followed by a headFor a full description - of this format, see The Mandoku Text Format, accessed April 20, - 2020, Org mode uses what I called an implicit markup, + which is exactly the opposite of XML. Org mode’s markup is as short as possible and in + many cases derived from context. An asterisk * followed by a space at the + start of a line indicates a heading of level one, instead of TEI’s div followed + by a headFor a full description of this format, see The + Mandoku Text Format, accessed April 20, 2020, . (and the corresponding closing tags to convey this information).

From the beginning, the DZJY was in my view itself a pilot project for a much larger project, on which preparatory work started in earnest in 2012: the Kanseki Repository (GitHub username @kanripo).Accessed June24, 2020, Kanseki Repository (GitHub username + @kanripo).Accessed June24, 2020, and . Kanseki here is the Japanese term for premodern Chinese texts, and @@ -560,22 +562,25 @@ foundation for the creation of digital textual artifacts, based mostly on the German tradition of scholarly editing and its distinction between documentary edition and interpretative edition. These two types are - distinguished through naming conventions for the Git - branches. Documentary editions are also represented through digital facsimiles, which can - be called up to be displayed side by side with the transcribed text. Interpretative - editions may normalize the characters used to modern forms, add punctuation, and also make - it possible to add translations and semantic annotations.

+ distinguished through naming conventions for the Git branches. Documentary editions + are also represented through digital facsimiles, which can be called up to be displayed + side by side with the transcribed text. Interpretative editions may normalize the + characters used to modern forms, add punctuation, and also make it possible to add + translations and semantic annotations.

From earlier textual projects, such as ZenBase, CBETA, and DZJY, but also from other sources available on the Internet, we have compiled an initial catalog of about 10,000 titles to be included in a first phase of the project; this catalog is also being supplemented by users who deposit whatever texts they are interested in into the - repository. Since the initial publication on GitHub in September 2015, and the launch of a dedicated website in March 2016, - usage has been increasing slowly but steadily.

+ repository. Since the initial publication on GitHub in September 2015, and the + launch of a dedicated website in March 2016, usage has been increasing slowly but + steadily.

Kanripo Project Details -

All the texts are freely available on GitHub in their - source form. This repository of texts can be accessed through the All the texts are freely available on GitHub in their source form. + This repository of texts can be accessed through the kanripo.org website, but also through a module of the Emacs editor called Mandoku. This allows users to query, access, clone, edit, and push the texts directly from their own computer. Reading, commenting, and editing do not @@ -587,31 +592,34 @@ the context of their aims—and authoritative vetting and editorial quality assurance.

demonstrate the concept and functions of the Kanseki Repository. On the website, users can search for - texts or browse the catalog. Once a text is found, the webserver reads it from the GitHub repository and serves it to the user. For - most texts, there are different editions to choose from; usually both documentary and - interpretative versions exist. For many texts, there is also a digital facsimile, which - can be called up alongside the text; if there is more than one edition documented with a - digital facsimile, the others can also be directly inspected on the page for the text on - the Kanseki Repository website.

+ texts or browse the catalog. Once a text is found, the webserver reads it from the GitHub repository and serves it to the user. For most texts, there are different + editions to choose from; usually both documentary and interpretative versions exist. For + many texts, there is also a digital facsimile, which can be called up alongside the + text; if there is more than one edition documented with a digital facsimile, the others + can also be directly inspected on the page for the text on the Kanseki Repository + website.

A text in the Kanseki Repository

In the screenshot in , there is a link at the - top of the page labeled GitHub, from - which the source of the text can be directly accessed. A user who wishes to make changes - to the text, by correcting, annotating, or even translating it, can transfer a copy of - this text from the public @kanripo account, either by cloning it to their - own account on GitHub, or by downloading it - locally.

-

The user can also log in to the Kanripo website with their Github credentials. When this is done for the first time, the user - has to grant the Kanseki Repository access to their repositories. In addition, a new - repository KR-Workspace is created; some settings related to the - use of the Kanseki Repository are stored here. (Most websites store this kind of - information in their own database, with no direct access to it for the user. KR does it - in this way + top of the page labeled GitHub, from which the source of the text can + be directly accessed. A user who wishes to make changes to the text, by correcting, + annotating, or even translating it, can transfer a copy of this text from the public + @kanripo account, either by cloning it to their own account on GitHub, or by downloading it locally.

+

The user can also log in to the Kanripo website with their Github + credentials. When this is done for the first time, the user has to grant the Kanseki + Repository access to their repositories. In addition, a new repository + KR-Workspace is created; some settings related to the use of the + Kanseki Repository are stored here. (Most websites store this kind of information in + their own database, with no direct access to it for the user. KR does it in this way + to allow the user control over their data and so that the user’s preferences and settings can be applied to different applications with which the user might access the KR. @@ -627,33 +635,36 @@ distant reading, text analysis, and similar purposes, a separate account @kr-shadowAccessed June 24, 2020, has been - created on Github. You will find here the texts - of the master branch, which is usually the normalized and edited - version of the text in a form that makes it easy to download the whole archive at - once.

+ created on Github. You will find here the texts of the + master branch, which is usually the normalized and edited version + of the text in a form that makes it easy to download the whole archive at once.

- Mandoku -

As mentioned, the texts can also be accessed from the text editor Emacs, which is available on all major platforms. This is intended - for people who work intensely with a text, for example as the topic for a PhD thesis. - The Emacs module MandokuAccessed May 18, 2020, Mandoku +

As mentioned, the texts can also be accessed from the text editor Emacs, which is + available on all major platforms. This is intended for people who work intensely with a + text, for example as the topic for a PhD thesis. The Emacs module MandokuAccessed May 18, 2020, . provides ways to search the - KR, clone texts, create new branches, and many other functions. All other Emacs extensions and modules can also be used. shows an example of a text with its digital - facsimile, and shows the same poems, - rearranged by line, with a translation added. In the middle there is an example of an - inline note. And finally, shows the same text, - pushed to the user’s account and displayed from there on the Kanripo website.

+ KR, clone texts, create new branches, and many other functions. All other Emacs extensions and modules can also be used. shows an example of a text with its digital facsimile, and shows the same poems, rearranged by line, with a + translation added. In the middle there is an example of an inline note. And finally, + shows the same text, pushed to the user’s + account and displayed from there on the Kanripo website.

A text from the Kanseki Repository, side by side with a facsimile, - displayed using the Emacs module Mandoku + displayed using the Emacs module Mandoku
@@ -662,8 +673,9 @@
- The text with translation, now pulled from the user’s GitHub account + The text with translation, now pulled from the user’s GitHub account
diff --git a/data/JTEI/14_2021-23/jtei-barabuccietal-196-source.xml b/data/JTEI/14_2021-23/jtei-barabuccietal-196-source.xml index 11f93144..d23b98a5 100644 --- a/data/JTEI/14_2021-23/jtei-barabuccietal-196-source.xml +++ b/data/JTEI/14_2021-23/jtei-barabuccietal-196-source.xml @@ -113,11 +113,12 @@

This paper describes how we dealt with the encoding and transformation of the punctuation in the Early New High German edition of Marco Polo’s travel account. Technically, we implemented a set of general rules (as XSLT - templates) plus various exceptions (as descriptive instructions in XML attributes), - and applied them in an automated fashion (using XProc pipelines). In addition to - this, we discuss the philological foundation of this method and, contextually, we - address the topic of the transformation of a single original source into different + xml:id="R1" target="#xslt"/>XSLT templates) + plus various exceptions (as descriptive instructions in XML attributes), and applied + them in an automated fashion (using XProc pipelines). In addition to this, we + discuss the philological foundation of this method and, contextually, we address the + topic of the transformation of a single original source into different transcriptions: from a highly diplomatic edition to an interpretative one, going through a spectrum of intermediate levels of normalization. We also reflect on the separation between transcription and analysis, as well as on the role of the editor @@ -146,8 +147,9 @@

These issues quickly arose while editing:

the master TEI file became too big and its structure too complex, thus too hard - to navigate and maintain, even when using advanced XML editors such as Oxygen - XML; + to navigate and maintain, even when using advanced XML editors such as Oxygen XML; normalizing punctuation revealed itself as a complex task that required profound changes to the structure of the edited text. @@ -187,9 +189,10 @@

This paper provides an overview of our approach (section 3) and shows how our approach addresses both issues. In particular we show how normalizing punctuation represents a dramatic step beyond the more classical normalization of words. The - current implementation of our approach, based on XProc and XSLT, is also - presented in section 3.

+ current implementation of our approach, based on XProc and XSLT, is also presented in section 3.

Moving to an editorial workflow with such a level of automation requires a reevaluation of the role of the editor, from wordsmith to formalizer of rules (and exceptions). In section 4 we discuss how our approach fits the recent @@ -257,15 +260,17 @@ (revealing, opening up to the text) in its nature, as some specific aspects of the text are presented in such a way that the reader is granted more informed access to them.

-

The edition will be published online using a specifically tailored version of EVT - (Edition Visualization TechnologyA light-weight, - open source tool specifically designed to create digital editions from - XML-encoded texts (Rosselli Del Turco et al. 2013).) and will - present, on the one hand, each witness in its continuum from facsimile to multiple - levels of normalization and, on the other hand, the three main witnesses in - synopsis. From each module of the edition and from each of the texts composing the - editorial project it will be possible to access a twofold commentary: specific +

The edition will be published online using a specifically tailored version of EVT (Edition Visualization TechnologyA + light-weight, open source tool specifically designed to create digital + editions from XML-encoded texts + (Rosselli Del Turco et al. 2013).) and + will present, on the one hand, each witness in its continuum from facsimile to + multiple levels of normalization and, on the other hand, the three main witnesses + in synopsis. From each module of the edition and from each of the texts composing + the editorial project it will be possible to access a twofold commentary: specific notes referring to the named entities and the realia appearing in the text, and philological notes referring either to all of the three witnesses or to one witness in particular.

@@ -784,7 +789,8 @@ Multiple editions will be generated automatically from the master TEI file, with no manual intervention on the resulting files. The generated editions files will conform to the TEI subset understood by - EVT. + EVT.

Some of these desiderata clash with each other. For instance, the desire to directly edit the XML file makes it hard and error-prone to keep in a single @@ -848,8 +854,9 @@ level="m">Digital Vercelli Book, by Roberto Rosselli Del Turco (n.d.): here two levels of edition are offered, a diplomatic and a more interpretative one. The - user can compare the two editions visualizing them synoptically in the EVT - software used for the edition.

+ user can compare the two editions visualizing them synoptically in the EVT software used for the edition.

@@ -878,11 +885,12 @@ the edition files, despite being the main concrete output of the editorial project, are ephemeral and never modified directly. -

The implementation consists of a series of XSLT transformations, each - representing and implementing a single rule, coordinated by three different XProc - pipelines, one for each level of edition. The source code is available at .

+

The implementation consists of a series of XSLT transformations, each + representing and implementing a single rule, coordinated by three different XProc pipelines, one for each level of edition. The source code is available + at .

This methodology contrasts with the established editorial practice of mingling transcription, normalization, and critical amendments. Instead of just performing the desired normalization steps while transcribing and keeping track of them in an @@ -934,14 +942,14 @@ system that does not allow for this interaction to happen is not able to deal in properly with normalization in general and punctuation in particular.

Each rule is implemented as a small and self-contained XSLT + xml:id="R8" target="#xslt"/>XSLT transformation. At the time of writing, the ENHG Marco Polo project comprises about a hundred rules, grouped in twenty macro categories. On average, the core of each rule is implemented in less than three lines of XSLT.

+ xml:id="R9" target="#xslt"/>XSLT.

To give the readers an impression of the simplicity of the rule implementation, we - show here the main parts of the XSLT that implement one of the example + show here the main parts of the XSLT that implement one of the example rules described above.

Example: Rule to Join Words Split at the End of a Line @@ -949,8 +957,8 @@ used to mark that a word has been split at the end of a line. In the diplomatic rendition we want to preserve this word division and the forced line break, while in other renditions we want to reconstruct the complete word.

-

The XSLT excerpt in The XSLT excerpt in Example 3 shows how split words are joined when a middle double oblique hyphen is found. The joining is performed in a lossless way: all information present in the original witness is preserved. This is possible @@ -993,8 +1001,8 @@ - XSLT implementation of the rule Join + XSLT implementation of the rule Join words split with a double oblique hyphen.

The rule in Example 3 is @@ -1085,13 +1093,16 @@ gray cogs indicate steps shared by all pipelines. The cogs with patterns identify level-specific steps. -

Each pipeline is implemented as an XProc pipeline. All the pipelines are simple - linear flows (i.e., the output of a rule is the input for the next rule). From a - methodological point of view, the XProc pipeline is a record of all the operations - that the scholar performs on the transcription. The creation of an edition level - is equivalent to replaying this record. Example 6 shows an excerpt of the XProc pipeline used to generate the - semidiplomatic edition.

+

Each pipeline is implemented as an XProc pipeline. All the + pipelines are simple linear flows (i.e., the output of a rule is the input for the + next rule). From a methodological point of view, the XProc + pipeline is a record of all the operations that the scholar performs on the + transcription. The creation of an edition level is equivalent to replaying this + record. Example 6 shows an excerpt + of the XProc pipeline used to generate the semidiplomatic edition.

It is important to note that pipelines comprise three kinds of steps:

infrastructural steps: for example, the tokenize step that @@ -1138,17 +1149,19 @@ - Excerpt of the XProc pipeline used to generate the - semi-diplomatic edition. Steps marked A are steps that implement rules; the - step marked B takes care of exceptions. + Excerpt of the XProc pipeline used to + generate the semi-diplomatic edition. Steps marked A are steps that implement + rules; the step marked B takes care of exceptions. -

The fact that the editorial workflows for all the editions are formalized in XProc - pipelines makes it possible, for instance, to compare these pipelines and see in - detail (and with utmost precision) how they differ and what is, in this project, - the difference between the processes needed to establish a diplomatic, a - semi-diplomatic or an interpretative edition. Breaking down the traditional - analogue processes into unambiguous discrete steps can contribute to the scholarly - debate on edition typology.

+

The fact that the editorial workflows for all the editions are formalized in XProc pipelines makes it possible, for instance, to compare these + pipelines and see in detail (and with utmost precision) how they differ and what + is, in this project, the difference between the processes needed to establish a + diplomatic, a semi-diplomatic or an interpretative edition. Breaking down the + traditional analogue processes into unambiguous discrete steps can contribute to + the scholarly debate on edition typology.

@@ -1191,19 +1204,21 @@ like to experiment with creating declarative rule generators. Many rules are repetitive in their nature (for example, the normalization of single characters) and it should be possible to express them in a declarative fashion. These abstract - rules would then be translated into XSLT transformations. Another aspect we + rules would then be translated into XSLT transformations. Another aspect we would like to reflect on is how the transformation process directed by the pipelines influences the various levels of abstraction of the document being transformed, drawing parallels with stratified document models such as CMV+P (Barabucci 2019). A final thing we would like to test - is the replacement of the XProc pipelines with pure XSLT pipelines - (Birnbaum 2017). Replacing XProc - with XSLT pipelines would reduce the number of technologies that other - scholars have to be familiar with in order to understand the editorial process in its - entirety.

+ is the replacement of the XProc pipelines with pure XSLT pipelines + (Birnbaum 2017). Replacing XProc with XSLT pipelines would reduce the number of + technologies that other scholars have to be familiar with in order to understand the + editorial process in its entirety.

Another future development that we envision is the deconstruction of the visualization of the edition into a series of small, explicit steps, taking place one after the other, just like their counterparts in the pipelines: one click would show @@ -1288,8 +1303,8 @@ Polo. Prima edizione integrale. Firenze: Leo S. Olschki. Birnbaum, David J. 2017. - Patterns and Antipatterns in <ptr type="software" - xml:id="XSLT" target="#XSLT"/><rs type="soft.name" ref="#XSLT">XSLT</rs> + <title level="a">Patterns and Antipatterns in <ptr type="software" xml:id="R23" + target="#xslt"/><rs type="soft.name" ref="#R23">XSLT</rs> Micropipelining. In Proceedings of Balisage: The Markup Conference 2017. Balisage Series on Markup Technologies @@ -1458,10 +1473,12 @@ The Digital Vercelli Book. Beta version. Accessed October 22, 2021. . - Rosselli Del Turco, Roberto, et al. + Rosselli Del Turco, + Roberto, et al. 2013. Edition Visualization Technology. - Accessed April 19, 2021.. + Accessed April 19, 2021.. Stella, Francesco, ed. 2020. Corpus Rhythmorum Musicum. Last modified July 28, 2020. Huitfeldt and Sperberg-McQueen 2003). In GODDAG, all children of the markup nodes are typically ordered, but TexMECS provides a - notation to mark certain markup nodes as unordered. The GODDAG processor - ignores the default linear order of these elements’ children, and therefore - TexMECS supports the representation of nonlinear structures. No known working - implementation of TexMECS, however, is currently available. At first glance, - EARMARK (Extremely Annotated RDF Markup) also seems to support the option to - represent nonlinearity: with EARMARK, users can express different linear - structures using RDF statements about text fragments, and in this way it is - possible to describe multiple text orders (Peroni and Vitali 2009, 4.1; Di - Iorio 2009). However, multi-orderedness is not the same as partial - orderedness: if a text is partially ordered, it means that (part of the) text - has no order. Multi-orderedness always implies a certain order. The EARMARK - specification as described in Peroni and - Vitali 2009 does not natively support partially ordered text, in the - sense that EARMARK users cannot mark the branching of the text stream. It is - also important to note that EARMARK is a metamarkup language, which means that - users encode their texts not in EARMARK but in an RDF - serialization.Recognizing the challenge of expressing literary texts - as RDF statements, Barabucci et al. developed the FRETTA approach, which is designed - to express EARMARK annotations in an embedded - syntax such as XML. It is unclear, however, whether this approach - has been further developed or implemented.

+ notation to mark certain markup nodes as unordered. The GODDAG + processor ignores the default linear order of these elements’ children, + and therefore TexMECS supports the representation of nonlinear structures. No + known working implementation of TexMECS, however, is currently available. At + first glance, EARMARK (Extremely Annotated RDF Markup) also seems to support + the option to represent nonlinearity: with EARMARK, users can express different + linear structures using RDF statements about text fragments, and in this way it + is possible to describe multiple text orders (Peroni and Vitali 2009, 4.1; Di Iorio 2009). However, multi-orderedness is not + the same as partial orderedness: if a text is partially ordered, it means that + (part of the) text has no order. Multi-orderedness always implies a certain + order. The EARMARK specification as described in Peroni and Vitali 2009 does not natively support + partially ordered text, in the sense that EARMARK users cannot mark the + branching of the text stream. It is also important to note that EARMARK is a + metamarkup language, which means that users encode their texts not in EARMARK + but in an RDF serialization.Recognizing the challenge of expressing + literary texts as RDF statements, Barabucci et al. developed the FRETTA approach, + which is designed to express EARMARK annotations + in an embedded syntax such as XML. It is unclear, however, + whether this approach has been further developed or implemented.

Discontinuity @@ -543,9 +544,11 @@

TAGML may resemble existing markup languages like XML, TexMECS, or LMNL, but TAGML is more expressive. For instance, in XML all annotation values are of type string, but TAGML offers data-typing of annotations. These data types are - expressed in UTF-8 and interpreted by the TAGML parser as different data types. - Encoders can distinguish between integer, string, or Boolean values ().

+ expressed in UTF-8 and interpreted by the TAGML parser as + different data types. Encoders can distinguish between integer, string, or + Boolean values ().
Example of TAGML, featuring different types of annotation value. @@ -572,11 +575,17 @@ encoding complex textual features, TAGML is designed to make that modeling process as natural as possible. The markup language has the same compactness as XML and is independent of the user environment.TAGML can be edited in any - editor, but the open source text editor Sublime has a TAGML - syntax highlighting package, and the reference - implementation Alexandria can be used to parse and validate TAGML + editor, but the open source text editor Sublime has a TAGML syntax highlighting + package, and the reference + implementation Alexandria can be used to parse and validate TAGML documents and store them as a TAG hypergraph. Following the argument of Sperberg-McQueen and Huitfeldt and Peroni and Vitali, we did not @@ -1048,8 +1057,8 @@ retrieve all quotes together. The first would not pose a problem for TEI XML, but retrieving the disjointed quotations as one (merged) utterance would only be possible with additional, vocabulary-specific coding. Processing the two q elements - as a single q requires a set of XSLT instructions that check + as a single q requires a set of XSLT instructions that check the values of the xml:id and the next and prev attributes in order to know which q elements should be stitched together. In TAGML, both scenarios would be equally straightforward. The hypergraph can be queried @@ -1091,22 +1100,22 @@ TEI transcription of
To process the text of this fragment correctly, one needs to write a rather - complicated set of XSLT instructions. At the very least, these + complicated set of XSLT instructions. At the very least, these instructions need to match the values of the xml:id and prev in order to process the first part of the deletion, look for the second part of the deletion, and then concatenate their textual content. At the same time, one has to prevent the second part from being processed twice (first as the second part of the deletion, and the second time together with the regular del elements). After - some experimenting and consulting several XSLT specialists, we have - come to no less than three different sets of instructions.The authors are - grateful to Peter Boot, Vincent Neyt, and Frederike Neuber for sharing their - expertise and invaluable insights. And considering the ingenuity and - technical expertise of the TEI community, we are quite certain there are even more - ways. In short, it can be a challenging and time-consuming process to write and tweak - vocabulary-specific and schema-aware tools—a daunting task for any TEI XML user who - lacks a certain level of technical expertise.

+ some experimenting and consulting several XSLT specialists, we have come + to no less than three different sets of instructions.The authors are grateful + to Peter Boot, Vincent Neyt, and Frederike Neuber for sharing their expertise and + invaluable insights. And considering the ingenuity and technical expertise + of the TEI community, we are quite certain there are even more ways. In short, it can + be a challenging and time-consuming process to write and tweak vocabulary-specific + and schema-aware tools—a daunting task for any TEI XML user who lacks a certain level + of technical expertise.

Conclusion diff --git a/data/JTEI/16_2023_spa/jtei-rioriande-torresallen-250-source.xml b/data/JTEI/16_2023_spa/jtei-rioriande-torresallen-250-source.xml index a4206a01..47ffa640 100644 --- a/data/JTEI/16_2023_spa/jtei-rioriande-torresallen-250-source.xml +++ b/data/JTEI/16_2023_spa/jtei-rioriande-torresallen-250-source.xml @@ -167,10 +167,13 @@ de la TEI, que es parte de las iniciativas del Consorcio:Más información sobre este grupo de trabajo en: fecha de consulta 16 de julio de 2023, . en primer lugar, el desarrollo de una nueva infraestructura, - TranslateTEI,TranslateTEI ha sido desarrollada por Hugh Cayless (Duke - University) y está disponible en: fecha de consulta 16 de julio de 2023, + />. en primer lugar, el desarrollo de una nueva infraestructura, TranslateTEI,TranslateTEI ha sido + desarrollada por Hugh Cayless (Duke + University) y está disponible en: fecha de consulta 16 de julio de 2023, para mejorar la experiencia de usuario al colaborar con traducciones multilingües de las especificaciones de la TEI;Las especificaciones de la TEI son las descripciones de los elementos de los diferentes módulos. en @@ -235,9 +238,11 @@ para la comunicación científica sino también por aquellos insertos en proyectos de investigación que trabajan con la codificación o edición de textos en español con TEI desde cualquier país o institución.

-

El software utilizado para crear y distribuir la encuesta fue - Qualtrics,Qualtrics, fecha de consulta 16 de julio de 2023, y la licencia de uso fue +

El software utilizado para crear y distribuir la encuesta fue Qualtrics,Qualtrics, fecha de + consulta 16 de julio de 2023, y la licencia de uso fue proporcionada por la Universidad de Miami. 135 participantes iniciaron la encuesta, aunque sólo 107 respondieron a todas las secciones y preguntas. 77 de estas 107 respuestas se escribieron en español y 30 en inglés (MySQL, - PostgreSQL, etc.), 5. Transformaciones XSLT, 6. Transformaciones - XQuery, 7. Vocabularios controlados o tesauros, 8. Tecnologías sobre Sistemas de - Información Geográfica (SIG), 9. Tecnologías de la información y la comunicación, - 9. Tecnologías sobre Procesamiento del Lenguaje Natural (PLN), 10. Tecnologías - sobre web semántica, 11. Visualización de datos, 12. Otros. El objetivo detrás de - algunas de las opciones que dimos era confirmar si la comunidad de uso de la TEI - en español era de hecho consciente de las mejores prácticas ya en uso a nivel - internacional.

+ xml:id="R3" target="#mysql"/>MySQL, PostgreSQL, etc.), 5. Transformaciones XSLT, 6. + Transformaciones XQuery, 7. Vocabularios controlados o tesauros, + 8. Tecnologías sobre Sistemas de Información Geográfica (SIG), 9. Tecnologías de + la información y la comunicación, 9. Tecnologías sobre Procesamiento del Lenguaje + Natural (PLN), 10. Tecnologías sobre web semántica, 11. Visualización de datos, + 12. Otros. El objetivo detrás de algunas de las opciones que dimos era confirmar + si la comunidad de uso de la TEI en español era de hecho consciente de las mejores + prácticas ya en uso a nivel internacional.

Las respuestas revelaron que la mayoría de los proyectos recurren a personalizaciones del esquema TEI (40) y utilizan esquemas y ODD (39), que son algunos de los requisitos para un flujo de trabajo de marcado documentado, @@ -475,21 +482,26 @@ parece haber un mayor uso de bases de datos XML (23) por delante de las bases de datos relacionales más tradicionales (18). Esto está en consonancia con la evolución reciente y la llegada de productos de bases de datos XML de código - abierto como eXist.Exist Database, fecha de consulta 16 de julio de 2023, - + abierto como eXist.Exist Database, fecha de consulta 16 de julio de 2023,

En lo que respecta a la transformación y renderizado de XML, el lenguaje más - utilizado parece seguir siendo el ya veterano XSLT (29),XSLT, fecha de consulta 16 de julio de 2023, . mientras que las - transformaciones XQuery (11)XQuery, fecha de consulta 16 de julio de 2023, - . se utilizan menos. + utilizado parece seguir siendo el ya veterano XSLT (29),XSLT, fecha de consulta 16 de julio de 2023, + . mientras que las transformaciones XQuery + (11)XQuery, fecha de consulta 16 + de julio de 2023, . se utilizan menos. Esto no parece coincidir del todo con la pregunta anterior sobre el uso de bases - de datos XML, ya que la mayoría de las bases de datos XML nativas utilizan XQuery + de datos XML, ya que la mayoría de las bases de datos XML nativas utilizan XQuery como principal herramienta de recuperación de datos, en lugar de XSLT. Entre los participantes existe una curiosa mezcla entre las formas antiguas y las nuevas.

Otras prácticas que los participantes adoptan cuando trabajan con TEI son la @@ -498,7 +510,7 @@ (11), y el Procesamiento del Lenguaje Natural (12%). En Otros, algunos participantes explicaron que utilizaban scripts de interoperabilidad entre distintos esquemas (DCAT, DDI-CDI), anotación lingüística de corpus, XSLT - LaTex - PDF y Cocoon. Lamentablemente, el 9% eligió otros sin especificar más.

diff --git a/data/JTEI/8_2014-15/jtei-8-boschetti-source.xml b/data/JTEI/8_2014-15/jtei-8-boschetti-source.xml index 9b323184..2bd389ae 100644 --- a/data/JTEI/8_2014-15/jtei-8-boschetti-source.xml +++ b/data/JTEI/8_2014-15/jtei-8-boschetti-source.xml @@ -89,18 +89,18 @@ processing of marked-up documents. We describe the method used to design and implement the TeiCoPhiLib, outlining the design patterns as well as discussing general benefits of the overall architecture. Finally, we present case studies in which some components of our - library currently implemented in Java have been used.

+ library currently implemented in Java have been used.

Introduction

The TeiCoPhiLib library is a collection of components currently implemented in Java - (JSR 270), which parses documents encoded according to a basic subset of TEI tags defined + type="software" xml:id="R2" target="#Java"/>Java + (JSR 270), which parses documents encoded according to a basic subset of TEI tags defined in an ODD fileThe ODD files currently available can be downloaded from GitHub: . The TEI schema we intend eventually to adopt conforms to the EpiDoc vocabulary, following the policy of the Perseus Catalog (Crane et al. 2014). and @@ -112,8 +112,8 @@ visualization through a web browser by instantiating a collection of widgets rendered on the client through standard web technologies. Specifically, the server-side environment jointly processes data and visualization templates,Facelets XML templates are used - under the Java Server Faces 2.0 specification. and generates HTML pages + under the Java Server Faces 2.0 specification. and generates HTML pages rendered on the client. Special components are devoted to monitoring the behavior and interactions among the objects generated from the input TEI documents.

In distributed and collaborative environments, the maintenance of links and relations @@ -161,18 +161,25 @@ achieving great results and benefits for both scholars and developers. Among others, the open-source general-purpose framework Cocoon. and the native - XML database eXist-db. deserve to be mentioned. Specifically for - TEI-annotated documents, TUSTEP,. - TEIBoilerplate,. - TXM, - . and TAPAS. are prominent projects.

+ XML database eXist-db + . deserve to be mentioned. Specifically for + TEI-annotated documents, + TUSTEP,. + + TEIBoilerplate + ,. + + TXM, + . + and TAPAS + . are prominent projects.

For all of these initiatives, the transformation from an XML document structure to another format by XSLT can be considered the focal point.

@@ -216,8 +223,13 @@ results are validated by domain expert collaborations and test-driven development (both unit tests and acceptance tests) . The continuous integration and release are supported by open source Integrated - Development Environments (IDEs) like Eclipse or NetBeans and by a software configuration - management tool such as SVN or Git for versioning and revision control.

+ Development Environments (IDEs) like + Eclipse or + + NetBeans and by a software configuration + management tool such as + SVN or + Git for versioning and revision control.

The aforementioned paradigm is applied in the TeiCoPhiLib library by the implementation of a flexible importing and normalization module in the @@ -242,16 +254,16 @@ . It is important to point out that the new data structure is the result of transformations (by XSLT DOM transformations or SAX event-driven transformations) managed during the parsing process. Thus, the current implementation of the TeiCoPhiLib - exposes methods that parse the XML file and create Java objects. The resources are + exposes methods that parse the XML file and create Java objects. The resources are stored and maintained in a native XML database management system (i.e., eXist-db). The APIs and services provided by Lucene, a software + type="software" xml:id="R15" target="#existdb"/>eXist-db). The APIs and services provided by Lucene, a software library developed and hosted by the Apache Foundation, have been used for indexing the textual data.

For instance, the information conveyed by the following TEI snippet is distributed - among the appropriate Java objects that handle the four levels described + among the appropriate Java objects that handle the four levels described above:

[...]

Io nacqui veneziano ai 18 ottobre del 1775, giorno Style. The style is managed by separated renderers, which point to textual positions affected by stylistic features. For instance, the information extracted from the style attribute is used to instantiate the Java objects devoted to managing the rendering information. Behavior. Behaviors are handled by objects that process textual resources according to the current state of the data structure and the rules to manage @@ -309,7 +321,7 @@ of the most suitable algorithm for the current task. The general idea of object-oriented patterns is to encapsulate functionality and data inside an efficient and flexible collection of classes. The current implementation of the prototype exploits the Java programming language technologies.

The Model-View-Controller (MVC) pattern (Java snippet illustrates + interface. The following simplified Java snippet illustrates this concept programmatically. Observer annotation = new Annotation(); Observer comment = new Comment(); Subject teiDocument = new Document(); teiDocument.subscribe(ObserverType.ANNOTATION, annotation); @@ -424,8 +436,8 @@ uses TeiCoPhiLib APIs invokes the building method of the abstract Builder class. Moreover, the resulting document object is a concretization of an abstract class representing the current structure of the TEI-encoded resource, as illustrated in the - following Java statement: Document teiDocument = + following Java statement: Document teiDocument = AbstractBuilderFactory.buildDocument(new File("features.properties"),new File("teiDocument.xml"));

@@ -462,9 +474,11 @@

The case studies illustrated below have been implemented with the components already developed for our library.

- Euporia: Visualization, Editing, and Annotation of Parallel Texts for Didactic + + Euporia: Visualization, Editing, and Annotation of Parallel Texts for Didactic Purposes -

Euporia is a project aimed at visualizing, editing, and annotating bilingual texts +

+ Euporia is a project aimed at visualizing, editing, and annotating bilingual texts displayed in parallel. The original digital resources are stored and maintained in authoritative digital libraries available online, such as the Biblioteca Italiana and the Perseus Digital Library, or they are downloaded from social proofreading websites, @@ -564,15 +578,18 @@ -

Parallel texts are visualized and managed through EuporiaWebApp (), which is a server-side Java web application compliant - with the JSR 314 specification intended for educational purposes. Students, the end - users of Euporia, are allowed to query texts, both jointly and independently, through +

Parallel texts are visualized and managed through + EuporiaWebApp (), which is a server-side Java web application compliant + with the JSR 314 specification intended for educational purposes. Students, the end + users of + Euporia, are allowed to query texts, both jointly and independently, through multilingual or monolingual keywords.

- Euporia + + Euporia

Annotations can also be associated with linked chunks of text (such as a sentence and its translation: see ) or with single, @@ -582,13 +599,17 @@ under evaluation.

- Annotation in Euporia + Annotation in + Euporia
- Aporia: Adapting the Parallel Text Framework to Specific Scientific + + Aporia: Adapting the Parallel Text Framework to Specific Scientific Requirements -

Aporia is the enhanced version of Euporia, intended for research purposes. Accordingly, +

+ is the enhanced version of + Euporia, intended for research purposes. Accordingly, the Parallel Text framework has been adapted and extended to meet specific scientific requirements (Bozzi 2013). An experimental case study has been performed on Theodor Mommsen’s edition of the Res Gestae @@ -609,19 +630,24 @@ application.</p> <figure xml:id="figure6"> <graphic url="images/jtei-8-boschetti_06.png" height="414px" width="1390px"/> - <head type="legend">Aporia</head> + <head type="legend"><ptr type="software" xml:id="R30" target="#aporia"/> + <rs type="soft.name" ref="#R30">Aporia</rs></head> </figure> </div> <div xml:id="saussure"> - <head>Saussure Project: Supporting Genetic Criticism</head> - <p>The Saussure Project exploits the flexibility of Aporia in order to adapt the system to + <head><ptr type="software" xml:id="R32" target="#saussure"/> + <rs type="soft.name" ref="#R32">Saussure Project</rs>: Supporting Genetic Criticism</head> + <p>The <ptr type="software" xml:id="R33" target="#saussure"/> + <rs type="soft.name" ref="#R33">Saussure Project</rs> exploits the flexibility of <ptr type="software" xml:id="R31" target="#aporia"/> + <rs type="soft.name" ref="#R31">Aporia</rs> in order to adapt the system to the study of Saussurean autographs, making author’s variants searchable and creating multilingual indexes of ancient terms studied by the linguist.</p> <p>Instead of showing linked texts in parallel, the system shows the image of the manuscript and the related transcription (<ptr type="crossref" target="#figure7"/>).</p> <figure xml:id="figure7"> <graphic url="images/jtei-8-boschetti_07.png" height="743px" width="1430px"/> - <head type="legend">Saussure Project</head> + <head type="legend"><ptr type="software" xml:id="R34" target="#saussure"/> + <rs type="soft.name" ref="#R34">Saussure Project</rs></head> </figure> </div> </div> @@ -639,18 +665,18 @@ <p>Reusable software components promote the management of stand-off annotation at any level (such as editing, searching, or visualizing), improving the experience of the annotation and use of TEI documents.</p> - <p>The document parsing in the current <ptr type="software" xml:id="Java" target="#Java" - /><rs type="soft.name" ref="#Java">Java</rs> implementation takes place on the server - side, where the <ptr type="software" xml:id="Java" target="#Java"/><rs type="soft.name" - ref="#Java">Java</rs> virtual machine runs within the web application environment.</p> + <p>The document parsing in the current <ptr type="software" xml:id="R35" target="#Java" + /><rs type="soft.name" ref="#R35">Java</rs> implementation takes place on the server + side, where the <ptr type="software" xml:id="R36" target="#Java"/><rs type="soft.name" + ref="#R36">Java</rs> virtual machine runs within the web application environment.</p> <p> The marshalling and unmarshalling process handles the serialization of the object representation of the TEI document, in order to store and retrieve data on the filesystem - or in native XML databases, such as <ptr type="software" xml:id="eXist-db" - target="#existdb"/><rs type="soft.name" ref="#eXist-db">eXist-db</rs>.</p> + or in native XML databases, such as <ptr type="software" xml:id="R37" + target="#existdb"/><rs type="soft.name" ref="#R37">eXist-db</rs>.</p> <p>Performance measurement tools such as JMeter will help to optimize the performance of the library components.</p> <p> Software currently under development will be available on <ptr type="software" - xml:id="GitHub" target="#GitHub"/><rs type="soft.name" ref="#GitHub">GitHub</rs> at <ptr + xml:id="R38" target="#GitHub"/><rs type="soft.name" ref="#R38">GitHub</rs> at <ptr target="https://github.com/CoPhi/cophilib"/>.</p> </div> </body> @@ -665,11 +691,13 @@ Environment: Metadata, Vocabularies and Techniques in the Digital Humanities, article no. 11. New York: ACM. doi:10.1145/2517978.2517990. - Bozzi, Andrea. 2013. + <rs type="soft.ref.bib" ref="#R39"> + <bibl xml:id="bozzi13"><author><rs type="soft.agent" ref="#R39">Bozzi, Andrea</rs></author>. <date>2013</date>. <title level="a" >G2A: A Web Application to Study, Annotate and Scholarly Edit Ancient Texts and Their Aligned Translations. Studia graeco-arabica 3:159–71. . + target="http://www.greekintoarabic.eu/uploads/media/BOZZI_SGA_3-2013.pdf"/>. Burbeck, Steve. 1992. Applications Programming in Smalltalk-80TM: How to Use Model-View-Controller (MVC). Last modified March 4, 1997. - TEI stylesheets (or the - - TEI Boilerplate software) to - very complex frameworks based on CMS and SQL search engines. Researchers of the Digital - Vercelli Book project started looking into a simple, user-friendly solution and eventually - decided to build their own: EVT (Edition Visualization Technology) has been under - development since about 2010 and has turned into a flexible tool that can be used to - create a web-based digital edition starting from transcription files encoded in TEI XML. - This paper describes why this tool was created, how it works, and developments planned for - the future.

+ simple HTML pages produced using the + TEI stylesheets (or the + TEI Boilerplate software) to very complex frameworks + based on CMS and SQL search engines. Researchers of the Digital Vercelli Book project + started looking into a simple, user-friendly solution and eventually decided to build + their own: EVT (Edition Visualization Technology) has been under development since about + 2010 and has turned into a flexible tool that can be used to create a web-based digital + edition starting from transcription files encoded in TEI XML. This paper describes why + this tool was created, how it works, and developments planned for the future.

For the purposes of the Italian academy, R. Rosselli Del Turco is responsible for @@ -162,35 +161,35 @@ />.

in favor of a web-based publication. While this decision was critical in that it allowed us to select the most supported and widely-used medium, we soon discovered that it did not make choices any simpler. On the one hand, the XSLT - stylesheets provided by TEI are great for HTML rendering, but do not include support for - image-related features (such as the text-image linking available thanks to the P5 - version of the TEI schema) and tools (including zoom in/out, magnifying lens, and hot - spots) that represent a significant part of a digital facsimile and/or diplomatic - edition; other features, such as an XML search engine, would have to be integrated - separately, in any case. On the other hand, there are powerful frameworks based on - CMS

The Omeka framework () supports publishing TEI documents; see also - Drupal () - and TEICHI ( + type="software" xml:id="R1" target="#teistylesheets"/>XSLT stylesheets provided by TEI are great for HTML rendering, but do not + include support for image-related features (such as the text-image linking available + thanks to the P5 version of the TEI schema) and tools (including zoom in/out, magnifying + lens, and hot spots) that represent a significant part of a digital facsimile and/or + diplomatic edition; other features, such as an XML search engine, would have to be + integrated separately, in any case. On the other hand, there are powerful frameworks + based on CMS

The Omeka framework () supports publishing TEI documents; see + also Drupal () and TEICHI ( ).

and other web technologies

Such as the - eXist XML database, .

which looked far too - complex and expensive, particularly when considering future maintenance needs, for our - project’s purposes. Other solutions, such as the eXist XML database, .

which looked far + too complex and expensive, particularly when considering future maintenance needs, for + our project’s purposes. Other solutions, such as the EPPT software

Edition Production and Presentation Technology, - .

developed by K. Kiernan or - the + .

developed by K. Kiernan + or the Elwood viewer

Elwood Viewer, + type="soft.url" ref="#R7"> .

created by G. Lyman, either were not yet ready or were unsuitable for other reasons (proprietary software, user interface @@ -239,7 +238,7 @@
EVT - v. 2.0: Rebooting the Project + v. 2.0: Rebooting the Project

To get out of the impasse we decided to completely reboot the project, removing secondary features and giving priority to fundamental ones. We also found a solution for the data-loading problem: instead of finding a way to load the data into the software we @@ -247,19 +246,20 @@ starting point means that the editor can focus on his work, marking up the transcription text, with very little configuration needed to create the edition. This approach also allowed us to quickly test XML files belonging to other edition projects, to check if - EVT could go beyond being a project-specific tool. The inspiration for these changes - came from work done in similar projects developed within the TEI community, namely + EVT could go beyond being a project-specific tool. The inspiration for these + changes came from work done in similar projects developed within the TEI community, + namely TEI Boilerplate,

TEI Boilerplate, - + .

- John A. Walsh’s collection of John A. Walsh’s collection of XSLT stylesheets,

tei2html - , , .

and Solenne Coutagne’s work for the Berliner @@ -290,11 +290,12 @@
How it Works

Our ideal goal was to have a simple, very user-friendly drop-in tool, requiring little - work and/or knowledge of anything beyond XML from the editor. To reach this goal, EVT is - based on a modular structure where a single stylesheet (evt_builder.xsl) - starts a chain of XSLT - 2.0 transformations calling in turn all the other + work and/or knowledge of anything beyond XML from the editor. To reach this goal, + EVT is based on a modular structure where a single + stylesheet (evt_builder.xsl) starts a chain of XSLT + 2.0 transformations calling in turn all the other modules. The latter belong to two general categories: those devoted to building the HTML site, and the XML processing ones, which extract the edition text lying between folios using the pb element and format it according to the edition level. All you can then apply the evt_builder.xsl stylesheet to your TEI XML document using the Oxygen XML editor or another XSLT 2–compliant - engine. + xml:id="R21" target="#XSLT"/>XSLT + 2–compliant engine.

@@ -341,8 +342,10 @@ ref="#R24">XSLT stylesheets

The transformation chain has two main purposes: generate the HTML files containing the edition and create the home page which will dynamically recall the other HTML files.

-

The EVT builder’s transformation system is composed of a modular collection of XSLT +

The + EVT builder’s transformation system is composed of + a modular collection of XSLT 2.0 stylesheets: these modules are designed to permit scholars to freely add their own stylesheets and to manage the different desired levels of the edition without influencing other parts of the system, for instance the @@ -381,11 +384,13 @@ edition_array variable.

The available edition levels are described in the software documentation. Adding another edition level requires providing the corresponding stylesheet.

-

Once the XML file is ready and the parameters are set, the EVT builder’s transformation - system uses a collection of stylesheets to divide the XML file containing the text of - the transcription into smaller portions, each one corresponding to the content of a - folio, recto or verso, of the manuscript. For each of these text fragments it creates as - many output files as requested by the file settings.

+

Once the XML file is ready and the parameters are set, the + EVT builder’s transformation system uses a + collection of stylesheets to divide the XML file containing the text of the + transcription into smaller portions, each one corresponding to the content of a folio, + recto or verso, of the manuscript. For each of these text fragments it creates as many + output files as requested by the file settings.

In order to create the contents of these files, templates required to handle the edition level are selected by using the value of their mode attribute. For example, by associating the transformation for the diplomatic edition to the @@ -395,8 +400,8 @@ xsl:apply-templates select="current-group()" mode="dipl" instruction before its content is inserted into the diplomatic output file.

-

Using XSLT modes it is possible to separate the rules for the different +

Using XSLT modes it is possible to separate the rules for the different transformations of a TEI element and to recall other XSLT stylesheets in order to manage the transformations or send different parts of a document to different parts of @@ -505,11 +510,12 @@ matter what view one is currently in, one can expand the desired frame to focus on its specific content, temporarily hiding the other components of the user interface. It is furthermore possible to collapse the frame toolbars to increase the space devoted to - content visualization; it is important to notice, however, that we recommend using EVT - in full-screen mode to see images and text at the maximum possible screen resolution. - The collapse and restore actions are triggered by icons embedded in the interface, but - one can also press the Esc key to instantly return to the default - layout.

+ content visualization; it is important to notice, however, that we recommend using + EVT in full-screen mode to see images and text at + the maximum possible screen resolution. The collapse and restore actions are triggered + by icons embedded in the interface, but one can also press the Esc key to + instantly return to the default layout.

On the image side, several tools are available to improve analysis of manuscript images. The zoom feature is always active, so that the user can zoom in or out at any time by means of the mouse scroll wheel or using the appropriate slider in the bottom @@ -529,11 +535,14 @@ available manuscript digitized folios.

-

The image-text feature is inspired by Martin Holmes’s Image Markup - Tool

The UVic Image Markup Tool Project, .

software - and was implemented in The image-text feature is inspired by + Martin Holmes’s + Image Markup Tool

The UVic Image Markup Tool Project, .

+ software and was implemented in XSLT and CSS; all the other features are achieved by using jQuery plug-ins.

In the text frame tool bar you can see three drop-down menus which are useful for @@ -553,8 +562,8 @@ accessible at .

soliciting feedback from all interested parties. Shortly afterwards, the version of the - EVT software we used, improved by more bug fixes and - small enhancements, was made available for the academic community on EVT
software we used, improved by more bug fixes + and small enhancements, was made available for the academic community on the project’s SourceForge site.

Edition Visualization Technology: Digital edition visualization software, EVT. Some of the planned features will require fundamental changes to the software architecture to be implemented effectively: this is - probably the case for the Digital Lightbox (see + Digital Lightbox (see ), which requires a client-server architecture (), instead of the current client-only model, to perform some of the existing and planned actions. The currently developed search engine ( + EVT could work as a desktop application or an off-line web application that could be accessed anywhere, and possibly distributed in optical formats (CD or DVD). Forcing the prerequisites of an Internet connection and of dependency on a server-based XML database would have undermined our original goal. Going the database route was no longer an option for a client-only EVT and we immediately felt the need to go back to our original architecture to meet this standard. This sudden turnaround marked another - chapter in the research process and brought us to the current implementation of EVT + chapter in the research process and brought us to the current implementation of + EVT Search.

However, this new vision also had obvious limitations and issues. An XML database could provide us with crucial functionality typical of every information retrieval system: an @@ -661,7 +676,8 @@ collections of web pages. It can function both offline and online, and it does not necessarily require a web server or a server-side programming/query language (such as SQL, PHP, or Python) in order to work. While technically a plug-in, its architecture - is quite interesting and versatile: Tipue uses a combination of client-side + Tipue uses a combination of client-side JavaScript for the actual bulk of the work, and JSON (or JavaScript @@ -859,7 +875,7 @@ target="http://lightbox-dev.dighum.kcl.ac.uk"> Digital Lightbox

A beta version is - available at .

is a + available at .

is a web-based visualization framework which aims to support historians, paleographers, art historians, and others in analyzing and studying digital reproductions of cultural heritage objects. The methodology of research inspiring development of this tool is to @@ -940,22 +956,26 @@ xml:id="R91" target="#EVT"/> EVT provides a rich and usable interface to browse and study manuscript texts together with the corresponding images, the tools offered by - the Digital Lightbox allow users to identify, gather, and analyze visual details which + the + Digital Lightbox allow users to identify, gather, and analyze visual details which can be found within the images, and which are important for inquiries relating, for instance, to the style of the handwriting, decorations on manuscript folia, or page layout.

-

An effort to adapt and integrate the Digital Lightbox into An effort to adapt and integrate the + Digital Lightbox into EVT is already underway, making it available as a - separate, image-centered view, but there is a major hurdle to overcome: some of the DL + separate, image-centered view, but there is a major hurdle to overcome: some of the + DL features are only possible within a client-server architecture. Since EVT or, more precisely, a separate version of - EVT will migrate to this architecture, at some point - in the future it will be possible to integrate a full version of the DL. Plans for the - current, client-only version envision implementing all those features that do not depend - on server software: even if this means giving up interesting features such as + EVT will migrate to this architecture, at some + point in the future it will be possible to integrate a full version of the + DL. Plans for + the current, client-only version envision implementing all those features that do not + depend on server software: even if this means giving up interesting features such as collaborative work and annotation, we believe that even a subset of the available tools will be an invaluable help for manuscript image analysis. Furthermore, as noted above, thanks to HTML5 and CSS3 it will become more and more feasible to implement features in @@ -1039,13 +1059,14 @@

- The EVT Team + The + EVT Team Roberto Rosselli Del Turco Roberto Rosselli Del Turco, Julia Kenny, and Raffaele Masotti - Julia Kenny and Raffaele Masotti @@ -1055,7 +1076,8 @@ Giancarlo Buomprisco -

EVT website:

+

+ EVT website:

diff --git a/data/JTEI/rolling_2022/jtei-mitiku-212-source.xml b/data/JTEI/rolling_2022/jtei-mitiku-212-source.xml index ffa8c151..f6423955 100644 --- a/data/JTEI/rolling_2022/jtei-mitiku-212-source.xml +++ b/data/JTEI/rolling_2022/jtei-mitiku-212-source.xml @@ -544,13 +544,18 @@ 2018–. Beta Maṣāḥǝft Guidelines. . - Wick, Christoph, Christian Reul, and - Frank Puppe. 2020. Calamari − A - High-Performance Tensorflow-based Deep Learning Package for Optical Character - Recognition. - Digital Humanities Quarterly, 14 (2), . + + Wick, + Christoph, Christian Reul, and Frank Puppe. 2020. <rs type="soft.name" ref="#R1">Calamari</rs> − A High-Performance + Tensorflow-based Deep Learning Package for Optical Character + Recognition. + Digital Humanities Quarterly, 14 (2), . Liuzzo, Pietro Maria. 2019. Digital Approaches to Ethiopian and Eritrean Studies. Supplement to Aethiopica 8. Wiesbaden: diff --git a/schema/tei_jtei_annotated.odd b/schema/tei_jtei_annotated.odd index 091f2a64..c27b4649 100644 --- a/schema/tei_jtei_annotated.odd +++ b/schema/tei_jtei_annotated.odd @@ -2247,7 +2247,9 @@ - + + + @@ -2321,6 +2323,7 @@ + @@ -2439,6 +2442,23 @@ + + + + + + + + + + + + + + + + + @@ -3761,7 +3781,7 @@ - + There's no local target for diff --git a/schema/tei_jtei_annotated.rng b/schema/tei_jtei_annotated.rng index 5db61162..fd488f19 100644 --- a/schema/tei_jtei_annotated.rng +++ b/schema/tei_jtei_annotated.rng @@ -5,7 +5,7 @@ xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes" ns="http://www.tei-c.org/ns/1.0">