Tagging Rationale #1

jcuenod · 2020-06-16T19:52:05Z

Could you add some of the logic guiding the tagging decisions to the readme? I assume it's somewhere in Martin's thesis but it seems like a good idea to keep that close to the data.

Also, it would be useful to list the genres that are tagged. I believe that it's:

instruction
list
poetry
prophetic
prose

codykingham · 2020-06-16T19:54:48Z

That would be really great. I'm not privy to the exact criteria used. Maybe @MartijnNaaijer or @PeursenWTvan could add something here that we could append to the documentation. I think the labels were built manually.

jcuenod · 2020-06-16T19:56:19Z

My impression was that it was manual tagging but that makes it more important in my mind to understand what was guiding the taggers. If you don't have the criteria, that's another story...

codykingham · 2020-06-16T19:57:13Z

Added tagging values to readme in c383395

codykingham · 2020-06-16T20:05:39Z

My impression is that the majority of cases are uncontroversial, given a coarse-grained approach. But there are probably a few spots (e.g. Isaiah 6) where I see a lot of room for interpretation. Would be great to get the criteria used.

jcuenod · 2020-06-17T05:27:56Z

I'm closing, assuming that this it's going to be difficult to get the criteria.

codykingham · 2020-06-17T05:54:20Z

Hi James, let’s leave it open for now and see if Martijn or Wido has something to add. It’s an important question, and not too hard to at least get a basic answer!

MartijnNaaijer · 2020-06-17T07:02:15Z

Hi guys, we use three basic categories: prose, poetry and prophecy for complete books, so Genesis is prose, Psalms is poetry, Isaiah is prophecy, etc. This is not based on formal criteria, but on what you find generally in exegetical books. Within prose, we added lists, these are generally longs lists of names of people, and instruction, which is a general name for (mainly) law texts, genreally based on what one finds in exegetical commentaries. Also, we changed "prose" to "poetry" if there is a piece of poetry emmbedded in prose, such as in Genesis 49. So, the whole dataset is pretty basic and often based on intuitive criteria. It was meant to be coarse grained, to avoid drowning in tiny details, but feel free to improve!

jcuenod · 2020-06-17T13:46:36Z

Thanks!

codykingham · 2020-06-17T13:47:35Z

Thanks @MartijnNaaijer. I'll add this information to the readme before closing.

codykingham · 2020-06-17T13:55:45Z

I've added a methodology description to the readme in 5ca0851 thank you both for improving the dataset!

jcuenod closed this as completed Jun 17, 2020

codykingham reopened this Jun 17, 2020

codykingham closed this as completed Jun 17, 2020

codykingham pinned this issue Jul 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tagging Rationale #1

Tagging Rationale #1

jcuenod commented Jun 16, 2020 •

edited

Loading

codykingham commented Jun 16, 2020

jcuenod commented Jun 16, 2020

codykingham commented Jun 16, 2020

codykingham commented Jun 16, 2020

jcuenod commented Jun 17, 2020

codykingham commented Jun 17, 2020

MartijnNaaijer commented Jun 17, 2020

jcuenod commented Jun 17, 2020

codykingham commented Jun 17, 2020

codykingham commented Jun 17, 2020

Tagging Rationale #1

Tagging Rationale #1

Comments

jcuenod commented Jun 16, 2020 • edited Loading

codykingham commented Jun 16, 2020

jcuenod commented Jun 16, 2020

codykingham commented Jun 16, 2020

codykingham commented Jun 16, 2020

jcuenod commented Jun 17, 2020

codykingham commented Jun 17, 2020

MartijnNaaijer commented Jun 17, 2020

jcuenod commented Jun 17, 2020

codykingham commented Jun 17, 2020

codykingham commented Jun 17, 2020

jcuenod commented Jun 16, 2020 •

edited

Loading