Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tagging Rationale #1

Closed
jcuenod opened this issue Jun 16, 2020 · 10 comments
Closed

Tagging Rationale #1

jcuenod opened this issue Jun 16, 2020 · 10 comments

Comments

@jcuenod
Copy link

jcuenod commented Jun 16, 2020

Could you add some of the logic guiding the tagging decisions to the readme? I assume it's somewhere in Martin's thesis but it seems like a good idea to keep that close to the data.

Also, it would be useful to list the genres that are tagged. I believe that it's:

  • instruction
  • list
  • poetry
  • prophetic
  • prose
@codykingham
Copy link
Contributor

That would be really great. I'm not privy to the exact criteria used. Maybe @MartijnNaaijer or @PeursenWTvan could add something here that we could append to the documentation. I think the labels were built manually.

@jcuenod
Copy link
Author

jcuenod commented Jun 16, 2020

My impression was that it was manual tagging but that makes it more important in my mind to understand what was guiding the taggers. If you don't have the criteria, that's another story...

@codykingham
Copy link
Contributor

Added tagging values to readme in c383395

@codykingham
Copy link
Contributor

My impression is that the majority of cases are uncontroversial, given a coarse-grained approach. But there are probably a few spots (e.g. Isaiah 6) where I see a lot of room for interpretation. Would be great to get the criteria used.

@jcuenod jcuenod closed this as completed Jun 17, 2020
@jcuenod
Copy link
Author

jcuenod commented Jun 17, 2020

I'm closing, assuming that this it's going to be difficult to get the criteria.

@codykingham
Copy link
Contributor

Hi James, let’s leave it open for now and see if Martijn or Wido has something to add. It’s an important question, and not too hard to at least get a basic answer!

@codykingham codykingham reopened this Jun 17, 2020
@MartijnNaaijer
Copy link

Hi guys, we use three basic categories: prose, poetry and prophecy for complete books, so Genesis is prose, Psalms is poetry, Isaiah is prophecy, etc. This is not based on formal criteria, but on what you find generally in exegetical books. Within prose, we added lists, these are generally longs lists of names of people, and instruction, which is a general name for (mainly) law texts, genreally based on what one finds in exegetical commentaries. Also, we changed "prose" to "poetry" if there is a piece of poetry emmbedded in prose, such as in Genesis 49. So, the whole dataset is pretty basic and often based on intuitive criteria. It was meant to be coarse grained, to avoid drowning in tiny details, but feel free to improve!

@jcuenod
Copy link
Author

jcuenod commented Jun 17, 2020

Thanks!

@codykingham
Copy link
Contributor

Thanks @MartijnNaaijer. I'll add this information to the readme before closing.

@codykingham
Copy link
Contributor

I've added a methodology description to the readme in 5ca0851 thank you both for improving the dataset!

@codykingham codykingham pinned this issue Jul 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants