-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tagging Rationale #1
Comments
That would be really great. I'm not privy to the exact criteria used. Maybe @MartijnNaaijer or @PeursenWTvan could add something here that we could append to the documentation. I think the labels were built manually. |
My impression was that it was manual tagging but that makes it more important in my mind to understand what was guiding the taggers. If you don't have the criteria, that's another story... |
Added tagging values to readme in c383395 |
My impression is that the majority of cases are uncontroversial, given a coarse-grained approach. But there are probably a few spots (e.g. Isaiah 6) where I see a lot of room for interpretation. Would be great to get the criteria used. |
I'm closing, assuming that this it's going to be difficult to get the criteria. |
Hi James, let’s leave it open for now and see if Martijn or Wido has something to add. It’s an important question, and not too hard to at least get a basic answer! |
Hi guys, we use three basic categories: prose, poetry and prophecy for complete books, so Genesis is prose, Psalms is poetry, Isaiah is prophecy, etc. This is not based on formal criteria, but on what you find generally in exegetical books. Within prose, we added lists, these are generally longs lists of names of people, and instruction, which is a general name for (mainly) law texts, genreally based on what one finds in exegetical commentaries. Also, we changed "prose" to "poetry" if there is a piece of poetry emmbedded in prose, such as in Genesis 49. So, the whole dataset is pretty basic and often based on intuitive criteria. It was meant to be coarse grained, to avoid drowning in tiny details, but feel free to improve! |
Thanks! |
Thanks @MartijnNaaijer. I'll add this information to the readme before closing. |
I've added a methodology description to the readme in 5ca0851 thank you both for improving the dataset! |
Could you add some of the logic guiding the tagging decisions to the readme? I assume it's somewhere in Martin's thesis but it seems like a good idea to keep that close to the data.
Also, it would be useful to list the genres that are tagged. I believe that it's:
The text was updated successfully, but these errors were encountered: