Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify how proof of a person is connected to proof of a fact #120

Open
EssyGreen opened this issue Feb 5, 2012 · 25 comments
Open

clarify how proof of a person is connected to proof of a fact #120

EssyGreen opened this issue Feb 5, 2012 · 25 comments

Comments

@EssyGreen
Copy link

The proof of a Fact which is related to a Person or Relationship is totally dependant upon the proof that the Persona(s) are proven to be the same Person(s). If any evidence (maybe discovered at a future date) disproves that Persona ABC relates to Person XYZ then all concluded facts which are based on this assumption are null and void. How can this be ascertained given that there is no connection between the Attribution in the Fact and that of the Person?

@stoicflame
Copy link
Member

Thanks for bringing up the question. This needs to be clarified. I've been meaning to get around to that for awhile now. You're not the first to bring it up. I'll try to tackle this for 0.10.0.

@EssyGreen
Copy link
Author

My take on this is that a Person (Conclusion Model) is primarily a collection of source references where the Persona has been researched to deduce whether or not it is thought to match the Person being researched. The collection should include negative results as well as positive ones (ie should include why Persona1 is not thought to be Person ABC as well as why Persona2 is thought to match Person ABC).

From each Persona we/the application can track the Facts and Relationships from the Records so it is not essential to re-create these in the Conclusion Model.

The researcher typically (and selectively) builds the life event(s) of a Person from the Facts/Relationships/Persona but these do not necessarily correlate directly. For example:

  • Record R1 of a Census taken on 1st June 1841 at District 57, Bishport, Somerset states Joe Bloggs (R1.P1) (pit man, age 53, born same county) and Jessie Bloggs (R1.P2) (servant, age 50, born in foreign parts)
  • Record R2 of a Census taken on 30th March 1851 at High Street, Bedminster, Somerset states Joseph Bloggs (R2.P1) (householder, miner, age 62, born in Bedminster) and Jessica Bloggs (R2.P2) (wife, age 61, born in Jersey)
  • Record R3 of a Census taken on 1st April 1861 at 57, Mill Lane, Bedminster, Bristol states Jessica Bloggs (R3.P1) (widow, age 76, born in Channel Islands) [Interpretation would specify an unknown but deceased husband R3.P2]
  • Record R4 - an illustration of Bishport circa 1845
  • Record R5 - states that Bishport is an old name for Bedminster, Bristol

Conclusion Model might look like this:

Person CP1:
Source References: R1.P1(+3), +R2.P1(+3), R3.P2(+3)
Name: Joseph Bloggs [R2.P1(+3)]
Known as: Joe [R1.P1(+3)]
Birth: Abt 1788 in Bedminster, Bristol [R1.P1(+1), R2.P1(+3)]
Marriage: to [CP2 link] before 1841 [R1.P1.Relationship(+3), R2.P1.Relationship(+3), R3.P2.Relationship(+1)]
Residence: High Street, Bedminster, Bristol from 1841 to 1851 - Living with his wife [CP2 link] in a small terraced house close to the centre of town. [R1.P1.Census(+3), R2.P1.Census(+3),R5(+3)] Picture of Bishport [R4.Image]
Occupation: Miner [R1.P1(+3), R2.P1(+3)]
Death: Before 1861 [R3.P2(+1) - his wife has been widowed]

Person CP2:
Source References: R1.P2(+3), R2.P2(+3), R3.P1(+3)
Name: Jessica [R1.P2(+3), R3.P1(+3)]
Known as: Jessie [R2.P2(+3)]
Birth: Abt 1790 in Jersey [R1.P1(+1), R2.P2(+3), R3.P1(+1) - age in 1861 thought to be an enumerator error or attempt to appear older to claim parish relief]
Marriage: to [CP1 link] before 1841 [R1.P2.Relationship(+3), R2.P2.Relationship(+3), R3.P1.Relationship(+1)]
Residence: High Street, Bedminster, Bristol from 1841 to 1851 - Living with his wife [CP2 link] in a small terraced house close to the centre of town. [R1.P2.Census(+3), R2.P2.Census(+3),R5(+3)] Picture of Bishport [R4.Image]
Residence: 57, Mill Lane, Bedminster, Bristol in 1861 [R3.P1.Census(+3)]
Occupation: Servant [R1.P2(+3)]
Death:unknown - but inferred deceased from infeasibility of age given current date and DoB

Supposing we discover that R3.P1 is in fact a different person ... so R3.P1 changes from being a positive assessment to a negative assessment in the collection against CP2.

Providing that this is done at the Person level (and not just the Fact level) we can then deduce and highlight what needs to be re-assessed (ie all Facts for CP2 which also reference R3.P1).

We can also deduce that since R3.P1 is in a Relationship with R3.P2 then all references to R3.P2 are now also suspect. Hence we can highlight that CP1's death is suspect.

All this, depends on:

(a) holding a separate evaluation of the Person/Persona match for every resource referenced by a Fact etc for a Person. If we break this and allow a fact to reference a source without an evaluation of the Person match then the reference is ambiguous - does the Proof Statement refer to the Person/Persona match or to an assessment of the truth of the Record.Fact?
and
(b) referencing the Persona from the Record and not just a link to the Record

However, source references which do not link to any Persona do not need to be held/evaluated at the Person level (since they don't have a relevant Persona to link to). In the example above I used an illustration of a Place. If the illustration was later found to relate to another Place then this would be changed at the level of the Place entity and anything which referenced R4 could be highlighted as dubious.

@EssyGreen
Copy link
Author

An alternative way of describing Persons could be via a Wiki-type narrative rather than using Facts at all.

In the above example, we could have:

Person CP1:
Source References: R1.P1(+3), +R2.P1(+3), R3.P2(+3)
Narrative: Joseph Bloggs [R2.P1(+3)](otherwise known as Joe [R1.P1%28+3%29]) was born around 1788 in Bedminster, Bristol [R1.P1(+1), R2.P1(+3)]. Not much is know of his early life but by the age of 53 he was working as a miner[R1.P1(+3), R2.P1(+3)]. He married Jessica some time before 1841 [R1.P1.Relationship(+3), R2.P1.Relationship(+3), R3.P2.Relationship(+1)] and they lived together in a small terraced house in the High Street, close to the centre of town[R1.P1.Census(+3), R2.P1.Census(+3),R5(+3)] Picture of Bishport [R4.Image]. Joseph died between 1851 and 1861[R3.P2(+1).

Person CP2:
Source References: R1.P2(+3), R2.P2(+3), R3.P1(+3)
Narrative: Jessica [R1.P2(+3), R3.P1(+3)](the wife of Joseph Bloggs [R1.P2.Relationship%28+3%29, R2.P2.Relationship%28+3%29, R3.P1.Relationship%28+1%29]), was born in Jersey circa 1790 [R1.P1(+1), R2.P2(+3), R3.P1(+1). She lived with her husband[CP1 link] in a small terraced house in the High Street, close to the centre of town [R1.P2.Census(+3), R2.P2.Census(+3),R5(+3)] Picture of Bishport [R4.Image]. She worked as a servant[R1.P2(+3)] until the age of 50. After her husband's death (between 1851 and 1861), she moved to 57, Mill Lane [R3.P1.Census(+3)].

For Persons whose narrative has not been created (say because the Person is a distant leaf in the researcher's tree), the Record model can be used to show what has been linked without the need for any Facts to be created by the user, e.g.:

Person CP1:
Source References: R1.P1(+3), +R2.P1(+3), R3.P2(+3)
Name: Joe Bloggs
Birthplace: Somerset
Occupation: Pit man, Date: 1st June 1841
Census: Age: 53, Date: 1st June 1841 Place: District 57, Bishport, Somerset
Name: Joseph Bloggs [R2.P1(+3)]
Occupation: Miner, Date: 30th March 1851
Census: Householder, Age: 62, Date: 30th March 1851 Place: High Street, Bedminster, Somerset

In any of the above cases the Person.Facts are optional and are at the discretion of the software being used, but the Person/Persona links in the data model are critical.

@stoicflame
Copy link
Member

I'm starting to take this issue on with consideration that the record model isn't "visible" from the conclusion model anymore, except as just another derivative source.

So I was thinking that a simple clarification would be to state that the sources of all conclusions (gender, name, fact) must refer to sources declared at the level of the of the person. Then I'd provide some recipes to illustrate this.

Would this be an adequate clarification? What's missing?

@EssyGreen
Copy link
Author

Not sure ... I think what I'm needing is a direct Person->Persona link so that there is a connection not just to the source but to the Persona referenced within it. However, the 'Persona' I am referring to is one constructed by (or at least editable by) the researcher not necessarily by the vendor of the Record Model (something I was trying to understand/explore in #151)

For me, there is a distinction between a source cited as Evidence (ie a source which is intended to contribute towards proof of the Person/Fact) and a source which is cited as an illustration (e.g. a citation from an article on what life was like in a workhouse or a description of a place etc) - see my note in #141. Hence, in my model I have an Evidence object which links to the Persona; and a (separate) Citation which is just a link to the source. The Evidence can be validated as I illustrated above; the citation can't be validated since we don't even know if the Person was mentioned - it's effectively just a footnote to credit the author(s).

I suspect this may be too much detail and too specific an implementation for others so I'm happy to compromise but could we explore the concept of a Person->Persona link before we throw it out?

@stoicflame
Copy link
Member

I think I'm confused. Can you clarify exactly what you would need in order for this issue to be considered "closed" in your mind?

Would some documentation that explains how to distinguish between a source cited as Evidence and a source cited as an illustration be adequate?

@EssyGreen
Copy link
Author

Oh dear :( I'm sorry.

It may well be that this should be an application-specific thing rather than in the standard so maybe the short answer is for you just to say "No, we won't do this." But to give it one more go ...

My requirement would be for concluded Persons to have a collection of Evidence objects, each Evidence object comprising the following properties:

  • Persona (link)
  • Justification (explanation of why the Persona is thought to be/not to be the same as the concluded Person)
  • Confidence (positive or negative number indicating the degree of confidence in the match between the Persona and the Person - with negative meaning not a match)

Does that make sense?

@stoicflame
Copy link
Member

Oh dear :( I'm sorry.

Not at all. I appreciate your patience with me.

Does that make sense?

Yes. Thank you. I can do that.

@EssyGreen
Copy link
Author

Phew! Many thanks :)

@stoicflame
Copy link
Member

Hey, @EssyGreen, please consider the recipes that have been added to the recipe book to provide for the cases you've asked about.

What other questions, issues, or concerns still remain on this topic?

@EssyGreen
Copy link
Author

I've been awol for a while - just getting back into it. Give me a while to catch up on the new stuff - I'll comment asap.

@EssyGreen
Copy link
Author

I still can't find any examples of negative proof or any examples which allow the researcher to explain why the citation is thought to refer to (or not refer to) the relevant person. All the examples seem to be falling into the common trap of "I found it therefore it must be true". I understand the need for simple examples but think there is a desperate need to exemplify the importance and complexity of deeper investigation and analysis.

@jralls
Copy link
Contributor

jralls commented Jun 9, 2012

I understand the need for simple examples but think there is a desperate need to exemplify the importance and
complexity of deeper investigation and analysis.

+1!

To expand a bit, inference is critical to good genealogy, and GedcomX needs to be able to document inferred conclusions which depend on indirect evidence from several sources -- including the absence of evidence.

@stoicflame
Copy link
Member

examples of negative proof

Indeed. Not modeled yet. That's tracked at #127.

any examples which allow the researcher to explain why the citation is thought to refer to (or not refer to) the relevant person

Indeed. I'll add that concept to the examples. My bad that it got left out.

there is a desperate need to exemplify the importance and complexity of deeper investigation and analysis.

That's fine.

Thanks for your patience. I suffer from being too close to the project to understand what it's like for people who are approaching it from a distance. Things that are clear in my head are hard for me to articulate in a way that it can be clear for everybody else.

I'd be interested to know how you'd like to see things articulated. Would more of a narrative help? What if the recipes included some text that told a story? E.g. "Sarah is doing research on George Washington and found this source that included information about his birth date and she does some analysis, etc., etc. etc."

Would you be willing to write some narratives for me and I can fill in the technical details?

@EssyGreen
Copy link
Author

@stoicflame - I suffer from being awol for too long hehe ...

It's actually interesting coming back to it afresh ... which may give some insight into articulation ... as an "outsider" coming back it it feels that there is a very detailed technical format there but very little in terms of overview and/or summary to lead you in ... you go from a high level diagram slap bang into XML/RDF/FOAF definitions and it's really difficult to see (a) how it hangs together and (b) how to use it. This applies to both the conceptual spec and the recipe book.

Also, (and perhaps more importantly) it feels that its all about the technology and we've lost the genealogy.

Taking the baking a cake example ... Here I am in my genealogy kitchen ready to make my cake ... I'm already a cook so I understand the basics of cake making but I've just bought my GEDCOM X Cookbook 'cos it was supposed to make the most wonderful cakes and get around all those pesking problems of the cake sinking in the middle or getting burnt. So I open it up and start reading ...

There are some nice pictures of a couple of generic cakes just inside the cover (=model diagrams) so I'm encouraged to read on ...

But the bulk of the book is loads of technical information about the equipment I need to use (=detailed definitions in the spec) .. It goes on about exactly the size of the baking tins, the thickness of the grease-proof paper, what make of oven I must use and the health and safety equipment I must have in my kitchen. ... Er I think I'll skip that part.

Right here's a recipe (=recipe in the recipe book) ... er ... it doesn't mention the ingredients ... it just gives the chemical compound of the finished cake!! Eesh I guess I can make out that there was some flour, sugar and currants in there somewhere but where did the eggs go?

What would I like to see?

More narrative? Not necessarily - more genealogy - real genealogical data/situations are needed. A "How Do I..." approach e.g. "... enter a simple BMD certificate", "Record a Census record which includes unrelated people in the household", "Enter the possible matches for an IGI record", "Record the secondary roles mentioned on a marriage certificate", "Transcribe a detailed Will where some of the bequests are illegible", "Log my search results across multiple repositories", etc

What is missing for me is the "method" part of the recipe - or to put it another way the process part of genealogy ... e.g. You currently have:

"The following example illustrates how to cite an online record. Evidence for Israel Hoyt Heaton is found in the 1920 U.S. Census. The URI to the record is "https://familysearch.org/pal:/MM9.1.1/M8PT-4GN". The URI for a description of the record is "https://familysearch.org/platform/sources/GGG-GGGG"."

You seem to start with an existing person:
Person: KWCD-QBC
Name: Israel Hoyt Heaton [STV-WXZY]
Gender: Male
Birth: 30th January 1880, Orderville, Utah [BCD-FGHJ]
Death: 29th August 1936, Kanab, Kane, UT [KLM-NPQR]

And a source record: 1920 US Census https://familysearch.org/pal:/MM9.1.1/M8PT-4GN (no idea what is in the source record tho and presumably no where to record this in GEDCOM X!!!!)

These seems to end up as:
Person: KWCD-QBC [M8PT-4GN]
Name: Israel Hoyt Heaton [STV-WXZY]
Gender: Male
Birth: 30th January 1880, Orderville, Utah [BCD-FGHJ]
Death: 29th August 1936, Kanab, Kane, UT [KLM-NPQR]

What did this mean? What did the researcher do? It looks like they just happened to find a guy with the same name in the 1920 Census and went "Yes it must be a match I'll add it as a citation" and they bunged the reference at the end of the record to "prove" it.
Regardless of how it is coded up, I couldn't interpret what the researcher meant/found with any degree of certainty or clarity.
Surely this is not what genealogy is all about?

How about this instead:

  1. Pre-requisites:
    Person: KWCD-QBC
    Name: Israel Hoyt Heaton
    Gender: Male
    Birth: Abt January 1910, Orderville, Utah
    Mother: Mary
    Father: Isambard Heaton
    Baptised: 20th February 1911, Orderville Church
    Residence: 1900 to 1901 Orderville, Utah
  2. Search 1920 Census and record search criteria and results: - no idea where to put these in GEDCOM X but here's something like what I would want:
    Source: M8PT-4GN
    Collection: 1920 US Census
    Date: 1st January 1920
    Place: 5612 Kane Road, Big Water, Utah
    Item 1: Isambard Heaton, Widower, Iron-Worker, Age 52
    Item 2: Israel Hoyt Heaton, Age 15
  3. Link citation to person:
    Person: KWCD-QBC
    Name: Israel Hoyt Heaton
    Gender: Male
    Birth: Abt January 1910, Orderville, Utah
    Mother: Mary
    Father: Isambard Heaton
    Baptised: 20th February 1911, Orderville Church
    Residence: 1900 to 1901 Orderville, Utah
    Residence: 1920 5612 Kane Road, Big Water, Utah [M8PT-4GN Possible - Name, age and father seem to fit but DoB is a few years out - could be transcription error?]

How does this fit with GEDCOM-X? I don't know ... and that's my problem ... it's not the syntax which is a problem - it's the model.

Would I be willing to write some? I'd love to but time as always is the problem ... How about starting a thread for use cases that we can all add to and you can pick from these to code up and include in the recipe book?

@stoicflame
Copy link
Member

Excellent comments. Like you said, these kinds of things take a lot of time to put together, so they have to be prioritized according to availability of resources.

It should be no surprise that the team working on GEDCOM X is very much focused on development and technical details. And we're also not working on just GEDCOM X, but also the new FamilySearch Developer Platform. So while we're very strong on the technical details and real, working code, we fall short in our abilities to gather all of the "real genealogical data/situations" that you express a need for. We've got a set of requirements for FamilySearch application(s), but that's just one vendor's perspective on the world.

How about starting a thread for use cases that we can all add to and you can pick from these to code up and include in the recipe book?

Can we not just use the issue tracker? Open up a new issue for each of those different scenarios?

@EssyGreen
Copy link
Author

Can we not just use the issue tracker?
Sure that's sort of what I meant

Open up a new issue for each of those different scenarios?
I suspect a new thread for each different case might swamp things - one new "Issue" with a post per use case is what I had in mind?

@stoicflame
Copy link
Member

I suspect a new thread for each different case might swamp things - one new "Issue" with a post per use case is what I had in mind?

Okay.

I'm thinking it would be nice to create a "template" so others can know what kind of details we need to create a real recipe. I'm thinking there might be other slow people like me that don't know how to write a recipe and I guide or a template could help.

I wonder, could we leverage a Google Doc (document or spreadsheet)? Or maybe the BetterGEDCOM wiki? Would either of those solutions work better?

I'm trying to think of the most efficient way to organize the effort.

@jralls
Copy link
Contributor

jralls commented Jun 12, 2012

Uh, there's already a wiki provided by Github. Why not use that? I don't think it's fair to the BetterGedcom folks to barge in on theirs, though it would be nice to invite them to contribute to yours. Yank the recipes.html page and redo it as a top-level page in the GedcomX wiki, pasting in the list from recipes.html as a starting point. Post an announcement on your blog asking folks to try to figure out from the spec how to do something and to create a recipe documenting it... and to write an issue if they can't figure it out.

@EssyGreen
Copy link
Author

Yank the recipes.html page and redo it as a top-level page in the GedcomX wiki, pasting in the list from recipes.html as a starting point. Post an announcement on your blog asking folks to try to figure out from the spec how to do something and to create a recipe documenting it... and to write an issue if they can't figure it out.

+1

@stoicflame
Copy link
Member

Okay, let's try it out.

Would you guys be willing to create a wiki page that looks like an actual recipe so we can put together some clear instructions that others can follow?

@jralls
Copy link
Contributor

jralls commented Jun 15, 2012

Absolutely... once I fully grok the spec! In particular, I need an answer to the questions about precedence, etc. I asked yesterday in #165 and which got buried in the discussion about RDF.

@stoicflame
Copy link
Member

Absolutely... once I fully grok the spec!

Oh, I didn't want to impose that as a prerequisite :-).

I was just hoping you could come up with the text, inputs, outputs, expectations, scenarios, whatever needs to go in. Leave big holes for where the code needs to go and I'll fill those in (if possible).

@EssyGreen
Copy link
Author

@stoicflame - Give me a place to do it and I'll detail some cakes I need recipes for :)

@jralls
Copy link
Contributor

jralls commented Jun 16, 2012

OK, I can do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants