-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add type hints to db classes #1919
base: master
Are you sure you want to change the base?
Conversation
I've created this PR whilst it is still a work in progress to get some input on the impact. Having done that, code such as this Lines 1959 to 1962 in 1280aa4
now gets (correctly) flagged as an error because it just uses person even if it is None
The solution I've taken so far in this PR is to amend the code to handle a return of Before I go any further, is that the correct approach or is there a better way? |
I thought about alternative approaches where One way would return, say, a Another approach would be to remove the handle from the reference object. It might be possible to pre-process the entire DB, caching the proper references and removing anything pointing to a private object, and replacing sensitive values, like "Living". This would probably be faster and less error prone in the end. |
(Note, I haven't tried to understand the logic in this particular case, so my comment is just on general principles). If the code doesn't currently try to deal with a return of I don't think the code should be guarded with The problem seems (to me) to be that a Proxy could correctly return |
How about using Assert person is not None? Am I right in thinking that Assert is only executed if Debug is on. In which case, this would not impose a performance penalty. However, it would provide documentation of what was expected, and presumably would allow the static type checking to work correctly. If is is only on with debug, then the CI test modules should all be run with debug on (I don't know whether this is done or not). |
e72da20
to
f0a828a
Compare
I was trying to restrict this PR to just adding type hints. I've not quite succeeded but I'm reluctant to make larger scale code changes if I can avoid it. |
As I understand it a proxy db is substitutable for the real db. So all higher level (business logic if you like) code should be written to correctly handle a None return. If we add an alternative set of methods, which can return None, won't we still have to support a None return type in much of the code? |
gramps/gen/proxy/proxybase.py
Outdated
""" | ||
Return an iterator over database handles, one handle for each Person in | ||
the database. | ||
""" | ||
return filter(self.include_person, self.db.iter_person_handles()) | ||
return (_ for _ in filter(self.include_person, self.db.iter_person_handles())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return (_ for _ in filter(self.include_person, self.db.iter_person_handles()))
Flagging this change for comment.
Method: def iter_person_handles(self) -> Generator[PersonHandle]
This code previously returned a List. The equivalent method in DbGeneric
returns a Generator[PersonHandle]
.
I've standardised on returning a generator. Is this the right choice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me.
@@ -78,10 +105,12 @@ class DBAPI(DbGeneric): | |||
Database backends class for DB-API 2.0 databases | |||
""" | |||
|
|||
dbapi: Any # might be better to have a specific ConnectionBase class |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dbapi: Any # might be better to have a specific ConnectionBase class
Flagging this line for comment.
It's not correct to type hint dbapi
as Any
but I could not find an appropriate superclass to use.
Cursor functions such as
|
I don't see how protecting the code locally with When Nick wrote that
I presumed that he meant that anywhere that
From what Nick had said, I imagined that the code is effectively divided into code that can be called with a proxy db, and code that cannot be called with a proxy db. And I imagined that all the code that could be called from a proxy db already dealt with the possibility that So, I think that either you should have a statement like |
This is incorrect. Much code can be called with a limiting proxy (Private, Living) in place. This most likely happens in reports, filters, some utils (like alive.py), exports, and tools. |
@stevenyoungs I think your approach is sound. We can work on proxies and their ability to return None in another initiative. |
# Derived fields | ||
if table == "Person": | ||
given_name, surname = self._get_person_data(obj) | ||
given_name, surname = self._get_person_data(cast(Person, obj)) | ||
sets.append("given_name = ?") | ||
values.append(given_name) | ||
sets.append("surname = ?") | ||
values.append(surname) | ||
if table == "Place": | ||
handle = self._get_place_data(obj) | ||
handle = self._get_place_data(cast(Place, obj)) | ||
sets.append("enclosed_by = ?") | ||
values.append(handle) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better if this were written as
if type(obj) is Person
etc.
(If I am right that this works the same)
I appreciate that this would be changing existing code, but surely no more than adding cast
. The existing code is not very clear - accessing the type through table = obj.__class__.__name__
is complex and obscure, the conversion to name is not really necessary, and table is not a good name for the variable.
Testing the type in the if statement would (AIUI) automatically provide the static type checking, and if anyone wanted to add more code in the future, that wouldn't need to be type confirmed by something like a cast.
Well, if you are going to work on the ability of proxies to return |
gramps/gen/db/base.py
Outdated
""" | ||
Remove a person as either the father or mother of a family, | ||
deleting the family if it becomes empty; trans is compulsory. | ||
""" | ||
person = self.get_person_from_handle(person_handle) | ||
family = self.get_family_from_handle(family_handle) | ||
if person and family: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please put code changes in a separate PR. Keep this PR for adding type hints only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. The code changes will need to be applied first. The type hints on their own will not pass mypy; the code changes are the minimal set strictly required for mypy to pass.
As @stevenyoungs mentioned, he is only adding type hints in this PR (and adjusting code just enough to pass mypytests). If someone reports a crash because some general code is failing because it isn't properly handling proxies that return None, the immediate fix is to add code that checks the return value. A complete rethink of proxies is going to take some time and energy. |
Maybe that 'immediate fix' is right, and maybe it isn't. I'm sorry to point out something that I am sure you already know! I'm sure you don't mean only just blindly patch the code where the crash manifests itself. Not handling proxies that return That's why I don't think you should just add an if statement to skip part of the code. That may well turn out to be the necessary and sufficient solution to the problem, but it will probably take some time and energy (and perhaps another initiative) to determine it. I'm really excited about the static typing and especially how it is forcing clarification of the code, as mentioned by several of the comments in this PR. However, if the code was going to crash before, then it should still crash after the type checking has been added. Hence my suggestion of |
The problem is that the type hinted code does not pass type checking unless you modify the code to handle a None return. |
f0a828a
to
ca3f71d
Compare
As suggested, I've split the code changes out into a separate PR #1934 |
I think that allowing A better approach would be to make the proxies return empty objects instead. Then a check like This doesn't actually fix the problem with proxies leaking information, but that could be dealt with at a later date. |
ca3f71d
to
8301b28
Compare
Is there a heuristic that can be used to check if a |
An empty object would have its handle set to
|
Agreed!
Doesn't this just move the crash to a different place? If you return an empty object, then isn't it possible that the next piece of code will try to use the non-existent handle and consequently crash? I suppose the question we might ask is "does this handle point to an accessible object?" before trying to call
If you modify the code to add an assert, will this pass type checking? My theory is that such a modification would not alter the behaviour of the code (unless debug was switched on), so is the minimal change without extensive modification to fix the underlying problem. |
@Nick-Hall said:
Maybe (I mentioned that above). An even better approach would be that we would never have a reference handle that pointed to a an inaccessible object. @kulath said:
I don't think we want to surround every BUT, we could rethink the proxies to remove all handles to inaccessible objects. Let's consider that before changing every get method. |
Default to returning True so that all objects are included by default
gfilter_none can return None gfilter does not return None and raises an error
Use `yield from ()` rather than `return` and `return []`
…ded in the result
…ck if a handle is valid
41a36f8
to
e13ef86
Compare
I think this is ready for review. This PR is to review the final commit only "Add type hints to db methods". It only adds type hints. |
…check for handle existence. As a result it can get the object directly rather then convert the raw data into an object
In future, this allows the return type hint of <object> | None for consistency with other db implementations
e13ef86
to
903219d
Compare
cf983a9
to
25f76d1
Compare
25f76d1
to
ec701c8
Compare
Add type hints to the various db classes.
This builds upon, and requires, a minimal set of code changes introduced as PR #1934. (These changes were originally included in this PR)
All type hints are in the final commit, which only contains type hints.