-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update parser to handle hydrates, rational subscripts, complexes, and caged species #205
Conversation
Add failing tests for complexes and fraction subscripts. Reorganize ``test_formula_to_composition`` into several thematic tests. Add a few examples. Signed-off-by: Jeremy A Gray <[email protected]>
Expand regular expression for legibility and add a results name to find the elements after parsing. Signed-off-by: Jeremy A Gray <[email protected]>
Deprecate the use of slashes in charge strings and add tests for enforcement. Remove use of slash in charges in other tests. Signed-off-by: Jeremy A Gray <[email protected]>
Parse square bracket and curly brace groups, without added information. Add tests to check composition and to detect grouping symbol mismatch. Signed-off-by: Jeremy A Gray <[email protected]>
Require complete parsing of formula input string to prevent typographical errors. Add a new `primes` group to match one or more `*` or `'` (attached to the species and before any charge or state symbol) used to indicate excited or otherwise special species. Signed-off-by: Jeremy A Gray <[email protected]>
Parse fractional subscripts, provisionally changing the hydrate separator to `..` from `.` to disambiguate from decimal number subscripts. Add multiple `sympy.nsimplify()` calls in chemistry.py and test_chemistry.py to convert decimals to rationals to integers. BREAKING CHANGE: Change hydrate separator from `.` to `..`. Closes: closes bjodah#176 Signed-off-by: Jeremy A Gray <[email protected]>
Parse species like `Li@C60` indicated one species trapped by another, such as a lithium atom inside a buckminsterfullerene. Closes: closes bjodah#202 Signed-off-by: Jeremy A Gray <[email protected]>
Add more parser tests including a negative test for bad charges (`Na+Cl-`). Signed-off-by: Jeremy A Gray <[email protected]>
Signed-off-by: Jeremy A Gray <[email protected]>
Fix hydrate example in docstring to reflect change from `.` to `..`. Signed-off-by: Jeremy A Gray <[email protected]>
Pass subscripts as integers if at all possible to prevent problems with ``sympy.linsolve()``. Fix some linting issues. Signed-off-by: Jeremy A Gray <[email protected]>
Parse and suppress state symbols to avoid problems parsing formulas with states that are not in provided suffix lists, as revealed in the doctests in chempy/chemistry.py. Add additional chemistry and parsing tests to check parsing and duplicate doctests. Fix some flake8 lint warnings. Signed-off-by: Jeremy A Gray <[email protected]>
Signed-off-by: Jeremy A Gray <[email protected]>
55f7299
to
ac0f5da
Compare
Great, thanks @jeremyagray! I just visually verified that some tests we discussed (
I did a check for a simple case, and it doesn't seem to require all of the statements to raise that error. For example, this test passes:
despite the second division statement not raising ZeroDivisionError. So unless I'm missing some context (I might be!), I think each of those checks might need to be done individually. To continue the good practice of avoiding re-writing the test code, the test can be parametrized, for example like this:
|
No, @bertiewooster you're right about the tests. I was just adding tests for this PR without much thought for organization as I planned to do another PR on some testing issues that needed attention. All the parsing tests and probably some others should be parameterized and I thought to do that and inspect the coverage in another issue/PR. If there are no other problems with this PR, one of us should probably get around to merging it in the not to distant future. |
Hi @jeremyagray is there anything else that needs to be done to push all your hard work across the finish line and justify a new release, or does it just need to be merged? |
No, not that I know of. I just left it to allow (hopefully) someone else
to verify it works before I finally push it to the main branch.
…On Sat, Mar 12, 2022 at 11:16 AM Jeremy Monat ***@***.***> wrote:
Hi @jeremyagray <https://github.com/jeremyagray> is there anything else
that needs to be done to push all your hard work across the finish line and
justify a new release, or does it just need to be merged?
—
Reply to this email directly, view it on GitHub
<#205 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOQCHS7LJHOEODMH2X33XZLU7TGOJANCNFSM5N4FDHJQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Jeremy A. Gray
|
@jeremyagray Could you parametrize the tests which contain so that each parameter set is tested, that it actually fails, and not just one parameter set? I believe that should be done for the following:
|
I just parametrized all the parsing tests. They probably need some more organization, but this at least gets all the tests running as they should and has the added benefit of boosting the test statistics. The checks are failing because I've got some version and configuration drift between my local black and flake8 and the CI machine's versions. I'll fix that shortly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I visually reviewed the changes and they seem good, so once the checks pass, I recommend merging this commit. I like the elegance of the tests now that they're parametrized, for example
assert formula_to_composition(species) == composition
Thanks for putting in the extra effort to parametrize all the tests!
Parametrize all parsing tests. Signed-off-by: Jeremy A Gray <[email protected]>
Changes seem fine and checks passed. Unless @jeremyagray has any more changes in the works, recommend merging. |
Some really nice work here, thank you @jeremyagray for getting this done! |
Update parser to handle hydrates (via the
..
operator, not.
), rational subscripts (subscripts as decimals, with.
), coordination complexes (uses grouping symbols()
,[]
, and{}
but does not attach special meaning), and caged species (via the@
operator). Add parsing of some state symbols to the main parser, but suppress them for now so that the phase identification code in chemistry.py does not need changed; this does not affect the existing state symbol handling. Update parsing tests to reflect changes and make minor updates to the rest of the code and other tests to accommodate parser changes. Implement the planned deprecation of/
in charge strings.Breaking change: Breaks the old hydrate operator,
.
. Hydrates must now be indicated with the..
operator, as inBaCl2..2H2O
. The old hydrate operator,.
, is now used for rational subscripts. As warned in the last deprecation cycle,/
is now invalid in charge strings.Closes: closes #202, #176