Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch Edit: Support for editing basic fields #5417

Open
wants to merge 93 commits into
base: production
Choose a base branch
from
Open

Conversation

sharadsw
Copy link
Contributor

@sharadsw sharadsw commented Nov 26, 2024

Fixes #5413
Fixes #6209
Uses code from #4929

⚠️ Note: This PR affects database migrations. See migration testing instructions.

This PR pretty much has the same code as #4929 but updated to production. To minimize the scope for testing, batch edit for relationships is disabled in this PR and will be enabled in the follow up PR - batch edit for relationships. There are still some merge conflicts resulting from some newer code in prod which will be resolved in the next PR.

Checklist

  • Self-review the PR after opening it to make sure the changes look good
    and self-explanatory (or properly documented)
  • Add automated tests
  • Add relevant issue to release milestone

Testing instructions

  • Create a query
  • Click on Batch Edit. This will create a new Batch Edit dataset with columns involving relationships disabled (readonly)
  • Change values you wish to edit
  • Click Validate
  • Verify changed values are highlighted as updated cells
  • Click Commit. This will apply your changes
  • Re-run the original query and verify the values changed as expected
  • Click rollback
  • Verify values were rolled back to original

Regression tests

@sharadsw sharadsw changed the title Batch Edit: Support for editing basic fields [WIP] Batch Edit: Support for editing basic fields Nov 26, 2024
@sharadsw
Copy link
Contributor Author

sharadsw commented Dec 2, 2024

POTENTIAL IMPROVEMENTS?

  • Data Mapper in Batch Edit?
    • Should the Data Mapper be accessible in Batch Edit datasets? Since Batch Edit datasets can only be created from a Query, it seems confusing that users can still access the Data Mapper from a Batch Edit dataset with most of the mapper functionalities being disabled. It seems the mapper is only needed for changing Batch Edit Preferences. Could move the preferences dialog to Editor page instead.

@realVinayak
Copy link
Contributor

realVinayak commented Dec 2, 2024

Nice observation. My fault for keeping a hidden feature secret: Another, use for data mapper is to control “ignore when blank”-type behavior per field basis. That’s actually supported in that PR ( even for batch edit data mapper, you can modify this per-field setting. Ask grant for more context )

the only things not read only are

  1. Batch edit prefs
  2. Field behavior (never ignore, …)
  3. must match (also, could you leave a note somewhere to make trees must match by default?)

search for usages of “readonlyspec”.

For purposes of this PR, you can theoretically not add data mapper (you also don’t need batch edit prefs, they don’t do anything when there are no relationships)

@sharadsw
Copy link
Contributor Author

sharadsw commented Feb 7, 2025

@emenslin Cannot reproduce the interactions dialog and app resource error locally on the same db and also other dbs. Can you try making a new copy of the original db for testing?

The agent title error should now be fixed and there should be a validation error for records that don't belong to its picklist. However, there might be cases where a agent title record has the same title as what the picklist has but a different value. In such cases there won't be a validation error but the record will be updated to the correct picklist entry despite making no changes.

Copy link
Collaborator

@combs-a combs-a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Verify changed values are highlighted as updated cells
  • Verify values were rolled back to original

Didn't get to the regression test since I was testing other fields.

I wasn't able to recreate this issue a second time for some reason, but I had a case in which batch edit edited a preparation count above the amount it had available.

i.e.: Used a prep record which had a count of 3 with only 2 available preps. A gift preparation record was created, and the quantity of this gift preparation was 2. Opening this in batch edit, I was able to set the quantity on the gift preparation to 5 and save with no warning.

Checking the previous records, the quantity on the gift prep was 5, and then the original preparation record now had a count of 5. Rolling back the batch edit did not fix the preparation record.

Here's the audit log for that, though the old and new values are not displayed I know for sure that it was 3 before that batch edit, and there shouldn't have been two edits; just one from adding gift preps.

image

Link to the prep record: https://sdnhmherps2824batch2-issue-5413.test.specifysystems.org/specify/view/preparation/80361/

@emenslin
Copy link
Collaborator

emenslin commented Feb 7, 2025

Relating to the latlong issue, I know you said you already wrote up an issue for it but is there anyway it could be fixed in this pr or at least in the same release as batch edit? I know no values were actually changed but I feel like this could be very confusing and concerning to users. Although I used a simple example in the video this still happens if you were to make one change to the data set, the users would have no way of really knowing if they made a change or if it just looked like they did.

@sharadsw
Copy link
Contributor Author

sharadsw commented Feb 7, 2025

@emenslin Yeah we can do it in the same release/PR. My guess is that the problem exists in other places as well (even for non-decimal values) and we need a more nuanced way to compare old and new values. Will add it to this PR if it's a small fix

@realVinayak
Copy link
Contributor

we need a more nuanced way to compare old and new values.

Correct, but we already do (for API)! See

def fld_change_info(obj, field, val) -> Optional[FieldChangeInfo]:
if field.name != 'timestampmodified':
value = prepare_value(field, val)
if isinstance(field, FloatField) or isinstance(field, DecimalField):
value = None if value is None else float(value)
old_value = getattr(obj, field.name)
if str(old_value) != str(value): # ugh
return FieldChangeInfo(field_name=field.name, old_value=old_value, new_value=value)
return None

Should just be a mechanical change to use it here. Also,

if isinstance(field, DateTimeField) and isinstance(val, str):
branch is irrelevant to us (trace out why)

EDIT: I just saw you pushed 4814f11. Would still recommend change from api.py (to make things consistent)
/////////////////

There is actually a separate bug here in general. Previously, Elizabeth mentioned this error in 4929 review: #4929 (review). That is, some localities were flagged as new. This happens because, when parsing longitude and latitude fields, we also artificially set long1text and fields. That is, you'd observe that if we do a workbench upload but just set latitude1 and longitude1 fields, long1text and latitude1text will also be set.

Relevant code is here:

def parse_latlong(field: Field, value: str) -> ParseResult:
parsed = parse_coord(value)
if parsed is None:
return ParseFailure('coordinateBadFormat', {'value': value})
coord, unit = parsed
if field.name.startswith('lat') and abs(coord) > 90:
return ParseFailure("latitudeOutOfRange", {'value': value})
if field.name.startswith('long') and abs(coord) > 180:
return ParseFailure('longitudeOutOfRange', {'value': value})
return ParseSucess({field.name.lower(): coord,
'originallatlongunit': unit,
field.name.lower().replace('itude', '') + 'text': parse_string(value)})

Generically, parsing 1 field can result in multiple new fields. This is why ParseResult (which contains result for just one field) has the upload attribute as a dict. This is also why we have a nested loop when getting attributes that belong to a uploadTable

Relevant code:

attrs = {
fieldname_: value
for parsedField in self.parsedFields
for fieldname_, value in parsedField.upload.items()
}

Using the above, you might think "Oh, if we add extra lat1text and long1text, then doesn't this mean we would fail to match existing localities which have those fields set as null". That is, say {"latitude1": 34.07} is parsed as {"latitude1": 34.07, "lat1text": "34.07"}, then during match, we would fail to match this to a locality with {"latitude1": 34.07, "lat1text": None}. And, you'd be correct -- that's a bug. However, things weren't always this way. The above faulty behavior was introduced in #4548. Before that, the extra lat1text (and long1text) were just added to the filter part. Relevant ancient code:

def parse_latlong(field, value: str, column: str) -> Union[ParseResult, ParseFailure]:
parsed = parse_coord(value)
if parsed is None:
return ParseFailure('coordinateBadFormat', {'value':value}, column)
coord, unit = parsed
if field.name.startswith('lat') and abs(coord) >= 90:
return ParseFailure('latitudeOutOfRange', {'value':value}, column)
if field.name.startswith('long') and abs(coord) >= 180:
return ParseFailure('longitudeOutOfRange', {'value': value}, column)
text_filter = {field.name.replace('itude', '') + 'text': parse_string(value)}
return ParseResult(
text_filter,
{field.name: coord, 'originallatlongunit': unit, **text_filter},
None,
column,
None
)

I'd encourage you, trace that code out from the v7.9.0 (and see how it is just added to the filter part) --- this is also the reason why we have separate filer and upload attribute on ParseResult.

All the above is precursor to the bug (in batch-edit):

When you have latitude1 and/or longitude1, the relevant text will also be edited -- but, the user would not know that it did (since we report that). I'm not sure about the best way to fix this. Can this be discussed internally and determined if this needs to be fixed at all? I'm neutral on it:

  1. It changes data without user's permission (bad).
  2. It fixes possible incorrect data (good).

////////////////////////

Now, for the bug in #4929 (review).

You can probably see what the source of Elizabeth's bug was. IIRC, it was definitely because lat1text and long1text was empty (but we were matching with artificially added values -- which were not null). Can this be tested again when batch-edit supports relationships?

So, in conclusion, three bugs are related to locality.

  1. Decimal and value equality check. Can be fixed with that code from api.py.
  2. Extra fields are edited, without the user knowing.
  3. Extra fields are added when matching (partially caused from Locality and Geocoorddetail batch import tool #4548), which causes some localities to not be matched.

@sharadsw
Copy link
Contributor Author

@combs-a I'm not able to recreate it. I run into a validation error if I try to set the preparation quantity greater than the count or the availability:
Screenshot 2025-02-13 at 2 43 19 PM

What I did was:

  • Create a CO with preparation count 5
  • Create 2 Gift records
    • One with preparation qty 3
    • One with preparation qty 2
  • Tried to update preparation qty > 5 in Batch edit

@sharadsw
Copy link
Contributor Author

@realVinayak Doesn't look like fld_change_info can be used here without refactoring it to do something similar as is_equal from 4814f11. I'm happy to keep is_equal for now and note this in #6170

def fld_change_info(obj, field, val) -> Optional[FieldChangeInfo]:
if field.name != 'timestampmodified':
value = prepare_value(field, val)
if isinstance(field, FloatField) or isinstance(field, DecimalField):
value = None if value is None else float(value)
old_value = getattr(obj, field.name)
if str(old_value) != str(value): # ugh
return FieldChangeInfo(field_name=field.name, old_value=old_value, new_value=value)
return None

  • The string comparison is unreliable when comparing objects with a primitive. Not sure how we can handle it while keeping it generic
    • We get a scope change error when comparing discipline/scope values since the comparison becomes Discipline object (3) == 3 => false
  • For Decimal values, the old value remains a Decimal and the new value is a float and so it still gets detected as a change for lat/long
    • i.e str(Decimal(33.1493800000)) == str(33.14938) => 33.1493800000 == 33.14938 => false

Made a note of the lat/long text issues in #6126, thanks!

@sharadsw
Copy link
Contributor Author

@specify/testing The lat/long decimal issue should now be fixed. The lat/long text update + matching issues is added to #6126 and will be tested when we enable relationships.

@sharadsw sharadsw requested a review from combs-a February 13, 2025 21:29
@realVinayak
Copy link
Contributor

realVinayak commented Feb 13, 2025

Ah yes, you're right -- it's more involved than I thought, sorry for wasting your time. I assumed that the previous code is bug-free, but it is not (I tested this on sp7demofish, and latitude are always considered to be changed when saving a locality form -- that's another, unrelated to batch edit, bug). I agree, just keep what you have in that case.

Probably also need to document the lat/long text update (plus other fields that get updated, refer to the code for it). Maybe the discourse documentation is enough, or if there should be a warning when you use locality as the base table.

Just for reference (no changes needed):

RE: We get a scope change error when comparing discipline/scope values since the comparison become

Can replace field.name by field.attname and that'll fix this. Attname will convert discipline to discpline_id (which will be 3),

Copy link
Collaborator

@combs-a combs-a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sharadsw I've tried to recreate my issue again a few times, and wasn't able to as well--it should be kept in mind, but it might've just been something very specific.

  • Verify changed values are highlighted as updated cells
  • Re-run the original query and verify the values changed as expected
  • Verify values were rolled back to original

Regression tests

LatLon has been kind of fixed, but there are also potentially integers; ints are being converted to floats and that marks them as updated cells when they shouldn't be.

image

image

Also, I tried out COs that use COTs with the AlphaNumByYear type, and batch edit flagged the correct format as invalid, and incorrect formats as valid. Is fixing this within the scope of the PR? Wanted to check since the same thing works fine in the Workbench.

The incorrect format is AAAAAA, which was the default catno format before I changed it for FunnyTurtles.

image

@sharadsw
Copy link
Contributor Author

sharadsw commented Feb 18, 2025

@combs-a Catalog number validation is broken because relationships are disabled and so the value gets validated with the collection's default formatter. I created a new issue for it (#6248) but it will be fixed when relationships are enabled in #6126

About the lat/long issue, I'm not sure how we can reliably handle all cases there. The underlying cause for that row being updated is the issue Vinny mentioned where lat1text and lat2text get updated internally.
Specifically, here lat1text and lat2text get updated from 12 to 12.0 but latitude1 and longitude1 are not detected as changed since 12.0 == 12.

We could potentially parse the values similar to the frontend so that any integer values pretending to be floats get converted to integers (ie. 12.0 -> 12) in the dataset but that will lead to another issue where a value actually saved as 12.0 also gets converted to 12 and leads to a change detected in batch edit

Another potential solution could be to have comments on the cell in the frontend when lat/long text values update indicating the reason behind the row being shown as updated. Similar to how we have comments on validation error cells. Unfortunately, we don't have a way to do that right now but could be considered in future.

Either way I have created a separate issue for the underlying cause which will have to be looked at in a later milestone #6251

@sharadsw sharadsw requested a review from combs-a February 18, 2025 21:04
Copy link
Collaborator

@combs-a combs-a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Verify changed values are highlighted as updated cells
  • Re-run the original query and verify the values changed as expected
  • Verify values were rolled back to original

Regression tests


All tests passed, good work!

Copy link
Collaborator

@emenslin emenslin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't run into any new issues but since this is such a big PR I would like if we could get another testing review just to make sure nothing was missed.

@emenslin emenslin requested a review from a team February 21, 2025 16:59
Copy link
Collaborator

@lexiclevenger lexiclevenger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Verify changed values are highlighted as updated cells
  • Re-run the original query and verify the values changed as expected
  • Verify values were rolled back to original

Regression tests


Looks good, nice job!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Migration Prs that contain migration
Projects
Status: Dev Attention Needed
9 participants