Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ghostwriter equivalence tests for functions that mutate their arguments #4113

Open
sternj opened this issue Sep 22, 2024 · 2 comments
Open
Labels
enhancement it's not broken, but we want it to be better

Comments

@sternj
Copy link

sternj commented Sep 22, 2024

So, equivalence test generation at the moment doesn't really account for side effects on the parameters.

For instance, consider example.py, which is the following:

from typing import List

def first_function(a: List[int]) -> List[int]:
    for i in range(len(a)):
        a[i] = a[i] + 1
    return a

def second_function(a: List[int]) -> List[int]:
    for i in range(len(a)):
        a[i] = a[i] * 2
    return a

I run hypothesis write --style pytest --equivalent example.first_function example.second_function, which gives me

import example
import typing
from hypothesis import given, strategies as st


@given(a=st.lists(st.integers()))
def test_equivalent_first_function_second_function(a: typing.List[int]) -> None:
    result_first_function = example.first_function(a=a)
    result_second_function = example.second_function(a=a)
    assert result_first_function == result_second_function, (
        result_first_function,
        result_second_function,
    )

When I run this test, it passes, even though the two functions are clearly different. This is because the same List instance, a, is passed into both functions and mutated inside of them.

I've been thinking about this problem for a while, and I don't think there's necessarily a good solution, but I think that people will want to write property-based tests for functions with side effects often enough that it's worth thinking about. I also know that these tests are just meant to be a skeleton of what good tests could look like, but I think the frequency of this sort of situation warrants some automation.

Some thoughts on how to deal with this

First, I think it's a reasonable assumption that anything that one would need to compare would be a return value of the function. This is basically (I think) what Ghostwriter assumes in writing the tests.

Second, I'd lean towards saying that no two different parameters to a given function rely on referencing the exact same underlying object. I can see this happening in a pretty contrived case, but I can't come up with a compelling reason why someone would actually do that.

The most obvious answer I can come up with is just generating a set of calls to copy.deepcopy for each parameter, and passing the originals and copies into the two separate compared functions. (I know you can compare more than two, but I think this approach works when repeated n times as well).

Any thoughts on this? I'm definitely inclined to try and write up a proof of concept, but if I'm missing something obvious I'd definitely like to know!

@Zac-HD Zac-HD added the enhancement it's not broken, but we want it to be better label Sep 23, 2024
@Zac-HD Zac-HD changed the title Aliasing in Python Ghostwriter equivalence tests Mutable arguments in Python Ghostwriter equivalence tests Sep 23, 2024
@Zac-HD Zac-HD changed the title Mutable arguments in Python Ghostwriter equivalence tests Ghostwriter equivalence tests for functions that mutate their arguments Sep 23, 2024
@Zac-HD
Copy link
Member

Zac-HD commented Sep 23, 2024

This was a deliberate design decision, because I think the cost of writing less elegant code (e.g. inserting deepcopy(...)) by default is larger than the benefit of supporting the relatively rare and discouraged functions which mutate their arguments yet still make sense to compare by return value.

That said, I'd still be interested in adding an argument called e.g. copy_args: bool = False to the ghostwriter.equivalent() function.

Instead of exposing this through the CLI, we can try some heuristics for when to turn it on - e.g. do the usual ast.parse(inspect.getsource(fn)) trick and then look for argname.method(...) calls which are not part of an assignment or call, or argname[...] = .... We can probably get the false-alarm rate low enough to default copy_args=None and include this logic inside the function rather than in the CLI...

If you're interested in making a PR, that would be awesome! Err on the side of posting unfinished work and asking for help and feedback when that'd be useful; I and other maintainers are happy to help (to the extent that our free time lines up at least) 🙂

@sternj
Copy link
Author

sternj commented Sep 23, 2024

This seems like a principled way of doing things. I would guess that, while comparisons between functions that re-return one if its parameters is rare, I think that the case where a parameter is mutated inside a function is more common than you'd expect. A version of one of those functions that, for instance, does something like:

def first_function(buf: List[int], n: int) -> int:
   accum = 0
   for i in range(len(buf)):
      accum += n
      buf[i] = accum
  return sum(buf)

It is admittedly a tad contrived, but I've seen patterns like this before. Even ignoring the fact that the return values are objects here, I suspect that mutation is a more widespread practice than we'd hope.

I think the idea of looking for a parameter on the LHS of an ast.Assign or anywhere in an ast.Call is the best heuristic (perhaps looking for it in the LHS of only an ast.Assign where the LHS is an ast.Index or an ast.Attribute, also including ast.AugAssign and all of the other fun stuff).

I'll open a PR shortly, when I have the chance to write it out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement it's not broken, but we want it to be better
Projects
None yet
Development

No branches or pull requests

2 participants