Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for np.random.Generator #6566

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

NoureldinYosri
Copy link
Collaborator

@NoureldinYosri NoureldinYosri commented Apr 22, 2024

This PR addresses #6531

I don't modify support for old RANDOM_STATE_OR_SEED_LIKE instead I create a new PRNG_OR_SEED_LIKE type.

The reasons for that are:

  • RANDOM_STATE_OR_SEED_LIKE implies that the result will be a RandomState. This is enforced in the code by forcing a a cast to RandomState in the parse_random_state function
    return cast(np.random.RandomState, random_state)
  • The parse_random_state defaults to np.random when the seed is None which is problematic when usign multiprocessing/multithreading since the internal state of the np.random module becomes a shared state that will make the results of experiments correlated, it will also negatively impact the performance since the threads/processes will be blocked on write operations to the internal state of the module.
    return cast(np.random.RandomState, np.random)
  • Using np.random is also problematic in other ways as demonstrated in Can't use cirq.Simulator() in a multiprocessing closure (unable to pickle) #3717
  • The RANDOM_STATE_OR_SEED_LIKE type is just an alias of Any which makes using @overload to specify the return type impossible.

The solution

  • The new PRNG_OR_SEED_LIKE has union of None, int, np.random.RandomState, np.random.Generator, _CUSTOM_PRNG_T, the first two are types of seeds that will always give an np.random.Generator, the rest are objects that will be returned as is. and using @overload we specifiy the correct return type.

deprecating RANDOM_STATE_OR_SEED_LIKE atm is not possible due to how widely it's used. instead we should start to prefer using PRNG_OR_SEED_LIKE and maybe deprecate RANDOM_STATE_OR_SEED_LIKE at cirq 2.0 and remove it by cirq 3.0

Copy link

codecov bot commented Apr 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.82%. Comparing base (5d22a53) to head (770e8fe).
Report is 75 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #6566   +/-   ##
=======================================
  Coverage   97.82%   97.82%           
=======================================
  Files        1066     1068    +2     
  Lines       91864    91905   +41     
=======================================
+ Hits        89862    89903   +41     
  Misses       2002     2002           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@NoureldinYosri NoureldinYosri marked this pull request as ready for review April 22, 2024 19:21
@NoureldinYosri NoureldinYosri requested review from vtomole, cduck and a team as code owners April 22, 2024 19:21
@pavoljuhas
Copy link
Collaborator

NoureldinYosri requested a review from pavoljuhas ...

sorry about the delay, I will do the review this afternoon.

Copy link
Collaborator

@pavoljuhas pavoljuhas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of introducing new type we should consider defining and extending RANDOM_STATE_OR_SEED_LIKE and provide parsers that would express it as either a RandomState or Generator as needed. Eventually we may completely transition to Generator-s and deprecate/remove the parser to RandomState.

cirq-core/cirq/value/prng.py Outdated Show resolved Hide resolved
Comment on lines 63 to 65
def parse_prng(
prng_or_seed: PRNG_OR_SEED_LIKE,
) -> Union[np.random.Generator, np.random.RandomState, _CUSTOM_PRNG_T]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type of different types tends to be a code smell. Such returned value is less useful for type checking. In addition, Generator and RandomState (not to mention _CUSTOM_PRNG_T) have different APIs so the parse_prng caller would still need to do some isinstance check to ascertain the actual type and figure what methods can be called.

I would propose an alternative approach:

(1) convert the RANDOM_STATE_OR_SEED_LIKE type from Any to the Union of numpy types that can be converted to RandomState and the np.random.Generator type. Hopefully this can be done without too much hassle with typechecks, because the current Any type skips them completely.

(2) extend parse_random_state to accept a Generator object and convert it to RandomState.
Generator-s have bit_generator attribute that can be used to create RandomState.

(3) add method parse_random_generator to the cirq.value.random_state module which would take RANDOM_STATE_OR_SEED_LIKE argument and convert it to a Generator object.
np.random.RandomState() has a _bit_generator attribute that can be used for creating a Generator.
If in some configurations the _bit_generator is not present, we can just use RandomState.randint to get a seed for the np.random.default_rng()

With these steps in place, we can keep all the existing interfaces that take RANDOM_STATE_OR_SEED_LIKE and just start replacing its interpretation from parse_random_state to parse_random_generator as needed.

This would also avoid bifurcation between RANDOM_STATE_OR_SEED_LIKE and PRNG_OR_SEED_LIKE types that may need several major releases to clear up.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ptal

cirq-core/cirq/value/prng.py Outdated Show resolved Hide resolved
cirq-core/cirq/value/prng_test.py Outdated Show resolved Hide resolved
@CirqBot CirqBot added the size: M 50< lines changed <250 label May 5, 2024
@NoureldinYosri
Copy link
Collaborator Author

@pavoljuhas I updated parse_random_state to accept a generator and turn it into a randomstate and parse_prng to always return a generator.

however I think that we do need new type PRNG_OR_SEED_LIKE

Copy link
Collaborator

@pavoljuhas pavoljuhas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If at all feasible we should avoid Any in the PRNG_OR_SEED_LIKE type and discourage the np.random module use as a parse_prng argument.

Otherwise LGTM.

Comment on lines +29 to +31
If is an integer or None, turns into a `np.random.Generator` seeded with that value.
If is an instance of `np.random.Generator` or a subclass of it, return as is.
If is an instance of `np.random.RandomState` or has a `randint` method, returns
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If is an integer or None, turns into a `np.random.Generator` seeded with that value.
If is an instance of `np.random.Generator` or a subclass of it, return as is.
If is an instance of `np.random.RandomState` or has a `randint` method, returns
If an integer or None, turns into a `np.random.Generator` seeded with that value.
If an instance of `np.random.Generator` or a subclass of it, return as is.
If an instance of `np.random.RandomState` or has a `randint` method, returns



def parse_prng(
prng_or_seed: Union[PRNG_OR_SEED_LIKE, RANDOM_STATE_OR_SEED_LIKE]
Copy link
Collaborator

@pavoljuhas pavoljuhas May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just have only the PRNG_OR_SEED_LIKE type?

RANDOM_STATE_OR_SEED_LIKE is Any so it turns off type checking of the argument.

Comment on lines +59 to +60
if prng_or_seed is None or isinstance(prng_or_seed, numbers.Integral):
return np.random.default_rng(prng_or_seed if prng_or_seed is None else int(prng_or_seed))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we return a singleton Generator object for None?
The None arg is going to be frequently used as a default for optional arguments.
Singleton would prevent creation of potentially large number of Generator objects.

if prng_or_seed is None or isinstance(prng_or_seed, numbers.Integral):
return np.random.default_rng(prng_or_seed if prng_or_seed is None else int(prng_or_seed))
if isinstance(prng_or_seed, np.random.RandomState):
return np.random.default_rng(prng_or_seed.randint(2**31))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can reuse the bit generator for a more genuine conversion.

Suggested change
return np.random.default_rng(prng_or_seed.randint(2**31))
return np.random.default_rng(prng_or_seed._bit_generator)

randint = getattr(prng_or_seed, "randint", None)
if randint is not None:
return np.random.default_rng(randint(2**31))
raise TypeError(f"{prng_or_seed} can't be converted to a pseudorandom number generator")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - maybe state the actual class here ?

Suggested change
raise TypeError(f"{prng_or_seed} can't be converted to a pseudorandom number generator")
raise TypeError(f"{prng_or_seed} cannot be converted to the numpy.random.Generator")

Comment on lines +23 to +24
def _sample(prng):
return tuple(prng.random(10))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this. One output from random() is enough to check if 2 generators are at the same seed.

Comment on lines +30 to +33
# An `np.random.Generator` or a seed.
group_inputs: List[Union[int, np.random.Generator]] = [42, np.random.default_rng(42)]
group: List[np.random.Generator] = [cirq.value.parse_prng(s) for s in group_inputs]
eq.add_equality_group(*[_sample(g) for g in group])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us not check cross-group inequality. Following the test_parse_random_state style is a bit more readable

Suggested change
# An `np.random.Generator` or a seed.
group_inputs: List[Union[int, np.random.Generator]] = [42, np.random.default_rng(42)]
group: List[np.random.Generator] = [cirq.value.parse_prng(s) for s in group_inputs]
eq.add_equality_group(*[_sample(g) for g in group])
# An `np.random.Generator` or a seed.
prngs = [
cirq.value.parse_prng(42),
cirq.value.parse_prng(np.int32(42)),
cirq.value.parse_prng(np.random.default_rng(42)),
]
vals = [prng.random() for prng in prngs]
eq = cirq.testing.EqualsTester()
eq.add_equality_group(*vals)


# A None seed.
prng = cirq.value.parse_prng(None)
eq.add_equality_group(_sample(prng))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a noop check for a single value. Perhaps replace with

assert prng is cirq.value.parse_prng(None)

if you are OK with the previous suggestion to have a singleton generator for None.

Comment on lines +39 to +41
# RandomState PRNG.
prng = cirq.value.parse_prng(np.random.RandomState(42))
eq.add_equality_group(_sample(prng))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can check reproducibility here -

Suggested change
# RandomState PRNG.
prng = cirq.value.parse_prng(np.random.RandomState(42))
eq.add_equality_group(_sample(prng))
# RandomState PRNG.
prngs = [
cirq.value.parse_prng(np.random.RandomState(42)),
cirq.value.parse_prng(np.random.RandomState(42)),
]
vals = [prng.random() for prng in prngs]
eq = cirq.testing.EqualsTester()
eq.add_equality_group(*vals)

Comment on lines +43 to +45
# np.random module
prng = cirq.value.parse_prng(np.random)
eq.add_equality_group(_sample(prng))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not support creation of generator from a module, not a good practice.
The use of np.random module was causing pickle havoc in #3717.

I don't quite see a need for it, users can pass None for a default generator.
I'd be open to throwing TypeError for a module argument.

Suggested change
# np.random module
prng = cirq.value.parse_prng(np.random)
eq.add_equality_group(_sample(prng))

@github-actions github-actions bot removed the Stale label Jul 12, 2024
@github-actions github-actions bot added the Stale label Aug 11, 2024
@github-actions github-actions bot removed the Stale label Aug 12, 2024
@github-actions github-actions bot added Stale and removed Stale labels Sep 11, 2024
@github-actions github-actions bot added Stale and removed Stale labels Oct 12, 2024
@github-actions github-actions bot added the Stale label Nov 13, 2024
@NoureldinYosri NoureldinYosri added triage/accepted there is consensus amongst maintainers that this is a real bug or a reasonable feature to add and removed Stale labels Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size: M 50< lines changed <250 triage/accepted there is consensus amongst maintainers that this is a real bug or a reasonable feature to add
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants