Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unique_constraints, a more versatile replacement for django_get_or_create and sqlalchemy_get_or_create #794

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
8 changes: 6 additions & 2 deletions docs/changelog.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,14 @@
ChangeLog
=========

3.1.1 (unreleased)
3.2.0 (unreleased)
------------------

- Nothing changed yet.
*New:*

- Add a generic "get-or-create" mechanism through :attr:`~FactoryOptions.unique_constraints`.

- Add support for composite unique constraints; addressing :issue:`241`.


3.1.0 (2020-10-02)
Expand Down
22 changes: 21 additions & 1 deletion docs/internals.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,27 @@ Instantiating, Step 2: Preparing values
encountering a :class:`SubFactory`.


Instantiating, Step 3: Building the object
Instantiating, step 3: Lookup existing data (optional)
------------------------------------------------------

Once the ``StepBuilder`` and its ``Resolver`` are ready, the builder looks at the
:attr:`~Factory.Meta.unique_constraints`:

1. It provides the call-time parameters to the :meth:`FactoryOptions.get_lookup_groups`
method;
2. That method computes groups of lookups to try:

- Any call-time parameter that appear in any unique constraint will *always* be included;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Any call-time parameter that appear in any unique constraint will *always* be included;
- Any call-time parameter that appears in a unique constraint will *always* be included;

- The first lookups are performed on the unique constraintss sharing the most parameters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- The first lookups are performed on the unique constraintss sharing the most parameters
- The first lookups are performed on the unique constraints sharing the most parameters

with the call-time parameters

3. For each group, the ``StepBuilder`` computes the value of the required parameters,
and performs a database lookup;
4. If any lookup returns an instance, it is used; otherwise, the remaining parameters
will be resolved by the ``Resolver``.


Instantiating, Step 4: Building the object
------------------------------------------

1. The ``StepBuilder`` fetches the attributes computed by the ``Resolver``.
Expand Down
91 changes: 91 additions & 0 deletions docs/reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,86 @@ Meta options

.. versionadded: 2.6.0

.. attribute:: unique_constraints

Some models have unicity constraints on parts of their fields.
Instead of failing to save an instance, factories may define
these unicity constraints here:

.. code-block:: python

class EmployeeFactory(factory.Factory):
class Meta:
model = Employee
unique_constraints = [
['email'],
['access_card_id'],
['company', 'employee_id'],
]

Each group is a list of fields which should be *collectively unique*;
in this example, the ``employee_id`` is unique within each company.

When a new instance is required, the factory will start by resolving
the parameter used for each unique constraint, and attempt a lookup for this
combination of fields:
Comment on lines +159 to +160
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if an example should illustrate the combination of fields? When reading the unique constraints, it may not be obvious they will be combined with an AND.


.. code-block:: pycon

>>> e1 = EmployeeFactory()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifying the access_card_id in the first factory explains why e1 is returned when e2 is built.

Suggested change
>>> e1 = EmployeeFactory()
>>> e1 = EmployeeFactory(access_card_id=42)

>>> e2 = EmployeeFactory(access_card_id=42)
lookup(email='[email protected]') -> None; continue
lookup(access_card_id=42) -> e1; return
>>> e1 == e2
True

If the lookup succeeds, the instance is used as is; post-generation hooks
will still be called.

.. note:: The :attr:`~FactoryOptions.unique_constraints` feature requires
a :meth:`Factory._lookup` definition on the factory class
or one of its parents.

.. tip:: Group parameters are resolved lazily: in the above example,
the `company` declaration will only be evaluated if the `email`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The EmployeeFactory does not have declarations. It would be useful to document a least a SubFactory to Company, to illustrate that the SubFactory, which usually creates a database row, will not be called.

and `access_card_id` lookup failed.

This avoids polluting the database with unneeded objects.

.. note::

**Lookup priority**

Unique constraints have a specific interaction with call-time parameters:
if a parameter is explicitly set when calling the factory, and appears
in any unique constraint, it will be included in each lookup: it is likely
a unique identifier of some sort.

Moreover, unique constraints containing more call-time parameters will be
tried first: they are more likely to succeed.

With the above definition, the following will happen:

.. code-block:: pycon

>>> EmployeeFactory()
# 1. lookup(email=...)
# 2. lookup(access_card_id=...)
# 3. lookup(company=..., employee_id=...)

>>> EmployeeFactory(access_card_id='42')
# 1. lookup(access_card_id=42)
# 2. lookup(access_card_id=42, email=...)
# 3. lookup(access_card_id=42, company=..., employee_id=...)

>>> EmployeeFactory(access_card_id='42', name="John Doe")
# The `name=` is not included in lookups, since it doesn't appear
# in unique constraints.
# 1. lookup(access_card_id=42)
# 2. lookup(access_card_id=42, email=...)
# 3. lookup(access_card_id=42, company=..., employee_id=...)

.. versionadded:: 3.2.0

.. attribute:: strategy

Expand Down Expand Up @@ -319,6 +399,17 @@ Attributes and methods

.. OHAI_VIM*

.. classmethod:: _lookup(cls, model_class, strategy, fields)

Lookup an instance from the database, using the passed-in fields.

This is required for the :attr:`~FactoryOptions.unique_constraints` feature.

The ``strategy`` field can be used to disable lookups with specific
strategies - typically for :data:`BUILD_STRATEGY` and :data:`STUB_STRATEGY`.

.. versionadded:: 3.2.0

.. classmethod:: _after_postgeneration(cls, obj, create, results=None)

:arg object obj: The object just generated
Expand Down
42 changes: 38 additions & 4 deletions factory/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -238,7 +238,7 @@ def is_model(meta, value):
OptionDefault('exclude', (), inherit=True),
OptionDefault('rename', {}, inherit=True),
OptionDefault(
'lookup_groups', [], inherit=True,
'unique_constraints', [], inherit=True,
normalize=lambda groups: [set(group) for group in groups],
),
]
Expand Down Expand Up @@ -371,6 +371,15 @@ def reset_sequence(self, value=None, force=False):
value = self.counter_reference.factory._setup_next_sequence()
self._counter.reset(value)

def apply_renames(self, mapping):
"""Rename items based on self.renames.

Mutates the passed-in mapping in place.
"""
for old_name, new_name in self.rename.items():
if old_name in mapping:
mapping[new_name] = mapping.pop(old_name)

def prepare_arguments(self, attributes):
"""Convert an attributes dict to a (args, kwargs) tuple."""
kwargs = dict(attributes)
Expand All @@ -384,9 +393,7 @@ def prepare_arguments(self, attributes):
}

# 3. Rename fields
for old_name, new_name in self.rename.items():
if old_name in kwargs:
kwargs[new_name] = kwargs.pop(old_name)
self.apply_renames(kwargs)

# 4. Extract inline args
args = tuple(
Expand All @@ -396,6 +403,26 @@ def prepare_arguments(self, attributes):

return args, kwargs

def get_lookup_groups(self, params):
"""Retrieve the list of lookup fields based on provided params."""
# Extract call-time params that appear in any unique constraint
lookup_params = set(params) & set().union(*self.unique_constraints)

# Sort constraint groups: start with those containing the most
# call-time params
by_distance = sorted(
self.unique_constraints,
key=lambda group: -len(group & set(params)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find reverse more explicit than negating the key. Up to you.

Suggested change
key=lambda group: -len(group & set(params)),
key=lambda group: len(group & set(params)),
reverse=True,

)
for group in by_distance:
yield group | lookup_params

def lookup(self, fields, strategy):
kwargs = dict(fields)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fields is always a dict. The original fields dict is not reused later, I don’t think it needs to be copied. If that was the intent, dict.copy() is more explicit.

Suggested change
kwargs = dict(fields)

self.apply_renames(kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be great to get a test to verify the renaming.

model = self.get_model_class()
return self.factory._lookup(model, strategy, fields=kwargs)

def instantiate(self, step, args, kwargs):
model = self.get_model_class()

Expand Down Expand Up @@ -593,6 +620,13 @@ def _create(cls, model_class, *args, **kwargs):
"""
return model_class(*args, **kwargs)

@classmethod
def _lookup(cls, model_class, params):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _lookup(cls, model_class, params):
def _lookup(cls, model_class, strategy, params):

raise NotImplementedError(
"Using `class Meta: unique_constraints` is not available on class %s: "
"no `def _lookup()` method has been provided." % cls,
)

@classmethod
def build(cls, **kwargs):
"""Build an instance of the associated class, with overriden attrs."""
Expand Down
42 changes: 29 additions & 13 deletions factory/builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -199,21 +199,22 @@ def parse_declarations(decls, base_pre=None, base_post=None):


class BuildStep:
def __init__(self, builder, sequence, parent_step=None):
def __init__(self, *, builder, sequence, declarations, parent_step=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing the BuildStep arguments as kw-only would stand as an improvement on its own to keep this patch focused.

self.builder = builder
self.sequence = sequence
self.attributes = {}
self.declarations = declarations
self.parent_step = parent_step
self.stub = None

def resolve(self, declarations):
self.stub = Resolver(
declarations=declarations,
declarations=self.declarations,
step=self,
sequence=self.sequence,
)

for field_name in declarations:
def resolve(self, only=None):
only = only or self.declarations

for field_name in only:
self.attributes[field_name] = getattr(self.stub, field_name)

@property
Expand Down Expand Up @@ -272,16 +273,31 @@ def build(self, parent_step=None, force_sequence=None):
builder=self,
sequence=sequence,
parent_step=parent_step,
declarations=pre,
)
step.resolve(pre)

args, kwargs = self.factory_meta.prepare_arguments(step.attributes)
instance = None

instance = self.factory_meta.instantiate(
step=step,
args=args,
kwargs=kwargs,
)
for group in self.factory_meta.get_lookup_groups(self.extras.keys()):
step.resolve(only=group)
fields = {
field: step.attributes[field]
for field in group
}
instance = self.factory_meta.lookup(fields, strategy=self.strategy)
if instance is not None:
break

if instance is None:
step.resolve()

args, kwargs = self.factory_meta.prepare_arguments(step.attributes)

instance = self.factory_meta.instantiate(
step=step,
args=args,
kwargs=kwargs,
)

postgen_results = {}
for declaration_name in post.sorted():
Expand Down
1 change: 1 addition & 0 deletions tests/test_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,7 @@ class AbstractFactory(base.Factory):
self.assertEqual((), AbstractFactory._meta.inline_args)
self.assertEqual((), AbstractFactory._meta.exclude)
self.assertEqual(enums.CREATE_STRATEGY, AbstractFactory._meta.strategy)
self.assertEqual([], AbstractFactory._meta.unique_constraints)

# Non-declarative attributes
self.assertEqual({}, AbstractFactory._meta.pre_declarations.as_dict())
Expand Down
Loading