Baseline Django Field Tests #62

NoiSek · 2023-05-05T21:28:42Z

This PR adds testing and confirms functionality for all of Django's ORM coverage and field types. mixer has been added as a dev dependency for test fuzzing.

I've opened this as a draft for discussion on a few points.

…icitly.

NoiSek · 2023-05-05T21:37:58Z

django_scylla/cql/compiler.py

 from django.db.models.sql import compiler

+from datetime import timedelta
+from cassandra.util import Duration
+

 def unique_rowid():
    # TODO: guarantee that this is globally unique


This caused a failure in the test for SmallAutoField because the number generated was beyond the 32767 limit for short ints. Which leads me to:

I don't think supporting SmallAutoField is very important, but I do think that models should be able to generate unique identifiers consistently without risk of collision. I think that row IDs generated with this driver should use the standard UUID, since Scylla does not actually support the paradigm of "unique counters".

Realistically, the only reason you would specifically want to preserve the tradition of keeping row_ids as ints is because it gives you an idea of when an object was created chronologically and because it makes URL slugs cleaner, but if you are consciously choosing to use Scylla I think that's an acceptable tradeoff and you can model that requirement in better ways.

@r4fek are you opposed to the idea of making these UUID values by default?

Not at all. Let's use uuids by default.

As an update to this:

There are actually many major obstacles to using UUIDs as the default for AutoFields. Django assumes by default that all AutoFields are subclasses of IntegerField [1], and the workarounds for that require a lot of monkeypatching. The canonical approach is to have an AutoID field + a separate UUID field, but that sort of defeats the purpose since Scylla can't really do that anyway...

Transparently overriding AutoFields into UUIDs doesn't work so well because the generated UUID gets cast as an integer regardless internally due to field validation in many places (and UUIDs translate to 128 bit integers, much larger than even bigint's range).

I did try creating a UUIDAutoField that subclasses the existing AutoField so it could be used by setting DEFAULT_AUTO_FIELD to UUIDAutoField. With that I was able to override the existing validation methods and prevent it from being cast to an integer, but even in that situation it would lead to column conflict errors since internal models still expect a BigAutoField value-- meaning it would generate a UUID value and pass every validation check internally but fail once the actual query executed.

Django devs seem to be aware of this and working on it, so I figured at this point it is probably best to wait for full support with an official UUIDAutoField and double down on the original approach (all AutoFields as BigInts, generate suitably random number). I've reverted back to the original approach and added a few additional steps so that generated values are guaranteed to be more random.

[1] https://code.djangoproject.com/ticket/32577
[2] https://code.djangoproject.com/ticket/33450

NoiSek · 2023-05-05T21:42:06Z

tests/test_base_fields.py

+
+    # Ensure that the other side of this relationship has been created and is selectable
+    for library in libraries:
+        assert library.book_set is not None


Aside from the SmallAutoField test, this is currently the only failure.

Creation works as expected, but the ORM's internal JOIN behavior is hard to work around when these values are being selected afterward.

NoiSek · 2023-05-15T17:15:50Z

django_scylla/cql/compiler.py

+    pid = getpid()
+    sys_random = SystemRandom().randint(0, 65535)
+
+    return timestamp + counter['value'] + pid + sys_random


Ideally this approach means that given the same code running simultaneously: each worker will return a different pid, each thread of that worker will return a different counter, and each call will return a different sys_random value.

The counter in particular is worth noting because it uses a little known "pitfall" of Python, which is that default values specified as dicts are persistent between runs.

e.g.

>>> def foo(counter={'value':0}): ... counter['value'] += 1 ... return counter['value'] ... >>> foo() 1 >>> foo() 2 >>> foo() 3 >>> foo() 4

NoiSek added 9 commits May 3, 2023 17:42

Add models to demo for testing that cover all Django fields.

95c36b5

Merge branch 'master' into base_model_tests

17d880d

Add mixer to dependencies for fuzzy model generation.

910c724

Add basic Django field tests.

fac3d19

Modify default column mapping to be more accurate.

840bb2c

Fix an issue with DecimalField values being converted to strings impl…

41757ec

…icitly.

Add support for BinaryField and DurationField.

d058df8

Add Address model to test OneToOne relations.

c6f1b5f

Expand field tests to cover OneToOne fields and niche field types.

6beb9a8

NoiSek requested a review from r4fek May 5, 2023 21:28

NoiSek commented May 5, 2023

View reviewed changes

NoiSek added 5 commits May 11, 2023 10:22

Use a more provably random basis for generating AutoField values.

8ac4258

Revert Integers to bigint by default to support unique AutoField IDs.

8e817f8

Fix minor errors in model definitions.

2d21531

Adjust Django settings.

22e006f

Adjust many to many test.

c592c45

NoiSek commented May 15, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Baseline Django Field Tests #62

Baseline Django Field Tests #62

NoiSek commented May 5, 2023 •

edited

Loading

NoiSek May 5, 2023

r4fek May 6, 2023

NoiSek May 15, 2023

NoiSek May 5, 2023

NoiSek May 15, 2023

Baseline Django Field Tests #62

Are you sure you want to change the base?

Baseline Django Field Tests #62

Conversation

NoiSek commented May 5, 2023 • edited Loading

NoiSek May 5, 2023

Choose a reason for hiding this comment

r4fek May 6, 2023

Choose a reason for hiding this comment

NoiSek May 15, 2023

Choose a reason for hiding this comment

NoiSek May 5, 2023

Choose a reason for hiding this comment

NoiSek May 15, 2023

Choose a reason for hiding this comment

NoiSek commented May 5, 2023 •

edited

Loading