-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mitigate problems with EntityManager::flush()
reentrance since 2.16.0
#10906
Conversation
@mpdude this test fails for me on 2.16.x but works on 2.15.x I added a few more entities that are persisted so I guess the flush during postPersist messes with the persisting of the other entities following afterwards? <?php
declare(strict_types=1);
namespace Doctrine\Tests\ORM\Functional\Ticket;
use Doctrine\DBAL\Logging\LoggerChain;
use Doctrine\ORM\Event\PostPersistEventArgs;
use Doctrine\ORM\Events;
use Doctrine\Tests\DbalExtensions\SqlLogger;
use Doctrine\Tests\Models\CMS\CmsUser;
use Doctrine\Tests\OrmFunctionalTestCase;
use function sprintf;
class GH10869Test extends OrmFunctionalTestCase
{
protected function setUp(): void
{
$this->useModelSet('cms');
parent::setUp();
}
public function testPostPersistListenerUpdatingObject(): void
{
foreach (range(1, 3) as $i) {
$joe = new CmsUser();
$joe->username = "joe-$i";
$joe->name = "Joe $i";
$this->_em->persist($joe);
}
$this->_em->getEventManager()->addEventListener(Events::postPersist, new class {
public function postPersist(PostPersistEventArgs $args): void
{
$object = $args->getObject();
if (! $object instanceof CmsUser) {
return;
}
$objectManager = $args->getObjectManager();
if ($object->status === null) {
$object->status = sprintf('joe-id=%s', $object->id);
}
$objectManager->flush();
}
});
$this->_em->flush();
$this->_em->clear();
$joeReloaded = $this->_em->find(CmsUser::class, $joe->id);
self::assertNotNull($joeReloaded->status);
self::assertSame($joe->status, $joeReloaded->status);
}
} |
EntityManager::flush()
reentrance since 2.16.0
This PR addresses the issue brought up in doctrine#10869. It happens when users use event listeners – `postPersist` in particular – to make changes to entities and/or persist new entities and finally call `EntityManager::flush()`, while the `UnitOfWork` is currently within the commit phase. There is a discussion in doctrine#10900 whether this kind of reentrance should be deprecated in 2.x and prevented in 3.x. But, in order to prevent complete breakage with the 2.16.0 update, this PR tries to apply a band-aid 🩹. A few things changed inside the UoW commit implementation in 2.16, and for sure this PR does not restore all details of the previous behavior. Take it with a grain of salt. Here's the details. The issue appears when `UoW::commit()` is performing entity insertions, and `postPersist` listener causes `commit()` reentrance when there are pending insertions left. In that situation, the "inner" (reentrant) call will start working through all changesets. Eventually, it finishes with all insertions being performed and `UoW::$entityInsertions` being empty. The entity insertion order, an array computed at the beginning of `UnitOfWork::executeInserts()`, still contains entities that now have been processed already. This leads to the reported failure down the road. The mitigation is to check for this condition and skip such entities. Before doctrine#10547 (pre-2.16), things worked a bit differently: The UoW did not build a list of entities (objects) as the commit order, but a sequence of _classes_ to process. For every entity _class_, it would find all entity instances in `UoW::$entityInsertions` and pass them to the persister in a batch. The persister, in turn, would execute the `INSERT` statements for _all_ those entities _before_ it dispatched the `postInsert` events. That means when a `postInsert` listener was notified pre-2.16, the UoW was in a state where 1) all insertions for a particular class had been written to the database already and 2) the UoW had not yet gathered the next round of entities to process.
@dmaicher Please check and report if this fixes the issue for you |
@mpdude thanks for all your efforts on this 👍 The SQL error with missing parameters is indeed gone. However I don't think the behavior is quite the same as on 2.15 😕 Now some other test expectations are failing that seem related to the new entities that are persisted within I think its too dangerous to flush again while another flush is ongoing. So instead of somehow fixing this I would definitely prefer a new event and do another clean flush. here a test that fails with your changes but works on 2.15: <?php
declare(strict_types=1);
namespace Doctrine\Tests\ORM\Functional\Ticket;
use Doctrine\ORM\Event\PostPersistEventArgs;
use Doctrine\ORM\Events;
use Doctrine\ORM\Mapping as ORM;
use Doctrine\Tests\OrmFunctionalTestCase;
use function sprintf;
class GH10869Test extends OrmFunctionalTestCase
{
protected function setUp(): void
{
parent::setUp();
$this->setUpEntitySchema([
GH10869Entity::class,
]);
}
public function testPostPersistListenerUpdatingObjectFieldWhileOtherInsertPending(): void
{
$entity1 = new GH10869Entity();
$this->_em->persist($entity1);
$entity2 = new GH10869Entity();
$this->_em->persist($entity2);
$this->_em->getEventManager()->addEventListener(Events::postPersist, new class {
public function postPersist(PostPersistEventArgs $args): void
{
$object = $args->getObject();
if ($object->field === 'foo') {
return;
}
$objectManager = $args->getObjectManager();
$object->field = sprintf('test %s', $object->id);
foreach (range(1, 2) as $ignored) {
$newObject = new GH10869Entity();
$newObject->field = 'foo';
$objectManager->persist($newObject);
}
$objectManager->flush();
}
});
$this->_em->flush();
$this->_em->clear();
$entity1Reloaded = $this->_em->find(GH10869Entity::class, $entity1->id);
self::assertSame($entity1->field, $entity1Reloaded->field);
$entity2Reloaded = $this->_em->find(GH10869Entity::class, $entity2->id);
self::assertSame($entity2->field, $entity2Reloaded->field);
$objectsPersistedDuringPostPersist = $this->_em->getRepository(GH10869Entity::class)->count(['field' => 'foo']);
self::assertSame(4, $objectsPersistedDuringPostPersist);
}
}
/**
* @ORM\Entity
*/
class GH10869Entity
{
/**
* @ORM\Id
* @ORM\GeneratedValue
* @ORM\Column(type="integer")
*
* @var ?int
*/
public $id;
/**
* @ORM\Column(type="text", nullable=true)
*
* @var ?string
*/
public $field;
} |
What exactly is this difference? |
you mean how my test case is different from yours? --- a/tests/Doctrine/Tests/ORM/Functional/GH10869Test.php
+++ b/tests/Doctrine/Tests/ORM/Functional/GH10869Test.php
@@ -35,8 +35,19 @@ class GH10869Test extends OrmFunctionalTestCase
{
$object = $args->getObject();
+ if ($object->field === 'foo') {
+ return;
+ }
+
$objectManager = $args->getObjectManager();
$object->field = sprintf('test %s', $object->id);
+
+ foreach (range(1, 2) as $ignored) {
+ $newObject = new GH10869Entity();
+ $newObject->field = 'foo';
+ $objectManager->persist($newObject);
+ }
+
$objectManager->flush();
}
});
@@ -49,6 +60,9 @@ class GH10869Test extends OrmFunctionalTestCase
$entity2Reloaded = $this->_em->find(GH10869Entity::class, $entity2->id);
self::assertSame($entity2->field, $entity2Reloaded->field);
+
+ $objectsPersistedDuringPostPersist = $this->_em->getRepository(GH10869Entity::class)->count(['field' => 'foo']);
+ self::assertSame(4, $objectsPersistedDuringPostPersist);
} the last assert basically fails. On your branch its 6 and on 2.15 its 4 which would be correct. |
No, I mean what exactly the difference is between 2.15 and 2.16... does it miss updates? Is only the ordering different? |
Hard to tell so far. It seems some new objects (added during postPersist) are not even persisted at all or so. I will check in more detail. |
@mpdude ok after further checking it has to do with the changed order of entity insertion. In the postPersist listener I expect some related entity to be already persisted (written to DB and assigned ID) and only then I can properly create the new entity. So I think I need a new event 😋 Too brittle to make (false as we see now) assumptions about the order in which entities are persisted. |
Yes, you were assuming that all other entities of the same class have already been written before the event is dispatched. Maybe we can move the |
actually in my case its not even about entities of the same class 😕 Its about another related class. The structure is a bit more complex. Its like
I listen to |
orm/lib/Doctrine/ORM/UnitOfWork.php Lines 1215 to 1217 in d2de4ec
This is the 2.15 version of |
Regarding #10906 (comment) – I cannot reproduce that. For me, it seems to pass also based on 1895cf8 |
PHPCS fails. 🤓 |
Converting to draft for the time being. Will open a new PR later since the solution we landed at is somewhat different from what I initially proposed. To keep the PR history understandable, I think it would be better to move the "new" solution to a dedicated discussion. Extra things I'd like to include:
|
….0 (Take 2) The changes from doctrine#10547, which landed in 2.16.0, cause problems for users that call `EntityManager::flush()` from within `postPersist` event listeners. * When `UnitOfWork::commit()` is re-entered, the "inner" (reentrant) call will start working through all changesets. Eventually, it finishes with all insertions being performed and UoW::$entityInsertions being empty. After return, the entity insertion order, an array computed at the beginning of `UnitOfWork::executeInserts()`, still contains entities that now have been processed already. This leads to a strange-looking SQL error where the number of parameters used does not match the number of parameters bound. This has been reported as doctrine#10869. * The fixes made to the commit order computation may lead to a different entity insertion order than previously. `postPersist` listener code may be affected by this when accessing generated IDs for other entities than the one the event has been dispatched for. This ID may not yet be available when the insertion order is different from the one that was used before 2.16. This has been mentioned in doctrine#10906 (comment). This PR suggests to address both issues by dispatching the `postPersist` event only after _all_ new entities have their rows inserted into the database. Likewise, dispatch `postRemove` only after _all_ deletions have been executed. This solves the first issue because the sequence of insertions or deletions has been processed completely _before_ we start calling event listeners. This way, potential changes made by listeners will no longer be relevant. Regarding the second issue, I think deferring `postPersist` a bit until _all_ entities have been inserted does not violate any promises given, hence is not a BC break. In 2.15, this event was raised after all insertions _for a particular class_ had been processed - so, it was never an "immediate" event for every single entity. doctrine#10547 moved the event handling to directly after every single insertion. Now, this PR moves it back a bit to after _all_ insertions.
….0 (Take 2) The changes from doctrine#10547, which landed in 2.16.0, cause problems for users calling `EntityManager::flush()` from within `postPersist` event listeners. * When `UnitOfWork::commit()` is re-entered, the "inner" (reentrant) call will start working through all changesets. Eventually, it finishes with all insertions being performed and `UoW::$entityInsertions` being empty. After return, the entity insertion order, an array computed at the beginning of `UnitOfWork::executeInserts()`, still contains entities that now have been processed already. This leads to a strange-looking SQL error where the number of parameters used does not match the number of parameters bound. This has been reported as doctrine#10869. * The fixes made to the commit order computation may lead to a different entity insertion order than previously. `postPersist` listener code may be affected by this when accessing generated IDs for other entities than the one the event has been dispatched for. This ID may not yet be available when the insertion order is different from the one that was used before 2.16. This has been mentioned in doctrine#10906 (comment). This PR suggests to address both issues by dispatching the `postPersist` event only after _all_ new entities have their rows inserted into the database. Likewise, dispatch `postRemove` only after _all_ deletions have been executed. This solves the first issue because the sequence of insertions or deletions has been processed completely _before_ we start calling event listeners. This way, potential changes made by listeners will no longer be relevant. Regarding the second issue, I think deferring `postPersist` a bit until _all_ entities have been inserted does not violate any promises given, hence is not a BC break. In 2.15, this event was raised after all insertions _for a particular class_ had been processed - so, it was never an "immediate" event for every single entity. doctrine#10547 moved the event handling to directly after every single insertion. Now, this PR moves it back a bit to after _all_ insertions.
Closing in favor of #10915. |
….0 (Take 2) The changes from doctrine#10547, which landed in 2.16.0, cause problems for users calling `EntityManager::flush()` from within `postPersist` event listeners. * When `UnitOfWork::commit()` is re-entered, the "inner" (reentrant) call will start working through all changesets. Eventually, it finishes with all insertions being performed and `UoW::$entityInsertions` being empty. After return, the entity insertion order, an array computed at the beginning of `UnitOfWork::executeInserts()`, still contains entities that now have been processed already. This leads to a strange-looking SQL error where the number of parameters used does not match the number of parameters bound. This has been reported as doctrine#10869. * The fixes made to the commit order computation may lead to a different entity insertion order than previously. `postPersist` listener code may be affected by this when accessing generated IDs for other entities than the one the event has been dispatched for. This ID may not yet be available when the insertion order is different from the one that was used before 2.16. This has been mentioned in doctrine#10906 (comment). This PR suggests to address both issues by dispatching the `postPersist` event only after _all_ new entities have their rows inserted into the database. Likewise, dispatch `postRemove` only after _all_ deletions have been executed. This solves the first issue because the sequence of insertions or deletions has been processed completely _before_ we start calling event listeners. This way, potential changes made by listeners will no longer be relevant. Regarding the second issue, I think deferring `postPersist` a bit until _all_ entities have been inserted does not violate any promises given, hence is not a BC break. In 2.15, this event was raised after all insertions _for a particular class_ had been processed - so, it was never an "immediate" event for every single entity. doctrine#10547 moved the event handling to directly after every single insertion. Now, this PR moves it back a bit to after _all_ insertions.
….0 (Take 2) The changes from doctrine#10547, which landed in 2.16.0, cause problems for users calling `EntityManager::flush()` from within `postPersist` event listeners. * When `UnitOfWork::commit()` is re-entered, the "inner" (reentrant) call will start working through all changesets. Eventually, it finishes with all insertions being performed and `UoW::$entityInsertions` being empty. After return, the entity insertion order, an array computed at the beginning of `UnitOfWork::executeInserts()`, still contains entities that now have been processed already. This leads to a strange-looking SQL error where the number of parameters used does not match the number of parameters bound. This has been reported as doctrine#10869. * The fixes made to the commit order computation may lead to a different entity insertion order than previously. `postPersist` listener code may be affected by this when accessing generated IDs for other entities than the one the event has been dispatched for. This ID may not yet be available when the insertion order is different from the one that was used before 2.16. This has been mentioned in doctrine#10906 (comment). This PR suggests to address both issues by dispatching the `postPersist` event only after _all_ new entities have their rows inserted into the database. Likewise, dispatch `postRemove` only after _all_ deletions have been executed. This solves the first issue because the sequence of insertions or deletions has been processed completely _before_ we start calling event listeners. This way, potential changes made by listeners will no longer be relevant. Regarding the second issue, I think deferring `postPersist` a bit until _all_ entities have been inserted does not violate any promises given, hence is not a BC break. In 2.15, this event was raised after all insertions _for a particular class_ had been processed - so, it was never an "immediate" event for every single entity. doctrine#10547 moved the event handling to directly after every single insertion. Now, this PR moves it back a bit to after _all_ insertions.
….0 (Take 2) The changes from doctrine#10547, which landed in 2.16.0, cause problems for users calling `EntityManager::flush()` from within `postPersist` event listeners. * When `UnitOfWork::commit()` is re-entered, the "inner" (reentrant) call will start working through all changesets. Eventually, it finishes with all insertions being performed and `UoW::$entityInsertions` being empty. After return, the entity insertion order, an array computed at the beginning of `UnitOfWork::executeInserts()`, still contains entities that now have been processed already. This leads to a strange-looking SQL error where the number of parameters used does not match the number of parameters bound. This has been reported as doctrine#10869. * The fixes made to the commit order computation may lead to a different entity insertion order than previously. `postPersist` listener code may be affected by this when accessing generated IDs for other entities than the one the event has been dispatched for. This ID may not yet be available when the insertion order is different from the one that was used before 2.16. This has been mentioned in doctrine#10906 (comment). This PR suggests to address both issues by dispatching the `postPersist` event only after _all_ new entities have their rows inserted into the database. Likewise, dispatch `postRemove` only after _all_ deletions have been executed. This solves the first issue because the sequence of insertions or deletions has been processed completely _before_ we start calling event listeners. This way, potential changes made by listeners will no longer be relevant. Regarding the second issue, I think deferring `postPersist` a bit until _all_ entities have been inserted does not violate any promises given, hence is not a BC break. In 2.15, this event was raised after all insertions _for a particular class_ had been processed - so, it was never an "immediate" event for every single entity. doctrine#10547 moved the event handling to directly after every single insertion. Now, this PR moves it back a bit to after _all_ insertions.
This PR addresses the issue brought up in #10869. It happens when users use event listeners –
postPersist
in particular – to make changes to entities and/or persist new entities and finally callEntityManager::flush()
, while theUnitOfWork
is currently within the commit phase.There is a discussion in #10900 whether this kind of reentrance should be deprecated in 2.x and prevented in 3.x. But, in order to prevent complete breakage with the 2.16.0 update, this PR tries to apply a band-aid 🩹.
A few things changed inside the UoW commit implementation in 2.16, and for sure this PR does not restore all details of the previous behavior. Take it with a grain of salt.
Here's the details.
The issue appears when
UoW::commit()
is performing entity insertions, andpostPersist
listener causescommit()
reentrance when there are pending insertions left.In that situation, the "inner" (reentrant) call will start working through all changesets. Eventually, it finishes with all insertions being performed and
UoW::$entityInsertions
being empty.The entity insertion order, an array computed at the beginning of
UnitOfWork::executeInserts()
, still contains entities that now have been processed already. This leads to the reported failure down the road. The mitigation is to check for this condition and skip such entities.Before #10547 (pre-2.16), things worked a bit differently:
The UoW did not build a list of entities (objects) as the commit order, but a sequence of classes to process.
For every entity class, it would find all entity instances in
UoW::$entityInsertions
and pass them to the persister in a batch. The persister, in turn, would execute theINSERT
statements for all those entities before it dispatched thepostInsert
events.That means when a
postInsert
listener was notified pre-2.16, the UoW was in a state where 1) all insertions for a particular class had been written to the database already and 2) the UoW had not yet gathered the next round of entities to process.