-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: pubOp for creating and updating pubs more easily #965
Conversation
+ cleanup of api
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are helper methods to make matching pubvalues easier.
eg instead of doing
expect(pub.values).toHaveLength(3);
pub.values.sort(/* sorting bc you want the order of the values to be stable over different tests */)
expect(pub.values).toMatchObject([
...
]);
you just do
expect(pub).toHaveValues([...])
Similarly for
expect(pubId).toExist()
just easier than manually looking up the pub yourself
it("should handle complex nested relation scenarios", async () => { | ||
const trx = getTrx(); | ||
// manual rollback try/catch bc we are manually setting pubIds, so a failure in the middle of this will leave the db in a weird state | ||
try { | ||
// Create all pubs with meaningful IDs | ||
const pubA = "aaaaaaaa-0000-0000-0000-000000000000" as PubsId; | ||
const pubB = "bbbbbbbb-0000-0000-0000-000000000000" as PubsId; | ||
const pubC = "cccccccc-0000-0000-0000-000000000000" as PubsId; | ||
const pubD = "dddddddd-0000-0000-0000-000000000000" as PubsId; | ||
const pubE = "eeeeeeee-0000-0000-0000-000000000000" as PubsId; | ||
const pubF = "ffffffff-0000-0000-0000-000000000000" as PubsId; | ||
const pubG = "11111111-0000-0000-0000-000000000000" as PubsId; | ||
const pubH = "22222222-0000-0000-0000-000000000000" as PubsId; | ||
const pubI = "33333333-0000-0000-0000-000000000000" as PubsId; | ||
const pubJ = "44444444-0000-0000-0000-000000000000" as PubsId; | ||
const pubK = "55555555-0000-0000-0000-000000000000" as PubsId; | ||
const pubL = "66666666-0000-0000-0000-000000000000" as PubsId; | ||
|
||
// create the graph structure: | ||
// A J | ||
// / \ | | ||
// / \ | | ||
// B C --> I | ||
// | / \ | ||
// G --> E D | ||
// / \ | ||
// F H | ||
// / \ | ||
// K --> L | ||
|
||
// create leaf nodes first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is by far the most complex test, and very curious whether you agree with whether this should be the behavior.
i spent a bit too long trying to figure this out haha
see my comment in pub-op.ts
for a more in depth explanation about the "algorithm" for determining orphans
* | ||
* Notably, E and I are not deleted, because | ||
* 1. E is the target of a relation from G, which, while still a relation itself, is not reachable from the C-tree | ||
* 2. I is the target of a relation from J, which, while still a relation itself, is not reachable from the C-tree | ||
* | ||
* So this should be the resulting graph: | ||
* | ||
* ``` | ||
* A J | ||
* ┌──┴ │ | ||
* ▼ ▼ | ||
* B I | ||
* │ | ||
* ▼ | ||
* G ─► E | ||
* │ | ||
* ▼ | ||
* F | ||
* ``` | ||
* | ||
* | ||
*/ | ||
private async cleanupOrphanedPubs( | ||
trx: Transaction<Database>, | ||
orphanedPubIds: PubsId[] | ||
): Promise<void> { | ||
if (orphanedPubIds.length === 0) { | ||
return; | ||
} | ||
|
||
const pubsToDelete = await trx | ||
.withRecursive("affected_pubs", (db) => { | ||
// Base case: direct connections from the to-be-removed-pubs down | ||
const initial = db | ||
.selectFrom("pub_values") | ||
.select(["pubId as id", sql<string[]>`array["pubId"]`.as("path")]) | ||
.where("pubId", "in", orphanedPubIds); | ||
|
||
// Recursive case: keep traversing outward | ||
const recursive = db | ||
.selectFrom("pub_values") | ||
.select([ | ||
"relatedPubId as id", | ||
sql<string[]>`affected_pubs.path || array["relatedPubId"]`.as("path"), | ||
]) | ||
.innerJoin("affected_pubs", "pub_values.pubId", "affected_pubs.id") | ||
.where((eb) => eb.not(eb("relatedPubId", "=", eb.fn.any("affected_pubs.path")))) // Prevent cycles | ||
.$narrowType<{ id: PubsId }>(); | ||
|
||
return initial.union(recursive); | ||
}) | ||
// pubs in the affected_pubs table but which should not be deleted because they are still related to other pubs | ||
.with("safe_pubs", (db) => { | ||
return ( | ||
db | ||
.selectFrom("pub_values") | ||
.select(["relatedPubId as id"]) | ||
.distinct() | ||
// crucial part: | ||
// find all the pub_values which | ||
// - point to a node in the affected_pubs | ||
// - but are not themselves affected | ||
// these are the "safe" nodes | ||
.innerJoin("affected_pubs", "pub_values.relatedPubId", "affected_pubs.id") | ||
.where((eb) => | ||
eb.not( | ||
eb.exists((eb) => | ||
eb | ||
.selectFrom("affected_pubs") | ||
.select("id") | ||
.whereRef("id", "=", "pub_values.pubId") | ||
) | ||
) | ||
) | ||
); | ||
}) | ||
.selectFrom("affected_pubs") | ||
.select(["id", "path"]) | ||
.distinctOn("id") | ||
.where((eb) => | ||
eb.not( | ||
eb.exists((eb) => | ||
eb | ||
.selectFrom("safe_pubs") | ||
.select("id") | ||
.where(sql<boolean>`safe_pubs.id = any(affected_pubs.path)`) | ||
) | ||
) | ||
) | ||
.execute(); | ||
|
||
if (pubsToDelete.length > 0) { | ||
await deletePub({ | ||
pubId: pubsToDelete.map((p) => p.id), | ||
communityId: this.options.communityId, | ||
lastModifiedBy: this.options.lastModifiedBy, | ||
trx, | ||
}); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is by far the most complex part of the PR, am curious if you think the general approach is correct! obviously very annoying to check the actual sql, but im more curious whether you think my definition of orphans
here tracks
protected async executeWithTrx(trx: Transaction<Database>): Promise<PubsId> { | ||
const operations = this.collectOperations(); | ||
|
||
await this.createAllPubs(trx, operations); | ||
await this.processStages(trx, operations); | ||
await this.processRelations(trx, operations); | ||
await this.processValues(trx, operations); | ||
|
||
return this.id; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is basically entirely what a pubOp
can do. First create all pubs, add them to stages, do relation stuff, then do value stuff (those last two might be able to be done at the same time, or changed)
notably, all of these things happen at the same time for all pubs. so no more recursive calls, we first collect all operations and then do these things one by one. much more easy to reason about that way i think
if (nullStages.length > 0) { | ||
await autoRevalidate( | ||
trx.deleteFrom("PubsInStages").where( | ||
"pubId", | ||
"in", | ||
nullStages.map(({ pubId }) => pubId) | ||
) | ||
).execute(); | ||
} | ||
|
||
const nonNullStages = stagesToUpdate.filter(({ stageId }) => stageId !== null); | ||
|
||
if (nonNullStages.length > 0) { | ||
await autoRevalidate( | ||
trx | ||
.with("deletedStages", (db) => | ||
db | ||
.deleteFrom("PubsInStages") | ||
.where((eb) => | ||
eb.or( | ||
nonNullStages.map((stageOp) => eb("pubId", "=", stageOp.pubId)) | ||
) | ||
) | ||
) | ||
.insertInto("PubsInStages") | ||
.values( | ||
nonNullStages.map((stageOp) => ({ | ||
pubId: stageOp.pubId, | ||
stageId: stageOp.stageId, | ||
})) | ||
) | ||
).execute(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hopefully thisll be a bit simpler once #967 lands!
core/prisma/seed/createSeed.ts
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seperated this out into a separate file so you can createSeed
while still having the automatic rollbacks work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah missed this earlier, awesome!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please look here for usage!
im really curious what you all think. please comment any ideas/feedback you have for better naming or anything else regarding the api! really interested in anything, especially "i hate this" or something!
@@ -2,12 +2,12 @@ import { describe, expect, it } from "vitest"; | |||
|
|||
import { CoreSchemaType, MemberRole } from "db/public"; | |||
|
|||
import type { Seed } from "~/prisma/seed/seedCommunity"; | |||
import { createSeed } from "~/prisma/seed/createSeed"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iirc importing createSeed
up here did make data persist whereas importing just the type { Seed }
didn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, but i think that was bc they were both imported from /seedCommunity
. I separated out the createSeed to a different file, which (I did not explicitly check) i think should fix that issue.
i do think using createSeed
is slightly better as that preserves the type info once you pass it to seedCommunity
, eg if you have pubTypes: { 'SomepubType': ... }
in your seed, the output of seedcommunity will have pubTypes.SomePubType
as well
core/lib/__tests__/matchers.ts
Outdated
expected: Partial<ProcessedPub["values"][number]>[] | ||
) { | ||
if (typeof received === "string") { | ||
throw new Error("toHaveValues() can only be called with a ProcessedPub"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how come received: PubsId | ProcessedPub
up above instead of received: ProcessedPub
then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch! it was left over bc i think i wanted to be able to pass in both in either case, and was having some trouble with the types. i later decided to just use PubsId
for toExist and ProcessedPub
for toHaveValues, but never updated it! I'll fix it
const pub2 = await PubOp.upsert(crypto.randomUUID() as PubsId, { | ||
communityId: seededCommunity.community.id, | ||
pubTypeId: seededCommunity.pubTypes["Basic Pub"].id, | ||
lastModifiedBy: createLastModifiedBy("system"), | ||
}) | ||
.relate(seededCommunity.pubFields["Some relation"].slug, "test relations value", pub.id) | ||
.execute(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is quite nice! just so I understand, if pub2 were already created, would we also have to call .upsert(...).relate(...)
? i.e. there's nothing like PubOp.relate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah correct!
You'd more likely do PubOp.update(pub2Id).relate()
if you did not want to override the existing relations, but yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for context, initially i did only have upsert
, not create
and update
, but i thought it would be best to have separate ones bc
pubOp.upsert().unrelate()
is kind of weird I think- I wanted to have
upsert
more act like aPUT
than aPATCH
if the pub already exists, ie get rid of the existing values/relations by default if they are not mentioned in theupsert
. But i ofc did want to keep the functionality to only update a subset of values, so we would need anupdate
(or force consumers of this api to do a bunch of unnatural configuration, which is what i was doing before in feat: add proper pub upsert functions #914 ) - it's good to get an error if you are explicitly trying to create a pub that already exists, rather than silently update it when that's maybe not what you wanted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, thanks for the context, makes sense!
/** | ||
* remove pubs that have been disconnected/their value removed, | ||
* has `deleteOrphaned` set to true for their relevant relation operation, | ||
* AND have no other relations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it's a better idea to have this be a configuration setting on the PubType
(or the pubField?)
like, for some relations this could be desireable, like versions and discussions in the current arcadia community. you probably want to delete all the versions of the pub if that pub is deleted.
but for others, like tag
or something, you would not want this. you wouldn't want to delete a Tag
pub if the you delete the last Pub that is relating to it, you probably want to keep it around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the UI i think we can just force the user to manually delete all the relations first, but i think it would be good in certain places to automatically decide this (import actions)
lastModifiedBy: createLastModifiedBy("system"), | ||
}) | ||
.set(seededCommunity.pubFields["Title"].slug, "Test") | ||
.relate(seededCommunity.pubFields["Some relation"].slug, "relation1", (pubOp) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's really nice! 🙌
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests look like they cover most of the PubOp behavior and the logic for pub orphans seems sound. Looking forward to tomorrow's chat about the recursive CTE.
Issue(s) Resolved
Adds a new PubOp system for creating, updating and managing relationships between Pubs using a fluent-like API/
High-level Explanation of PR
Note
This PR is an RFC! Meaning I'm totally happy not merging it!
I'm more interested in your general thoughts on this approach, please nitpick it to hell!
This PR introduces a new
PubOp
system that provides a fluent, builder-style API for managing Pub operations. The system handles complex scenarios like creating multiple related pubs, managing nested relationships, and handling pub value updates in a single transaction.API
The API follows a builder pattern, allowing for chaining operations:
All builders
Currently there's 3 kinds of operations you can do:
create
update
upsert
For
create
, there's also an explicitcreateWithId
.For both
update
andupsert
, there's anupdateByValue
and anupsertByValue
. These allow you to, instead of selecting a pub by id, select a pub by a specific value, eg a google drive id.Basic capabilities
Currently, the
PubOp
allows you to do a bunch of things, here's a small table that shows what you can do with each of the operations:Set values
Set many values in one go
Relate pubs
Relate multiple pubs in one go
Nested relate
You can nest pub operations, allowing you to create, relate, and/or update multiple pubs in a single operation:
If you don't want to keep repeating
{ communityId: ..., lastModifiedBy: ... }
, you also define a function insteadThis just returns a new
PubOp
withcommunityId
etc already set.You'll still need to provide the
pubTypeId
forupsert
andcreate
operations.Test Plan
Look at the tests in
platform/core/lib/server/pub-op.db.test.ts
Lines 1 to 1329 in 64ebece
Write one extra test to get a feel for it (don't need to push it, but would be appreciated!)
Screenshots (if applicable)
Notes
Note
This new API isn't used anywhere yet, that comes later!