-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bulk create by copy operation for files #4483
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the permissions: I would imagine you should only need files/read on the source and files/write on the destination
...ugins/storage/src/main/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/files/Files.scala
Outdated
Show resolved
Hide resolved
.../src/main/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/files/routes/FilesRoutes.scala
Outdated
Show resolved
Hide resolved
...ugins/storage/src/main/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/files/Files.scala
Outdated
Show resolved
Hide resolved
...ugins/storage/src/main/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/files/Files.scala
Outdated
Show resolved
Hide resolved
...ugins/storage/src/main/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/files/Files.scala
Outdated
Show resolved
Hide resolved
...ugins/storage/src/main/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/files/Files.scala
Outdated
Show resolved
Hide resolved
...ugins/storage/src/main/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/files/Files.scala
Show resolved
Hide resolved
...ugins/storage/src/main/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/files/Files.scala
Outdated
Show resolved
Hide resolved
.../src/main/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/files/routes/FilesRoutes.scala
Outdated
Show resolved
Hide resolved
.../main/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/files/routes/CopyFilePayload.scala
Outdated
Show resolved
Hide resolved
...torage/src/test/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/files/FileFixtures.scala
Show resolved
Hide resolved
...s/storage/src/test/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/files/FilesSpec.scala
Outdated
Show resolved
Hide resolved
.../test/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/files/routes/FilesRoutesSpec.scala
Outdated
Show resolved
Hide resolved
"destinationFilename": "{destinationFilename}", | ||
"sourceProjectRef": "{sourceOrg}/{sourceProj}", | ||
"sourceFileId": "{sourceFileId}", | ||
"sourceTag": "{sourceTagName}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some other places, we just pass the id suffixed by the rev/tag (example: multi-fetch)
...ugins/storage/src/main/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/files/Files.scala
Outdated
Show resolved
Hide resolved
destFilesAttributes.traverse { destFileAttributes => | ||
for { | ||
iri <- generateId(pc) | ||
command = CreateFile(iri, dest.project, destStorageRef, destStorageTpe, destFileAttributes, c.subject, dest.tag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be interesting to keep track that this file is a copy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good shout, Cristina mentioned wanting that - this is already massive so I'm thinking do in a follow up?
.../src/main/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/files/routes/FilesRoutes.scala
Outdated
Show resolved
Hide resolved
...orage/src/main/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/StoragePluginModule.scala
Outdated
Show resolved
Hide resolved
...age/src/main/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/files/batch/BatchCopy.scala
Show resolved
Hide resolved
...main/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/files/routes/BatchFilesRoutes.scala
Outdated
Show resolved
Hide resolved
/** | ||
* Rejection returned when a storage cannot fetch a file's attributes | ||
*/ | ||
sealed abstract class CopyFileRejection(loggedDetails: String) extends StorageFileRejection(loggedDetails) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have it only on the FileRejection
side ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh as in ditch the existing pattern since it just adds more error types than are necessary? I guess this came from monix typed errors in the past
import io.circe.syntax._ | ||
import io.circe.{Encoder, Json, JsonObject} | ||
|
||
final case class BulkOperationResults[A](results: Seq[A]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the end, it is not something that is restrained to bulk operations, we could have a more generic name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would go the opposite direction and call it CopyFilesResult
😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it not? The only purpose of this type was to encode the bulk-operation
context in a contained way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is a CopyFileResult
, it should go in the storage plugin (and the related context too)
Ref.of[IO, Option[CopyOperationFailed]](None).flatMap { errorRef => | ||
files | ||
.parTraverse { case c @ CopyBetween(source, dest) => | ||
copySingle(source, dest).onError(_ => errorRef.set(Some(CopyOperationFailed(c)))) | ||
copySingle(source, dest).onError(e => errorRef.set(Some(CopyOperationFailed(c, e)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does the ref get used here rather than modifying the error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think I understand the question 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we setting a ref when we could just return the error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll refactor this afterwards
...src/test/scala/ch/epfl/bluebrain/nexus/delta/kernel/utils/TransactionalFileCopierSuite.scala
Outdated
Show resolved
Hide resolved
...src/test/scala/ch/epfl/bluebrain/nexus/delta/kernel/utils/TransactionalFileCopierSuite.scala
Show resolved
Hide resolved
...orage/src/main/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/StoragePluginModule.scala
Show resolved
Hide resolved
...age/src/main/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/files/batch/BatchCopy.scala
Show resolved
Hide resolved
...n/scala/ch/epfl/bluebrain/nexus/delta/plugins/storage/storages/operations/disk/package.scala
Show resolved
Hide resolved
val events = ListBuffer.empty[Event] | ||
val (sourceFileRes, sourceStorage) = genFileResourceAndStorage(sourceFileId, sourceProj.context, diskVal) | ||
val (user, aclCheck) = userAuthorizedOnProjectStorage(sourceStorage.value) | ||
|
||
val batchCopy = mkBatchCopy( | ||
fetchFile = stubbedFetchFile(sourceFileRes, events), | ||
fetchStorage = stubbedFetchStorage(sourceStorage, events), | ||
aclCheck = aclCheck, | ||
stats = stubbedStorageStats(storageStatEntry, events), | ||
diskCopy = stubbedDiskCopy(stubbedFileAttr, events) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a big setup
val destStorage: DiskStorage = genDiskStorage() | ||
|
||
batchCopy.copyFiles(source, destStorage)(caller(user)).map { obtained => | ||
val obtainedEvents = events.toList |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if an abstraction around the events list might be a bit clearer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is actually a really nice pure FP abstraction for this kinda of testing (no mutable state), but it relies on tagless final / monad transformers so not a good fit here. I'll think about how to make it clearer tho 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the mutable state is the problem, it just needs to read better. I was thinking just a tiny class / enclosure
test("batch copying should fetch storage, perform copy and evaluate create file commands") { | ||
val events = ListBuffer.empty[Event] | ||
val destProj: Project = genProject() | ||
val (destStorageRef, destStorage) = (genRevision(), genStorage(destProj.ref, diskVal)) | ||
val fetchFileStorage = mockFetchFileStorage(destStorageRef, destStorage.storage, events) | ||
val stubbedDestAttributes = genAttributes() | ||
val batchCopy = BatchCopyMock.withStubbedCopyFiles(events, stubbedDestAttributes) | ||
val destFileUUId = UUID.randomUUID() // Not testing UUID generation, same for all of them | ||
|
||
val batchFiles: BatchFiles = mkBatchFiles(events, destProj, destFileUUId, fetchFileStorage, batchCopy) | ||
implicit val c: Caller = Caller(genUser(), Set()) | ||
val (source, destination) = (genCopyFileSource(), genCopyFileDestination(destProj.ref, destStorage.storage)) | ||
val obtained = batchFiles.copyFiles(source, destination).accepted | ||
|
||
val expectedFileIri = destProj.base.iri / destFileUUId.toString | ||
val expectedCmds = stubbedDestAttributes.map( | ||
CreateFile(expectedFileIri, destProj.ref, destStorageRef, destStorage.value.tpe, _, c.subject, destination.tag) | ||
) | ||
|
||
// resources returned are based on file command evaluation | ||
assertEquals(obtained, expectedCmds.map(genFileResourceFromCmd)) | ||
|
||
val expectedActiveStorageFetched = ActiveStorageFetched(destination.storage, destProj.ref, destProj.context, c) | ||
val expectedBatchCopyCalled = BatchCopyCalled(source, destStorage.storage, c) | ||
val expectedCommandsEvaluated = expectedCmds.toList.map(FileCommandEvaluated) | ||
val expectedEvents = List(expectedActiveStorageFetched, expectedBatchCopyCalled) ++ expectedCommandsEvaluated | ||
assertEquals(events.toList, expectedEvents) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what to suggest but this feels like a really big test
import io.circe.syntax._ | ||
import io.circe.{Encoder, Json, JsonObject} | ||
|
||
final case class BulkOperationResults[A](results: Seq[A]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would go the opposite direction and call it CopyFilesResult
😆
for { | ||
iri <- generateId(pc) | ||
command = | ||
CreateFile(iri, dest.project, destStorageRef, destStorageTpe, destFileAttributes, c.subject, dest.tag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we want to preserve the original tag.
It would be nice in some way to know that this file has been copied from another
@@ -8,7 +8,7 @@ import ch.epfl.bluebrain.nexus.tests.iam.types.Permission | |||
import io.circe.Json | |||
import org.scalatest.Assertion | |||
|
|||
class DiskStorageSpec extends StorageSpec { | |||
class DiskStorageSpec extends StorageSpec with CopyFilesSpec { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be other test classes dedicated for the batch operation ?
Those are already complicated, it may be better to split and just keep the creation of storages in common between them ?
Fixes #4400 with a bulk file copy operation.
This PR unfortunately got massive and I'd be happy to pair with someone to go through anything not clear.
Docs still to come