Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List storage jobs query API #748

Merged
merged 11 commits into from
Jan 15, 2021

Conversation

asutula
Copy link
Member

@asutula asutula commented Jan 3, 2021

The StorageJobs API wasn't feeling right to me, and the main problem is that there was no way to query for a list of historical StorageJobs. This PR implements that new API and also takes a stab at the more general problem of data paging in APIs. This new ListStorageJobs API attempts to support query options for filter by API id, filter by cid, filter by job state, limit, ascending/descending, all while also supporting data paging. It's a bit of a bespoke solution for the particular use case of StorageJobs, but I feel like we can use similar ideas in other places in the API where data paging is required.

I want to get the bulk of the work out for preliminary PR feedback now, and will continue with the work still remaining:

  • Incorporate this new API into the client and CLI
  • Write a needed data migration in the sjstore
  • Deleting redundant/old/useless APIs that this new one replaces
  • Finish writing through tests for sjstore queries.

@asutula asutula requested a review from jsign January 3, 2021 01:24
@asutula asutula self-assigned this Jan 3, 2021
Signed-off-by: Aaron Sutula <[email protected]>
Signed-off-by: Aaron Sutula <[email protected]>
Signed-off-by: Aaron Sutula <[email protected]>
Copy link
Contributor

@jsign jsign left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I left some notes.

api/client/admin/storagejobs.go Show resolved Hide resolved
api/client/admin/storagejobs.go Show resolved Hide resolved
api/client/storagejobs.go Show resolved Hide resolved
api/server/admin/jobs.go Show resolved Hide resolved
api/server/admin/jobs.go Outdated Show resolved Hide resolved
ffs/scheduler/internal/sjstore/sjstore.go Show resolved Hide resolved
ffs/scheduler/internal/sjstore/sjstore.go Outdated Show resolved Hide resolved
ffs/scheduler/internal/sjstore/sjstore.go Outdated Show resolved Hide resolved
ffs/scheduler/internal/sjstore/sjstore.go Show resolved Hide resolved
ffs/scheduler/internal/sjstore/sjstore.go Show resolved Hide resolved
Signed-off-by: Aaron Sutula <[email protected]>
Signed-off-by: Aaron Sutula <[email protected]>
Signed-off-by: Aaron Sutula <[email protected]>
Signed-off-by: Aaron Sutula <[email protected]>

// V3StorageJobsIndexMigration contains the logic to upgrade a datastore from
// version 2 to version 3.
var V3StorageJobsIndexMigration = Migration{
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Giving this all the migration "3" naming so it doesn't conflict with migration 2 that already exists in jsign/import... will make my pending rebase easier.

"github.com/ipfs/go-datastore/query"
)

type storageJob struct {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Includes only the needed parts of ffs.StorageJob.

Comment on lines +23 to +52
Run: func(ds datastoreReaderWriter) error {
q := query.Query{Prefix: "/ffs/scheduler/sjstore/job"}
res, err := ds.Query(q)
if err != nil {
return fmt.Errorf("querying sjstore jobs: %s", err)
}
defer func() { _ = res.Close() }()

for r := range res.Next() {
if r.Error != nil {
return fmt.Errorf("iterating results: %s", r.Error)
}

var job storageJob
if err := json.Unmarshal(r.Value, &job); err != nil {
return fmt.Errorf("unmarshaling job: %s", err)
}

apiidKey := datastore.NewKey("/ffs/scheduler/sjstore/apiid").ChildString(job.APIID).ChildString(job.Cid.String()).ChildString(fmt.Sprintf("%d", job.CreatedAt))
cidKey := datastore.NewKey("/ffs/scheduler/sjstore/cid").ChildString(job.Cid.String()).ChildString(job.APIID).ChildString(fmt.Sprintf("%d", job.CreatedAt))

if err := ds.Put(apiidKey, []byte(job.ID)); err != nil {
return fmt.Errorf("putting apiid index record in datastore: %s", err)
}
if err := ds.Put(cidKey, []byte(job.ID)); err != nil {
return fmt.Errorf("putting cid index record in datastore: %s", err)
}
}
return nil
},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very simple migration that simply reads all saved StorageJobs and writes the corresponding index records for them.

// V3StorageJobsIndexMigration contains the logic to upgrade a datastore from
// version 2 to version 3.
var V3StorageJobsIndexMigration = Migration{
UseTxn: true,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a "big" migration, so seems like this is a good idea.

Comment on lines +11 to +22
t.Parallel()

ds := tests.NewTxMapDatastore()

pre(t, ds, "testdata/v3_StorageJobs.pre")
txn, _ := ds.NewTransaction(false)

err := V3StorageJobsIndexMigration.Run(txn)
require.NoError(t, err)
require.NoError(t, txn.Commit())

post(t, ds, "testdata/v3_StorageJobs.post")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not much to test here other than the data is what we expect according to the csv files.

@@ -155,6 +155,7 @@ func TestRealDataBadger(t *testing.T) {

migrations := map[int]Migration{
1: V1MultitenancyMigration,
2: V3StorageJobsIndexMigration,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will merge this with jsign/import and this migration will be key 3.

Signed-off-by: Aaron Sutula <[email protected]>
// it returns a nil *ffs.Job and no-error.
func (s *Store) Dequeue() (*ffs.StorageJob, error) {
func (s *Store) Dequeue(iid ffs.APIID) (*ffs.StorageJob, error) {
Copy link
Member Author

@asutula asutula Jan 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @jsign the new apiid arg to Dequeue

cb func(t *testing.T, s *Store, createdJobs ...createdJobs)
}

var testsList = []test{
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also FYI @jsign here's a declarative test setup I'm using to test List.

Comment on lines +409 to +424
t.Run("APIID", func(t *testing.T) {
t.Parallel()
s := create(t)

j1 := createJob(t, "apiid1", cid.Undef)
err := s.Enqueue(j1)
require.NoError(t, err)

j2 := createJob(t, "apiid2", cid.Undef)
err = s.Enqueue(j2)
require.NoError(t, err)

j, err := s.Dequeue(ffs.APIID("apiid2"))
require.NoError(t, err)
require.Equal(t, ffs.APIID("apiid2"), j.APIID)
})
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New test to test dequeuing a job for a specific api id.

@asutula asutula merged commit 5ecc0ec into asutula/cid-info-summary-split Jan 15, 2021
@asutula asutula deleted the asutula/list-storage-jobs branch January 15, 2021 22:53
asutula added a commit that referenced this pull request Jan 16, 2021
* first cut a a list implementation for storage jobs

Signed-off-by: Aaron Sutula <[email protected]>

* bubble up list storage jobs api through scheduler, ffs, and apis

Signed-off-by: Aaron Sutula <[email protected]>

* client for list, better name for next page token

Signed-off-by: Aaron Sutula <[email protected]>

* list cli

Signed-off-by: Aaron Sutula <[email protected]>

* fix selector filter

Signed-off-by: Aaron Sutula <[email protected]>

* remove old/redundant code

Signed-off-by: Aaron Sutula <[email protected]>

* better alias

Signed-off-by: Aaron Sutula <[email protected]>

* pr feedback

Signed-off-by: Aaron Sutula <[email protected]>

* remove counts

Signed-off-by: Aaron Sutula <[email protected]>

* migration

Signed-off-by: Aaron Sutula <[email protected]>

* good tests for sjstore.List

Signed-off-by: Aaron Sutula <[email protected]>
asutula added a commit that referenced this pull request Jan 16, 2021
* first cut a a list implementation for storage jobs

Signed-off-by: Aaron Sutula <[email protected]>

* bubble up list storage jobs api through scheduler, ffs, and apis

Signed-off-by: Aaron Sutula <[email protected]>

* client for list, better name for next page token

Signed-off-by: Aaron Sutula <[email protected]>

* list cli

Signed-off-by: Aaron Sutula <[email protected]>

* fix selector filter

Signed-off-by: Aaron Sutula <[email protected]>

* remove old/redundant code

Signed-off-by: Aaron Sutula <[email protected]>

* better alias

Signed-off-by: Aaron Sutula <[email protected]>

* pr feedback

Signed-off-by: Aaron Sutula <[email protected]>

* remove counts

Signed-off-by: Aaron Sutula <[email protected]>

* migration

Signed-off-by: Aaron Sutula <[email protected]>

* good tests for sjstore.List

Signed-off-by: Aaron Sutula <[email protected]>
asutula added a commit that referenced this pull request Jan 18, 2021
* new summary api and simplified cid info

Signed-off-by: Aaron Sutula <[email protected]>

* pr feedback

Signed-off-by: Aaron Sutula <[email protected]>

* Proposed CLI package structure (#728)

* move data cli commands to packages, refactor common code

Signed-off-by: Aaron Sutula <[email protected]>

* add data summary command that was forgotten

Signed-off-by: Aaron Sutula <[email protected]>

* apply cli org to the rest

Signed-off-by: Aaron Sutula <[email protected]>

* update docs

Signed-off-by: Aaron Sutula <[email protected]>

* Data summary TUI (#729)

* basic tui for pow data summary

Signed-off-by: Aaron Sutula <[email protected]>

* pr feedback

Signed-off-by: Aaron Sutula <[email protected]>

* update docs

Signed-off-by: Aaron Sutula <[email protected]>

* List storage jobs query API (#748)

* first cut a a list implementation for storage jobs

Signed-off-by: Aaron Sutula <[email protected]>

* bubble up list storage jobs api through scheduler, ffs, and apis

Signed-off-by: Aaron Sutula <[email protected]>

* client for list, better name for next page token

Signed-off-by: Aaron Sutula <[email protected]>

* list cli

Signed-off-by: Aaron Sutula <[email protected]>

* fix selector filter

Signed-off-by: Aaron Sutula <[email protected]>

* remove old/redundant code

Signed-off-by: Aaron Sutula <[email protected]>

* better alias

Signed-off-by: Aaron Sutula <[email protected]>

* pr feedback

Signed-off-by: Aaron Sutula <[email protected]>

* remove counts

Signed-off-by: Aaron Sutula <[email protected]>

* migration

Signed-off-by: Aaron Sutula <[email protected]>

* good tests for sjstore.List

Signed-off-by: Aaron Sutula <[email protected]>

* update pow docs

Signed-off-by: Aaron Sutula <[email protected]>

* fix compilation problems

Signed-off-by: Aaron Sutula <[email protected]>

* update docs

Signed-off-by: Aaron Sutula <[email protected]>

* comment exported func

Signed-off-by: Aaron Sutula <[email protected]>

* fix docs

Signed-off-by: Ignacio Hagopian <[email protected]>

* add missin cmd to cli, better name for admin jobs cli

Signed-off-by: Aaron Sutula <[email protected]>

Co-authored-by: Ignacio Hagopian <[email protected]>
jsign added a commit that referenced this pull request Jan 21, 2021
* new summary api and simplified cid info

Signed-off-by: Aaron Sutula <[email protected]>

* pr feedback

Signed-off-by: Aaron Sutula <[email protected]>

* Proposed CLI package structure (#728)

* move data cli commands to packages, refactor common code

Signed-off-by: Aaron Sutula <[email protected]>

* add data summary command that was forgotten

Signed-off-by: Aaron Sutula <[email protected]>

* apply cli org to the rest

Signed-off-by: Aaron Sutula <[email protected]>

* update docs

Signed-off-by: Aaron Sutula <[email protected]>

* Data summary TUI (#729)

* basic tui for pow data summary

Signed-off-by: Aaron Sutula <[email protected]>

* pr feedback

Signed-off-by: Aaron Sutula <[email protected]>

* update docs

Signed-off-by: Aaron Sutula <[email protected]>

* List storage jobs query API (#748)

* first cut a a list implementation for storage jobs

Signed-off-by: Aaron Sutula <[email protected]>

* bubble up list storage jobs api through scheduler, ffs, and apis

Signed-off-by: Aaron Sutula <[email protected]>

* client for list, better name for next page token

Signed-off-by: Aaron Sutula <[email protected]>

* list cli

Signed-off-by: Aaron Sutula <[email protected]>

* fix selector filter

Signed-off-by: Aaron Sutula <[email protected]>

* remove old/redundant code

Signed-off-by: Aaron Sutula <[email protected]>

* better alias

Signed-off-by: Aaron Sutula <[email protected]>

* pr feedback

Signed-off-by: Aaron Sutula <[email protected]>

* remove counts

Signed-off-by: Aaron Sutula <[email protected]>

* migration

Signed-off-by: Aaron Sutula <[email protected]>

* good tests for sjstore.List

Signed-off-by: Aaron Sutula <[email protected]>

* update pow docs

Signed-off-by: Aaron Sutula <[email protected]>

* fix compilation problems

Signed-off-by: Aaron Sutula <[email protected]>

* update docs

Signed-off-by: Aaron Sutula <[email protected]>

* comment exported func

Signed-off-by: Aaron Sutula <[email protected]>

* fix docs

Signed-off-by: Ignacio Hagopian <[email protected]>

* add missin cmd to cli, better name for admin jobs cli

Signed-off-by: Aaron Sutula <[email protected]>

Co-authored-by: Ignacio Hagopian <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants