Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming data to/from IndexedDB #419

Open
dumbmatter opened this issue Apr 19, 2024 · 2 comments
Open

Streaming data to/from IndexedDB #419

dumbmatter opened this issue Apr 19, 2024 · 2 comments
Labels
TPAC2024 Topic for discussion at TPAC 2024

Comments

@dumbmatter
Copy link

Now that the Streams API is widely supported, would it make sense to have some built-in IndexedDB API for streaming data to/from IndexedDB?

The problem now is that it is somewhat difficult and inefficient to write such functionality on your own. For example, if you want to create a ReadableStream that outputs all of the data in a giant object store, you can't just naively iterate over a cursor in ReadableStream.pull because the transaction will automatically close at some point. So you wind up kind of fighting against the stream trying to only read part of the data into memory at once, and IndexedDB closing a transaction when it's no longer active. Something like this:

const makeReadableStream = (db, store) => {
  let prevKey;

  return new ReadableStream({
    async pull(controller) {
      const range = prevKey !== undefined
        ? IDBKeyRange.lowerBound(prevKey, true)
        : undefined;

      const MIN_BATCH_SIZE = 100;
      let batchCount = 0;

      let cursor = await db.transaction(store).store.openCursor(range);
      while (cursor) {
        controller.enqueue(`${JSON.stringify(cursor.value)}\n`);
        prevKey = cursor.key
        batchCount += 1;

        if (controller.desiredSize > 0 || batchCount < MIN_BATCH_SIZE) {
          cursor = await cursor.continue();
        } else {
          break;
        }
      }

      console.log(`Done batch of ${batchCount} object`);

      if (!cursor) {
        // Actually done with this store, not just paused
        console.log("Completely done");
        controller.close();
      }
    },
  }, {
    highWaterMark: 100,
  });
};

In addition to that code being a little complicated to write, it's also probably slower than it needs to be due to creating many transactions over the course of a large stream.

I wrote a blog post about this a few years ago and if I search I still can't find anyone else talking about doing stuff like this, but I do get a couple people finding that article in Google every day and every now and again someone emails me about it, so I'm not literally the only person interested in this. Although I admit it's probably a niche use case. I do have hundreds of users every day exporting large amounts of data from IndexedDB in my video games, and that uses code similar to what I wrote in that blog post.

What would be better is maybe an API equivalent to getAll - a method on IDBObjectStore and IDBIndex that takes an IDBKeyRange and returns a stream of all matching records. And then maybe also an equivalent API for writing data to an object store.

@asutherland
Copy link
Collaborator

xref #34 on explicit transaction lifetime control.

@SteveBeckerMSFT SteveBeckerMSFT added the TPAC2024 Topic for discussion at TPAC 2024 label Sep 9, 2024
@SteveBeckerMSFT
Copy link

TPAC 2024: We discussed streaming large values with IDB reads and writes. We pointed out that this can be accomplished today using File and Blob, which can then be stored in IDB. However, this potentially forces developers to implement a two-phase commit between the large value in File/Blob and the IDB transaction.

Perhaps there is still an opportunity to improve IndexedDB's API ergonomics when interacting with the Streams API to reduce the amount of boiler plate code required by the example above. Also, as noted above, providing developers with explicit transaction lifetime control may reduce the number of transactions required when streaming data from IDB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
TPAC2024 Topic for discussion at TPAC 2024
Projects
None yet
Development

No branches or pull requests

3 participants