Unsorted results prefix performance #8396

macdrevx · 2023-10-13T21:09:35Z

How frequently does the bug occur?

Always

Description

I'm working on optimizing a query of a large dataset where I only need the first N results.

In #4886 (comment), there's the following blurb:

Because you're sorting the Results, Realm can't just terminate the query search after it finds a small number of objects matching the predicate, as that might not be the first match after the sort operation. So all 80,000 objects must be sorted before Realm can even serve you one of them. Skipping the sort operation would drastically improve the performance here.

My use case doesn't require sorting, so I was expecting that I could iterate through the results with a for … in loop, breaking after finding N matches, and that it would take a similar amount of time for realm to find each next result.

What I'm actually seeing is that, even without sorting, accessing the first result takes dramatically longer than accessing each subsequent result. Is this expected?

Stacktrace & log output

…
created 1000000
[0.30182695388793945, 1.4066696166992188e-05, 2.0265579223632812e-06, 0.0, 9.5367431640625e-07, 0.0, 0.0, 0.0, 9.5367431640625e-07, 7.033348083496094e-06, 0.0, 0.0, 0.0, 0.0, 0.0, 9.5367431640625e-07, 4.506111145019531e-05, 0.0, 0.0, 0.0, 9.5367431640625e-07]

Each array entry shows the time it took to get the next element in the for…in loop. Note how much longer it takes to access the first element despite the fact that the results are unsorted.

Can you reproduce the bug?

Always

Reproduction Steps

PrefixPerfDemo.zip

Run attached Xcode project.

Version

10.43.0

What Atlas Services are you using?

Local Database only

Are you using encryption?

No

Platform OS and version(s)

iOS Simulator 16.4

Build environment

Xcode version: 14.3.1
Dependency manager and version: Swift Package Manager (whatever version comes with Xcode 14.3.1)

The text was updated successfully, but these errors were encountered:

tgoyne · 2023-10-13T21:46:55Z

The query is run and the full object set is assembled the first time anything which needs it is access, and we don't do an incremental find. The part which is lazy is creating the accessor object for each object in the Results (which for a non-sorted Results is typically much slower than the actual query), and reading any of the data for those objects which wasn't used in the query. We don't have an offset: parameter on index(matching:) because it'd be a performance trap; objects[objects.index(matching: query)] is a bit faster than objects.filter(query).first, but doing multiple searches is typically dramatically slower than just finding all of them even if you end up discarding most of the results without using them.

There isn't any good reason for Results to not have a .limit() function, but I guess it's just never really come up as something anyone has needed in the time since it was implemented in the query engine.

macdrevx · 2023-10-13T22:27:46Z

Thanks for the quick answer!

macdrevx added the T-Bug label Oct 13, 2023

realm-probot bot added the O-Community label Oct 13, 2023

github-actions bot added Encryption:Off Frequency:Always Repro:Always SDK-Use:Local labels Oct 13, 2023

macdrevx closed this as completed Oct 13, 2023

github-actions bot locked as resolved and limited conversation to collaborators Mar 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unsorted results prefix performance #8396

Unsorted results prefix performance #8396

macdrevx commented Oct 13, 2023 •

edited by sync-by-unito bot

Loading

tgoyne commented Oct 13, 2023

macdrevx commented Oct 13, 2023

Unsorted results prefix performance #8396

Unsorted results prefix performance #8396

Comments

macdrevx commented Oct 13, 2023 • edited by sync-by-unito bot Loading

How frequently does the bug occur?

Description

Stacktrace & log output

Can you reproduce the bug?

Reproduction Steps

Version

What Atlas Services are you using?

Are you using encryption?

Platform OS and version(s)

Build environment

tgoyne commented Oct 13, 2023

macdrevx commented Oct 13, 2023

macdrevx commented Oct 13, 2023 •

edited by sync-by-unito bot

Loading