Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an unstable_merge function to deduplicate PAGE_DATA structures as best as possible #43

Open
chris-oo opened this issue Apr 16, 2024 · 3 comments
Assignees

Comments

@chris-oo
Copy link
Member

Instead of having just a stable merge, offer an unstable merge that attempts to deduplicate PAGE_DATA structures between different compatibility masks. A user could use this unstable_merge to make the IGVM file as small as possible, then perform another pass to generate the launch measurement.

@chris-oo chris-oo self-assigned this Apr 16, 2024
@chris-oo
Copy link
Member Author

chris-oo commented Apr 22, 2024

A different approach to this is done by deduplicating file data as implemented by #44. While an unstable_merge could be useful, deduplicating file data is actually the larger portion of the file size, which in a different project reduces the generated file size by as much as 2/3rds or more.

@chris-oo
Copy link
Member Author

For now I'm going to close this because we don't yet need it, the variable headers section is small. We could revisit it in the future.

@chris-oo chris-oo reopened this Apr 30, 2024
@chris-oo
Copy link
Member Author

The current stable merge function is slow, so it's potentially worth looking at unstable_merge again.

chris-oo added a commit that referenced this issue Apr 30, 2024
On large IGVM files with many headers, the merge function can take a
significant amount of time when attempting to merge headers via
compatibility mask. I wasn't able to think of any good way to perform
this merge faster while maintaining stability of the headers.

Offer a `merge_simple` function that merely does the required fixups.
This is significantly faster, and since serialization deduplicates
file_data, the size increase of the final file isn't much larger due to
variable headers individually being quite small.

In the future we could implement a linear time `unstable_merge` that
attempts to merge equivalent headers via a hashmap.

#43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant