Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port benchmarks from rivulet #459

Merged
merged 9 commits into from
Jan 24, 2025
Merged

Port benchmarks from rivulet #459

merged 9 commits into from
Jan 24, 2025

Conversation

mcember
Copy link
Collaborator

@mcember mcember commented Jan 21, 2025

Summary

Port benchmarks. Modify imports and make refactors for changes to schema and other interfaces when we ported internal repo to deltacat

NOTE: currently this diff is pulling in changes from linter, per #458. We should merge in 458 first then rebase this.

@mcember

This comment was marked as outdated.

@mcember mcember closed this Jan 21, 2025
@mcember mcember reopened this Jan 21, 2025
@mcember mcember force-pushed the dev/rivulet/port-benchmarks branch from fb801e9 to fbd238d Compare January 21, 2025 17:29
@mcember mcember requested review from flliver and thesalus January 21, 2025 17:50
deltacat/benchmarking/README.md Outdated Show resolved Hide resolved
Comment on lines +12 to +22
def timed_step(description: str) -> Generator[BenchmarkStep, None, None]:
"""Convenience for computing elapsed time of a block of code as a metric.

:param description: description of the step
:return: a benchmark operation populated with the elapsed time
"""
metric = BenchmarkStep(description)
start_time = time.time()
yield metric
end_time = time.time()
metric.add(BenchmarkMetric("elapsed_time", 1000 * (end_time - start_time), "ms"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an opportunity to replace this with the existing invoke_with_perf_counter https://github.com/ray-project/deltacat/blob/2.0/deltacat/utils/performance.py#L8-L17?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I would like to refactor the whole benchmark framework. I would like to be able to write the tests more like normal client code without wrapping things in BenchmarkEngine/Run/Step objects.

Having said that this CR is more or less a copy-pasta, I don't want to take on scope for refactoring. invoke_with_perf_counter can't be just used as-is because it runs the passed in func whereas benchmark metric is more like a dataclass which gets wrapped in benchmark step

Comment on lines 32 to 39
@contextmanager
def make_tmpdir():
tmpdir = tempfile.mkdtemp()
try:
yield tmpdir
finally:
shutil.rmtree(tmpdir)
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This is also duplicated at: https://github.com/ray-project/deltacat/blob/2.0/deltacat/tests/storage/conftest.py#L9-L16. Can we make this a fixture, and share it between benchmarks and unit tests from one common location?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding talking to an LLM is that it is an anti pattern to import fixtures explicitly, and multiple conftest.py files can only be automatically imported hierarchically, not from sibling directories. So unless we moved /benchmarks under /tests, we can't share fixtures

As a compromise I moved this to test utils and imported that from the benchmarks and test fixture

Comment on lines 30 to 31
ALL = "all"
ALL = "all"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: looks like this got duplicated.

@@ -0,0 +1,7 @@
All the dependences needed to test are in dev-requirements.txt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: This is going to bug me otherwise.

Suggested change
All the dependences needed to test are in dev-requirements.txt
All the dependencies needed to test are in dev-requirements.txt

@mcember mcember merged commit db034c3 into 2.0 Jan 24, 2025
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants