Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix missing curl binary and update CI #45

Merged
merged 1 commit into from
Aug 30, 2023

Conversation

AndreMiras
Copy link
Contributor

Commit summary

Fix parquet file download error because of a missing curl binary. This was happening on the eth_getBalance test for instance. Note that flood swalled the exception making the error harder to spot.

Update the CI to also test against eth_getBalance which triggers a parquet file download.
Also removes the redundant node2, this was a workaround as flood used to crash with only one node.

Also unpin the patch version of Python and address a small DRY on the Vegeta version.

Motivation

As in version 0.3.1, eth_getBalance fails in a docker environment:

docker run --rm paradigmxyz/flood:latest eth_getBalance node1=https://eth.llamarpc.com --duration 3 --rate 1

Output:

┌───────────────────────────┐
│ Load test: eth_getBalance │
└───────────────────────────┘
- sample rates: [1]
- sample duration: 3
- extra args: None
- output directory: /tmp/tmpsrenaj0e

Gathering node data...
──────────────────────
[WARNING] ctc config file does not exist; use `ctc setup` on command line to generate a config file

       node  │                       url  │               metadata  
    ─────────┼────────────────────────────┼─────────────────────────
      node1  │  https://eth.llamarpc.com  │  llamanodes_web3_proxy  
             │                            │               v1.42.11  


Running load tests...
─────────────────────
[2023-08-30 20:37:24] Starting
[2023-08-30 20:37:24] Running load test for node1
/home/flood/.local/lib/python3.11/site-packages/flood/generators/raw_data_sources/raw_download_utils.py:68: UserWarning: FLOOD_SAMPLES_DIR not set in env, defaulting to CWD
  warnings.warn('FLOOD_SAMPLES_DIR not set in env, defaulting to CWD')
downloading ethereum_contracts_samples__L__v1_0_0.parquet

downloading https://datasets.paradigm.xyz/evm_samples/ethereum_samples__v1_0_0/ethereum_contracts_samples__L__v1_0_0.parquet
2

Solution

Fix parquet file download error because of a missing curl binary. This was happening on the eth_getBalance test for instance. Note that flood swallowed the exception making the error harder to spot.

PR Checklist

  • Added Tests
  • Added Documentation
  • Breaking changes

Fix parquet file download error because of a missing curl binary.
This was happening on the `eth_getBalance` test for instance.
Note that flood swallowed the exception making the error harder to spot.

Update the CI to also test against eth_getBalance which triggers a
parquet file download.
Also removes the redundant node2, this was a workaround as flood used to
crash with only one node.

Also unpin the patch version of Python and address a small DRY on the
Vegeta version.
@sslivkoff
Copy link
Member

👍

so just to be clear, there is no actual error with the parquet file right? this new image just embeds the parquet file inside the image for convenience?

@AndreMiras
Copy link
Contributor Author

AndreMiras commented Aug 30, 2023

Not exactly, it's happening while trying to download the parquet file using the pdp utility package here:
https://github.com/paradigmxyz/flood/blob/0.3.1/flood/generators/raw_data_sources/raw_download_utils.py#L58
However pdp relies on curl here https://github.com/paradigmxyz/paradigm-data-portal/blob/0.2.2/pdp/data_utils/file_utils.py#L48-L57
So without that change, running the eth_getBalance test in a container fails with a swallowed FileNotFoundError.
The non-swallowed exception would be something like:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/flood/.local/lib/python3.11/site-packages/pdp/data_utils/file_utils.py", line 57, in download_file
    subprocess.call(['curl', url, '--output', output_path])
  File "/usr/local/lib/python3.11/subprocess.py", line 389, in call
    with Popen(*popenargs, **kwargs) as p:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/subprocess.py", line 1024, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/local/lib/python3.11/subprocess.py", line 1917, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'curl'

Edit:
So to clarify, this image doesn't embed the parquet file, that file is still being downloaded by flood when not available.
This is that exact download process that was failing because the curl binary was missing from the container.
This parquet download was introduced in b7c98b2, so we're basically just missing curl

@sslivkoff
Copy link
Member

oh. gotcha. lgtm

@sslivkoff sslivkoff merged commit 03f93c4 into paradigmxyz:main Aug 30, 2023
4 checks passed
Comment on lines +32 to +33
RUN apt-get update && apt-get install -y --no-install-recommends curl \
&& rm -rf /var/lib/apt/lists/*
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the bug fix, the rest is just cosmetic

@AndreMiras AndreMiras deleted the feature/fix_missing_curl_dep branch August 30, 2023 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants