Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat io tests #21

Merged
merged 39 commits into from
Nov 21, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
b73f324
fix tests after refactor
sphamba Oct 26, 2023
77e4df6
add gpm_api_test_data as submodule
sphamba Oct 26, 2023
0cf01ad
add granule data
sphamba Oct 27, 2023
1127a87
Merge branch 'main' into fix-tests
sphamba Oct 30, 2023
4bb451e
fix some download tests
sphamba Oct 30, 2023
a441759
add granule data pull in GH action
sphamba Oct 30, 2023
c93f62a
add hdf5 file locking in gh action tests
sphamba Oct 30, 2023
3650cac
add tests for new methods of io/checks
sphamba Oct 30, 2023
42ccd5c
add tests for new methods of io/filter
sphamba Oct 31, 2023
5405ca5
add git lfs in github actions
sphamba Oct 31, 2023
9921010
add manual lfs pull in GH action
sphamba Oct 31, 2023
0f6ed22
add tests for io/info
sphamba Oct 31, 2023
d1f3a5e
remove unused test data generation scripts
sphamba Oct 31, 2023
ea2359d
fix tests for io/filter with updated conftest
sphamba Oct 31, 2023
7c32bba
remove lfs usage for test data
sphamba Nov 2, 2023
aba73e5
add tests for io/find local
sphamba Nov 6, 2023
b2f30f1
add tests for io/find pps
sphamba Nov 6, 2023
4723de7
remove io tests relying on network
sphamba Nov 6, 2023
f36a5b8
run io/find tests on all products
sphamba Nov 6, 2023
1be2640
fix (temporary) of io/download test
sphamba Nov 6, 2023
94cc5d0
fix test typings for 3.8
sphamba Nov 6, 2023
29c1fb5
fix url slashes in windows for pps
sphamba Nov 6, 2023
5d5793d
fix pps test on windows (slashes in url)
sphamba Nov 6, 2023
2386c9d
fix test granule files in python 3.8
sphamba Nov 6, 2023
6dc27b7
add tests for io/find ges_disc
sphamba Nov 6, 2023
d6fa57d
put kwargs in test functions calls
sphamba Nov 9, 2023
922ecba
lint
sphamba Nov 9, 2023
704a82f
add test io/filter case over two days
sphamba Nov 9, 2023
551ceac
split dataset test_granule finalize test
sphamba Nov 9, 2023
35237b4
fix: set granule data path relative to root
sphamba Nov 9, 2023
253eaac
fix caught exception in io/info
sphamba Nov 14, 2023
be2b26e
add io/find/find_daily_filepath test
sphamba Nov 14, 2023
02430b9
add non-failling asserts in tests and split tests
sphamba Nov 16, 2023
b7d476e
add io/download tests (and pps, ges_disc)
sphamba Nov 16, 2023
557b355
add data integrity test on real hdf5 file
sphamba Nov 16, 2023
8b46bf5
move test for io/checks to avoid conflicts
sphamba Nov 20, 2023
d83acf8
remove io/checks is_empty method
sphamba Nov 20, 2023
724e546
fix: remove unused variables and inexistent returns
sphamba Nov 21, 2023
6437e4f
lint
sphamba Nov 21, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion .coveragerc
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,13 @@ omit =
*dev*
*docs*
*tutorials*
gpm_api/tests/*
gpm_api/bucket/*
gpm_api/cli/*
gpm_api/encoding/*
gpm_api/etc/*
gpm_api/retrieval/*
gpm_api/tests/*
gpm_api/_version.py

[report]
exclude_lines =
Expand Down
4 changes: 2 additions & 2 deletions gpm_api/io/ges_disc.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ def _get_href_value(input_string):
return href_value


def _get_gesc_disc_list_path(url):
def _get_ges_disc_list_path(url):
sphamba marked this conversation as resolved.
Show resolved Hide resolved
# Retrieve url content
# - If it returns something, means url is correct
wget_output = _get_ges_disc_url_content(url)
Expand Down Expand Up @@ -179,7 +179,7 @@ def _get_gesdisc_file_list(url_product_dir, product, date, version, verbose=True
Default is False. Whether to specify when data are not available for a specific date.
"""
try:
filepaths = _get_gesc_disc_list_path(url_product_dir)
filepaths = _get_ges_disc_list_path(url_product_dir)
except Exception as e:
# If url not exist, raise an error
if "was not found on the GES DISC server" in str(e):
Expand Down
22 changes: 12 additions & 10 deletions gpm_api/tests/test_io/test_checks.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#!/usr/bin/env python3

Check notice on line 1 in gpm_api/tests/test_io/test_checks.py

View check run for this annotation

CodeScene Delta Analysis / CodeScene Cloud Delta Analysis (main)

ℹ Getting worse: Code Duplication

introduced similar code in: test_check_remote_storage,test_check_storage. Avoid duplicated, aka copy-pasted, code inside the module. More duplication lowers the code health.
# -*- coding: utf-8 -*-
"""
Created on Mon Jul 17 15:41:14 2023
Expand Down Expand Up @@ -240,29 +240,30 @@


def test_check_product_version(
check, # For non-failing asserts
product_info: Dict[str, Any],
versions: List[int],
) -> None:
"""Test check_product_version()"""

# Check valid versions
for product, info in product_info.items():
sphamba marked this conversation as resolved.
Show resolved Hide resolved
# Check valid versions
valid_versions = info.get("available_versions", [])

for version in valid_versions:
assert checks.check_product_version(version, product) == version
with check:
assert checks.check_product_version(version, product) == version

# Check last version return if None
for product, info in product_info.items():
# Check last version return if None
last_version = info.get("available_versions", [])[-1]
assert checks.check_product_version(None, product) == last_version
with check:
assert checks.check_product_version(None, product) == last_version

# Check invalid versions
for product, info in product_info.items():
# Check invalid versions
invalid_versions = list(set(versions) - set(info.get("available_versions", [])))

for version in invalid_versions:
with pytest.raises(ValueError):
with check.raises(ValueError):
checks.check_product_version(version, product)


Expand Down Expand Up @@ -579,6 +580,7 @@


def test_check_valid_time_request(
check, # For non-failing asserts
product_info: Dict[str, Any],
) -> None:
"""Test check_valid_time_request()"""
Expand All @@ -596,14 +598,14 @@
# Check invalid start time
start_time = valid_start_time - datetime.timedelta(days=1)
end_time = valid_start_time + datetime.timedelta(days=1)
with pytest.raises(ValueError):
with check.raises(ValueError):
checks.check_valid_time_request(start_time, end_time, product)

# Check invalid end time
if valid_end_time is not None:
start_time = valid_end_time - datetime.timedelta(days=1)
end_time = valid_end_time + datetime.timedelta(days=1)
with pytest.raises(ValueError):
with check.raises(ValueError):
checks.check_valid_time_request(start_time, end_time, product)


Expand Down
29 changes: 15 additions & 14 deletions gpm_api/tests/test_io/test_download.py
Original file line number Diff line number Diff line change
Expand Up @@ -163,64 +163,65 @@


def test_download_data(
check, # For non-failing asserts
products: List[str],
product_types: List[str],
remote_filepaths: Dict[str, Dict[str, Any]],
mocker: MockerFixture,
versions: List[str],
):
"""Test download_data function

This test is somewhat redundant considering it is testing methods
bundled in another functions which need to be turned off in order to
test this function. However, it is useful to have a test that checks
the entire download process.

It may be useful as boilerplate to increase the number of tests here in the
future.
"""

mocker.patch.object(dl, "_download_files", autospec=True, return_value=[])
mocker.patch.object(dl, "_download_daily_data", autospec=True, return_value=([], versions))
mocker.patch.object(dl, "run", autospec=True, return_value=None)
from gpm_api.io import info, pps

mocker.patch.object(
info,
"_get_info_from_filename",
autospec=True,
return_value={
"product": "2A-CLIM",
"product_type": "CLIM",
"start_time": datetime.datetime(2022, 9, 7, 12, 0, 0),
"end_time": datetime.datetime(2022, 9, 7, 13, 0, 0),
"version": "V07A",
"satellite": "GPM",
"granule_id": "2A-CLIM.GPM.GMI.GPROF2021v1.20150301-S121433-E134706.005708.V07A.HDF5",
},
)
mocker.patch.object(
find,
"find_daily_filepaths",
autospec=True,
return_value=(remote_filepaths.keys(), versions),
)

# Assume files pass file integrity check by mocking return as empty
for product in products:
for product_type in product_types:
if product in available_products(product_type=product_type):
start_time = get_product_start_time(product)
if start_time is None:
continue
res = dl.download_archive(
product=product,
start_time=start_time,
end_time=start_time + datetime.timedelta(hours=1),
product_type=product_type,
)

assert res is None # Assume data is downloaded
for product_type in product_types:
for product in available_products(product_type=product_type):
start_time = get_product_start_time(product)
if start_time is None:
continue
res = dl.download_archive(
product=product,
start_time=start_time,
end_time=start_time + datetime.timedelta(hours=1),
product_type=product_type,
)

with check:
assert res is None # Assume data is downloaded

Check notice on line 224 in gpm_api/tests/test_io/test_download.py

View check run for this annotation

CodeScene Delta Analysis / CodeScene Cloud Delta Analysis (main)

ℹ Getting worse: Excess Number of Function Arguments

test_download_data increases from 5 to 6 arguments, threshold = 4. This function has too many arguments, indicating a lack of encapsulation. Avoid adding more arguments.


def test_download_daily_data_private(
Expand Down Expand Up @@ -318,7 +319,7 @@
"""Test convert_pps_to_local_filepaths function

Parameters
"""

Check notice on line 322 in gpm_api/tests/test_io/test_download.py

View check run for this annotation

codefactor.io / CodeFactor

gpm_api/tests/test_io/test_download.py#L322

unresolved comment '# TODO: WRONG REDO !' (C100)
# TODO: WRONG REDO !
assert dl.get_fpaths_from_fnames(
filepaths=[
Expand Down
197 changes: 108 additions & 89 deletions gpm_api/tests/test_io/test_filter.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,107 +50,126 @@
)


def test_filter_filepaths(
remote_filepaths: Dict[str, Dict[str, Any]],
products: Dict[str, Dict[str, Any]],
mocker: MockerFixture,
) -> None:
class TestFilterFilepaths:
"""Test filter filepaths"""

product = "2A-DPR"
sphamba marked this conversation as resolved.
Show resolved Hide resolved

# Test year filtering
# Count and assert 2019 paths
count_2019 = 0
for remote_filepath, info_dict in remote_filepaths.items():
if (
info_dict["year"] == 2019
and info_dict["product"] == product
and info_dict["version"] == 7
):
count_2019 += 1

res = filter.filter_filepaths(
filepaths=list(remote_filepaths.keys()),
product=product,
start_time=datetime.datetime(2019, 1, 1),
end_time=datetime.datetime(2019, 12, 31, 23, 59, 59),
version=7,
)

assert len(res) == count_2019
def test_year_filtering(

Check notice on line 58 in gpm_api/tests/test_io/test_filter.py

View check run for this annotation

CodeScene Delta Analysis / CodeScene Cloud Delta Analysis (main)

✅ No longer an issue: Complex Conditional

test_filter_filepaths no longer has a complex conditional. A complex conditional is an expression inside a branch (e.g. if, for, while) which consists of multiple, logical operators such as AND/OR. The more logical operators in an expression, the more severe the code smell.
self,
remote_filepaths: Dict[str, Dict[str, Any]],
) -> None:
# Count and assert 2019 paths
count_2019 = 0
for remote_filepath, info_dict in remote_filepaths.items():
if (
info_dict["year"] == 2019
and info_dict["product"] == self.product
and info_dict["version"] == 7

Check notice on line 68 in gpm_api/tests/test_io/test_filter.py

View check run for this annotation

CodeScene Delta Analysis / CodeScene Cloud Delta Analysis (main)

ℹ New issue: Complex Conditional

TestFilterFilepaths.test_year_filtering has 1 complex conditionals with 2 branches, threshold = 2. A complex conditional is an expression inside a branch (e.g. if, for, while) which consists of multiple, logical operators such as AND/OR. The more logical operators in an expression, the more severe the code smell.
):
count_2019 += 1

res = filter.filter_filepaths(
filepaths=list(remote_filepaths.keys()),
product=self.product,
start_time=datetime.datetime(2019, 1, 1),
end_time=datetime.datetime(2019, 12, 31, 23, 59, 59),
version=7,
)

# Test None filepaths
res = filter.filter_filepaths(
filepaths=None,
product=product,
start_time=datetime.datetime(2019, 1, 1),
end_time=datetime.datetime(2019, 12, 31, 23, 59, 59),
version=7,
)
assert res == []
assert len(res) == count_2019

Check warning on line 80 in gpm_api/tests/test_io/test_filter.py

View check run for this annotation

CodeScene Delta Analysis / CodeScene Cloud Delta Analysis (main)

❌ New issue: Code Duplication

The module contains 2 functions with similar structure: TestFilterFilepaths.test_empty_start_time,TestFilterFilepaths.test_year_filtering. Avoid duplicated, aka copy-pasted, code inside the module. More duplication lowers the code health.

# Test empty filepath list
res = filter.filter_filepaths(
filepaths=[],
product=product,
start_time=datetime.datetime(2019, 1, 1),
end_time=datetime.datetime(2019, 12, 31, 23, 59, 59),
version=7,
)
assert res == []
def test_none_filepath(
self,
remote_filepaths: Dict[str, Dict[str, Any]],
) -> None:
res = filter.filter_filepaths(
filepaths=None,
product=self.product,
start_time=datetime.datetime(2019, 1, 1),
end_time=datetime.datetime(2019, 12, 31, 23, 59, 59),
version=7,
)
assert res == []

# Test empty start time
count_until_2019 = 0
for remote_filepath, info_dict in remote_filepaths.items():
if info_dict["year"] == 2019 and info_dict["product"] == product:
count_until_2019 += 1
res = filter.filter_filepaths(
filepaths=list(remote_filepaths.keys()),
product=product,
start_time=None,
end_time=datetime.datetime(2019, 12, 31, 23, 59, 59),
version=7,
)
def test_empty_filepath_list(
self,
remote_filepaths: Dict[str, Dict[str, Any]],
) -> None:
res = filter.filter_filepaths(
filepaths=[],
product=self.product,
start_time=datetime.datetime(2019, 1, 1),
end_time=datetime.datetime(2019, 12, 31, 23, 59, 59),
version=7,
)
assert res == []

assert len(res) == count_until_2019
def test_empty_start_time(
self,
remote_filepaths: Dict[str, Dict[str, Any]],
) -> None:
count_until_2019 = 0
for remote_filepath, info_dict in remote_filepaths.items():
if info_dict["year"] == 2019 and info_dict["product"] == self.product:
count_until_2019 += 1
res = filter.filter_filepaths(
filepaths=list(remote_filepaths.keys()),
product=self.product,
start_time=None,
end_time=datetime.datetime(2019, 12, 31, 23, 59, 59),
version=7,
)

# Test empty end time (Error as time given (datetime.datetime.now())
# requires date to be less than now() in supportive
# function checks.check_start_end_time)
count_from_2019 = 0
for remote_filepath, info_dict in remote_filepaths.items():
if info_dict["year"] >= 2019 and info_dict["product"] == product:
count_from_2019 += 1
assert len(res) == count_until_2019

res = filter.filter_filepaths(
filepaths=list(remote_filepaths.keys()),
product=product,
start_time=datetime.datetime(2019, 1, 1),
end_time=None,
version=7,
)
assert len(res) == count_from_2019
def test_empty_end_time(
self,
remote_filepaths: Dict[str, Dict[str, Any]],
) -> None:
"""Test empty end time (Error as time given (datetime.datetime.now())
requires date to be less than now() in supportive
function checks.check_start_end_time)"""

# Test unmatched version
res = filter.filter_filepaths(
filepaths=list(remote_filepaths.keys()),
product=product,
start_time=datetime.datetime(2019, 1, 1),
end_time=datetime.datetime(2019, 12, 31, 23, 59, 59),
version=0,
)
assert res == []
count_from_2019 = 0
for remote_filepath, info_dict in remote_filepaths.items():
if info_dict["year"] >= 2019 and info_dict["product"] == self.product:
count_from_2019 += 1

res = filter.filter_filepaths(
filepaths=list(remote_filepaths.keys()),
product=self.product,
start_time=datetime.datetime(2019, 1, 1),
end_time=None,
version=7,
)
assert len(res) == count_from_2019

def test_unmatched_version(
self,
remote_filepaths: Dict[str, Dict[str, Any]],
) -> None:
res = filter.filter_filepaths(
filepaths=list(remote_filepaths.keys()),
product=self.product,
start_time=datetime.datetime(2019, 1, 1),
end_time=datetime.datetime(2019, 12, 31, 23, 59, 59),
version=0,
)
assert res == []

# Test unmatched product
res = filter.filter_filepaths(
filepaths=list(remote_filepaths.keys()),
product="1A-GMI",
start_time=datetime.datetime(2019, 1, 1),
end_time=datetime.datetime(2019, 12, 31, 23, 59, 59),
version=7,
)
assert res == []
def test_unmatched_product(
self,
remote_filepaths: Dict[str, Dict[str, Any]],
) -> None:
res = filter.filter_filepaths(
filepaths=list(remote_filepaths.keys()),
product="1A-GMI",
start_time=datetime.datetime(2019, 1, 1),
end_time=datetime.datetime(2019, 12, 31, 23, 59, 59),
version=7,
)
assert res == []

Check notice on line 172 in gpm_api/tests/test_io/test_filter.py

View check run for this annotation

CodeScene Delta Analysis / CodeScene Cloud Delta Analysis (main)

✅ No longer an issue: Complex Method

test_filter_filepaths is no longer above the threshold for cyclomatic complexity. This function has many conditional statements (e.g. if, for, while), leading to lower code health. Avoid adding more conditionals and code to it without refactoring.

Check notice on line 172 in gpm_api/tests/test_io/test_filter.py

View check run for this annotation

CodeScene Delta Analysis / CodeScene Cloud Delta Analysis (main)

✅ No longer an issue: Bumpy Road Ahead

test_filter_filepaths is no longer above the threshold for logical blocks with deeply nested code. The Bumpy Road code smell is a function that contains multiple chunks of nested conditional logic. The deeper the nesting and the more bumps, the lower the code health.


def test_filter_by_time(
Expand Down Expand Up @@ -217,19 +236,19 @@
end_time=None,
)

# Test granule starting on previous day
count_previous_day = 0
for remote_filepath, info_dict in remote_filepaths.items():
if info_dict["start_time"].day != info_dict["end_time"].day:
count_previous_day += 1

res = filter.filter_by_time(
filepaths=list(remote_filepaths.keys()),
start_time=datetime.datetime(2020, 7, 6, 0, 0, 20),
end_time=datetime.datetime(2020, 7, 6, 0, 0, 30),
)

assert len(res) == count_previous_day

Check notice on line 251 in gpm_api/tests/test_io/test_filter.py

View check run for this annotation

CodeScene Delta Analysis / CodeScene Cloud Delta Analysis (main)

ℹ New issue: Complex Method

test_filter_by_time has a cyclomatic complexity of 9, threshold = 9. This function has many conditional statements (e.g. if, for, while), leading to lower code health. Avoid adding more conditionals and code to it without refactoring.

Check notice on line 251 in gpm_api/tests/test_io/test_filter.py

View check run for this annotation

CodeScene Delta Analysis / CodeScene Cloud Delta Analysis (main)

ℹ Getting worse: Bumpy Road Ahead

test_filter_by_time increases from 3 to 4 logical blocks with deeply nested code, threshold is one single block per function. The Bumpy Road code smell is a function that contains multiple chunks of nested conditional logic. The deeper the nesting and the more bumps, the lower the code health.


def test_filter_by_product(
Expand Down
Loading
Loading