Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing Data for April/March 2023 #15

Closed
quasiben opened this issue Apr 19, 2023 · 3 comments
Closed

Missing Data for April/March 2023 #15

quasiben opened this issue Apr 19, 2023 · 3 comments

Comments

@quasiben
Copy link

  • condastats version: 0.2.1
  • Python version: 3.10
  • Operating System: OSX

Description

The data for March and April of this year,2023, seems to be missing:

What I Did

○ → condastats overall cudf --start_month 2023-01 --data_source rapidsai --monthly
pkg_name  time
cudf      2018-10       93
          2018-11     1038
          2018-12      807
          2019-01     1503
          2019-02     2702
          2019-03     4586
          2019-04     5245
          2019-05     4576
          2019-06     3470
          2019-07     5546
          2019-08     2825
          2019-09     2244
          2019-10     3330
          2019-11     2669
          2019-12     2129
          2020-01     2329
          2020-02     5502
          2020-03     6005
          2020-04     8149
          2020-05     9698
          2020-06     9637
          2020-07    10062
          2020-08    11617
          2020-09    14342
          2020-10    12957
          2020-11    11489
          2020-12    11205
          2021-01    11267
          2021-02    11824
          2021-03    17856
          2021-04    18740
          2021-05    20166
          2021-06    20394
          2021-07    16088
          2021-08    14745
          2021-09    15327
          2021-10    16039
          2021-11    13019
          2021-12    12711
          2022-01    15999
          2022-02    17021
          2022-03    20025
          2022-04    19299
          2022-05    19367
          2022-06    23601
          2022-07    25680
          2022-08    20749
          2022-09    18496
          2022-10    18093
          2022-11    18324
          2022-12    19657
          2023-01    12923
          2023-02    17163
@sophiamyang
Copy link
Contributor

@seibert @cappadona any updates on the March data?

@quasiben
Copy link
Author

Similar to #13 we I'm also not seeing any pkg_python info:

In [21]: df = pd.read_parquet(
    ...:     "s3://anaconda-package-data/conda/hourly/2023/02/2023-02-28.parquet",
    ...:     filters=[
    ...:         ("data_source", "in", ("rapidsai", "rapidsai-nightly")),
    ...:     ],
    ...:     storage_options={"anon": True},
    ...: )
df
In [22]: df.head()
Out[22]:
        time       data_source          pkg_name      pkg_version   pkg_platform pkg_python  counts
0 2023-02-28  rapidsai-nightly  libraft-distance        23.04.00a  linux-aarch64                  8
1 2023-02-28          rapidsai         custreamz         22.12.01       linux-64                  2
2 2023-02-28          rapidsai         raft-dask         22.12.01       linux-64                  2
3 2023-02-28          rapidsai         raft-dask         23.02.00       linux-64                 10
4 2023-02-28  rapidsai-nightly      libcumlprims  21.12.00a211111       linux-64                  1

@sophiamyang
Copy link
Contributor

Moved the issue to anaconda/anaconda-package-data#39

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants