Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyYAML 6.0 load() function requires Loader argument #576

Open
aviasd opened this issue Oct 14, 2021 · 12 comments
Open

PyYAML 6.0 load() function requires Loader argument #576

aviasd opened this issue Oct 14, 2021 · 12 comments
Labels

Comments

@aviasd
Copy link

aviasd commented Oct 14, 2021

When using an updated version of pyyaml (version 6.0) on Google Colab, there is an import problem in some of Google Colab python packages, like plotly.express:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-19-86e89bd44552> in <module>()
----> 1 import plotly.express as px

9 frames
/usr/local/lib/python3.7/dist-packages/plotly/express/__init__.py in <module>()
     13     )
     14 
---> 15 from ._imshow import imshow
     16 from ._chart_types import (  # noqa: F401
     17     scatter,

/usr/local/lib/python3.7/dist-packages/plotly/express/_imshow.py in <module>()
      9 
     10 try:
---> 11     import xarray
     12 
     13     xarray_imported = True

/usr/local/lib/python3.7/dist-packages/xarray/__init__.py in <module>()
      1 import pkg_resources
      2 
----> 3 from . import testing, tutorial, ufuncs
      4 from .backends.api import (
      5     load_dataarray,

/usr/local/lib/python3.7/dist-packages/xarray/tutorial.py in <module>()
     11 import numpy as np
     12 
---> 13 from .backends.api import open_dataset as _open_dataset
     14 from .backends.rasterio_ import open_rasterio as _open_rasterio
     15 from .core.dataarray import DataArray

/usr/local/lib/python3.7/dist-packages/xarray/backends/__init__.py in <module>()
      4 formats. They should not be used directly, but rather through Dataset objects.
      5 
----> 6 from .cfgrib_ import CfGribDataStore
      7 from .common import AbstractDataStore, BackendArray, BackendEntrypoint
      8 from .file_manager import CachingFileManager, DummyFileManager, FileManager

/usr/local/lib/python3.7/dist-packages/xarray/backends/cfgrib_.py in <module>()
     14     _normalize_path,
     15 )
---> 16 from .locks import SerializableLock, ensure_lock
     17 from .store import StoreBackendEntrypoint
     18 

/usr/local/lib/python3.7/dist-packages/xarray/backends/locks.py in <module>()
     11 
     12 try:
---> 13     from dask.distributed import Lock as DistributedLock
     14 except ImportError:
     15     DistributedLock = None

/usr/local/lib/python3.7/dist-packages/dask/distributed.py in <module>()
      1 # flake8: noqa
      2 try:
----> 3     from distributed import *
      4 except ImportError:
      5     msg = (

/usr/local/lib/python3.7/dist-packages/distributed/__init__.py in <module>()
      1 from __future__ import print_function, division, absolute_import
      2 
----> 3 from . import config
      4 from dask.config import config
      5 from .actor import Actor, ActorFuture

/usr/local/lib/python3.7/dist-packages/distributed/config.py in <module>()
     18 
     19 with open(fn) as f:
---> 20     defaults = yaml.load(f)
     21 
     22 dask.config.update_defaults(defaults)

TypeError: load() missing 1 required positional argument: 'Loader

When reverting back to pyyaml version 5.4.1, the problem is solved.

@bnx05
Copy link

bnx05 commented Oct 14, 2021

I ran into this issue as well, had to revert back to 5.4.1 to get our tests working again.

support/frames_parser.py:14: in __init__
    file = yaml.load(open(file_path))
E   TypeError: load() missing 1 required positional argument: 'Loader'

@sbesson
Copy link

sbesson commented Oct 14, 2021

We got hit by the same error which is related to #561 which was included yesterday in the latest major release of PyYAML 6.0.
The simple yaml.load(f) call has been deprecated in the 5.x line in favor of yaml.load(f, Loader=loader). Either capping PyYAML to 5.x or updating all usages of yaml.load should restore your builds.

tf-gerrit-replicator pushed a commit to tungstenfabric/tf-deployment-test that referenced this issue Oct 14, 2021
yaml/pyyaml#576

Change-Id: Ic5041bfe42ef4a44939764b4bb43af6f40325e12
@nitzmahone
Copy link
Member

nitzmahone commented Oct 14, 2021

TBC: this is as-designed. Loading without specifying a loader has been deprecated and issuing loud warnings for the past three years (since 5.1), due to numerous CVEs being filed against the default loader's ability to execute arbitrary(ish) code. Since changing the default to a significantly less-capable loader was just as big a breaking change (and one that could cause more problems in the future), it was decided to just require folks to be specific about the capability they required from the loader going forward.

@ingydotnet
Copy link
Member

Reopening this for a spell until the dust settles.

@ingydotnet ingydotnet reopened this Oct 14, 2021
@perlpunk perlpunk pinned this issue Oct 14, 2021
@ingydotnet ingydotnet changed the title TypeError: load() missing 1 required positional argument: 'Loader' on Google Colab PyYAML 6.0 load() function requires Loader argument Oct 14, 2021
@ingydotnet
Copy link
Member

Retitled the issue and pinned it.

@cirodrig
Copy link

cirodrig commented Oct 15, 2021

The documentation still presents the old usage of load, which is now invalid, in quite a few places. I opened an issue on the documentation's repository.

@andy-maier
Copy link

andy-maier commented Oct 16, 2021

@cirodrig I did not find your documentation issue, can you post a link?

I think the documentation should point out what the recommended loader is now that users need to specify it. I guess there is no single recommended loader, so the recommendation should cover which loader is recommended for which use case. I don't find that in the documentation today.

@cirodrig
Copy link

@andy-maier I created an issue here yaml/pyyaml.org#15

pranavprakash20 added a commit to pranavprakash20/glusto that referenced this issue Oct 20, 2021
[Problem]
With the latest version of pyYAML [> 5.4.1], the
`yaml.load(f)` is deprecated in favour of
`yaml.load(f, Loader=loader)`. This causes
"""
File "/usr/local/lib/python3.6/site-packages/glusto/configurable.py", line 215, in _load_yaml
config = yaml.load(configfd)
TypeError: load() missing 1 required positional argument: 'Loader' """
More discussion on this topic [1]

[1] yaml/pyyaml#576

[Solution]
Fix the version to 5.4.1 as changing the code to accomodate
the latest method changes requires further testing.

Signed-off-by: Pranav <[email protected]>
henryaddison added a commit to henryaddison/zookeeper_tutorial that referenced this issue Oct 26, 2021
PyYAML 6.0 was released recently and the load method now requires a Loader argument (see yaml/pyyaml#561 and yaml/pyyaml#576). Can either provide one like this (I think it's backwards compatible with version 5) or could pin the version of PyYAML in requirements.txt.
@2sn
Copy link

2sn commented Oct 30, 2021

I find it disappoint, however, that the documentation tutorial examples do not work
https://pyyaml.org/wiki/PyYAMLDocumentation

import yaml
>>> yaml.load("""
... - Hesperiidae
... - Papilionidae
... - Apatelodidae
... - Epiplemidae
... """)

@aigarius
Copy link

aigarius commented Nov 16, 2021

Yeah, that is not a good solution at all. Breaking 100% of old code is worse than breaking 0.5% of old code that is depending on some of the insecure functionality of the old default loader. Just default to a secure loader as the new default. It is a breaking change, but nowhere near as breaking as this.

The worst is when I have several different packages in my downstream dependencies using PyYAML and some use the new functionality and declare the PyYAML 6 as dependency while others do not provide a loader object to load and just crash now.

Change the PyYAML tutorial for the new syntax at least and let that be up for three years, maybe then it would be ok as a change. But a less destruictive change is always better than more destructive one.

sgnn7 added a commit to sgnn7/pptop that referenced this issue Nov 19, 2021
You get an error trying to run this with Python 3.9 and latest pip
install:
```
$ pptop <PID>                                                                                                                                                                                                               
load() missing 1 required positional argument: 'Loader' 
```

Since the PyYAML library [API changed](yaml/pyyaml#576),
we need to use `safe_load` (which should be the default anyways).
sf-project-io pushed a commit to softwarefactory-project/python-sfmanager that referenced this issue Nov 29, 2021
- Add upload-pypi job in release pipeline
- use zuul-worker-ubi7 for py36 tests
- add pyyaml<6.0.0 in requirements.txt to fix
  yaml/pyyaml#576

Change-Id: I3c03b57a878102e6e5ff6076d1643123bdf374a3
@kaivio
Copy link

kaivio commented Feb 11, 2022

TBC: this is as-designed. Loading without specifying a loader has been deprecated and issuing loud warnings for the past three years (since 5.1), due to numerous CVEs being filed against the default loader's ability to execute arbitrary(ish) code. Since changing the default to a significantly less-capable loader was just as big a breaking change (and one that could cause more problems in the future), it was decided to just require folks to be specific about the capability they required from the loader going forward.

I don't think it's a good design. It's not in line with the usage habits.

initialed85 added a commit to ftpsolutions/python-third-party-license-file-generator that referenced this issue Mar 1, 2022
Includes fix wherein `yaml.load` requires `Loader` keyword argument.

Ref.: yaml/pyyaml#576 (looks like it's intended and won't be fixed)
sjiang95 pushed a commit to sjiang95/Pytorch1.1.0-cc2.x that referenced this issue May 3, 2022
scottgigante-immunai added a commit to wes-lewis/SingleCellOpenProblems that referenced this issue Jun 14, 2022
scottgigante-immunai added a commit to openproblems-bio/openproblems that referenced this issue Jul 13, 2022
* Create alra.py

Add alra.py (includes existing bug).

* pre-commit

* import alra

* pre-commit

* set alra version

* split up alra oneliner

* debug

* fix syntax error

* pre-commit

* use dgCMatrix

* output is stored in obsm

* remove prints

* pre-commit

* Update alra.py

add to_csr() to fix coo matrix error

* fix csr casting

* Update alra.py

try adding custom exception to catch shape mismatch from ALRA

* pre-commit

* add ValueError

* pre-commit

* simplify ValueError to avoid errors

* pre-commit

* cast to array for MSE

Now getting an error in MSE--seems like this was already the case with earlier code, but attempting to fix regardless!

* pre-commit

* separate error line functions

Seems something about ALRA is failing tests. Separate out obsm call to get cleaner traceback

* Remove to_array()

* pre-commit

* try casting to a matrix one more time

* notate that wes' ALRA fork must be used instead

* pre-commit

* source from wes' code

* fix URL

* shorten line lengths

* Check output is ndarray

* Fix typo

* Return dense data

* don't need tocsr now that the data is dense

* Return directly to denoised

* code cleanup

* Revert debugging

* Don't edit adata.obsm['train']

* access train_norm

* Add warning about editing adata.obsm['train']

* pre-commit

* check train and test are not modified

* pre-commit

* Retry ALRA on failure

* pre-commit

* Switch t(as.matrix()) order

* Check dense data

* Return sparse data

* Check input data is sparse

* Fix typo

* pre-commit

* Don't send the full AnnData to R

* Expect sparse input, dense array output

* train and test must be floats

* Convert back to float

* Fail on final attempt

* put the retry inside python

* Remove the retry from R

* pre-commit

* layers['counts'] might not be sparse

* pre-commit

* Log error each time

* import logging

* pre-commit

* Better way to check matrices

* pre-commit

* fix array equal comparison

* add explicit comment

* More explicit toarray

* Can't check for untouched train/test

* Don't import scprep

* Just use a fixed target_sum

* Sample data should match API

* pre-commit

* flake8

* no_denoising still needs to densify

* convert to csc

* pre-commit

* Convert to csr

* conversion of sparse doesn't work, try anndata

* accept sce

* pre-commit

* Convert to dense

* pre-commit

* Convert to dense

* pre-commit

* Try `.tocsr()`

* Create dca.py

* pre-commit

* Create dca.py

* pre-commit

* add dca

* add dca

* Update dca.py

* Update dca.py

update import statement for DCA. Note that the main function, DCA(), might need to share the same name as the overall file (?), i.e. if it is DCA(), the file might need to be DCA.py

* pre-commit

* Update dca.py

* Update dca.py

* Delete dca.py

* Update requirements.txt

* Update __init__.py

* pre-commit

* Update dca.py

Try just importing dca

* pre-commit

* Update dca.py

* pre-commit

* put dca import inside method

* pre-commit

* Update dca.py

* Update requirements.txt

* pre-commit

* Create README.md

* Update README.md

* Create Dockerfile

* Create requirements.txt

* pre-commit

* Create requirements.txt

* pre-commit

* remove dca from python-extras readme

* fix image specification

* remove dca from here

* Update Dockerfile

* pin dca 0.3*

used ==, uncertain if = would've sufficed

* Update dca.py

* Update __init__.py

* Update requirements.txt

* Update README.md

* Update README.md

* Update README.md

* Update requirements.txt

* Update `check_version` api

* Require pyyaml==5.4.1 to prevent kopt error

Due to yaml/pyyaml#576

* pre-commit

* Fix keras version

* Update dca.py

Remove scprep normalization commands.
make adata2 object, which is adata made from just adata.obsm['train']

* pre-commit

* Update dca.py

* pre-commit

* Update dca.py

* pre-commit

* Update dca.py

* pre-commit

* Add test args

* fix thread count and pass epochs to dca

* pre-commit

* add in masking

* pre-commit

* Update README.md

* Update README.md

* add removezeros and insert_at functions

* pre-commit

* Update dca.py

* pre-commit

* Remove zero counts from train data

* Remove filtering from DCA

* Remove unused code

* pre-commit

* Don't need a line break

* Update utils.py

* pre-commit

* Use epochs if passed

* Fix metric descriptions

* don't compute coverage on non-test args

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott Gigante <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>
Co-authored-by: Scott Gigante <[email protected]>
@Cerebus
Copy link

Cerebus commented Aug 3, 2022

Is there a reason why the documentation /still isn't updated a year later/?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests