Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] structured dataset (constructed by dataframe but not uri) in dataclass will fail in attribute access #6205

Open
2 tasks done
Future-Outlier opened this issue Jan 31, 2025 · 1 comment
Assignees
Labels
bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers

Comments

@Future-Outlier
Copy link
Member

Describe the bug

the below error will show.

The above exception was the direct cause of the following exception:

╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /Users/future-outlier/code/dev/flytekit/build/PR/pr_3027_sd.py:47 in         │
│ <module>                                                                     │
│                                                                              │
│ ❱ 47print(wf())                                                         │
│                                                                              │
│ /Users/future-outlier/code/dev/flytekit/flytekit/core/workflow.py:299 in     │
│ __call__                                                                     │
│                                                                              │
│ ❱ 299 │   │   self.compile()                                                 │
│                                                                              │
│ /Users/future-outlier/code/dev/flytekit/flytekit/core/workflow.py:804 in     │
│ compile                                                                      │
│                                                                              │
│ ❱ 804 │   │   │   │   raise FlyteValidationException(                        │
╰──────────────────────────────────────────────────────────────────────────────╯
FlyteValidationException: USER:ValidationError: error=Failed to bind output o0 
for function pr_3027_sd.wf: 'Promise' object has no attribute 'uri', 
cause='Promise' object has no attribute 'uri'

Expected behavior

it should work.

Additional context to reproduce

from flytekit import task, workflow, ImageSpec
from flytekit.types.structured.structured_dataset import StructuredDataset
import pandas as pd
from dataclasses import dataclass


flytekit_hash = "master"
flytekit = f"git+https://github.com/flyteorg/flytekit.git@{flytekit_hash}"

custom_image = ImageSpec(packages=[flytekit, "pandas", "pyarrow"],
                            apt_packages=["git"],
                            registry="futureoutlier",
                            env={"FLYTE_SDK_LOGGING_LEVEL": 10},
                         )

@dataclass
class DC:
    sd: StructuredDataset


@task(container_image=custom_image)
def create_pd_sd_dc() -> DC:
    return DC(sd=StructuredDataset(
        dataframe=pd.DataFrame(
            {
                "calories": [420, 380, 390],
                "duration": [50, 40, 45]
            }
        ),
        file_format="parquet"
    ))


@task(container_image=custom_image)
def return_pd_sd(sd: StructuredDataset) -> StructuredDataset:
    return sd


@workflow
def wf() -> DC:
    dc = create_pd_sd_dc()
    sd = return_pd_sd(dc.sd)
    return DC(sd=sd)


if __name__ == "__main__":
    print(wf())

Screenshots

Image

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@Future-Outlier Future-Outlier added bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers labels Jan 31, 2025
@JiangJiaWei1103
Copy link
Contributor

#take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers
Projects
Status: Backlog
Development

No branches or pull requests

2 participants