Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

daft.lit(val) where val is of type daft.Series has incorrect behavior #3287

Open
kevinzwang opened this issue Nov 13, 2024 · 0 comments
Open
Labels
bug Something isn't working needs triage

Comments

@kevinzwang
Copy link
Member

kevinzwang commented Nov 13, 2024

Describe the bug

When you pass a Series into daft.lit, it actually tries to use the series as a column, and if it does not match the length of a table, it fails. Instead, it should be treated as a singular list row and broadcasted, to be consistent with other literal types.

To Reproduce

>>> import daft
>>> df = daft.from_pydict({"foo": [1, 2, 3], "bar": ["a", "b", "c"]})
>>> s = daft.Series.from_pylist(["x", "y", "z"])
>>> df = df.with_column("baz", daft.lit(s))
>>> df.show()
╭───────┬──────┬──────╮
│ foobarbaz  │
│ ---------  │
│ Int64Utf8Utf8 │
╞═══════╪══════╪══════╡
│ 1ax    │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 2by    │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 3cz    │
╰───────┴──────┴──────╯

Expected behavior

╭───────┬──────┬─────────────────╮
│ foo   ┆ bar  ┆ baz             │
│ ---   ┆ ---  ┆ ---             │
│ Int64 ┆ Utf8 ┆ List[Utf8]      │
╞═══════╪══════╪═════════════════╡
│ 1     ┆ a    ┆ ['x', 'y', 'z'] │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2     ┆ b    ┆ ['x', 'y', 'z'] │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 3     ┆ c    ┆ ['x', 'y', 'z'] │
╰───────┴──────┴─────────────────╯

Component(s)

Expressions

Additional context

How we would implement this:

  • we actually want to remove Series as a dependency in daft-dsl, so what we should do to further that while solving this issue is to create a new LiteralValue::List type that holds a vec of LiteralValue instead of a series. We can add functionality that converts to and from series, where the to_series implementation will return a ListArray series with the inner vec value.
@kevinzwang kevinzwang added bug Something isn't working needs triage labels Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage
Projects
None yet
Development

No branches or pull requests

1 participant