Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flyte Decks] support ydata-profiling in python 3.12 #2766

Merged
merged 16 commits into from
Sep 25, 2024

Conversation

Future-Outlier
Copy link
Member

@Future-Outlier Future-Outlier commented Sep 24, 2024

Why are the changes needed?

What changes were proposed in this pull request?

  1. add 2 requirements in flytekit-deck-standard, ydata-profiling and pillow.
  2. remove exclude flytekit-deck-standard in python 3.12 CI

How was this patch tested?

local execution and remote execution

import flytekit
from flytekit import ImageSpec, task, workflow
from flytekit.deck.renderer import MarkdownRenderer
from sklearn.decomposition import PCA
import pandas as pd
from flytekitplugins.deck.renderer import FrameProfilingRenderer, BoxRenderer, ImageRenderer
from flytekit.deck import TopFrameRenderer
from flytekit.types.file import FlyteFile
import plotly.express as px
import plotly
from typing import Annotated

gitsha = "3737792f4ff4981a8110c6280f989bd279005c06"
flytedeck_plugin = f"git+https://github.com/flyteorg/flytekit.git@{gitsha}#subdirectory=plugins/flytekit-deck-standard"

# Define custom image for the task
custom_image = ImageSpec(python_version="3.12",
                        packages=["scikit-learn", "pyarrow", "pandas", "markdown",
                                  "plotly", "ydata_profiling", "pygments", flytedeck_plugin],
                        apt_packages=["git"],
                         registry="localhost:30000")

# Task 1: PCA Plot
@task(enable_deck=True, container_image=custom_image)
def pca_plot():
    iris_df = px.data.iris()
    X = iris_df[["sepal_length", "sepal_width", "petal_length", "petal_width"]]
    pca = PCA(n_components=3)
    components = pca.fit_transform(X)
    total_var = pca.explained_variance_ratio_.sum() * 100
    fig = px.scatter_3d(
        components,
        x=0,
        y=1,
        z=2,
        color=iris_df["species"],
        title=f"Total Explained Variance: {total_var:.2f}%",
        labels={"0": "PC 1", "1": "PC 2", "2": "PC 3"},
    )
    main_deck = flytekit.Deck("pca", MarkdownRenderer().to_html("### Principal Component Analysis"))
    main_deck.append(plotly.io.to_html(fig))

# Task 2: Frame Renderer
@task(enable_deck=True, container_image=custom_image)
def frame_renderer() -> None:
    df = pd.DataFrame(data={"col1": [1, 2], "col2": [3, 4]})
    flytekit.Deck("Frame Renderer", FrameProfilingRenderer().to_html(df=df))

# Task 3: Top Frame Renderer
@task(enable_deck=True, container_image=custom_image)
def top_frame_renderer() -> Annotated[pd.DataFrame, TopFrameRenderer(1)]:
    return pd.DataFrame(data={"col1": [1, 2], "col2": [3, 4]})

# Task 4: Markdown Renderer
@task(enable_deck=True, container_image=custom_image)
def markdown_renderer() -> None:
    flytekit.current_context().default_deck.append(
        MarkdownRenderer().to_html("You can install flytekit using this command: ```import flytekit```")
    )

# Task 5: Box Renderer
@task(enable_deck=True, container_image=custom_image)
def box_renderer() -> None:
    iris_df = px.data.iris()
    flytekit.Deck("Box Plot", BoxRenderer("sepal_length").to_html(iris_df))

# Task 6: Image Renderer
@task(enable_deck=True, container_image=custom_image)
def image_renderer(image: FlyteFile) -> None:
    flytekit.Deck("Image Renderer", ImageRenderer().to_html(image_src=image))

# Combined Workflow
@workflow
def combined_workflow(image: FlyteFile = "https://bit.ly/3KZ95q4") -> None:
    pca_plot()
    frame_renderer()
    top_frame_renderer()
    markdown_renderer()
    box_renderer()
    image_renderer(image=image)

if __name__ == "__main__":
    from flytekit.clis.sdk_in_container import pyflyte
    from click.testing import CliRunner
    import os

    runner = CliRunner()
    path = os.path.realpath(__file__)
    result = runner.invoke(pyflyte.main,
                        ["run", path, "combined_workflow", "--image", "https://bit.ly/3KZ95q4"])
    print("Local Execution: ", result.output)

    result = runner.invoke(pyflyte.main,
                           ["run", "--remote", path, "combined_workflow", "--image", "https://bit.ly/3KZ95q4"])
    print("Remote Execution: ", result.output)

Screenshots

  • local execution
image
  • remote execution
image image

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

@cosmicBboy
Copy link
Contributor

cosmicBboy commented Sep 24, 2024

I don't want to block this PR, but can we make all of the dependencies of this plugin "soft dependencies"? i.e. we import them lazily and complain at runtime if it's not installed. Can make this a separate PR.

Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
@cosmicBboy
Copy link
Contributor

Can we update the README to just document the additional requirements you need in order to use each renderer?

@pingsutw
Copy link
Member

Can we update the README to just document the additional requirements you need in order to use each renderer?

+1, we can add a table

@Future-Outlier
Copy link
Member Author

Future-Outlier commented Sep 25, 2024

Can we update the README to just document the additional requirements you need in order to use each renderer?

+1, we can add a table

Hi, @pingsutw @cosmicBboy
please take a look again, thank you all!

image

Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Copy link

codecov bot commented Sep 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.96%. Comparing base (5643915) to head (ea675e9).
Report is 4 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##           master    #2766       +/-   ##
===========================================
+ Coverage   42.96%   78.96%   +35.99%     
===========================================
  Files         215      194       -21     
  Lines       20685    19807      -878     
  Branches     3905     4130      +225     
===========================================
+ Hits         8887    15640     +6753     
+ Misses      11683     3447     -8236     
- Partials      115      720      +605     
Flag Coverage Δ
78.91% <ø> (+35.95%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@pingsutw pingsutw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great, thank you

@Future-Outlier Future-Outlier merged commit b6f30e6 into master Sep 25, 2024
31 of 32 checks passed
@fellhorn
Copy link
Contributor

fellhorn commented Oct 1, 2024

Hey @Future-Outlier

A minimal installation of the plugin seems to be broken now.
flyteorg/flyte#5792

Do you think plotly and pandas should be added as hard dependencies again or is there a different workaround? Without these two deps, there will be an ImportError when importing flytekit

@ggydush
Copy link
Collaborator

ggydush commented Oct 1, 2024

+1 to the above!

@Future-Outlier
Copy link
Member Author

Hey @Future-Outlier

A minimal installation of the plugin seems to be broken now. flyteorg/flyte#5792

Do you think plotly and pandas should be added as hard dependencies again or is there a different workaround? Without these two deps, there will be an ImportError when importing flytekit

Fixing, thank you

otarabai pushed a commit to otarabai/flytekit that referenced this pull request Oct 15, 2024
* [Flyte Decks] support ydata-profiling in python 3.12

Signed-off-by: Future-Outlier <[email protected]>

* remove exclude deck standard python3.12 ci

Signed-off-by: Future-Outlier <[email protected]>

* make plugin soft dependencies

Signed-off-by: Future-Outlier <[email protected]>

* add dev-requirements.in

Signed-off-by: Future-Outlier <[email protected]>

* nit

Signed-off-by: Future-Outlier <[email protected]>

* better README with dependenc

Signed-off-by: Future-Outlier <[email protected]>

* add other dependency in dev-requirements.in, this will help setup-global-uv

Signed-off-by: Future-Outlier <[email protected]>

* Trigger CI

Signed-off-by: Future-Outlier <[email protected]>

* Trigger CI

Signed-off-by: Future-Outlier <[email protected]>

* Update dependenct

Signed-off-by: Future-Outlier <[email protected]>

* new dockerfile dev

Signed-off-by: Future-Outlier <[email protected]>

* new dockerfile

Signed-off-by: Future-Outlier <[email protected]>

* new dockerfile

Signed-off-by: Future-Outlier <[email protected]>

* revert back

Signed-off-by: Future-Outlier <[email protected]>

* new dev image

Signed-off-by: Future-Outlier <[email protected]>

---------

Signed-off-by: Future-Outlier <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants