Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add options to output registered entity summary #3028

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

paullongtan
Copy link

@paullongtan paullongtan commented Jan 2, 2025

Tracking issue

Closes flyteorg/flyte#3919

Why are the changes needed?

Currently, pyflyte register does not have an option to output registered entities to a file. This PR creates such functionality.

What changes were proposed in this pull request?

  1. Added two options --summary-format and --summary-dir to pyflyte register in flytekit/flytekit/clis/sdk_in_container/register.py.
  • --summary-format: Sets output format for registration summary. Lists registered workflows, tasks, and launch plans. 'json' and 'yaml' supported.
  • --summary-dir: Directory to save registration summary. Uses current working directory if not specified.
  1. Collects registered entity information in flytekit/flytekit/tools/repo.py. Results are then saved to designated directory in selected file format.
  2. Updated the unit test for register.py.

How was this patch tested?

  • Unit testing
  • Built and run some Flytekit test

Setup process

Screenshots

image image image image

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Summary by Bito

Enhanced pyflyte register command with JSON/YAML format support and improved console output capabilities. Implemented comprehensive logging control mechanisms including quiet mode and SuppressEcho context manager. Added support for multiple output formats while removing unnecessary logging statements. Test suite updated and refactored for better maintainability while preserving coverage.

Unit tests added: True

Estimated effort to review (1-5, lower is better): 3

Copy link

welcome bot commented Jan 2, 2025

Thank you for opening this pull request! 🙌

These tips will help get your PR across the finish line:

  • Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
  • Sign off your commits (Reference: DCO Guide).

@flyte-bot
Copy link
Contributor

Code Review Agent Run Status

  • Limitations and other issues: ❌ Failure - The AI Code Review Agent skipped reviewing this change because it is configured to exclude certain pull requests based on the source/target branch or the pull request status. You can change the settings here, or contact the agent instance creator at [email protected].

@paullongtan paullongtan force-pushed the feat/add_register_output_format branch from 47d8b92 to dda078c Compare January 2, 2025 00:51
@flyte-bot
Copy link
Contributor

Code Review Agent Run Status

  • Limitations and other issues: ❌ Failure - The AI Code Review Agent skipped reviewing this change because it is configured to exclude certain pull requests based on the source/target branch or the pull request status. You can change the settings here, or contact the agent instance creator at [email protected].

@paullongtan paullongtan force-pushed the feat/add_register_output_format branch from ef2743a to 0395846 Compare January 2, 2025 04:57
Copy link
Contributor

@Mecoli1219 Mecoli1219 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice first PR! I think you can add more unit tests.

  1. yaml
  2. testing registration without specifying --summary-dir

else:
summary_dir = os.getcwd()

summary_file = f"registration_summary.{summary_format}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that the other registration may overwrite the summary? I think we could name the file with other unique names (ie. tmp file, py file name, wf name, version number)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! The summary will be overwritten if the registration is rerun.

tests/flytekit/unit/cli/pyflyte/test_register.py Outdated Show resolved Hide resolved

except Exception as e:
if not skip_errors:
raise e
print_registration_status(og_id, success=False)
result["status"] = "failed"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it fails, what will the values of other keys be?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The values of the other keys(id, type, version) are pre-computed before registration. Thus, the values will not be empty even if the registration fails.

The values are from:

result = {
"id": og_id.name,
"type": og_id.resource_type_name(),
"version": og_id.version,
"status": "skipped", # default status
}
where og_id is the id of the entity's template / entity itself.

@flyte-bot
Copy link
Contributor

flyte-bot commented Jan 2, 2025

Code Review Agent Run #9a3edb

Actionable Suggestions - 3
  • flytekit/tools/repo.py - 2
    • Consider enhancing registration result information · Line 341-346
    • Consider using set for format checking · Line 402-404
  • flytekit/clis/sdk_in_container/register.py - 1
    • Consider validating summary format and directory · Line 146-159
Additional Suggestions - 1
  • tests/flytekit/unit/cli/pyflyte/test_register.py - 1
Review Details
  • Files reviewed - 3 · Commit Range: f51be71..9bba92f
    • flytekit/clis/sdk_in_container/register.py
    • flytekit/tools/repo.py
    • tests/flytekit/unit/cli/pyflyte/test_register.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

@flyte-bot
Copy link
Contributor

flyte-bot commented Jan 2, 2025

Changelist by Bito

This pull request implements the following key changes.

Key Change Files Impacted
Feature Improvement - Enhanced Registration Output and Logging Control

pyflyte.py - Modified logging configuration and environment variable handling for quiet mode support

register.py - Added new CLI options for summary format and quiet mode in registration command

repo.py - Implemented registration summary generation in JSON/YAML formats with improved logging control

test_register.py - Added comprehensive tests for new registration summary formats and quiet mode

Comment on lines 341 to 346
result = {
"id": og_id.name,
"type": og_id.resource_type_name(),
"version": og_id.version,
"status": "skipped", # default status
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider enhancing registration result information

Consider adding more detailed status information in the result dictionary. The current status field only captures high-level states ('skipped', 'success', 'failed'). Additional fields like error_message and timestamp could provide more context for debugging and monitoring.

Code suggestion
Check the AI-generated fix before applying
Suggested change
result = {
"id": og_id.name,
"type": og_id.resource_type_name(),
"version": og_id.version,
"status": "skipped", # default status
}
result = {
"id": og_id.name,
"type": og_id.resource_type_name(),
"version": og_id.version,
"status": "skipped", # default status
"timestamp": datetime.datetime.now().isoformat(),
"error_message": "",
"details": {}
}

Code Review Run #9a3edb


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Comment on lines 402 to 404
supported_format = ["json", "yaml"]
if summary_format not in supported_format:
raise ValueError(f"Unsupported file format: {summary_format}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using set for format checking

Consider using a set for supported_format instead of a list since we're only checking membership. This would provide O(1) lookup instead of O(n). Could be defined as supported_format = {'json', 'yaml'}.

Code suggestion
Check the AI-generated fix before applying
Suggested change
supported_format = ["json", "yaml"]
if summary_format not in supported_format:
raise ValueError(f"Unsupported file format: {summary_format}")
if summary_format not in {"json", "yaml"}:
raise ValueError(f"Unsupported file format: {summary_format}")

Code Review Run #9a3edb


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Comment on lines 146 to 159
"--summary-format",
"-f",
required=False,
type=click.Choice(["json", "yaml"], case_sensitive=False),
default=None,
help="Set output format for registration summary. Lists registered workflows, tasks, and launch plans. 'json' and 'yaml' supported.",
)
@click.option(
"--summary-dir",
required=False,
type=click.Path(dir_okay=True, file_okay=False, writable=True, resolve_path=True),
default=None,
help="Directory to save registration summary. Uses current working directory if not specified.",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider validating summary format and directory

Consider adding validation for summary-format and summary-dir options. When summary-format is specified but summary-dir is not, the summary may not be saved correctly. Consider making summary-dir required when summary-format is provided.

Code suggestion
Check the AI-generated fix before applying
Suggested change
"--summary-format",
"-f",
required=False,
type=click.Choice(["json", "yaml"], case_sensitive=False),
default=None,
help="Set output format for registration summary. Lists registered workflows, tasks, and launch plans. 'json' and 'yaml' supported.",
)
@click.option(
"--summary-dir",
required=False,
type=click.Path(dir_okay=True, file_okay=False, writable=True, resolve_path=True),
default=None,
help="Directory to save registration summary. Uses current working directory if not specified.",
)
"--summary-format",
"-f",
required=False,
type=click.Choice(["json", "yaml"], case_sensitive=False),
default=None,
help="Set output format for registration summary. Lists registered workflows, tasks, and launch plans. 'json' and 'yaml' supported.",
callback=validate_summary_options,
)
"--summary-dir",
required=False,
type=click.Path(dir_okay=True, file_okay=False, writable=True, resolve_path=True),
default=None,
help="Directory to save registration summary. Uses current working directory if not specified.",
callback=validate_summary_options,
)

Code Review Run #9a3edb


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Signed-off-by: paullongtan <[email protected]>
…mmary-dir testings to unit test

Signed-off-by: paullongtan <[email protected]>
@paullongtan paullongtan force-pushed the feat/add_register_output_format branch from 2bbe0d1 to db9f744 Compare January 2, 2025 06:39
@flyte-bot
Copy link
Contributor

flyte-bot commented Jan 2, 2025

Code Review Agent Run #a2cf6a

Actionable Suggestions - 0
Review Details
  • Files reviewed - 1 · Commit Range: 9bba92f..db9f744
    • tests/flytekit/unit/cli/pyflyte/test_register.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

@paullongtan
Copy link
Author

Nice first PR! I think you can add more unit tests.

  1. yaml
  2. testing registration without specifying --summary-dir

Thanks!
I have added unit tests for both yaml and registration without specifying --summary-dir.

@flyte-bot
Copy link
Contributor

flyte-bot commented Jan 7, 2025

Code Review Agent Run #5721dc

Actionable Suggestions - 3
  • flytekit/tools/repo.py - 1
    • Consider safer click.secho state management · Line 432-436
  • tests/flytekit/unit/cli/pyflyte/test_register.py - 2
Review Details
  • Files reviewed - 3 · Commit Range: db9f744..a3734e4
    • flytekit/clis/sdk_in_container/register.py
    • flytekit/tools/repo.py
    • tests/flytekit/unit/cli/pyflyte/test_register.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

Comment on lines 432 to 436
with temporary_secho():
if summary_format == "json":
click.secho(json.dumps(all_results, indent=2))
elif summary_format == "yaml":
click.secho(yaml.dump(all_results))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider safer click.secho state management

Consider using a context manager for temporarily restoring click.secho instead of directly manipulating it. The current approach with temporary_secho() could lead to inconsistent state if an exception occurs. A similar issue was also found in flytekit/clis/sdk_in_container/register.py (line 219-220).

Code suggestion
Check the AI-generated fix before applying
 @@ -246,11 +246,12 @@
 -def temporary_secho():
 +@contextmanager
 +def temporary_secho():
      current_secho = click.secho
      try:
          click.secho = original_secho
          yield
      finally:
          click.secho = current_secho

Code Review Run #5721dc


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

["register", "--summary-format", "json", "core5"]
)
assert result.exit_code == 0
summary_data = json.loads(result.output)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding JSON validation check

Consider validating the result.output before parsing it as JSON to handle potential invalid JSON gracefully.

Code suggestion
Check the AI-generated fix before applying
Suggested change
summary_data = json.loads(result.output)
try:
summary_data = json.loads(result.output)
except json.JSONDecodeError as e:
pytest.fail(f"Failed to parse registration summary JSON: {e}")
except Exception as e:
pytest.fail(f"Unexpected error while parsing registration summary: {e}")

Code Review Run #5721dc


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

["register", "--summary-format", "yaml", "core6"]
)
assert result.exit_code == 0
summary_data = yaml.safe_load(result.output)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding YAML parse error handling

Consider adding error handling when parsing YAML from result.output. The yaml.safe_load() could raise yaml.YAMLError if the output is not valid YAML.

Code suggestion
Check the AI-generated fix before applying
Suggested change
summary_data = yaml.safe_load(result.output)
try:
summary_data = yaml.safe_load(result.output)
except yaml.YAMLError as e:
pytest.fail(f"Failed to parse YAML output: {e}")

Code Review Run #5721dc


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@paullongtan paullongtan force-pushed the feat/add_register_output_format branch from a9564f5 to 8a82a78 Compare January 7, 2025 08:04
@flyte-bot
Copy link
Contributor

flyte-bot commented Jan 7, 2025

Code Review Agent Run #93cfad

Actionable Suggestions - 1
  • flytekit/tools/repo.py - 1
    • Consider consolidating duplicate error handling logic · Line 394-401
Review Details
  • Files reviewed - 3 · Commit Range: a3734e4..8a82a78
    • flytekit/clis/sdk_in_container/register.py
    • flytekit/tools/repo.py
    • tests/flytekit/unit/cli/pyflyte/test_register.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

Comment on lines +394 to +401
raise e
if not quiet:
print_registration_status(og_id, success=False)
result["status"] = "failed"
else:
if not quiet:
print_registration_status(og_id, dry_run=True)
except RegistrationSkipped:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider consolidating duplicate error handling logic

Consider consolidating the error handling logic. Currently, there are two separate error handling paths - one for registration failure and one for RegistrationSkipped. Both paths perform similar actions (printing status and setting result status). This could be simplified to reduce code duplication.

Code suggestion
Check the AI-generated fix before applying
 @@ -394,8 +394,12 @@
 -                        print_registration_status(og_id, success=False)
 -                    result["status"] = "failed"
 -            else:
 -                    print_registration_status(og_id, dry_run=True)
 -            except RegistrationSkipped:
 -                if not quiet:
 -                    print_registration_status(og_id, success=False)
 -                result["status"] = "skipped"
 +                        _handle_registration_error(og_id, "failed", quiet)
 +            else:
 +                    print_registration_status(og_id, dry_run=True) 
 +            except RegistrationSkipped:
 +                _handle_registration_error(og_id, "skipped", quiet)
 +
 +def _handle_registration_error(og_id, status, quiet):
 +    if not quiet:
 +        print_registration_status(og_id, success=False)
 +    result["status"] = status

Code Review Run #93cfad


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@flyte-bot
Copy link
Contributor

flyte-bot commented Jan 8, 2025

Code Review Agent Run #7297c4

Actionable Suggestions - 2
  • flytekit/clis/sdk_in_container/register.py - 1
    • Consider using context manager for logger · Line 37-37
  • tests/flytekit/unit/cli/pyflyte/test_register.py - 1
Review Details
  • Files reviewed - 3 · Commit Range: 8a82a78..05ab275
    • flytekit/clis/sdk_in_container/register.py
    • flytekit/tools/repo.py
    • tests/flytekit/unit/cli/pyflyte/test_register.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

@@ -33,6 +33,9 @@
the root of your project, it finds the first folder that does not have a ``__init__.py`` file.
"""

_original_secho = click.secho
_original_log_level = logger.level
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using context manager for logger

Consider using a context manager to handle logger level changes instead of storing the level in a global variable. This would ensure proper cleanup and avoid potential side effects. A similar issue was also found in flytekit/tools/repo.py (line 290).

Code suggestion
Check the AI-generated fix before applying
Suggested change
_original_log_level = logger.level
class LogLevelManager:
def __init__(self):
self.original_level = logger.level
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
logger.level = self.original_level

Code Review Run #7297c4


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Comment on lines 225 to 249
out = subprocess.run(["git", "init"], capture_output=True)
assert out.returncode == 0
os.makedirs("core6", exist_ok=True)
with open(os.path.join("core6", "sample.py"), "w") as f:
f.write(sample_file_contents)
f.close()

result = runner.invoke(
pyflyte.main,
["register", "--summary-format", "yaml", "core6"]
)
assert result.exit_code == 0
try:
summary_data = yaml.safe_load(result.output)
except yaml.YAMLError as e:
pytest.fail(f"Failed to parse YAML output: {e}")
assert isinstance(summary_data, list)
assert len(summary_data) > 0
for entry in summary_data:
assert "id" in entry
assert "type" in entry
assert "version" in entry
assert "status" in entry

shutil.rmtree("core6")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using finally block for cleanup

Consider moving the shutil.rmtree("core6") cleanup to a finally block to ensure cleanup happens even if assertions fail.

Code suggestion
Check the AI-generated fix before applying
Suggested change
out = subprocess.run(["git", "init"], capture_output=True)
assert out.returncode == 0
os.makedirs("core6", exist_ok=True)
with open(os.path.join("core6", "sample.py"), "w") as f:
f.write(sample_file_contents)
f.close()
result = runner.invoke(
pyflyte.main,
["register", "--summary-format", "yaml", "core6"]
)
assert result.exit_code == 0
try:
summary_data = yaml.safe_load(result.output)
except yaml.YAMLError as e:
pytest.fail(f"Failed to parse YAML output: {e}")
assert isinstance(summary_data, list)
assert len(summary_data) > 0
for entry in summary_data:
assert "id" in entry
assert "type" in entry
assert "version" in entry
assert "status" in entry
shutil.rmtree("core6")
out = subprocess.run(["git", "init"], capture_output=True)
assert out.returncode == 0
os.makedirs("core6", exist_ok=True)
with open(os.path.join("core6", "sample.py"), "w") as f:
f.write(sample_file_contents)
f.close()
try:
result = runner.invoke(
pyflyte.main,
["register", "--summary-format", "yaml", "core6"]
)
assert result.exit_code == 0
try:
summary_data = yaml.safe_load(result.output)
except yaml.YAMLError as e:
pytest.fail(f"Failed to parse YAML output: {e}")
assert isinstance(summary_data, list)
assert len(summary_data) > 0
for entry in summary_data:
assert "id" in entry
assert "type" in entry
assert "version" in entry
assert "status" in entry
finally:
shutil.rmtree("core6")

Code Review Run #7297c4


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Copy link

codecov bot commented Jan 9, 2025

Codecov Report

Attention: Patch coverage is 73.07692% with 28 lines in your changes missing coverage. Please review.

Project coverage is 77.25%. Comparing base (edbf992) to head (386eea0).
Report is 14 commits behind head on master.

Files with missing lines Patch % Lines
flytekit/tools/repo.py 73.07% 20 Missing and 8 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3028      +/-   ##
==========================================
- Coverage   78.19%   77.25%   -0.94%     
==========================================
  Files         201      212      +11     
  Lines       21274    22062     +788     
  Branches     2733     2753      +20     
==========================================
+ Hits        16635    17045     +410     
- Misses       3843     4257     +414     
+ Partials      796      760      -36     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: paullongtan <[email protected]>
@paullongtan
Copy link
Author

New updates:

  1. Added option --quiet to pyflyte register to mute all output to CLI
  2. Updated registration summary to output directly to the CLI instead of saving to a file to ensure scalability
  • uses the --quiet flag to ensure proper output format
  • drops option --summary-dir for pyflyte register

Note:

  1. I have temporarily commented out the log for config file overwriting in pyflyte.py and log for packages used in validate_package function in utils.py. This is to prevent these logs from breaking the CLI output when outputting registration summary to the CLI and ensure no CLI outputs when using the --quiet flag for register.

@flyte-bot
Copy link
Contributor

flyte-bot commented Jan 9, 2025

Code Review Agent Run #24c91a

Actionable Suggestions - 0
Review Details
  • Files reviewed - 4 · Commit Range: 05ab275..4b2b3b9
    • flytekit/clis/sdk_in_container/pyflyte.py
    • flytekit/clis/sdk_in_container/register.py
    • flytekit/clis/sdk_in_container/utils.py
    • tests/flytekit/unit/cli/pyflyte/test_register.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

[FlyteCTL Feature] add a --output flag for pyflyte register
3 participants