Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sagemakerop #1

Merged
merged 208 commits into from
Aug 8, 2023
Merged

Sagemakerop #1

merged 208 commits into from
Aug 8, 2023

Conversation

ellisms
Copy link
Owner

@ellisms ellisms commented Aug 8, 2023

Created new hook and Operators for Amazon SageMaker Notebooks Instances. Allows user to create, start, stop, and delete notebook instances.
Created unit tests for hook and operators and system test. All tests passed.
Updated documentation with new operator.

^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

bbovenzi and others added 30 commits July 27, 2023 00:50
* Center to node on task selection

* Use new fitView options from reactflow

* Maintain zoom, remove focusNode fn
* Set openlineage provider as ready to be released
The apache#32873 has been merged without merging and commiting breeze's hash
accounting for changed dependencies, breaking unit tests of breeze.

This PR brings hash of breeze back adter running `breeze setup version`
manually and commiting the changed hash.
…apache#32896)

So far we only applied PyPi suffix for cross-provider dependencies -
i.e. when one airflow package depended on another one released at
the same time we added "dev0" suffix to their cross-dependencies,
when building them on CI so that they can be installed together
without conflicts (this is because of decisions made with PEP-440
where version suffixes are in different namespaces than the final
versions, so it is impossible to specify "final" dependency as
minimal and have the ".dev0" satisfy it only for that single package.

With openlineage provider, we need to release it before Airflow 2.7.0
gets released, and openlineage provider depends on Airflow 2.7.0 so
we need to also handle the situation, where OpenLineage has >= 2.7.0
for Airflow, but the dependency that is used to resolve dependencies
in CI uses 2.7.0dev0. This is done by dynamically manipulating the
dependencies in setup.py based on VERSION_SUFFIX_FOR_PYPI variable.

This variable in CI is set to "dev0" thus all packages built have
"dev0" added as version, with this change if any package has
>= <CURRENT_AIRFLOW_VERSION> specified, it will also be extended
with the same "dev0" suffix.
The apache#32767 missed imports of imports for a few kubernetes modules.

This PR adds the missing ones.
…pache#32901)

The 1.28.12 release of appflow mypy typeshed introduces
inconsistency that we have no idea how to fix

youtype/mypy_boto3_builder#209

Limiting it to unblock main
* init new gantt chart

set axes

support task groups

synced scrolling

clean up var names and height/scroll logic

* fix task group queued date

* fix last task appearance

* Reset out-of-sync grid/gantt scroll onSelect

* fix www tests

* Active gantt scroll bar, fix tabs and address pr feedback

* Move checkScrollPosition to a timer

* clean up gantt tooltip
* unwrap Proxy before checking __iter__ in is_container()

Fixes: apache#32804

---------

Co-authored-by: Tzu-ping Chung <[email protected]>
Follow up after apache#32896 - also apply the suffix to ARM images
The support for openlineage was added in apache-airflow-providers-common-sql==1.6.0
…2908)

* Remove old gantt chart and redirect to grid views gantt tab

* Add ts-ignore for some reason
Updated step 4 and 5 in pyenv setup to be inside code blocks
…2927)

After raising youtype/mypy_boto3_builder#209
the maintainer attempted to fix the mypy problem in 1.28.15 but
it does not seem to work for mypy. It seems to be ok for
pyright, but mypy still detects the input and output types
produced by the library as different and incompatible types.

While waiting for a fix, we are limiting the library to < 1.28.0
and we are going to lift the limits to test it manualy when the
fix is ready.
The current tests around the different ways of connecting to Azure in the WasbHook are not actually unit testing it.
The tests are basically testing if we could connect and not how we connected.
This refactor improves the tests and appropriately tests how we connect to the BlobServiceClient.
It also removes adding the different connections to the database before testing.
* Base implementation of executor vending cli commands

Update the cli_parser construction (and associated tests) to now get cli
methods from the executor modules.

* Move existing Celery and Kubernetes cli commands to their module

Move the existing cli commands out of the base/core cli module to the
respective executor modules
Small follow-up after apache#29055 - for better diagnostic of potential
future problems, it would be good to re-raise the original import
error, otherwise if the Import error results from some other issue
than Airflow version, we will get quite a bit of head scratching
trying to diagnose some of the resulting aftermath.
Buildig cache on our CI has been broken since we added open
lineage provider with >=2.7.0 limit. This should be handled by
also adding VERSION_SUFFIX_FOR_PYPI to the cache builds (which
is anyhow needed because otherwise cache would not be really
functional.

We actually also have to add it to ci-image-build automatically
in this case otherwise users won't be able to build their images
locally.
eladkal and others added 28 commits August 7, 2023 09:52
…ndpoint (apache#32705)

* add dag_run_ids and task_ids as filter types for the batch task instance endpoint

* add version notice to the new filters

* Update the released version in OpenAPI spec

---------

Co-authored-by: pierrejeambrun <[email protected]>
With the header change in PR apache#32948, the link to Verify by
contributors in the Status of providers testing issue in broken.
Correct the same by pointing it to the right header.
The elasticsearch group is likely to be moved to elasticsearch
provider. Anticipating that (see apache#33135) we need to move it to
pre-2.7 defaults in order to have back-compatibility for providers
that assume default values to be there.
…che#33171)

The dnspython and Flask Application Builder have a weird transitive
dependency to openssl. It does not require it - it is an optional
dependnency of email validator that uses dnspython and the dnspython
uses pyopenssl when installed. This is all fine. But when opendns
is used with older FAB < 4.1.4) and NEWER opensl (>=23.0), just
creating the FAB applicatio in flask causes an exception than
there is a missing parameter in openssl.

Since openssl is NOT declared as required dependency in neither FAB
nor dnspython, installing airflow with constraints while
having a newer version of pyopenssl does not downgrade pyopenssl
it is just mentioned in a conflicting message resulting from conflicts
with other packages already installed (such as cryptography).

This caused a problem when installin older airflow version in breeze
with --use-airflow-version switch.

Also - during checking this issue, it turned out that we were not
- by default - using the constraints for older version when installing
it in breeze (you could do that by explicitly specifying the
constraints). This was because when you used anoter way to specify
versio (commit hash, tag) it is quite complex to figure out which
constraints to use. However we can automatically derive constraints
when you specify version, or branch, which are the most common
scenarios.

This PR fixes the `--use-airflow-version` case for Airflow 2.4 and below
by:

* automatically deriving the right constraints when version or branch
  are used in `--use-airflow-version`

* adding pyopenssl specifically as dependency when
  --use-airflow-version switch is used.

* also a small quality of life improvment - when airflow db migrate
  fails (because old versio of airflow does not support it) - we inform
  the user that it happened and that we are attempting to run the
  legacy airflow db init instead.
* Setting overrides with default for flower_url_prefix

* adding test cases
* Fixing typo in Dockerfile

* Fixing typo in Dockerfile
)

The apache#32495 mistakeny replaced the "build" script with a
"release-management build-docs" command where there is no such
command - there is just "build-docs"
…_JOB_LOG_LINK to DATAPROC_JOB_LINK and add deprecation warning (apache#33189)
…_runs reached its upper limit. (apache#31414)

* feat: select dag_model with row lock

* fix: logging that scheduling was skipped

* fix: remove unused get_dagmodel

* fix: correct log message to more generic word

---------

Co-authored-by: doiken <[email protected]>
Co-authored-by: Tzu-ping Chung <[email protected]>
Co-authored-by: eladkal <[email protected]>
* Replace State by TaskInstanceState in Airflow executors

* chaneg state type in change_state method, KubernetesResultsType and KubernetesWatchType to TaskInstanceState

* Fix change_state annotation in CeleryExecutor

---------

Co-authored-by: Tzu-ping Chung <[email protected]>
@ellisms ellisms merged this pull request into main Aug 8, 2023
5 checks passed
@ellisms ellisms deleted the sagemakerop branch September 20, 2024 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.