Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make scipy an optional dependency #2062

Open
tylerflex opened this issue Nov 7, 2024 · 10 comments
Open

Make scipy an optional dependency #2062

tylerflex opened this issue Nov 7, 2024 · 10 comments

Comments

@tylerflex
Copy link
Collaborator

tylerflex commented Nov 7, 2024

Motivation: allow a more lightweight base package for validation

Not sure if it's possible:

  1. There are some methods that use scipy (like SimulationData.to_mat) so lazy import should be fine there.
  2. xarray interpolation uses scipy, so we'd need to install different xarray versions depending on the installation?
  3. not sure if scipy comes with autograd already

Either way, we probably want regular pip install tidy3d users to have scipy and interp() support. So not sure if to accomplish this goal of having a "lightweight" tidy3d package if we just need to create a more custom installation script to MC rather than using pyproject.toml?

Thoughts @daquintero @yaugenst @momchil-flex ?

@momchil-flex
Copy link
Collaborator

Motivation: allow a more lightweight base package for validation

Not sure if it's possible:

  1. There are some methods that use scipy (like SimulationData.to_mat) so lazy import should be fine there.
  2. xarray interpolation uses scipy, so we'd need to install different xarray versions depending on the installation?
  3. not sure if scipy comes with autograd already

One other thing that comes to mind is we may want to add it to the trimesh option then. MC basically installs our core packages + trimesh for the validation API to be able to validate all possible simulations, and I remember that some STL validator requires some scipy functionality and it fails if not installed.

Either way, we probably want regular pip install tidy3d users to have scipy and interp() support. So not sure if to accomplish this goal of having a "lightweight" tidy3d package if we just need to create a more custom installation script to MC rather than using pyproject.toml?

That is effectively what this block achieves, albeit in a bit of a dirty way.

tidy3d/pyproject.toml

Lines 41 to 48 in 3f087b5

### NOT CORE
boto3 = "^1.28.0"
requests = "^2.31.0"
pyjwt = "*"
click = "^8.1.0"
responses = "*"
joblib = "*"
### END NOT CORE

But yeah MC basically comments that out for the validation API. So the question to me is if scipy can be moved inside this block. It was moved up I think because of that trimesh issue actually, because the other 3 things that you point out should not be needed for validation. Well actually I think we have a validator for custom data that checks that the data the user has provided can be interpolated (e.g. has no repeated coords), so it may be needed for that too.

@yaugenst-flex
Copy link
Contributor

Not really sure what the minimum requirements for MC are but autograd depends on scipy (as does xarray as @tylerflex pointed out). That seems pretty tricky to get around?
I think autograd only imports scipy when you import autograd.scipy though, so in principle you could pip install autograd --no-deps.

@tylerflex
Copy link
Collaborator Author

Not really sure what the minimum requirements for MC are but autograd depends on scipy (as does xarray as @tylerflex pointed out). That seems pretty tricky to get around?

yea it does, i wonder if we just need to say that we only validate the .to_static() version of the component? and just not install autograd (or use --no-deps)?

@Frank-DingYC
Copy link

I would like to share some background about the problem. We use Web Assembly (More specifically, Pyodide) to import Tidy3D so that we can implement some features, like plot source_time, script objects, show grid, plot permittivity and so on. If we can't limit the core dependencies of Tidy3D, the front-end webpage will suffer from crashing because of out-of-memory issues. For now, @majinge7846 accomplished lots of work to remove Scipy for front-end Tidy3D utilization. However, the problem becomes critical when upgrading Tidy3d to v2.7.6.

Recently, we want to improve the user experience for frequently used functions (Validate/Estimate). We plan to use the same technical solution to implement it. However, we are facing the same problem. Refer to this. Currently, we use lambda to do validation/estimation and pipeline as backup, which will spend lots of time wasted during the simulation.json file transmission. Instead, if we use web assembly to implement validation/estimation, users can get the results immediately.

@majinge7846
Copy link
Contributor

https://github.com/pydata/xarray/blob/9b632fd36edba7c6fcd1eea778ba3d8710c68349/pyproject.toml#L36
image

scipy is an optional deependency of xarray. And I found that it just used for interpolate(scipy.interpolate.interp1d).
If the tidy3d wasm without scipy does't call related functions while do validating. the validation of tidy3d wasm is OK.

webgui can even do a check, if tidy3d wasm error/log validation error, "lack scipy please install first" , this is acceptable, we can use lambda/pipeline as fallback. (Of course, it would be better if numpy or other methods can be used to avoid calling Scipy)

I think webgui validation not need designed to handle all simulation cases, just most simulation cases are OK too. lightweight base package may not be needed.

@yaugenst-flex
Copy link
Contributor

yea it does, i wonder if we just need to say that we only validate the .to_static() version of the component? and just not install autograd (or use --no-deps)?

oh nevermind, scipy is optional in autograd too. i guess that's good news then.

what would be the best way to go here? sounds to me like it would make sense to add something like a minimal install option under [tool.poetry.extras]?

@daquinteroflex
Copy link
Collaborator

daquinteroflex commented Nov 8, 2024

Just catching up on this. Ultimately, this is one of the package refactor aspects we want to implement in 3.0. In the sense that we want to have an essential core tidy3d API and extend the package functionality from there. So this is helpful to appreciate the core functionality and extensible functionality. Like the core functionality probably should just be an static API data-structure validation and then have a clear boundary for further functionality. In the future, we do want to have minimal tidy3d lightweight pacakge that is extended through tidy3d[option] or simply other independent packages that depend on the minimal tidy3d. This should help for separating lightweight memory operations in the web built upon a minimal installation.

This kind of leads to the issue of how to approach this right now (edit to be clear 3.0 is a bit far away still). Unfortunately poetry does not enable us to "minimize" the base package [tool.poetry.dependencies] : it will always install those when pip install tidy3d. --no-deps is also an option since autograd or xarray don't directly depend on it. Based on previously looking into this, tool.poetry.extras will install extra packages on top of the base installation. So we shouldn't really make scipy an optional dependency and install it in an extra because then users will not have it in their base tidy3d installation by default. Unfortunately, I think the kind of hacky solution we've been using of the CORE NOTCORE commented section in the pyproject.toml would enable it not to be directly installed by MC in their current workflow possibly. Possibly we can add trimesh as Momchil suggests, also since this hack is shared in the interim before 3.0.

If the pipeline can be a good fallback for all the other non-web-assembly local validation cases then I think this could be a good way to get it to work before 3.0! We'd just have to check the scipy to be sure they are on-the-fly per-function like Tyler mentioned.

Happy to do a PR for this if this is the approach we want to go for.

@momchil-flex
Copy link
Collaborator

momchil-flex commented Nov 8, 2024

So my thoughts:

  • We should definitely get something working before 3.0 as the latter may take quite a while
  • Yeah I remember why we went with CORE NOTCORE and I think we may just have to continue with that. Basically I don't think we want to require our typical users to do anything else but pip install tidy3d to be able to do most of their work.
  • On the other hand, we do separate some packages out as extras that users may still need, when we've encountered installation issues with such packages in the past. This is why trimesh and gdstk and vtk are separate extras. So I don't think we should put trimesh into CORE if it's then going to make it always installed for the usual installation too, but we could consider somehow defining a CORE (or some other name) set of dependencies which are needed for all validations but nothing more.
  • Finally, whatever we do about this, we need to make sure that we are testing that everything can be validated using only these base dependencies. That's probably the biggest chunk of effort. We also discussed on slack at some point that we may need to refactor our SIM_FULL into a more formal set of tests for all possible components, also probably splitting it into multiple simulations as some components are becoming incompatible to be in the same simulation. I mean, without too much refactoring of the tests, testing the core/validate_only install could potentially also be done by defining a subset of existing tests to run with the limited installation, which should cover quite a lot already.

@daquinteroflex
Copy link
Collaborator

Nice, yeah I agree!

How urgent is this out of curiosity? If it is very urgent for the web UI (sounds a bit?), probably someone else should do this given my current limited timing constraints. Otherwise, think I've got a few ideas how to implement and properly test this with the right environments as you suggest and happy to focus on cracking on this. Or happy to propose a few implementation approaches too as need be.

@majinge7846
Copy link
Contributor

majinge7846 commented Nov 11, 2024

Nice, yeah I agree!

How urgent is this out of curiosity? If it is very urgent for the web UI (sounds a bit?), probably someone else should do this given my current limited timing constraints. Otherwise, think I've got a few ideas how to implement and properly test this with the right environments as you suggest and happy to focus on cracking on this. Or happy to propose a few implementation approaches too as need be.

test for web to usage:
do not install scipy, load/create most simulation (validation) successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants