Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Balancing Corporate Involvement #3

Open
jcrist opened this issue Dec 4, 2018 · 5 comments
Open

Balancing Corporate Involvement #3

jcrist opened this issue Dec 4, 2018 · 5 comments

Comments

@jcrist
Copy link
Member

jcrist commented Dec 4, 2018

How do we balance friendliness towards corporate involvement (e.g. sponsored development, patches from corporate users, etc...) and community involvement (remain open and friendly to individual contributors or small groups)?

@mrocklin
Copy link
Member

Here are some situations that might arise:

  1. Developer from corporation X arrives with a patch for a major feature. They say that corporation X wants them to include a copyright notice in the file(s) that include their contribution.

  2. Corporation X wants to build internal infrastructure on top of Dask, but is concerned that the community might change Dask in ways that would break their application. They're happy to contribute to the project financially if there is some way to guarantee stability of their application. How can they do this?

    Last year the answer was "Open a consulting contract with Anaconda Inc to pay Dask developers to prioritize features or bugfixes that are important to the company". It's not entirely clear what the answer is today.

  3. Corporation X sees Dask as critical to their business and is happy to devote engineering time to the project in certain directions, assuming that they can also influence where the project goes long term. They're not asking to buy complete control, but they are asking to buy access and "a seat at the table".

    There is some balance here between encouraging them to participate while also making sure that they don't dominate the direction of the project.

  4. A smaller organization (perhaps a startup or research lab) uses Dask and considers it critical to their work, is able to spend a bit of maintenance time on the project, but is unable to provide the kind of human-power that a larger organization can provide, and so likely does not get a seat at whatever governance body exists.

  5. A consortium of smaller organizations with similar goals bands together, and speaks with a unified voice. No single institution has enough voice to be very loud, but the larger consortium does.

    This is arguably the case for Pangeo.

There are probably more cases like these. I encourage people to think about the right way to balance these use cases and how the Dask community should react to them.

@mrocklin
Copy link
Member

mrocklin commented Dec 13, 2018

Thinking of three different concrete cases:

  • CPython doesn't seem to engage corporations much today, despite being critical infrastructure. I've heard people within corporations say "To whom do I write the million dollar check to ensure that Python doesn't change division semantics again?" Core Python is probably on one extreme of entirely community developed with no explicit corporate interest.

  • Tensorflow is possibly the other extreme. Tensorflow gets tons of support from Google (tens of full-time engineers), but decision making seems to be entirely internal.

  • The Databricks/Apache Spark/IBM experience is also an interesting case study for people to consider. There is a single company, Databricks, that for a long time (maybe still?) employed the majority of Apache Spark committers. Then there was a new company, IBM, that dumped a ton of developer time on the project. I don't know precisely what happened here, but I got the sense that there was frustrations on all sides of this arrangement. Some frustrations that might have been felt include:

    • The community feeling that the Spark project probably wouldn't ever deviate from a path that was profitable to Databricks
    • The community feeling overwhelmed by IBM contributions
    • IBM feeling like they weren't able to effectively move the project despite their investment
    • ... ? (I have zero insight into what actually happened here, I'm just hypothesizing)

@ogrisel
Copy link
Contributor

ogrisel commented Dec 14, 2018

Note that #1 (NumFocus sponsorship) states the following minimal requirement:

Have a leadership body or team consisting of at least 3 people; these people should not be employed by the same entity or share a common affiliation beyond that of the project.

@guillaumeeb
Copy link
Member

I feel that the NumFocus minimal requirement is very good for this.

I'm confortable to say that

To whom do I write the million dollar check to ensure that Python doesn't change division semantics again

corporation X wants them to include a copyright notice in the file(s) that include their contribution

or

there is some way to guarantee stability of their application

will not be feasible.

IMO, the only thing corporation can get is

they can also influence where the project goes long term

with no absolute garantee. If they have active developers, they can have a seat at the table, but only one per corporation/consortium. We must ensure that leadership is spread over several groups in the future.

@mrocklin
Copy link
Member

Do other people have thoughts on this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants