Reconsidering the license of reusable libraries #75

ibnesayeed · 2017-11-22T21:34:33Z

The license selection of Data Together repositories (mostly AGPL) is little troubling for me. While the selection of a license is completely on the discretion of the author or the parent organization, but this may affect who can contribute or use the software in question. I want to express my concerns about Data Together's choice of AGPL license. There are some very useful libraries being built (such as warc, cdxj, and many others) that can potentially be used in implementing various archival related tools and applications. However, I personally prefer releasing my work under a more permissive license such as MIT (which happens to be the most used license on GitHub, 44.69% in contrast only 1.05% projects used AGPL in year 2015). This means, I (and others releasing their code under MIT or other permissive licenses) cannot use these rather useful components in my work and might end up implementing them myself or find a compatible version.

The Open Source Guides has an article, The Legal Side of Open Source, where they talk about the general nature of licenses and their suitability etc. Here is a quote from there about what to consider for a license decision:

Do you want your project to be used as a dependency by other projects? Probably best to use the most popular license in your relevant community. For example, MIT is the most popular license for npm libraries.
Do you want your project to appeal to large businesses? A large business will likely want an express patent license from all contributors. In this case, Apache 2.0 has you (and them) covered.
Do you want your project to appeal to contributors who do not want their contributions to be used in closed source software? GPLv3 or (if they also do not wish to contribute to closed source services) AGPLv3 will go over well.

I feel like many of the repositories from the Data Together fall under the first category. The Open Source Guides also talk about the possibility of switching licenses and related consequences. Recently, Facebook has relicensed their various popular libraries such as React from the BSD + Patents to the MIT because their earlier license choice was coming in the way of their adoption due to some obvious reasons.

b5 · 2017-11-22T21:43:38Z

@ibnesayeed brought this up on slack & I asked him to post here.

I think this is an issue that with any luck we'll encounter a bunch as time goes on. If people are bringing it up it's (hopefully) because they're interested in using what we're building. I'm of the opinion that our libraries should be MIT, but our services should be AGPL, but I'm not tied to that position by any means.

cc'ing relevant parties: @titaniumbones @ebarry @patcon @jeffreyliu @ebenp @patcon @dcwalk.

ibnesayeed · 2017-11-22T22:46:32Z

@b5: I'm of the opinion that our libraries should be MIT, but our services should be AGPL

This sounds a very reasonable approach to me.

ebenp · 2017-11-26T15:31:05Z

Sounds good to me!

ebarry · 2017-11-30T17:39:49Z

Noting that the questions posed by the Open Source Guide do not include any language about the commons and share-alike which are the major drivers of a GPL flavored license choice. Consider:

Do you want improvements created on your work to be accessible to you?
Do you feel that all people using your software, even if they only access it over a network (the internet) have the right to access the source?
Do you want to contribute to humanity's knowledge commons?
Do you want to resist enclosure of the commons?

ebarry · 2017-11-30T18:09:59Z

In the past decade, "unrestricted use" has become culturally associated with "most sharing-oriented." There are actually at least two major innovations that have happened to bring this about, and i want to disentangle these.

Hypermodularity (a leap forward in the style of writing code) has lowered barriers to entry much more dramatically than was possible in older, enormously labyrinthine codebases. Because newer hypermodular codebases were also using permissive open source licenses at a greater rate than copyleft codebases, these two innovations became entangled in people's minds but they are distinct.

ibnesayeed · 2017-11-30T18:35:51Z

@ebarry: Noting that the questions posed by the Open Source Guide do not include any language about the commons and share-alike which are the major drivers of a GPL flavored license choice.

I thought, CC and CC-SA licenses are used more often with creative work such as pictures, music, and whatnot. On the coding side popular copyleft licenses are usually variants of GPL. Looking at the intent and purpose of the "Open Source Guides", I am not surprised that it is kept concise and easier for non-lawyer people to digest the basics of licensing.

All those questions you put forward should be considered. However, one answer might not fit all situations. Looking at the intent, purpose, and feasibility of various types of work may suggest different answers for end-user software, libraries and packages, services, documentation, data, and artwork etc.

ebarry · 2017-11-30T19:14:48Z

Thanks @ibnesayeed. I want to clarify that the term share-alike is not domain specific, as in only for creative works and not for software. Share-alike means that copies or adaptations of the work will be released under the same or similar license as the original. In this way we can see that GPL is a share-alike license. The terms copyleft, GPL, share-alike all refer to the same principle -- a way of working to build a commons that cannot be enclosed.

ibnesayeed · 2017-11-30T19:32:28Z

Thanks @ebarry, I thought you were more specifically talking about the Creative Commons license beacsue you said:

... the Open Source Guide do not include any language about the commons and share-alike which ...

However, they do have brief notes about both permissive and share alike licenses:

Your project very likely has (or will have) dependencies. For example, if you’re open sourcing a Node.js project, you’ll probably use libraries from the Node Package Manager (npm). Each of those libraries you depend on will have its own open source license. If each of their licenses is “permissive” (gives the public permission to use, modify, and share, without any condition for downstream licensing), you can use any license you want. Common permissive licenses include MIT, Apache 2.0, ISC, and BSD.

On the other hand, if any of your dependencies’ licenses are “strong copyleft” (also gives public same permissions, subject to condition of using the same license downstream), then your project will have to use the same license. Common strong copyleft licenses include GPLv2, GPLv3, and AGPLv3.

flyingzumwalt · 2017-12-10T16:48:50Z

👍 I think part of the issue here is that we're trying to apply a single policy to very different types of things (code, creative work, and datasets).

for code, permissive licenses (ie. MIT or BSD) are much better than strong copyleft because it encourages adoption and reuse of the software, which ultimately builds a stronger network of users and contributors.
for non-code contributions, primarily tutorials, documentation, etc. CC-SA is a fine option

For datasets it's more complicated.

For data that is held and/or redistributed by the Data Together project itself, we need some baseline permissions to redistribute, repurpose and reuse the data (example: flickr commons only accepts images and metadata with "no known copyright restrictions")
We shouldn't prevent people from releasing their data under more permissive licenses (ie. CC-0)
The software we create should encourage open data, but shouldn't necessarily force you to only use open data in the system as long as people's "closed" data doesn't get mixed into the open data everyone is passing around

dcwalk · 2018-02-15T04:58:06Z

Late to this, I know the conversation has moved much further along, but just wanted to document on the ur-thread my thoughts:

I'm only going to address code here as I agree that datasets are quite different and I would look to the work of the open knowledge foundation, p2p foundation, and more around different considerations before I would want to try to grapple with how it operates in the DT context.

I am sympathetic to "permissive" licenses (a name I find to be a bit of a misnomer) in the context of current software development. I don't find the type of communities generated through more permissive OS licenses to necessarily live up to the promise (ie, be actual communities) except in a minority of cases, and I don't think that is isolatable from the implicit conception of knowledge such licenses create or are predicated on. I also don't find the current popularity of licenses to necessarily be a compelling reason.

I think a lot about how I get to participate in the world through projects I work and do not often feel completely comfortable with the failed promise of "business friendly" and "openness" as it has played out within tech projects. I was struck by @ ebarry 's questions above and would pretty much answer "YES" to all of them all of the time...?

With that in mind. I'm one voice on this and I'm coming from a particular trajectory. @b5 has done the outsize portion of development on this and I factor that into my thoughts on the next steps on this. So, I guess I'm saying I'd want us to have a very strong, well identified reason (and strategy) for re-licensing portions of the DT code.

ibnesayeed · 2018-03-13T16:41:35Z

@dcwalk: I am sympathetic to "permissive" licenses (a name I find to be a bit of a misnomer) in the context of current software development. I don't find the type of communities generated through more permissive OS licenses to necessarily live up to the promise (ie, be actual communities) except in a minority of cases, and I don't think that is isolatable from the implicit conception of knowledge such licenses create or are predicated on. I also don't find the current popularity of licenses to necessarily be a compelling reason.

I personally don't think that either of the permissive or strong copyleft licenses are generally an absolute best or the purest choice (even in the context of just codes, not data sets). Unless we envision a world of everything open where there is no space for proprietary systems to co-exist, a pure copyleft license has a definite barrier between opensource and proprietary software which may or may not be a good thing depending on what is being discussed. For example, if I am defining a file format or an encoding scheme, I would perhaps like it to be used by as many systems as possible (including closed source), both in the spirit of interoperability and openness.

Permissive licences do not necessarily mean that an open source code is any less open or can be closed by an entity. The state of the code that is open remains open, any modifications to that code or bigger systems using that code may or may not be open. In general, it encourages more contributions which is quite evident from the empirical observations of the proliferation of the MIT and similar licenses on GitHub. While I agree that popularity does not necessarily translate to the quality, but it does not mean otherwise either.

Also, many of the license restrictions are related to the distribution of derivative work and not necessarily on their usage. So a strong copyleft license does not necessarily stop anyone from taking advantage of an existing work and modify it for their use without releasing their modifications, as long as they do not distribute it to others in any form (my understanding here might be incorrect).

That said, I do not want to influence the decision of the license choice by being too vocal, which is, in all fairness, the right of the author or the parent organization. However, in the spirit of this thread, I wanted to present my view point. I am a firm believer that these things should be considered on a case-by-case basis. Perhaps GPL was the best license for something like Linux, without which it would not be what it is now and a permissive license might have been a disaster for it. On the other hand MIT was perhaps the best license choice for something like jQuery or otherwise it would not have been so ubiquitous.

dcwalk added question in progress labels Feb 15, 2018

dcwalk mentioned this issue Feb 17, 2018

Roundtable: Licensing of Data Together libraries (CDXJ & WARC packages) 2018-Feb-19 datatogether/datatogether#48

Closed

1 task

b5 added ready and removed in progress labels Aug 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reconsidering the license of reusable libraries #75

Reconsidering the license of reusable libraries #75

ibnesayeed commented Nov 22, 2017

b5 commented Nov 22, 2017

ibnesayeed commented Nov 22, 2017

ebenp commented Nov 26, 2017

ebarry commented Nov 30, 2017

ebarry commented Nov 30, 2017

ibnesayeed commented Nov 30, 2017

ebarry commented Nov 30, 2017

ibnesayeed commented Nov 30, 2017

flyingzumwalt commented Dec 10, 2017

dcwalk commented Feb 15, 2018

ibnesayeed commented Mar 13, 2018

Reconsidering the license of reusable libraries #75

Reconsidering the license of reusable libraries #75

Comments

ibnesayeed commented Nov 22, 2017

b5 commented Nov 22, 2017

ibnesayeed commented Nov 22, 2017

ebenp commented Nov 26, 2017

ebarry commented Nov 30, 2017

ebarry commented Nov 30, 2017

ibnesayeed commented Nov 30, 2017

ebarry commented Nov 30, 2017

ibnesayeed commented Nov 30, 2017

flyingzumwalt commented Dec 10, 2017

dcwalk commented Feb 15, 2018

ibnesayeed commented Mar 13, 2018