diff --git a/contents/english/4-4-property-and-contract.md b/contents/english/4-4-property-and-contract.md index eae46a6d..e46e3cc9 100644 --- a/contents/english/4-4-property-and-contract.md +++ b/contents/english/4-4-property-and-contract.md @@ -28,7 +28,7 @@ The Vasana Team -Most large-scale cooperation in ⿻ societies takes place through the pooling of assets into entities that are considered chartered as corporations, including limited liability partnerships, civic organizations, religious organizations, trade associations, unions, and, of course, for-profit stock corporations. Their legal basis is in contractual arrangement that governs a sharing of assets (real, intellectual, human and financial) in a common undertaking towards a shared purpose. Even the simplest, most common and smallest scale contracts, such as rental agreements, involve the sharing of assets across people. +Most large-scale cooperation in ⿻ societies takes place through the pooling of assets into entities that are considered chartered as corporations, including limited liability partnerships, civic organizations, religious organizations, trade associations, unions, and of course, for-profit stock corporations. Their legal basis is in contractual arrangement that governs a sharing of assets (real, intellectual, human and financial) in a common undertaking towards a shared purpose. Even the simplest, most common and smallest scale contracts, such as rental agreements, involve the sharing of assets across people. A core aim of Lick's original vision for the "Intergalactic Computer Network" was to facilitate the sharing of digital assets, such as computation, storage and data. And, in some ways, such sharing is the heart of today's digital economy, with "the Cloud" providing a vast pool of shared computation and storage and the wide range of information shared online forming the foundation of the generative foundation models (GFMs) that are sweeping the technology industry. Yet for all the success of this work, it is confined to limited slices of the digital world and controlled by a small group of highly profitable, for-profit entities based in at most a handful of countries, creating both tremendous waste of opportunity and concentration of power. The dream that the internet could enable broad and horizontal asset sharing has yet to be realized. @@ -41,11 +41,11 @@ As with the other fundamental protocols we have discussed in this part of the bo As highlighted perhaps most dramatically in Kate Crawford's beautifully drawn *Atlas of AI*, the digital world is built on top of the physical world: computer circuits are made from rare metals mined with all the ensuing social challenges, data centers work much like and are often co-located with power plants, data is created by people like the "ghost workers" documented by Mary Gray and Siddarth Suri etc.[^Phys] Any serious account of the digital realm must thus grapple with real property relations. Yet there are crucial assets that emerge from these physical substrates as digital-native abstractions and that are the crucial components of life online. -We will focus on three categories that are most ubiquitous: storage, computation and data. Yet there are many other examples that intersect with these and have many related challenges, including the electromagentic spectrum, code, names and other addresses (e.g. universal record locater/urls), "physical" space in virtual worlds and non-fungible tokens (NFTs). +We will focus on three categories that are most ubiquitous: storage, computation and data. Yet there are many other examples that intersect with these and have many related challenges, including the electromagnetic spectrum, code, names and other addresses (e.g. universal record locator/urls), "physical" space in virtual worlds and non-fungible tokens (NFTs). -Storage, computation and data lie at the core of essentially every online interaction. Anything that occurs online persists from one moment to the next only because of the data it depends on being stored somewhere. The occurrences themselves are embodied by computations being performed to determine the outcome of instructions and actions. And the input and output of every operation are data. In this sense, storage acts something roughly like land in the real economy, computation acts something like fuel and data act like human inputs (sometimes called labor) and artifacts people create and reuse (sometimes called capital). +Storage, computation and data lie at the core of essentially every online interaction. Anything that occurs online persists from one moment to the next only because of the data it depends on being stored somewhere. The occurrences themselves are embodied by computations being performed to determine the outcome of instructions and actions. And the input and output of every operation are data. In this sense, storage acts something roughly like land in the real economy, computation acts something like fuel and data acts like human inputs (sometimes called labor) and artifacts people create and reuse (sometimes called capital). -While land, fuel, labor and capital are often treated as homogeneous "commodities", as social theorist Karl Polanyi famously argued this is a simplifying fiction. Storage, computation and especially data are heterogeneous, tied to places, people and cultures and these connections affect both their performance characteristics and the social impacts and meanings of using them in a digital economy and society. While these challenges are significant for fictitious commodities "in real life", in some ways they are even more severe for digital assets and at very least societies have had far less time to jointly adapt economic and social structures around them. These challenges are among the key inhibitors to a functional digital system of sharing, property and contract. +While land, fuel, labor and capital are often treated as homogeneous "commodities", as social theorist Karl Polanyi famously argued this is a simplifying fiction. Storage, computation and especially data are heterogeneous, tied to places, people and cultures and these connections affect both their performance characteristics and the social impacts and meanings of using them in a digital economy and society. While these challenges are significant for fictitious commodities "in real life", in some ways they are even more severe for digital assets and at the very least societies have had far less time to jointly adapt economic and social structures around them. These challenges are among the key inhibitors to a functional digital system of sharing, property and contract. @@ -53,7 +53,7 @@ While land, fuel, labor and capital are often treated as homogeneous "commoditie ### The Intergalactic Computer Network -Lick's 1960 "Memorandum For Members and Affiliates of the Intergalactic Computer Network" did not focus on the potential for online socialization or commerce that characterized so much of his contemporary and later writing. Instead, perhaps because of his scientific audience at the time, he focused on the potential for scientists to massively increase their productivity through computer networks by sharing analytic tools, memory and storage, computation and research findings and the promise this might have for related military applications. This was also a natural extension of the "time-sharing" systems that were one of the first projects Lick funded and aimed to allow a semblance of what would become the "personal computing" experience in the era of large mainframe computers by allowing many users to share access to a larger machine's capacity. In this sense, the internet began, above all, as a platform for precisely the sort of large scale computational resource sharing that we focus on in this chapter. +Lick's 1960 "Memorandum For Members and Affiliates of the Intergalactic Computer Network" did not focus on the potential for online socialization or commerce that characterized so much of his contemporary and later writing. Instead, perhaps because of his scientific audience at the time, he focused on the potential for scientists to massively increase their productivity through computer networks by sharing analytic tools, memory, storage, computation, research findings and the promise this might have for related military applications. This was also a natural extension of the "time-sharing" systems that were one of the first projects Lick funded and aimed to allow a semblance of what would become the "personal computing" experience in the era of large mainframe computers by allowing many users to share access to a larger machine's capacity. In this sense, the internet began, above all, as a platform for precisely the sort of large scale computational resource sharing that we focus on in this chapter. To appreciate why such an apparently dull topic excited such an (otherwise) expansive mind, it is useful both to look backwards from today at the limits he was trying to overcome and forward to the limits we might, in delivering on his vision, overcome ourselves. During the 1950s and 1960s, the dominant paradigm of computing was large "mainframes" sold primarily by International Business Machines (IBM). These were expensive machines intended to serve the needs of an entire business, university department or other large grouping. To access these machines, users would have to bring programs to a central administrator and they would, infrequently, have a single "high risk" chance to run their desired computation. If it turned out to have a bug, as it often did, they would have to return later, having meticulously and without practical tests attempted to fix these errors. At the same time, because preparing programs and managing the machine was so challenging, much of their time was spent idling, waiting for programs to arrive. @@ -67,75 +67,77 @@ What amazing future could we simulate if we could more effectively share our com ### The state of sharing -Studies of the semiconductor industry indicate that several times as many semiconductors are used in personal devices (e.g. PCs, smartphones, smartwatches, video game consoles) as go into cloud infrastructure and data centers.[^Gartner] While there is little systematic study, personal experience indicates that most of these devices are mostly little used most of the day. this suggests that a majority if not a large majority of computation and storage lies fallow at any time, not even accounting for the prevalent waste even in cloud infrastructure. Data are even more extreme; while these are even harder to quantify, the experience of any data scientist suggests that the overwhelming majority of desperately needed data sits in organizational or jurisdictional silos, unable to power collaborative intelligence or the building of GFMs. +Studies of the semiconductor industry indicate that several times as many semiconductors are used in personal devices (e.g. PCs, smartphones, smartwatches, video game consoles) as go into cloud infrastructure and data centers.[^Gartner] While there is little systematic study, personal experience indicates that most of these devices are mostly little used most of the day. This suggests that a majority if not a large majority of computation and storage lies fallow at any time, not even accounting for the prevalent waste even in cloud infrastructure. Data are even more extreme; while these are even harder to quantify, the experience of any data scientist suggests that the overwhelming majority of desperately needed data sits in organizational or jurisdictional silos, unable to power collaborative intelligence or the building of GFMs. [^Gartner]: https://www.gartner.com/en/newsroom/press-releases/2023-12-04-gartner-forecasts-worldwide-semiconductor-revenue-to-grow-17-percent-in-2024 Asset sharing may have important implications for values such as national security and the environment. Waste of resources effectively reduces the supply of semiconductors that national security policies have aimed to maximize and, like any waste, increases the demand on environmental resources per unit of output. However, it is important to bear in mind that the sources of energy employed by distributed devices and their efficiency in converting this energy to computation may in some cases be lower than those of cloud providers, making it important to pair improvements to digital asset sharing with the greening of the consumer electrical grid. Perhaps the most important implication of digital asset sharing for security may be increased interdependence between participants in these sharing networks which may bring them into tighter geopolitical alignment, especially given the requisite alignments of privacy and collaboration regulations. -What is most shocking about these figures is perhaps their comparison to physical assets, which one would naturally assume should be harder to share and ensure full utilization of given the difficulty of transportation and physical redeployment. When unemployment rates for workers or vacancy rates for housing rise above single digits, political scandal usually ensues; such waste is omnipresent in the digital world. In short, rates of waste (effective under- and unemployment) of physical assets even close to these would be considered a global crisis. +What is most shocking about these figures is perhaps their comparison to physical assets, which one would naturally assume should be harder to share and ensure full utilization given the difficulty of transportation and physical redeployment. When unemployment rates for workers or vacancy rates for housing rise above single digits, political scandal usually ensues; such waste is omnipresent in the digital world. In short, rates of waste (effective under- and unemployment) of physical assets even close to these would be considered a global crisis. -The key reason why this silent crisis is a bit less surprising than the figures suggest is that these purely digital assets are comparatively new. Societies have had thousands if not tens of thousands of years to experiment with various social organizational systems to provide for the needs of the people's within them [^DawnOfEverything]; the origins of of our contemoporay systems of property (rental systems, capital managemnet), labor, and practices that involve the abstract representation of value [^MysteryOfCapital] (with deeds, documents issued to people, supply chain tranactions, money) can be traced to certain social-psycological qualities that arose after 1000 years of cultural practices that banned cousin marriages in Christian Europe and lead to the emergence of people who were free to form new institutions and re-constitute how property was held create new types of institutions democratic institutions that didn't exist before [^Henrich]. There have been decades to figure out how to efficiently rent cars and increasingly harness digital tools to improve the sharing of these assets (e.g. ride and house sharing platforms). Digital assets, especially those in the hands of large groups of non-technical people, date back only a few decades. A crucial task before us, then, is to determine the crucial social and technical barriers to utilizing digital assets with the same effectiveness we have come to expect of physical assets. +The key reason why this silent crisis is a bit less surprising than the figures suggest is that these purely digital assets are comparatively new. Societies have had thousands if not tens of thousands of years to experiment with various social organizational systems to provide for the needs of the people within them [^DawnOfEverything]. The origins of of our contemporary systems of property (rental systems, capital management), labor, and practices that involve the abstract representation of value [^MysteryOfCapital] (with deeds, documents issued to people, supply chain transactions, money) can be traced to certain social-psychological qualities that arose after 1000 years of cultural practices. The ban on cousin marriages in Christian Europe led to the emergence of people who were free to form new institutions and re-constitute how property was held which created new types of democratic institutions that didn't exist before [^Henrich]. There have been decades to figure out how to efficiently rent cars and increasingly harness digital tools to improve the sharing of these assets (e.g. ride and house sharing platforms). Digital assets, especially those in the hands of large groups of non-technical people, date back only a few decades. A crucial task before us, then, is to determine the crucial social and technical barriers to utilizing digital assets with the same effectiveness we have come to expect of physical assets. One way to consider what stands in the way of computational asset sharing is to consider the areas where it has been relatively successful and draw out the differences between these domains and those where it has thus far mostly failed. To do so, we will run through the three areas of focus above: storage, computation and data. -The closest framework to an open standard for asset sharing exists in storage, through the Interplanetary File System (IPFS) explicitly modeled on Lick's vision and pioneered by Juan Benet and his Protocol Labs (PL). This open protocol allows computers around the world to offer storage to each other peer-to-peer in a fragmented, encrypted and distributed manner that helps ensure redundancy, robustness and data secrecy/integrity for users at reasonable cost to those who provide storage to the system. Prominent services built on the protocol include the use by Taiwan's Ministry of Digital Affairs and other governments facing strong adversaries who may hold leverage over more centralized service providers to ensure the persistence of their data and the storage market created by PL's Filecoin system to allow commercial transactions for IPFS-based storage. +The closest framework to an open standard for asset sharing exists in storage, through the Interplanetary File System (IPFS) explicitly modeled on Lick's vision and pioneered by Juan Benet and his Protocol Labs (PL). This open protocol allows computers around the world to offer storage to each other at a reasonable cost in a peer-to-peer, fragmented, encrypted and distributed manner that helps ensure redundancy, robustness and data secrecy/integrity. Prominent services built on the protocol include the use by Taiwan's Ministry of Digital Affairs and other governments facing strong adversaries who may hold leverage over more centralized service providers. To ensure the persistence of their data and the storage market PL also created the Filecoin system to allow commercial transactions and incentivize users to store as much of the entire network’s data as they can. -Yet even IPFS has been a limited success for "real-time" storage, where files need to be stored so as to allow their rapid access from many places around the world. It thus seems to be the relative simplicity of "deep" storage (think of the equivalent of the "public storage" spaces provided as a commodity service in real life) that has allowed IPFS to survive. Even the slightly more complicated challenge of optimizing for latency has been handled overwhelmingly by large corporate "cloud" providers such as Microsoft, Amazon, Google and Salesforce. Most of the digital services familiar to consumers in the developed world (remote storage of personal files across devices, streaming of audio and video content, share documents, etc.) depend on these cloud providers. They are also at the core of most digital businesses today, with 60% of business data being stored in proprietary clouds and the top two proprietary cloud providers (Amazon and Microsoft) capturing almost two-thirds of the market.[^Cloud] +Yet even IPFS has been a limited success for "real-time" storage, where files need to be stored so as to allow their rapid access from many places around the world. It thus seems to be the relative simplicity of "deep" storage (think of the equivalent of the "public storage" spaces provided as a commodity service in real life) that has allowed IPFS to survive. + +The slightly more complicated challenge of optimizing for latency has been handled overwhelmingly by large corporate "cloud" providers such as Microsoft, Amazon, Google and Salesforce. Most of the digital services familiar to consumers in the developed world (remote storage of personal files across devices, streaming of audio and video content, shared documents, etc.) depend on these cloud providers. They are also at the core of most digital businesses today, with 60% of business data being stored in proprietary clouds and the top two proprietary cloud providers (Amazon and Microsoft) capturing almost two-thirds of the market.[^Cloud] [^Cloud]: https://explodingtopics.com/blog/cloud-computing-stats, https://www.statista.com/chart/18819/worldwide-market-share-of-leading-cloud-infrastructure-service-providers/ -Yet even beyond the drawbacks of this space being controlled by a small number of for-profit companies, these cloud systems have achieved in many was far less than their pioneers or visionaries like Lick imagined. +Yet even beyond the drawbacks of this space being controlled by a small number of for-profit companies, these cloud systems have achieved, in many ways, far less than the visionaries like Lick imagined. First, heralds of the "cloud era" such as the Microsoft team that helped persuade the company to pursue the opportunity saw many of the gains from the cloud arising from more efficient resource sharing across tenants and applications to ensure full utilization.[^EconoCloud] Yet, in practice, most of the gains from the cloud have come from physical cost savings of data centers co-located with abundant power sources and efficiently maintained, rather than from meaningful cross-tenant resource-sharing as few cloud providers have effectively facilitated this kind of market and few customers have found ways to make sharing resources work for them. -Relatedly but even more dramatically, as emphasized in our statistics above, the cloud has large been built bespoke, in new data centers around the world, even as most available computation and storage remains, severely underutilized, in the pockets and on the laps and desks of personal computer owners around the world. Furthermore, these computers are physically closer and often more tightly networked to the consumers of computational resources than are cloud data centers...and yet the "genius" of the cloud system has systematically wasted them. In short, despite its many successes, the cloud has to a large extent involved a reversion to an even more centralized version of the "mainframe" model that proceeded the timesharing work Lick helped support, rather than a realization of its ambitions. +Relatedly but even more dramatically, as emphasized in our statistics above, the cloud has largely been built in new data centers around the world, even as most available computation and storage remains, severely underutilized, in the pockets and on the laps and desks of personal computer owners around the world. Furthermore, these computers are physically closer and often more tightly networked to the consumers of computational resources than the bespoke cloud data centers...and yet the "genius" of the cloud system has systematically wasted them. In short, despite its many successes, the cloud has to a large extent involved a reversion to an even more centralized version of the "mainframe" model that preceded the time sharing work Lick helped support, rather than a realization of its ambitions. -Yet even these limited success have been far more encompassing than what has been achieved in data sharing. The largest-scale uses of data today are either extremely siloed not just within corporate or institutional boundaries but even highly subdivided by privacy policies within these...or based on the ingestion of publicly available data online without even the awareness, much less consent, of the data creators. The leading example of the later is the still-undisclosed data sets on which the larges generative foundation models were trained. The movement to allow data sharing even for clear public interest cases, such as public health or the curing of diseases, has been held out for years under a variety of names and yet has made very little progress either in the private sector or in open standards-based collaborations. +Yet even these limited successes have been far more encouraging than what has been achieved in data sharing. The largest-scale uses of data today are either extremely siloed not just within corporate or institutional boundaries but even highly subdivided by privacy policies within these or, contrastingly, based on the ingestion of publicly available data online without even the awareness, much less consent, of the data creators. The leading example of the latter is the still-undisclosed data sets on which the generative foundation models(GFMs) were trained. The movement to allow data sharing even for clear public interest cases, such as public health or the curing of diseases, has been held out for years under a variety of names and yet has made very little progress either in the private sector or in open standards-based collaborations. This problem is widely recognized and the subject of a variety of campaigns around the world. Examples include the European Union's Gaia-X data federation infrastructure and their Data Governance Act, India's National Data Sharing and Accessibility Policy and Singapore's Data Sharing Agreements framework are just a few examples of attempts to overcome these challenges. ### Impediments to sharing -What lessons can we glean from these failures about the impediments to more effective sharing of digital assets? From the fact that data sharing has failed most spectacularly, and has struggled most with issues around data sharing, a natural hypothesis is that related issues may lie at the core of many of these problems. After all, related challenges recur in all these domains. Much of the structure of IPFS and the challenges it faces relate to maintaining data privacy while allowing storage far from the person or organization seeking to maintain this privacy. A central advantage of the cloud providers has been their reputation for maintaining the security and privacy of customer data while allowing those customers to share it across their devices and perform large-scale computations on it. +What lessons can we glean from these failures about the impediments to more effective sharing of digital assets? From the fact that data sharing has failed most spectacularly, and storage sharing has struggled most with issues around data sharing, a natural hypothesis is that related issues may lie at the core of many of these problems. After all, related challenges recur in all these domains. Much of the structure of IPFS and the challenges it faces relate to maintaining data privacy while allowing storage far from the person or organization seeking to maintain this privacy. A central advantage of the cloud providers has been their reputation for maintaining the security and privacy of customer data while allowing those customers to share it across their devices and perform large-scale computations on it. A basic contrast between data and many standard, real-world assets is important in understanding these challenges. Lending and pooling of assets is ubiquitous in the economy as we discussed above. Critical to it is the possibility of decomposing the rights one has to an asset. Legal scholars typically describe three attributes of property: "usus" (the right to use something), "abusus" (the right to alter or dispose of it) and "fructus" (the right to the value it creates). A standard rental contract, for example, transfers to the renter the usus rights, while retaining abusus and fructus for the landlord. A corporation grants usus of many assets to employees, abusus only to senior managers and often only with checks and balances and reserves fructus for shareholders. -Achieving this crucial separation is different and arguably more challenging for data. The simplest ways of giving access to usus of data also allow the person granted access the ability to abuse or transfer the data to others (abusus) and the ability for others to gain financial benefit from those data, possibly at the expense of the person sharing it (fructus). Many who chose to publish data online that has now been incorporated into GFMs believed they were sharing information for others to use, but they did not perceive the full implications that sharing would have. Of course norms, laws and cryptography could all potentially play a role in correcting this situation, and we turn to these short, but at present these are all underdeveloped relative to expectations in, for example, corporate governance or housing rentals, impairing the ability of data sharing to thrive. +Achieving this crucial separation is different and arguably more challenging for data. The simplest ways of giving access to usus of data also allow the person granted access the ability to abuse or transfer the data to others (abusus) and the ability for others to gain financial benefit from those data (fructus), possibly at the expense of the person sharing it. Many who chose to publish data online that has now been incorporated into GFMs believed they were sharing information for others to use, but they did not perceive the full implications that sharing would have. Of course norms, laws and cryptography could all potentially play a role in correcting this situation, and we turn to these shortly. At present these are all underdeveloped relative to expectations in, for example, corporate governance or housing rentals, impairing the ability of data sharing to thrive. -To make matters more complicated, settling on such a set of standards is, for the reasons we highlighted in the Association chapter, challenged by the other key property of data: that interests in it are rarely if ever usefully understood as mostly individual rights. Data are inherently associational, social and intersectional, making many of the simplest "quick fixes" for this problem (in terms of privacy regulations and cryptrography) so misfitting that they impede progress more than they facilitate it. +To make matters more complicated, settling on such a set of standards is, for the reasons we highlighted in the [Association and ⿻ Publics 4-2 chapter](https://www.plurality.net/v/chapters/4-2/eng/), challenged by the other key property of data: that interests in it are rarely if ever usefully understood as mostly individual rights. Data are inherently associational, social and intersectional, making many of the simplest "quick fixes" for this problem (in terms of privacy regulations and cryptography) so misfitting that they impede progress more than they facilitate it. -Furthermore, even if there were a clear set of solutions to these challenges, there is no straightforward way to implement them directly. In the most simplistic understanding of contracts is that they are commitments between parties described and mutually agreed to in a document and the freedom of contract simply requires these be enforced. The reality is much richer: it is impossible to specify in a contract how to resolve many conflicts that may arise and no one could read and process such a detailed document. Most contractual arrangements are therefore governed primarily by customary expectations, legal precedent, statutes that are consistent with these, etc. In many contexts, contractual provisions that conflict with these evolved principles will not be enforced. These norms and legal structures have co-evolved over decades and even centuries to govern canonical relationships like rental and employment, minimizing the role formal, court-based contractual provisions and enforcement have to play. While self-enforcing digital "smart contracts" might thus provide a way to implement such norms smoothly, they cannot substitute for the process of creating a stable social understanding of how data collaboration work, what different parties can expect and when various legal and technical enforcement mechanisms should and will kick in. +Furthermore, even if there were a clear set of solutions to these challenges, there is no straightforward way to implement them directly. The most simplistic understanding of contracts is that they are commitments between parties described and mutually agreed to in a document. The freedom of contract simply requires these be enforced. The reality is much richer: it is impossible to specify in a contract how to resolve many conflicts that may arise and no one could read and process such a detailed document if it were. Most contractual arrangements are therefore governed primarily by customary expectations, legal precedent, statutes that are consistent with these, etc. In many contexts, contractual provisions that conflict with these evolved principles will not be enforced. These norms and legal structures have co-evolved over decades and even centuries to govern canonical relationships like rental and employment, minimizing the role that formal court-based contractual provisions and enforcement have to play. While self-enforcing digital "smart contracts" might thus provide a way to implement such norms smoothly, they cannot substitute for the process of creating a stable social understanding of how data collaboration works, what different parties can expect and when various legal and technical enforcement mechanisms should and will kick in. -In fact, contracts are "incomplete" -- they don't specify all the details about the arrangements they govern -- in two ways. First, the contract terms themselves are often susceptible to multiple interpretations, so that a third party like a judge needs to clarify them in the event of a dispute. But second, and even more importantly, they often say nothing about many relevant aspects of the relationship between the parties. For example an employment contract may specify certain cases where they employee may be fired; and it may specify certain duties the employee is to complete. But these bright lines don't ever paint the full picture between an employee and employer. In fact, the employer typically wants the employee to work really hard and think really creatively about how to serve the company -- not just perform the tasks that are specifically defined in detail and have clear instructions and goals. And the employee wants to be treated respectfully, given guidance, and generally invited into a healthy and flourishing professional situation -- not just "not be fired". Contracts typically have nothing to say about these subtle things. +In fact, contracts are "incomplete" -- they don't specify all the details about the arrangements they govern -- in two ways. First, the contract terms themselves are often susceptible to multiple interpretations, requiring a third party like a judge to clarify them in the event of a dispute. But second, and even more importantly, they often say nothing about many relevant aspects of the relationship between the parties. For example, an employment contract may specify certain cases where the employee may be fired; and it may specify certain duties the employee is to complete. But these bright lines don't ever paint the full picture between an employee and employer. In fact, the employer typically wants the employee to work really hard and think really creatively about how to serve the company -- not just perform the tasks that are specifically defined in detail and have clear instructions and goals. And the employee wants to be treated respectfully, given guidance, and generally invited into a healthy and flourishing professional situation -- not just "not be fired". Contracts typically have nothing to say about these subtle things. -These "gaps" in what contracts govern are not errors in the system -- they are an unavoidable aspect of how contracts govern rich human arrangements. However, whether they are good or bad depends in large part on the factors discussed in the "Association" chapter. To oversimply just slightly: when contracting parties have a healthy associative relationship, the "gaps" in contract allow for mutual flourishing: they create space for the parties to cooperate flexibly, in an evolving, organic relationship in which both sides are thinking about the welfare of the other. By contrast, when contracting parties lack this kind of associative relationship, the gaps become zones in which the less powerful (or less ruthless) party is often exploited. For example, as the work of economist Samuel Bowles and others has indicated, when employers have the upper hand in the labor market, they do not need to pay more to extract more work from employees. Employers can exploit the "incompleteness" of contract by simply demanding more, if employees are afraid of losing their job. Conversely, where employers are afraid of people quitting, employees can exploit contractual ambiguities by shirking work and pushing up against the limits of acceptable behavior. Whether "incomplete" contracts create spaces for this kind of power exploitation, or space for better and more flexible collaboration, depends fundamentally on whether the parties understand each other as being in a healthy association. +These "gaps" in what contracts govern are not errors in the system -- they are an unavoidable aspect of how contracts govern rich human arrangements. However, whether they are good or bad depends in large part on the factors discussed in the "Association" chapter. To oversimplify just slightly: when contracting parties have a healthy associative relationship, the "gaps" in contract allow for mutual flourishing: they create space for the parties to cooperate flexibly, in an evolving, organic relationship in which both sides are thinking about the welfare of the other. By contrast, when contracting parties lack this kind of associative relationship, the gaps become zones in which the less powerful (or less ruthless) party is often exploited. For example, as the work of economist Samuel Bowles and others has indicated, when employers have the upper hand in the labor market, they do not need to pay more to extract more work from employees. Employers can exploit the "incompleteness" of contracts by simply demanding more, if employees are afraid of losing their job. Conversely, where employers are afraid of people quitting, employees can exploit contractual ambiguities by shirking work and pushing up against the limits of acceptable behavior. Whether "incomplete" contracts create spaces for this kind of power exploitation, or space for better and more flexible collaboration, depends fundamentally on whether the parties understand each other as being in a healthy association. -Challenges of this sort surround efforts to build infrastructure for sharing digital assets like data. The basic problem is that information has a near-infinity of possible uses, meaning that heavily "contractualist" approaches that seek to define exactly how parties may use information run into unmanageable complexity. Such contracts' zones of "incompleteness" are overwhelmingly vast because it isn't possible even to imagine, let alone catalogue and negotiate over all the possible future uses of information like genetics or geolocation. That means that the most promising possible benefits of data sharing -- which involve taking advantage of new technical affordance to convey information to distant parties all around the world -- are also the most dangerous and ungovernable. The potential market is therefore paralyzed. If we cannot address these problems with conventional contract, our ideal spheres of information sharing will end up matching the shape of our Associations -- meaning we need better maps of our associative connections, and, as discussed elsewhere, better assurances against information leakage even from trusted communities. +Challenges of this sort surround efforts to build infrastructure for sharing digital assets like data. The basic problem is that information has a near-infinity of possible uses, meaning that heavily "contractualist" approaches that seek to define exactly how parties may use information run into unmanageable complexity. Such contracts' zones of "incompleteness" are overwhelmingly vast because it isn't possible even to imagine, let alone catalog and negotiate over all the possible future uses of information like genetics or geolocation. That means that the most promising possible benefits of data sharing -- which involve taking advantage of new technical affordance to convey information to distant parties all around the world -- are also the most dangerous and ungovernable. The potential market is therefore paralyzed. If we cannot address these problems with conventional contracts, our ideal spheres of information sharing will end up matching the shape of our Associations -- meaning we need better maps of our associative connections, and, as discussed elsewhere, better assurances against information leakage even from trusted communities. Of course, these are far from the only problems besetting digital asset sharing... -But the challenges created by the lack of clear and meaningful standards (both legal and technical) for protecting data while it is shared spill out into almost every aspect of scalable digital cooperation. While no deductive analysis can substitute for the social experimentation and evolution that will be need ted to reach such standards, we can highlight some of the components and efforts that seem likely to address the central tensions above and thus should become important to social exploration if we are going to get past the current barriers to digital asset sharing. +But the challenges created by the lack of clear and meaningful standards (both legal and technical) for protecting data while it is shared spill out into almost every aspect of scalable digital cooperation. While no deductive analysis can substitute for the social experimentation and evolution that will be needed to reach such standards, we can highlight some of the components and efforts that seem likely to address the central tensions above and thus should become important to social exploration if we are going to get past the current barriers to digital asset sharing. ### ⿻ property -The first and simplest issue to address is standards for performance and security for computational asset sharing. When users store their data or entrust a computation to others, they need assurances that their data will not be compromised by a third party and that the computation will be performed according to their expectations, that their data will be retrievable by themselves or their customers with an expected distribution of latency by people in various places etc. Currently these sort of guarantees are central to the value propositions of the could providers, but because there are no standards that can easily be met by a broad set of individuals and organizations offer computational services, these powerful companies dominate the market. An analogous example is the introduction of "https", which allowed a range of web hosting services to meet security criteria that give web content consumers confidence that they can access data from that website without being maliciously surveiled. Such standards could naturally be paired with standardized formats for searching, requesting and matching on additional performance and security features. +The first and simplest issue to address is standards for performance and security for computational asset sharing. When users store their data or entrust a computation to others, they need assurances that their data will not be compromised by a third party and that the computation will be performed according to their expectations, that their data will be retrievable by themselves or their customers with an expected distribution of latency by people in various places etc. Currently these sort of guarantees are central to the value propositions of the cloud providers. Because there are no standards that can easily be met by a broad set of individuals and organizations offering computational services, these powerful companies dominate the market. An analogous example is the introduction of "https", which allowed a range of web hosting services to meet security criteria that give web content consumers confidence that they can access data from that website without being maliciously surveilled. Such standards could naturally be paired with standardized formats for searching, requesting and matching on additional performance and security features. However, as noted above, the thorniest questions pertain not to performance or third-party attacks, but to the problems at the heart of data collaboration: what should a collaborator Party B with whom Party A shares data or other digital assets learn about Party A's data? While this obviously has no single right answer, setting parameters and expectations in ways that allow participants to benefit from collaboration without frequently undermining their critical interests or those of other people affected by this collaboration is central to making data collaboration feasible and sustainable. Luckily, a number of tools are becoming available that will help provide the technical scaffolding for such relationships. -While we have discussed in the Association chapter, it is worth recalling their relevance here. Secure mutliparty computation (SMPC) and homomorphic encryption allow multiple parties to perform a computation together and create a collective output without each revealing to the others the inputs. While the simplest illustrative examples include calculating an average salary or tallying votes in an election, far more sophisticated possibilities are increasingly within reach, such as training or fine-tuning a GFM. These more ambitious applications have helped create the field of "federated learning" and "data federation", which allow the computations necessary for one of these ambitious applications to be performed locally on a distributed network of personal or organizational computers with the inputs to the model being passed back and forth securely without the underlying training data ever leaving the machine or servers of the respective parties to the communication. In collaboration with open source providers of these tools such as OpenMined, international organizations like the United Nations have increasingly built experimental showcase platforms for data collaboration harnessing these tools. An alternative to this distributed approach is to used specialized "confidential computers" that can be verified to perform particular calculations but give no one access to their intermediate outputs. Because these machines are expensive and produced by only a limited range of companies, however, these lend themselves more to control by a trusted central entity than diffuse collaboration. +While we have discussed it in the Association chapter, it is worth recalling their relevance here. Secure multiparty computation (SMPC) and homomorphic encryption allow multiple parties to perform a computation together and create a collective output without each revealing to the others the inputs. While the simplest illustrative examples include calculating an average salary or tallying votes in an election, far more sophisticated possibilities are increasingly within reach, such as training or fine-tuning a GFM. These more ambitious applications have helped create the field of "federated learning" and "data federation", which allow the computations necessary for one of these ambitious applications to be performed locally on a distributed network of personal or organizational computers with the inputs to the model being passed back and forth securely without the underlying training data ever leaving the machine or servers of the respective parties to the communication. In collaboration with open source providers of these tools such as OpenMined, international organizations like the United Nations have increasingly built experimental showcase platforms for data collaboration harnessing these tools. An alternative to this distributed approach is to use specialized "confidential computers" that can be verified to perform particular calculations but give no one access to their intermediate outputs. Because these machines are expensive and produced by only a limited range of companies, however, these lend themselves more to control by a trusted central entity than diffuse collaboration. -      While these approaches can help achieve a collaboration without unnecessary information being conveyed across collaborators, other tools are needed to address the information contained in the desired outputs (e.g. statistics or models) created by the collaboration. Models may both leak input information (e.g. a model reproduces intimate details of the medical history of a particular person) or may, conversely, obscure the source of information (e.g. reproduce input creative text without attribution, in violation of a license). Both are significant impediments to data collaboration, as collaborators will typically want agency over the use of their data. +While these approaches can help achieve a collaboration without unnecessary information being conveyed across collaborators, other tools are needed to address the information contained in the desired outputs (e.g. statistics or models) created by the collaboration. Models may both leak input information (e.g. a model reproduces intimate details of the medical history of a particular person) or may, conversely, obscure the source of information (e.g. reproduce input creative text without attribution, in violation of a license). Both are significant impediments to data collaboration, as collaborators will typically want agency over the use of their data. -      Tools to address these challenges are more statistical than cryptographic. Differential privacy limits the degree to which input data can be guessed from a collection of output data, using a "privacy budget" to ensure that together disclosures do not reliably reveal inputs. Watermarking can create "signatures" in content showing its origin in ways that are hard to erase, ignore or even in some cases detect. "Influence functions" trace the role a particular collection of data play in producing the output of a model, allowing at least partial attribution of the output of an otherwise "black box" model. +Tools to address these challenges are more statistical than cryptographic. Differential privacy limits the degree to which input data can be guessed from a collection of output data, using a "privacy budget" to ensure that together disclosures do not reliably reveal inputs. Watermarking can create "signatures" in content showing its origin in ways that are hard to erase, ignore or even in some cases detect. "Influence functions" trace the role a particular collection of data plays in producing the output of a model, allowing at least partial attribution of the output of an otherwise "black box" model. -      All these techniques have fallen somewhat behind the speed, scale and power of the development of GFMs. For example, differential privacy focuses mostly on the literal statistical recoverability of facts, where as GFMs are often capable of performing "reasoning" as a detective would, inferring for example someone's first school from a constellation of only loosely related facts about later schools, friendships, etc. Similarly, there has been little progress tracing attribution of provenance and value through neural networks. Harnessing the capacity of these models to tackle these technical challenges and deriving technical standard definitions of data protection and attribution, especially as models further progress, will be central to making data collaboration sustainable. +All these techniques have fallen somewhat behind the speed, scale and power of the development of GFMs. For example, differential privacy focuses mostly on the literal statistical recoverability of facts, whereas GFMs are often capable of performing "reasoning" as a detective would, inferring for example someone's first school from a constellation of only loosely related facts about later schools, friendships, etc. Similarly, there has been little progress tracing attribution of provenance and value through neural networks. Harnessing the capacity of these models to tackle these technical challenges and deriving technical standard definitions of data protection and attribution, especially as models further progress, will be central to making data collaboration sustainable. Yet many of the challenges to data collaboration are more organizational and social than purely technical. As we noted earlier, interests in data are rarely individual as almost all data are relational. Even beyond this most fundamental point, there are many reasons why organizing data rights and control at the individual level is impractical including: - Social leakage: Even when data do not directly arise from a social interaction, they almost always have social implications. For example, because of the shared genetic structure of relatives, something like a 1% statistical sample of a population allows the identification of any individual from their genetic profile, making the preservation of genetic privacy a profoundly social undertaking. @@ -145,7 +147,7 @@ Yet many of the challenges to data collaboration are more organizational and soc Organizations capable of taking on this role of collectively representing the rights and interests of "data subjects"[^whoownsthefuture] have been given a variety of names: data trusts,[^datatrust] collaboratives,[^datacollab] cooperatives,[^datacoops] or, in a whimsical turn of phrase one of the authors suggested, "mediators of individual data" (MIDs).[^dataaslabor] Some of these could quite naturally follow the lines of existing organizations: for example, unions for creative workers representing their content, or Wikipedia representing the collective interest of its volunteer editors and contributors. Others may require new forms of organization, such as the contributors of open source code that is being used to train code-generation models, authors of fan fiction and writers of Reddit pages may need to organize their own forms of collective representation. -Beyond these formal technologies, organizations and standards, broader and more diffuse concepts, expectations and norms will have to develop so to ensure broad understanding of what is at stake in data collaborations, so that contributors feel empowered to strike fair agreements and hold their collaborators accountable. Given the pace of technological change and adaptation in what data collaborations will thus become, these norms will both have to become pervasive and reasonably stable *and* dynamic and adaptive. Achieving this will require practices of education and cultural engagement that keep pace with technical change, as we discuss in the following chapters. +Beyond these formal technologies, organizations and standards, broader and more diffuse concepts, expectations and norms will have to develop so as to ensure broad understanding of what is at stake in data collaborations, so that contributors feel empowered to strike fair agreements and hold their collaborators accountable. Given the pace of technological change and adaptation in what data collaborations will thus become, these norms will both have to become pervasive and reasonably stable *and* dynamic and adaptive. Achieving this will require practices of education and cultural engagement that keep pace with technical change, as we discuss in the following chapters. Once they develop and spread sufficiently, data collaboration tools, organizations and practices may become sufficiently familiar to be encoded in common sense and legal practice as deeply as "property rights" are, though as we noted they will almost certainly have to take a different form than the standard patterns governing private ownership of land or the organization of a joint-stock corporation. They will, as we noted, need to include many more technical and cryptographic elements, different kinds of social organizations with a greater emphasis on collective governance and fiduciary duties and norms or laws protecting against unilateral disclosure by a member of a MIDs (analogous to prohibitions against unilateral strike-breaking against unions). These may form into a future version of "property" for the digital world, but one much more attuned to the ⿻ character of data. @@ -161,7 +163,7 @@ We take one step away from the purely digital asset world, and look to two examp Traditionally, entitlements to broadcast on a particular electromagnetic frequency in a particular geographic range have (in many countries including the United States) been assigned or auctioned to operators with licenses being renewed at low cost. This has effectively created a private property-like entitlement based on the idea that users of frequencies will interfere if many are allowed to operate on the same band in the same place and that licensees will steward the band if they have property rights over it. These assumptions have been tested to the breaking point recently, however, as many digital applications (such as WiFi) can share spectrum and the rapidly changing nature of uses for spectrum (e.g. moving from over-the-air broadcasting to 5G wireless) has dramatically changed interference patterns, requiring reorganization of the spectrum against which legacy license holders can often serve as holdouts. This in turn has led to significant changes to the property system, allowing licensing agencies like America's Federal Communications Commission to relocate holdouts in auctions, and proposals by leaders in the space for even more radical designs that would mix elements of rental and ownership as we discuss in our "Markets" chapter below or leave spectrum unlicensed for specified shared uses. -The evolution of property in name spaces has been even more radical. Traditionally the Internet Corporation for Assigned Names and Numbers (ICANN) allowed registration of domain names at relatively low cost with nominal fees for renewal, similar to the property-like licensing regime for spectrum. While this system has evolved, the more fundamental change has been that today, most people reach website through search engines rather than direct navigation. These engines usually list sites associated with a given name based on a variety of (mostly not publicly disclosed) signals of their relevance to users as well as including some paid advertisements that are auctioned in real-time. While relevance algorithms are something of a black box, a reasonable first mental model for them is the original "PageRank" algorithm of Google founders Sergey Brin and Larry Page, which ranked pages based on their "network centrality", a notion related to the network-based voting systems we will discuss in our chapter on "Voting" below. Thus, to a first blush, we can think of the *de facto* property regime of internet name spaces today as being a combination of collective direction towards the interest of browsers (rather than domain owners) combined with a real-time auction for domain owners. Both are a far cry from traditional property systems. +The evolution of property in name spaces has been even more radical. Traditionally the Internet Corporation for Assigned Names and Numbers (ICANN) allowed registration of domain names at relatively low cost with nominal fees for renewal, similar to the property-like licensing regime for spectrum. While this system has evolved, the more fundamental change has been that today, most people reach websites through search engines rather than direct navigation. These engines usually list sites associated with a given name based on a variety of (mostly not publicly disclosed) signals of their relevance to users as well as including some paid advertisements that are auctioned in real-time. While relevance algorithms are something of a black box, a reasonable first mental model for them is the original "PageRank" algorithm of Google founders Sergey Brin and Larry Page, which ranked pages based on their "network centrality", a notion related to the network-based voting systems we will discuss in our chapter on "Voting" below. Thus, to a first blush, we can think of the *de facto* property regime of internet name spaces today as being a combination of collective direction towards the interest of browsers (rather than domain owners) combined with a real-time auction for domain owners. Both are a far cry from traditional property systems. This is not, of course, to suggest any of this is ideal and certainly not socially legitimate. These systems have been largely designed far from the public eye, without public understanding by teams of technocratic engineers and economists. Few even recognize that they operate much less believe they are appropriate. On the other hand, they respond to real challenges in creative ways, and the issues they address stretch well beyond the narrow domains to which they have been applied thus far. Addressing holdout problems and spectrum sharing are central to allowing digital development that is broadly demanded by the public and viewed as central to even issues of national security. Similar holdout issues pervade the redevelopment of urban spaces and the building of common infrastructure, and much land currently held as private property could be made into shared spaces like parks (or vice-versa).