more wordsmithing

SystemsApproach · Oct 20, 2021 · df0ae01 · df0ae01
1 parent f4388c6
commit df0ae01
Show file tree

Hide file tree

Showing 3 changed files with 79 additions and 66 deletions.
diff --git a/arch.rst b/arch.rst
@@ -42,8 +42,8 @@ Platform-as-a-Service (PaaS).
 
 Aether supports this combination by implementing both the RAN and the
 user plane of the Mobile Core on-prem, as cloud-native workloads
-co-located on the Aether cluster. This is often referred to as local
-breakout because it enables direct communication between mobile
+co-located on the Aether cluster. This is often referred to as *local
+breakout* because it enables direct communication between mobile
 devices and edge applications without data traffic leaving the
 enterprise. This scenario is depicted in :numref:`Figure %s
 <fig-hybrid>`, which does not name the edge applications, but
@@ -62,7 +62,7 @@ example.
 
 The approach includes both edge (on-prem) and centralized (off-prem)
 components. This is true for edge apps, which often have a centralized
-counterpart running in a commodity cloud. It is also true for the
+counterpart running in a commodity cloud. It is also true for the 5G
 Mobile Core, where the on-prem User Plane (UP) is paired with a
 centralized Control Plane (CP). The central cloud shown in this figure
 might be private (i.e., operated by the enterprise), public (i.e.,
@@ -72,9 +72,9 @@ cloud). Also shown in :numref:`Figure %s <fig-hybrid>` is a
 centralized *Control and Management Platform*. This represents all the
 functionality needed to offer Aether as a managed service, with system
 administrators using a portal exported by this platform to operate the
-underlying infrastructure and services. The rest of this book is about
-everything that goes into implementing that *Control and Management
-Platform*.
+underlying infrastructure and services within their enterprise. The
+rest of this book is about everything that goes into implementing that
+*Control and Management Platform*.
 
 2.1 Edge Cloud
 --------------
@@ -112,8 +112,8 @@ the SD-Fabric), are deployed as a set of microservices, but details
 about the functionality implemented by these containers is otherwise
 not critical to this discussion. For our purposes, they are
 representative of any cloud native workload. (The interested reader is
-referred to our 5G and SDN books for more information about the
-internal working of SD-RAN, SD-Core, and SD-Fabric.)
+referred to our companion 5G and SDN books for more information about
+the internal working of SD-RAN, SD-Core, and SD-Fabric.)
 
 .. _reading_5g:
 .. admonition:: Further Reading 
@@ -151,8 +151,8 @@ Platform (AMP).
 Each SD-Core CP controls one or more SD-Core UPs, as specified by
 3GPP, the standards organization responsible for 5G. Exactly how CP
 instances (running centrally) are paired with UP instances (running at
-the edges) is a configuration-time decision, and depends on the degree
-of isolation the enterprise sites require. AMP is responsible for
+the edges) is a runtime decision, and depends on the degree of
+isolation the enterprise sites require. AMP is responsible for
 managing all the centralized and edge subsystems (as introduced in the
 next section).
 
@@ -173,12 +173,12 @@ we started with in :numref:`Figure %s <fig-hw>` of Chapter 1).\ [#]_
 This is because, while each ACE site usually corresponds to a physical
 cluster built out of bare-metal components, each of the SD-Core CP
 subsystems shown in :numref:`Figure %s <fig-aether>` is actually
-deployed as a logical Kubernetes cluster on a commodity cloud. The
+deployed in a logical Kubernetes cluster on a commodity cloud. The
 same is true for AMP. Aether’s centralized components are able to run
 in Google Cloud Platform, Microsoft Azure, and Amazon’s AWS. They also
 run as an emulated cluster implemented by a system like
 KIND—Kubernetes in Docker—making it possible for developers to run
-these components on a laptop.
+these components on their laptop.
 
 .. [#] Confusingly, Kubernetes adopts generic terminology, such as
        “cluster” and “service”, and gives it very specific meaning. In
@@ -190,8 +190,7 @@ these components on a laptop.
        potentially thousands of such logical clusters. And as we'll
        see in a later chapter, even an ACE edge site sometimes hosts
        more than one Kubernetes cluster (e.g., one running production
-       services and one used for development and testing of new
-       services).
+       services and one used for trial deployments of new services).
 
 2.3 Control and Management
 --------------------------
@@ -304,7 +303,7 @@ both physical and virtual resources.
 2.3.2 Lifecycle Management
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-Lifecycle Management is the process of integrating fixed, extended,
+Lifecycle Management is the process of integrating debugged, extended,
 and refactored components (often microservices) into a set of
 artifacts (e.g., Docker containers and Helm charts), and subsequently
 deploying those artifacts to the operational cloud. It includes a
@@ -368,7 +367,7 @@ the cloud offers to end users. Thus, we can generalize the figure so
 Runtime Control mediates access to any of the underlying microservices
 (or collections of microservices) the cloud designer wishes to make
 publicly accessible, including the rest of AMP! In effect, Runtime
-Control implements an abstraction layer, codified with programmatic
+Control implements an abstraction layer, codified with a programmatic
 API.
 
 Given this mediation role, Runtime Control provides mechanisms to
@@ -434,7 +433,7 @@ operators a way to both read (monitor) and write (control) various
 parameters of a running system. Connecting those two subsystems is how
 we build closed loop control.
 
-A third example is even more ambiguous. Lifecycle management usually
+A third example is even more nebulous. Lifecycle management usually
 takes responsibility for *configuring* each component, while runtime
 control takes responsibility for *controlling* each component. Where
 you draw the line between configuration and control is somewhat

diff --git a/intro.rst b/intro.rst
@@ -72,6 +72,12 @@ perspective on the problem. We return to the confluence of enterprise,
 cloud, access technologies later in this chapter, but we start by
 addressing the terminology challenge.
 
+.. _reading_aether:
+.. admonition:: Further Reading
+
+   `Aether: 5G-Connected Edge Cloud
+   <https://opennetworking.org/aether/>`__.
+
 1.1 Terminology
 ---------------
 
@@ -107,7 +113,7 @@ terminology.
   * **OSS/BSS:** Another Telco acronym (Operations Support System,
     Business Support System), referring to the subsystem that
     implements both operational logic (OSS) and business logic
-    (BSS). Usually the top-most component in the overall O&M
+    (BSS). It is usually the top-most component in the overall O&M
     hierarchy.
 
   * **EMS:**  Yet another Telco acronym (Element Management System),
@@ -164,34 +170,34 @@ terminology.
   * **Continuous Integration / Continuous Deployment (CI/CD):** An
     approach to Lifecycle Management in which the path from
     development (producing new functionality) to testing, integration,
-    and ultimately deployment is an automated pipeline. Typically
-    implies continuously making small incremental changes rather than
-    performing large disruptive upgrades.
+    and ultimately deployment is an automated pipeline. CI/CD
+    typically implies continuously making small incremental changes
+    rather than performing large disruptive upgrades.
 
   * **DevOps:** An engineering discipline (usually implied by CI/CD)
     that balances feature velocity against system stability. It is a
     practice typically associated with container-based (also known as
-    *cloud native*) systems, and typified by *Site Reliability
+    *cloud native*) systems, as typified by *Site Reliability
     Engineering (SRE)* practiced by cloud providers like Google.
 
   * **In-Service Software Upgrade (ISSU):** A requirement that a
     component continue running during the deployment of an upgrade,
     with minimal disruption to the service delivered to
-    end-users. Generally implies the ability to incrementally roll-out
-    (and roll-back) an upgrade, but is specifically a requirement on
-    individual components (as opposed to the underlying platform used
-    to manage a set of components).
+    end-users. ISSU generally implies the ability to incrementally
+    roll-out (and roll-back) an upgrade, but is specifically a
+    requirement on individual components (as opposed to the underlying
+    platform used to manage a set of components).
 
 * **Monitoring & Logging:** Collecting data from system components to aid
   in management decisions. This includes diagnosing faults, tuning
   performance, doing root cause analysis, performing security audits,
   and provisioning additional capacity.
 
   * **Analytics:** A program (often using statistical models) that
-    produces additional insights (value) from raw data. Can be used to
-    close a control loop (i.e., auto-reconfigure a system based on
+    produces additional insights (value) from raw data. It can be used
+    to close a control loop (i.e., auto-reconfigure a system based on
     these insights), but could also be targeted at a human operator
-    (that subsequently takes some action).
+    that subsequently takes some action.
 
 Another way to talk about operations is in terms of stages, leading to
 a characterization that is common for traditional network devices:
@@ -301,9 +307,9 @@ manageable:
   majority of configuration involves initiating software parameters,
   which is more readily automated.
 
-* Cloud native implies a set best-practices for addressing many of the
-  FCAPS requirements, especially as they relate to availability and
-  performance, both of which are achieved through horizontal
+* Cloud native implies a set of best-practices for addressing many of
+  the FCAPS requirements, especially as they relate to availability
+  and performance, both of which are achieved through horizontal
   scaling. Secure communication is also typically built into cloud RPC
   mechanisms.
 
@@ -319,17 +325,19 @@ monitoring data in a uniform way, and (d) continually integrating and
 deploying individual microservices as they evolve over time.
 
 Finally, because a cloud is infinitely programmable, the system being
-managed has the potential to change substantially over time.\ [#]_  This
-means that the cloud management system must itself be easily extended
-to support new features (as well as the refactoring of existing
-features). This is accomplished in part by implementing the cloud
-management system as a cloud service, but it also points to taking
-advantage of declarative specifications of how all the disaggregated
-pieces fit together. These specifications can then be used to generate
-elements of the management system, rather than having to manually
-recode them. This is a subtle issue we will return to in later
-chapters, but ultimately, we want to be able to auto-configure the
-subsystem responsible for auto-configuring the rest of the system.
+managed has the potential to change substantially over time.\ [#]_
+This means that the cloud management system must itself be easily
+extended to support new features (as well as the refactoring of
+existing features). This is accomplished in part by implementing the
+cloud management system as a cloud service, which means we will see a
+fair amount of recursive dependencies throughout this book. It also
+points to taking advantage of declarative specifications of how all
+the disaggregated pieces fit together. These specifications can then
+be used to generate elements of the management system, rather than
+having to manually recode them. This is a subtle issue we will return
+to in later chapters, but ultimately, we want to be able to
+auto-configure the subsystem responsible for auto-configuring the rest
+of the system.
 
 .. [#] For example, compare the two services Amazon offered ten years
        ago (EC2 and S3) with the well over 100 services available on
@@ -371,13 +379,19 @@ identifies the technology we assume.
 ~~~~~~~~~~~~~~~~~~~~~~~
 
 The assumed hardware building blocks are straightforward. We start
-with bare-metal servers and switches, built using merchant
-silicon. These might, for example, be ARM or x86 processor chips and
+with bare-metal servers and switches, built using merchant silicon
+chips. These might, for example, be ARM or x86 processor chips and
 Tomahawk or Tofino switching chips, respectively. The bare-metal boxes
 also include a bootstrap mechanism (e.g., BIOS for servers and ONIE
 for switches), and a remote device management interface (e.g., IPMI or
 Redfish).
 
+.. _reading_redfish:
+.. admonition:: Further Reading
+
+   Distributed Management Task Force (DMTF) `Redfish
+   <https://www.dmtf.org/standards/redfish>`__.
+
 A physical cloud cluster is then constructed with the hardware
 building blocks arranged as shown in :numref:`Figure %s <fig-hw>`: one
 or more racks of servers connected by a leaf-spine switching
@@ -397,11 +411,11 @@ that software running on the servers controls the switches.
 software components, which we describe next. Collectively, all the
 hardware and software components shown in the figure form the
 *platform*. Where we draw the line between what's *in the platform*
-and what runs *on top of the platform* will become clear in later
-chapters, but the summary is that different mechanisms will be
-responsible for (a) bringing up the platform and prepping it to host
-workloads, and (b) managing the various workloads that need to be
-deployed on that platform.
+and what runs *on top of the platform*, and why it is important, will
+become clear in later chapters, but the summary is that different
+mechanisms will be responsible for (a) bringing up the platform and
+prepping it to host workloads, and (b) managing the various workloads
+that need to be deployed on that platform.
 
 
 1.3.2 Server Virtualization
@@ -415,7 +429,7 @@ resources, all running on the commodity processors in the cluster:
 2. Kubernetes instantiates and interconnects containers.
 
 3. Helm charts specify how collections of related containers are
-   interconnected.
+   interconnected to build applications.
 
 These are all well known and ubiquitous, and so we only summarize them
 here. Links to related information for anyone that is not familiar

diff --git a/preface.rst b/preface.rst
@@ -11,21 +11,21 @@ job of it.
 The answer, we believe, is that the cloud is becoming ubiquitous in
 another way, as it moves from hundreds of datacenters to tens of
 thousands of enterprises. And while it is clear that the commodity
-cloud providers will happily manage those edge clusters as a logical
+cloud providers are eager to manage those edge clusters as a logical
 extension of their datacenters, they do not have a lock on the
 know-how for making that happen.
 
 This book lays out a roadmap that a small team of engineers followed
-over a course of a year to stand-up and operationalize a hybrid cloud
-spanning a dozen enterprises, and hosting a non-trivial cloud native
-service (5G connectivity in our case, but that’s just an example). The
-team was able to do this by leveraging 20+ open source components,
-but selecting those components is just a start. There were dozens of
-technical decisions to make along the way, and a few thousand lines of
-configuration code to write. We believe this is a repeatable exercise,
-which we report in this book. (And the code for those configuration
-files is open source, for those that want to pursue the topic in more
-detail.)
+over the course of a year to stand-up and operationalize a hybrid
+cloud that spans a dozen enterprises, and hosts a non-trivial cloud
+native service (5G connectivity in our case, but that’s just an
+example). The team was able to do this by leveraging 20+ open source
+components, but selecting those components is just a start. There were
+dozens of technical decisions to make along the way, and a few
+thousand lines of configuration code to write. We believe this is a
+repeatable exercise, which we report in this book. (And the code for
+those configuration files is open source, for those that want to
+pursue the topic in more detail.)
 
 Our roadmap may not be the right one for all circumstances, but it
 does shine a light on the fundamental challenges and trade-offs
@@ -41,8 +41,8 @@ How to operationalize a computing system is a question that’s as old
 as the field of *Operating Systems*. Operationalizing a cloud is just
 today’s version of that fundamental problem, which has become all the
 more interesting as we move up the stack, from managing *devices* to
-managing *services*. The fact that this topic is both timely and
-foundational are among the reasons it is worth studying.
+managing *services*. That this topic is both timely and foundational
+are among the reasons it is worth studying.
 
 
 Guided Tour of Open Source
@@ -80,11 +80,11 @@ Sunay for his influence on its overall design. Suchitra Vemuri's
 insights into testing and quality assurance were also invaluable.
 
 This book is still very much a work-in-progress, and we will happily
-acknowledge anyone that provides feedback. Please send us your
+acknowledge everyone that provides feedback. Please send us your
 comments using the `Issues Link
 <https://github.com/SystemsApproach/ops/issues>`__.  Also see the
 `Wiki <https://github.com/SystemsApproach/ops/wiki>`__ for the TODO
-list we're working on.
+list we're currently working on.
 
 | Larry Peterson, Scott Baker, Andy Bavier, Zack Williams, and Bruce Davie
 | October 2021