diff --git a/sols.tex b/sols.tex index a0ca275..467808e 100755 --- a/sols.tex +++ b/sols.tex @@ -15,7 +15,7 @@ \usepackage{tabls} %TCIDATA{OutputFilter=latex2.dll} %TCIDATA{Version=5.50.0.2960} -%TCIDATA{LastRevised=Thursday, September 27, 2018 13:02:41} +%TCIDATA{LastRevised=Wednesday, October 03, 2018 01:01:22} %TCIDATA{SuppressPackageManagement} %TCIDATA{} %TCIDATA{} @@ -124,26 +124,130 @@ \section{\label{chp.intro}Introduction} -These notes are meant as a detailed introduction to the basic combinatorics -that underlies the \textquotedblleft explicit\textquotedblright\ part of -abstract algebra (i.e., the theory of determinants, and concrete families of -polynomials). They cover permutations and determinants (from a combinatorial -viewpoint -- no linear algebra is presumed), as well as some basic material on -binomial coefficients and recurrent (Fibonacci-like) sequences. The reader is -assumed to be proficient in high-school mathematics and low-level -\textquotedblleft contest mathematics\textquotedblright, and mature enough to -read combinatorial proofs. - -These notes were originally written for the PRIMES reading project I have -mentored in 2015. The goal of the project was to become familiar with some -fundamentals of algebra and combinatorics (particularly the ones needed to -understand the literature on cluster algebras). +These notes are a detailed introduction to some of the basic objects of +combinatorics and algebra: binomial coefficients, permutations and +determinants (from a combinatorial viewpoint -- no linear algebra is +presumed). To a lesser extent, modular arithmetic and recurrent integer +sequences are treated as well. The reader is assumed to be proficient in +high-school mathematics and low-level \textquotedblleft contest +mathematics\textquotedblright, and mature enough to understand rigorous +mathematical proofs. + +One feature of these notes is their focus on rigorous and detailed proofs. +Indeed, so extensive are the details that a reader with experience in +mathematics will probably be able to skip whole paragraphs of proof without +losing the thread. (As a consequence of this amount of detail, the notes +contain far less material than might be expected from their length.) Rigorous +proofs mean that (with some minor exceptions) no \textquotedblleft +handwaving\textquotedblright\ is used; all relevant objects are defined in +mathematical (usually set-theoretical) language, and are manipulated in +logically well-defined ways. (In particular, some things that are commonly +taken for granted in the literature -- e.g., the fact that the sum of $n$ +numbers is well-defined without specifying in what order they are being added +-- are unpacked and proven in a rigorous way.) + +These notes are split into several chapters: + +\begin{itemize} +\item Chapter \ref{chp.intro} collects some basic facts and notations that are +used in later chapter. This chapter is \textbf{not} meant to be read first; it +is best consulted when needed. + +\item Chapter \ref{chp.ind} is an in-depth look at mathematical induction (in +various forms, including strong and two-sided induction) and several of its +applications (including basic modular arithmetic, division with remainder, +Bezout's theorem, some properties of recurrent sequences, the well-definedness +of compositions of $n$ maps and sums of $n$ numbers, and various properties thereof). + +\item Chapter \ref{chp.binom} surveys binomial coefficients and their basic +properties. Unlike most texts on combinatorics, our treatment of binomial +coefficients leans to the algebraic side, relying mostly on computation and +manipulations of sums; but some basics of counting are included. + +\item Chapter \ref{chp.recur} treats some more properties of Fibonacci-like +sequences, including explicit formulas (\`{a} la Binet) for two-term +recursions of the form $x_{n}=ax_{n-1}+bx_{n-2}$. + +\item Chapter \ref{chp.perm} is concerned with permutations of finite sets. +The coverage is heavily influenced by the needs of the next chapter (on +determinants); thus, a great role is played by transpositions and the +inversions of a permutation. + +\item Chapter \ref{chp.det} is a comprehensive introduction to determinants of +square matrices over a commutative ring\footnote{The notion of a commutative +ring is defined (and illustrated with several examples) in Section +\ref{sect.commring}, but I don't delve deeper into abstract algebra.}, from an +elementary point of view. This is probably the most unique feature of these +notes: I define determinants using Leibniz's formula (i.e., as sums over +permutations) and prove all their properties (Laplace expansion in one or +several rows; the Cauchy-Binet, Desnanot-Jacobi and Pl\"{u}cker identities; +the Vandermonde and Cauchy determinants; and several more) from this vantage +point, thus treating them as an elementary object unmoored from its +linear-algebraic origins and applications. No use is made of modules (or +vector spaces), exterior powers, eigenvalues, or of the \textquotedblleft +universal coefficients\textquotedblright\ trick\footnote{This refers to the +standard trick used for proving determinant identities (and other polynomial +identities), in which the entries of a matrix are replaced by indeterminates +and one then uses the \textquotedblleft genericity\textquotedblright\ of these +indeterminates to (e.g.) invert the matrix.}. (This means that all proofs are +done through combinatorics and manipulation of sums -- a rather restrictive +requirement!) This is a conscious and (to a large extent) aesthetic choice on +my part, and I do \textbf{not} consider it the best way to learn about +determinants; but I do regard it as a road worth charting, and these notes are +my attempt at doing so. +\end{itemize} + +The notes include numerous exercises of varying difficulty, many of them +solved. The reader should treat exercises and theorems (and propositions, +lemmas and corollaries) as interchangeable to some extent; it is perfectly +reasonable to read the solution of an exercise, or conversely, to prove a +theorem on their own instead of reading its proof. + +I have not meant these notes to be a textbook on any particular subject. For +one thing, their does not map to any of the standard university courses, but +rather straddles various subjects: + +\begin{itemize} +\item Much of Chapter \ref{chp.binom} (on binomial coefficients) and Chapter +\ref{chp.perm} (on permutations) is seen in a typical combinatorics class; but +my focus is more on the algebraic side and not so much on the combinatorics. + +\item Chapter \ref{chp.det} studies determinants far beyond what a usual class +on linear algebra would do; but it does not include any of the other topics of +a linear algebra class (such as row reduction, vector spaces, linear maps, +eigenvectors, tensors or bilinear forms). + +\item Being devoted to mathematical induction, Chapter \ref{chp.ind} appears +to cover the same ground as a typical \textquotedblleft introduction to +proofs\textquotedblright\ textbook or class (or at least one of its main +topics). In reality, however, it complements rather than competes with most +\textquotedblleft introduction to proofs\textquotedblright\ texts I have seen; +the examples I give are (with a few exceptions) nonstandard, and the focus different. + +\item While the notions of rings and groups are defined in Chapter +\ref{chp.det}, I cannot claim to really be doing any abstract algebra: I am +merely working \textit{in} rings (i.e., working with matrices over rings), +rather than working \textit{with} rings. Nevertheless, Chapter \ref{chp.det} +might help familiarize the reader with these concepts, facilitating proper +learning of abstract algebra later on. +\end{itemize} + +All in all, these notes are probably more useful as a repository of detailed +proofs than as a textbook read cover-to-cover. Indeed, one of my motives in +writing them was to have a reference for certain folklore results -- +particularly one that could convince people that said results do not require +any advanced abstract algebra to prove. + +These notes began as worksheets for the PRIMES reading project I have mentored +in 2015; they have since been greatly expanded with new material (some of it +originally written for my combinatorics classes, some in response to +\href{https://math.stackexchange.com/}{math.stackexchange} questions). The notes are in flux, and probably have their share of misprints. I thank -Anya Zhang and Karthik Karnik (the two students taking part in the project) -for finding some errors! Thanks also to the PRIMES project at MIT, which gave -the impetus for the writing of this notes; and to George Lusztig for the -sponsorship of my mentoring position in this project. +Anya Zhang and Karthik Karnik (the two students taking part in the 2015 PRIMES +project) for finding some errors. Thanks also to the PRIMES project at MIT, +which gave the impetus for the writing of this notes; and to George Lusztig +for the sponsorship of my mentoring position in this project. \subsection{Prerequisites} @@ -154,18 +258,18 @@ \subsection{Prerequisites} \item has a good grasp on basic school-level mathematics (integers, rational numbers, prime numbers, etc.); -\item has some experience with proofs (mathematical induction, strong -induction, proof by contradiction, the concept of \textquotedblleft -WLOG\textquotedblright, etc.) and mathematical notation (functions, -subscripts, cases, what it means for an object to be \textquotedblleft -well-defined\textquotedblright, etc.)\footnote{A great introduction into these -matters (and many others!) is the free book \cite{LeLeMe16} by Lehman, -Leighton and Meyer. (\textbf{Practical note:} As of 2017, this book is still -undergoing frequent revisions; thus, the version I am citing below might be -outdated by the time you are reading this. I therefore suggest searching for -possibly newer versions on the internet. Unfortunately, you will also find -many older versions, often as the first google hits. Try searching for the -title of the book along with the current year to find something up-to-date.) +\item has some experience with proofs (mathematical induction, proof by +contradiction, the concept of \textquotedblleft WLOG\textquotedblright, etc.) +and mathematical notation (functions, subscripts, cases, what it means for an +object to be \textquotedblleft well-defined\textquotedblright, +etc.)\footnote{A great introduction into these matters (and many others!) is +the free book \cite{LeLeMe16} by Lehman, Leighton and Meyer. +(\textbf{Practical note:} As of 2017, this book is still undergoing frequent +revisions; thus, the version I am citing below might be outdated by the time +you are reading this. I therefore suggest searching for possibly newer +versions on the internet. Unfortunately, you will also find many older +versions, often as the first google hits. Try searching for the title of the +book along with the current year to find something up-to-date.) \par Another introduction to proofs and mathematical workmanship is Day's \cite{Day-proofs} (but beware that the definition of polynomials in @@ -174,18 +278,16 @@ \subsection{Prerequisites} especially popular one is Velleman's \cite{Vellem06}.}; \item knows what a polynomial is (at least over $\mathbb{Z}$ and $\mathbb{Q}$) -and how polynomials differ from polynomial functions\footnote{See Section -\ref{sect.polynomials-emergency} below for a quick survey of what this means, -and which sources to consult for the precise definitions.}; +and how polynomials differ from polynomial functions\footnote{This is used +only in a few sections and exercises, so it is not an unalienable requirement. +See Section \ref{sect.polynomials-emergency} below for a quick survey of +polynomials, and which sources to consult for the precise definitions.}; -\item knows the basics of modular arithmetic (e.g., if $a\equiv -b\operatorname{mod}n$ and $c\equiv d\operatorname{mod}n$, then $ac\equiv -bd\operatorname{mod}n$); - -\item is familiar with the summation sign ($\sum$) and the product sign -($\prod$) and knows how to transform them (e.g., interchanging summations, and -substituting the index)\footnote{See Section \ref{sect.sums-repetitorium} -below for a quick overview of the notations that we will need.}; +\item is somewhat familiar with the summation sign ($\sum$) and the product +sign ($\prod$) and knows how to transform them (e.g., interchanging +summations, and substituting the index)\footnote{See Section +\ref{sect.sums-repetitorium} below for a quick overview of the notations that +we will need.}; \item has some familiarity with matrices (i.e., knows how to add and to multiply them). @@ -591,30 +693,32 @@ \subsection{\label{sect.jectivity}Injectivity, surjectivity, bijectivity} Composition of maps is associative: If $X$, $Y$, $Z$ and $W$ are three sets, and if $c:X\rightarrow Y$, $b:Y\rightarrow Z$ and $a:Z\rightarrow W$ are three maps, then $\left( a\circ b\right) \circ c=a\circ\left( b\circ c\right) $. +(This shall be proven in Proposition \ref{prop.ind.gen-ass-maps.fgh} below.) -More generally, if $X_{1},X_{2},\ldots,X_{k+1}$ are $k+1$ sets for some -$k\in\mathbb{N}$, and if $f_{i}:X_{i}\rightarrow X_{i+1}$ is a map for each -$i\in\left\{ 1,2,\ldots,k\right\} $, then the composition $f_{k}\circ -f_{k-1}\circ\cdots\circ f_{1}$ of all $k$ maps $f_{1},f_{2},\ldots,f_{k}$ is a +In Section \ref{sect.ind.gen-ass}, we shall prove a more general fact: If +$X_{1},X_{2},\ldots,X_{k+1}$ are $k+1$ sets for some $k\in\mathbb{N}$, and if +$f_{i}:X_{i}\rightarrow X_{i+1}$ is a map for each $i\in\left\{ +1,2,\ldots,k\right\} $, then the composition $f_{k}\circ f_{k-1}\circ +\cdots\circ f_{1}$ of all $k$ maps $f_{1},f_{2},\ldots,f_{k}$ is a well-defined map from $X_{1}$ to $X_{k+1}$, which sends each element $x\in X_{1}$ to $f_{k}\left( f_{k-1}\left( f_{k-2}\left( \cdots\left( -f_{2}\left( f_{1}\left( x\right) \right) \right) \right) \right) +f_{2}\left( f_{1}\left( x\right) \right) \right) \cdots\right) \right) \right) $ (in other words, which transforms each element $x\in X_{1}$ by first applying $f_{1}$, then applying $f_{2}$, then applying $f_{3}$, and so on); this composition $f_{k}\circ f_{k-1}\circ\cdots\circ f_{1}$ can also be written as $f_{k}\circ\left( f_{k-1}\circ\left( f_{k-2}\circ\left( -\cdots\circ\left( f_{2}\circ f_{1}\right) \right) \right) \right) $ or as -$\left( \left( \left( \left( f_{k}\circ f_{k-1}\right) \circ -\cdots\right) \circ f_{3}\right) \circ f_{2}\right) \circ f_{1}$. An +\cdots\circ\left( f_{2}\circ f_{1}\right) \cdots\right) \right) \right) $ +or as $\left( \left( \left( \cdots\left( f_{k}\circ f_{k-1}\right) +\circ\cdots\right) \circ f_{3}\right) \circ f_{2}\right) \circ f_{1}$. An important particular case is when $k=0$; in this case, $f_{k}\circ f_{k-1}\circ\cdots\circ f_{1}$ is a composition of $0$ maps. It is defined to be $\operatorname*{id}\nolimits_{X_{1}}$ (the identity map of the set $X_{1}% $), and it is called the \textquotedblleft empty composition of maps $X_{1}\rightarrow X_{1}$\textquotedblright. (The logic behind this definition is that the composition $f_{k}\circ f_{k-1}\circ\cdots\circ f_{1}$ should -transform transforms each element $x\in X_{1}$ by first applying $f_{1}$, then -applying $f_{2}$, then applying $f_{3}$, and so on; but for $k=0$, there are -no maps to apply, and so $x$ just remains unchanged.) +transform each element $x\in X_{1}$ by first applying $f_{1}$, then applying +$f_{2}$, then applying $f_{3}$, and so on; but for $k=0$, there are no maps to +apply, and so $x$ just remains unchanged.) \end{remark} \subsection{\label{sect.sums-repetitorium}Sums and products: a synopsis} @@ -663,7 +767,9 @@ \subsubsection{Definition of $\sum$} value of $\sum_{s\in S}a_{s}$ to depend only on $S$ and on the $a_{s}$ (not on some arbitrarily chosen $t\in S$). However, it is possible to prove that the right hand side of (\ref{eq.sum.def.1}) is actually independent of $t$ (that -is, any two choices of $t$ will lead to the same result). +is, any two choices of $t$ will lead to the same result). See Section +\ref{sect.ind.gen-com} below (and Theorem \ref{thm.ind.gen-com.wd} +\textbf{(a)} in particular) for the proof of this fact. \end{itemize} \textbf{Examples:} @@ -783,6 +889,10 @@ \subsubsection{Definition of $\sum$} summation\textquotedblright, and the second summation sign ($\sum_{t\in T}$) is called the \textquotedblleft inner summation\textquotedblright. +\item An expression of the form \textquotedblleft$\sum_{s\in S}a_{s}% +$\textquotedblright\ (where $S$ is a finite set) is called a \textit{finite +sum}. + \item We have required the set $S$ to be finite when defining $\sum_{s\in S}a_{s}$. Of course, this requirement was necessary for our definition, and there is no way to make sense of infinite sums such as $\sum_{s\in\mathbb{Z}% @@ -936,7 +1046,9 @@ \subsubsection{Properties of $\sum$} \sum_{s\in S}a_{s}=a_{t}+\sum_{s\in S\setminus\left\{ t\right\} }a_{s}. \label{eq.sum.split-off}% \end{equation} -(This is precisely the equality (\ref{eq.sum.def.1}).) This formula +(This is precisely the equality (\ref{eq.sum.def.1}) (applied to $n=\left\vert +S\setminus\left\{ t\right\} \right\vert $), because $\left\vert S\right\vert +=\left\vert S\setminus\left\{ t\right\} \right\vert +1$.) This formula (\ref{eq.sum.split-off}) allows us to \textquotedblleft split off\textquotedblright\ an addend from a sum. @@ -969,7 +1081,8 @@ \subsubsection{Properties of $\sum$} \textquotedblleft sub-bunches\textquotedblright\ (one \textquotedblleft sub-bunch\textquotedblright\ consisting of the $a_{s}$ for $s\in X$, and the other consisting of the $a_{s}$ for $s\in Y$), then take the sum of each of -these two sub-bunches, and finally add together the two sums. +these two sub-bunches, and finally add together the two sums. For a rigorous +proof of (\ref{eq.sum.split}), see Theorem \ref{thm.ind.gen-com.split2} below. \textbf{Examples:} @@ -1035,6 +1148,7 @@ \subsubsection{Properties of $\sum$} \begin{equation} \sum_{s\in S}a=\left\vert S\right\vert \cdot a. \label{eq.sum.equal}% \end{equation} +\footnote{This is easy to prove by induction on $\left\vert S\right\vert $.} In other words, if all addends of a sum are equal to one and the same element $a$, then the sum is just the number of its addends times $a$. In particular,% \[ @@ -1048,7 +1162,8 @@ \subsubsection{Properties of $\sum$} \sum_{s\in S}\left( a_{s}+b_{s}\right) =\sum_{s\in S}a_{s}+\sum_{s\in S}b_{s}. \label{eq.sum.linear1}% \end{equation} - +For a rigorous proof of this equality, see Theorem +\ref{thm.ind.gen-com.sum(a+b)} below. \textbf{Remark:} Of course, similar rules hold for other forms of summations: If $\mathcal{A}\left( s\right) $ is a logical statement for each $s\in S$, @@ -1071,6 +1186,9 @@ \subsubsection{Properties of $\sum$} \begin{equation} \sum_{s\in S}\lambda a_{s}=\lambda\sum_{s\in S}a_{s}. \label{eq.sum.linear2}% \end{equation} +For a rigorous proof of this equality, see Theorem +\ref{thm.ind.gen-com.sum(la)} below. + Again, similar rules hold for the other types of summation sign. \item \underline{\textbf{Zeroes sum to zero:}} Let $S$ be a finite set. Then,% @@ -1079,6 +1197,9 @@ \subsubsection{Properties of $\sum$} \end{equation} That is, any sum of zeroes is zero. +For a rigorous proof of this equality, see Theorem +\ref{thm.ind.gen-com.sum(0)} below. + \textbf{Remark:} This applies even to infinite sums! Do not be fooled by the infiniteness of a sum: There are no reasonable situations where an infinite sum of zeroes is defined to be anything other than zero. The infinity does not @@ -1087,11 +1208,12 @@ \subsubsection{Properties of $\sum$} \item \underline{\textbf{Dropping zeroes:}} Let $S$ be a finite set. Let $a_{s}$ be an element of $\mathbb{A}$ for each $s\in S$. Let $T$ be a subset of $S$ such that every $s\in T$ satisfies $a_{s}=0$. Then,% -\[ -\sum_{s\in S}a_{s}=\sum_{s\in S\setminus T}a_{s}. -\] +\begin{equation} +\sum_{s\in S}a_{s}=\sum_{s\in S\setminus T}a_{s}. \label{eq.sum.drop0}% +\end{equation} (That is, any addends which are zero can be removed from a sum without -changing the sum's value.) +changing the sum's value.) See Corollary \ref{cor.ind.gen-com.drop0} below for +a proof of (\ref{eq.sum.drop0}). \item \underline{\textbf{Renaming the index:}} Let $S$ be a finite set. Let $a_{s}$ be an element of $\mathbb{A}$ for each $s\in S$. Then,% @@ -1108,7 +1230,8 @@ \subsubsection{Properties of $\sum$} \sum_{t\in T}a_{t}=\sum_{s\in S}a_{f\left( s\right) }. \label{eq.sum.subs1}% \end{equation} (The idea here is that the sum $\sum_{s\in S}a_{f\left( s\right) }$ contains -the same addends as the sum $\sum_{t\in T}a_{t}$.) +the same addends as the sum $\sum_{t\in T}a_{t}$.) A rigorous proof of +(\ref{eq.sum.subs1}) can be found in Theorem \ref{thm.ind.gen-com.subst1} below. \textbf{Examples:} @@ -1319,7 +1442,9 @@ \subsubsection{Properties of $\sum$} of all $a_{s}$ for $s\in S$. The right hand side is the same sum, but split in a particular way: First, for each $w\in W$, we sum the $a_{s}$ for all $s\in S$ satisfying $f\left( s\right) =w$, and then we take the sum of all these -\textquotedblleft partial sums\textquotedblright. +\textquotedblleft partial sums\textquotedblright. For a rigorous proof of +(\ref{eq.sum.sheph}), see Theorem \ref{thm.ind.gen-com.shephf} (for the case +when $W$ is finite) and Theorem \ref{thm.ind.gen-com.sheph} (for the general case). \textbf{Examples:} @@ -1904,7 +2029,9 @@ \subsubsection{Definition of $\prod$} a_{s}. \label{eq.prod.def.1}% \end{equation} As for $\sum_{s\in S}a_{s}$, this definition is not obviously legitimate, but -it can be proven to be legitimate nevertheless. +it can be proven to be legitimate nevertheless. (The proof is analogous to the +proof for $\sum_{s\in S}a_{s}$; see Subsection \ref{subsect.ind.gen-com.prods} +for details.) \end{itemize} \textbf{Examples:} @@ -2012,6 +2139,10 @@ \subsubsection{Definition of $\prod$} should use whatever conventions make \textbf{you} safe from ambiguity; either way, you should keep in mind that other authors make different choices. +\item An expression of the form \textquotedblleft$\prod_{s\in S}a_{s}% +$\textquotedblright\ (where $S$ is a finite set) is called a \textit{finite +product}. + \item We have required the set $S$ to be finite when defining $\prod_{s\in S}a_{s}$. Such products are not generally defined when $S$ is infinite. However, \textbf{some} infinite products can be made sense of. The simplest @@ -2346,20 +2477,21 @@ \subsection{\label{sect.polynomials-emergency}Polynomials: a precise definition} As I have already mentioned in the above list of prerequisites, the notion of -polynomials (in one and in several indeterminates) will be used in these -notes. Most likely, the reader already has at least a vague understanding of -this notion (e.g., from high school); this vague understanding is probably -sufficient for reading most of these notes. But polynomials are one of the -most important notions in algebra (if not to say in mathematics), and the -reader will likely encounter them over and over; sooner or later, it will -happen that the vague understanding is not sufficient and some subtleties do -matter. For that reason, anyone serious about doing abstract algebra should -know a complete and correct definition of polynomials and have some experience -working with it. I shall not give a complete definition of the most general -notion of polynomials in these notes, but I will comment on some of the -subtleties and define an important special case (that of polynomials in one -variable with rational coefficients) in the present section. A reader is -probably best advised to skip this section on their first read. +polynomials (in one and in several indeterminates) will be occasionally used +in these notes. Most likely, the reader already has at least a vague +understanding of this notion (e.g., from high school); this vague +understanding is probably sufficient for reading most of these notes. But +polynomials are one of the most important notions in algebra (if not to say in +mathematics), and the reader will likely encounter them over and over; sooner +or later, it will happen that the vague understanding is not sufficient and +some subtleties do matter. For that reason, anyone serious about doing +abstract algebra should know a complete and correct definition of polynomials +and have some experience working with it. I shall not give a complete +definition of the most general notion of polynomials in these notes, but I +will comment on some of the subtleties and define an important special case +(that of polynomials in one variable with rational coefficients) in the +present section. A reader is probably best advised to skip this section on +their first read. It is not easy to find a good (formal and sufficiently general) treatment of polynomials in textbooks. Various authors tend to skimp on subtleties and @@ -2391,474 +2523,12885 @@ \subsection{\label{sect.polynomials-emergency}Polynomials: a precise notion of a \textit{commutative ring}, which is not difficult but somewhat abstract (I shall introduce it below in Section \ref{sect.commring}). -Let me give a brief survey of the notion of univariate polynomials (i.e., -polynomials in one variable). I shall define them as sequences. For the sake -of simplicity, I shall only talk of polynomials with rational coefficients. -Similarly, one can define polynomials with integer coefficients, with real -coefficients, or with complex coefficients; of course, one then has to replace -each \textquotedblleft$\mathbb{Q}$\textquotedblright\ by a \textquotedblleft% -$\mathbb{Z}$\textquotedblright, an \textquotedblleft$\mathbb{R}$% -\textquotedblright\ or a \textquotedblleft$\mathbb{C}$\textquotedblright. +Let me give a brief survey of the notion of univariate polynomials (i.e., +polynomials in one variable). I shall define them as sequences. For the sake +of simplicity, I shall only talk of polynomials with rational coefficients. +Similarly, one can define polynomials with integer coefficients, with real +coefficients, or with complex coefficients; of course, one then has to replace +each \textquotedblleft$\mathbb{Q}$\textquotedblright\ by a \textquotedblleft% +$\mathbb{Z}$\textquotedblright, an \textquotedblleft$\mathbb{R}$% +\textquotedblright\ or a \textquotedblleft$\mathbb{C}$\textquotedblright. + +The rough idea behind the definition of a polynomial is that a polynomial with +rational coefficients should be a \textquotedblleft formal +expression\textquotedblright\ which is built out of rational numbers, an +\textquotedblleft indeterminate\textquotedblright\ $X$ as well as addition, +subtraction and multiplication signs, such as $X^{4}-27X+\dfrac{3}{2}$ or +$-X^{3}+2X+1$ or $\dfrac{1}{3}\left( X-3\right) \cdot X^{2}$ or +$X^{4}+7X^{3}\left( X-2\right) $ or $-15$. We have not explicitly allowed +powers, but we understand $X^{n}$ to mean the product $\underbrace{XX\cdots +X}_{n\text{ times}}$ (or $1$ when $n=0$). Notice that division is not allowed, +so we cannot get $\dfrac{X}{X+1}$ (but we can get $\dfrac{3}{2}X$, because +$\dfrac{3}{2}$ is a rational number). Notice also that a polynomial can be a +single rational number, since we never said that $X$ must necessarily be used; +for instance, $-15$ and $0$ are polynomials. + +This is, of course, not a valid definition. One problem with it that it does +not explain what a \textquotedblleft formal expression\textquotedblright\ is. +For starters, we want an expression that is well-defined -- i.e., into that we +can substitute a rational number for $X$ and obtain a valid term. For example, +$X-+\cdot5$ is not well-defined, so it does not fit our bill; neither is the +\textquotedblleft empty expression\textquotedblright. Furthermore, when do we +want two \textquotedblleft formal expressions\textquotedblright\ to be viewed +as one and the same polynomial? Do we want to equate $X\left( X+2\right) $ +with $X^{2}+2X$ ? Do we want to equate $0X^{3}+2X+1$ with $2X+1$ ? The answer +is \textquotedblleft yes\textquotedblright\ both times, but a general rule is +not easy to give if we keep talking of \textquotedblleft formal +expressions\textquotedblright. + +We \textit{could} define two polynomials $p\left( X\right) $ and $q\left( +X\right) $ to be equal if and only if, for every number $\alpha\in\mathbb{Q}% +$, the values $p\left( \alpha\right) $ and $q\left( \alpha\right) $ +(obtained by substituting $\alpha$ for $X$ in $p$ and in $q$, respectively) +are equal. This would be tantamount to treating polynomials as +\textit{functions}: it would mean that we identify a polynomial $p\left( +X\right) $ with the function $\mathbb{Q}\rightarrow\mathbb{Q},\ \alpha\mapsto +p\left( \alpha\right) $. Such a definition would work well as long as we +would do only rather basic things with it\footnote{And some authors, such as +Axler in \cite[Chapter 4]{Axler}, do use this definition.}, but as soon as we +would try to go deeper, we would encounter technical issues which would make +it inadequate and painful\footnote{Here are the three most important among +these issues: +\par +\begin{itemize} +\item One of the strengths of polynomials is that we can evaluate them not +only at numbers, but also at many other things, e.g., at square matrices: +Evaluating the polynomial $X^{2}-3X$ at the square matrix $\left( +\begin{array} +[c]{cc}% +1 & 3\\ +-1 & 2 +\end{array} +\right) $ gives $\left( +\begin{array} +[c]{cc}% +1 & 3\\ +-1 & 2 +\end{array} +\right) ^{2}-3\left( +\begin{array} +[c]{cc}% +1 & 3\\ +-1 & 2 +\end{array} +\right) =\left( +\begin{array} +[c]{cc}% +-5 & 0\\ +0 & -5 +\end{array} +\right) $. However, a function must have a well-defined domain, and does not +make sense outside of this domain. So, if the polynomial $X^{2}-3X$ is +regarded as the function $\mathbb{Q}\rightarrow\mathbb{Q},\ \alpha +\mapsto\alpha^{2}-3\alpha$, then it makes no sense to evaluate this polynomial +at the matrix $\left( +\begin{array} +[c]{cc}% +1 & 3\\ +-1 & 2 +\end{array} +\right) $, just because this matrix does not lie in the domain $\mathbb{Q}$ +of the function. We could, of course, extend the domain of the function to +(say) the set of square matrices over $\mathbb{Q}$, but then we would still +have the same problem with other things that we want to evaluate polynomials +at. At some point we want to be able to evaluate polynomials at functions and +at other polynomials, and if we would try to achieve this by extending the +domain, we would have to do this over and over, because each time we extend +the domain, we get even more polynomials to evaluate our polynomials at; thus, +the definition would be eternally \textquotedblleft hunting its own +tail\textquotedblright! (We could resolve this difficulty by defining +polynomials as \textit{natural transformations} in the sense of category +theory. I do not want to even go into this definition here, as it would take +several pages to properly introduce. At this point, it is not worth the +hassle.) +\par +\item Let $p\left( X\right) $ be a polynomial with real coefficients. Then, +it should be obvious that $p\left( X\right) $ can also be viewed as a +polynomial with complex coefficients: For instance, if $p\left( X\right) $ +was defined as $3X+\dfrac{7}{2}X\left( X-1\right) $, then we can view the +numbers $3$, $\dfrac{7}{2}$ and $-1$ appearing in its definition as complex +numbers, and thus get a polynomial with complex coefficients. But wait! What +if two polynomials $p\left( X\right) $ and $q\left( X\right) $ are equal +when viewed as polynomials with real coefficients, but when viewed as +polynomials with complex coefficients become distinct (because when we view +them as polynomials with complex coefficients, their domains become extended, +and a new complex $\alpha$ might perhaps no longer satisfy $p\left( +\alpha\right) =q\left( \alpha\right) $ )? This does not actually happen, +but ruling this out is not obvious if you regard polynomials as functions. +\par +\item (This requires some familiarity with finite fields:) Treating +polynomials as functions works reasonably well for polynomials with integer, +rational, real and complex coefficients (as long as one is not too demanding). +But we will eventually want to consider polynomials with coefficients in any +arbitrary commutative ring $\mathbb{K}$. An example for a commutative ring +$\mathbb{K}$ is the finite field $\mathbb{F}_{p}$ with $p$ elements, where $p$ +is a prime. (This finite field $\mathbb{F}_{p}$ is better known as the ring of +integers modulo $p$.) If we define polynomials with coefficients in +$\mathbb{F}_{p}$ as functions $\mathbb{F}_{p}\rightarrow\mathbb{F}_{p}$, then +we really run into problems; for example, the polynomials $X$ and $X^{p}$ over +this field become identical as functions! +\end{itemize} +}. Also, if we equated polynomials with the functions they describe, then we +would waste the word \textquotedblleft polynomial\textquotedblright\ on a +concept (a function described by a polynomial) that already has a word for it +(namely, \textit{polynomial function}). + +The preceding paragraphs should have convinced you that it is worth defining +\textquotedblleft polynomials\textquotedblright\ in a way that, on the one +hand, conveys the concept that they are more \textquotedblleft formal +expressions\textquotedblright\ than \textquotedblleft +functions\textquotedblright, but on the other hand, is less nebulous than +\textquotedblleft formal expression\textquotedblright. Here is one such definition: + +\begin{definition} +\label{def.polynomial-univar}\textbf{(a)} A \textit{univariate polynomial with +rational coefficients} means a sequence $\left( p_{0},p_{1},p_{2}% +,\ldots\right) \in\mathbb{Q}^{\infty}$ of elements of $\mathbb{Q}$ such that% +\begin{equation} +\text{all but finitely many }k\in\mathbb{N}\text{ satisfy }p_{k}=0. +\label{eq.def.polynomial-univar.finite}% +\end{equation} +Here, the phrase \textquotedblleft all but finitely many $k\in\mathbb{N}$ +satisfy $p_{k}=0$\textquotedblright\ means \textquotedblleft there exists some +finite subset $J$ of $\mathbb{N}$ such that every $k\in\mathbb{N}\setminus J$ +satisfies $p_{k}=0$\textquotedblright. (See Definition \ref{def.allbutfin} for +the general definition of \textquotedblleft all but finitely +many\textquotedblright, and Section \ref{sect.infperm} for some practice with +this concept.) More concretely, the condition +(\ref{eq.def.polynomial-univar.finite}) can be rewritten as follows: The +sequence $\left( p_{0},p_{1},p_{2},\ldots\right) $ contains only zeroes from +some point on (i.e., there exists some $N\in\mathbb{N}$ such that +$p_{N}=p_{N+1}=p_{N+2}=\cdots=0$). + +For the remainder of this definition, \textquotedblleft univariate polynomial +with rational coefficients\textquotedblright\ will be abbreviated as +\textquotedblleft polynomial\textquotedblright. + +For example, the sequences $\left( 0,0,0,\ldots\right) $, $\left( +1,3,5,0,0,0,\ldots\right) $, $\left( 4,0,-\dfrac{2}{3},5,0,0,0,\ldots +\right) $, $\left( 0,-1,\dfrac{1}{2},0,0,0,\ldots\right) $ (where the +\textquotedblleft$\ldots$\textquotedblright\ stand for infinitely many zeroes) +are polynomials, but the sequence $\left( 1,1,1,\ldots\right) $ (where the +\textquotedblleft$\ldots$\textquotedblright\ stands for infinitely many $1$'s) +is not (since it does not satisfy (\ref{eq.def.polynomial-univar.finite})). + +So we have defined a polynomial as an infinite sequence of rational numbers +with a certain property. So far, this does not seem to reflect any intuition +of polynomials as \textquotedblleft formal expressions\textquotedblright. +However, we shall soon (namely, in Definition \ref{def.polynomial-univar} +\textbf{(j)}) identify the polynomial $\left( p_{0},p_{1},p_{2}% +,\ldots\right) \in\mathbb{Q}^{\infty}$ with the \textquotedblleft formal +expression\textquotedblright\ $p_{0}+p_{1}X+p_{2}X^{2}+\cdots$ (this is an +infinite sum, but due to (\ref{eq.def.polynomial-univar.finite}) all but its +first few terms are $0$ and thus can be neglected). For instance, the +polynomial $\left( 1,3,5,0,0,0,\ldots\right) $ will be identified with the +\textquotedblleft formal expression\textquotedblright\ $1+3X+5X^{2}% ++0X^{3}+0X^{4}+0X^{5}+\cdots=1+3X+5X^{2}$. Of course, we cannot do this +identification right now, since we do not have a reasonable definition of $X$. + +\textbf{(b)} We let $\mathbb{Q}\left[ X\right] $ denote the set of all +univariate polynomials with rational coefficients. Given a polynomial +$p=\left( p_{0},p_{1},p_{2},\ldots\right) \in\mathbb{Q}\left[ X\right] $, +we denote the numbers $p_{0},p_{1},p_{2},\ldots$ as the \textit{coefficients} +of $p$. More precisely, for every $i\in\mathbb{N}$, we shall refer to $p_{i}$ +as the $i$\textit{-th coefficient} of $p$. (Do not forget that we are counting +from $0$ here: any polynomial \textquotedblleft begins\textquotedblright\ with +its $0$-th coefficient.) The $0$-th coefficient of $p$ is also known as the +\textit{constant term} of $p$. + +Instead of \textquotedblleft the $i$-th coefficient of $p$\textquotedblright, +we often also say \textquotedblleft the \textit{coefficient before }$X^{i}% +$\textit{ of }$p$\textquotedblright\ or \textquotedblleft the +\textit{coefficient of }$X^{i}$ \textit{in }$p$\textquotedblright. + +Thus, any polynomial $p\in\mathbb{Q}\left[ X\right] $ is the sequence of its coefficients. + +\textbf{(c)} We denote the polynomial $\left( 0,0,0,\ldots\right) +\in\mathbb{Q}\left[ X\right] $ by $\mathbf{0}$. We will also write $0$ for +it when no confusion with the number $0$ is possible. The polynomial +$\mathbf{0}$ is called the \textit{zero polynomial}. A polynomial +$p\in\mathbb{Q}\left[ X\right] $ is said to be \textit{nonzero} if +$p\neq\mathbf{0}$. + +\textbf{(d)} We denote the polynomial $\left( 1,0,0,0,\ldots\right) +\in\mathbb{Q}\left[ X\right] $ by $\mathbf{1}$. We will also write $1$ for +it when no confusion with the number $1$ is possible. + +\textbf{(e)} For any $\lambda\in\mathbb{Q}$, we denote the polynomial $\left( +\lambda,0,0,0,\ldots\right) \in\mathbb{Q}\left[ X\right] $ by +$\operatorname*{const}\lambda$. We call it the \textit{constant polynomial +with value }$\lambda$. It is often useful to identify $\lambda\in\mathbb{Q}$ +with $\operatorname*{const}\lambda\in\mathbb{Q}\left[ X\right] $. Notice +that $\mathbf{0}=\operatorname*{const}0$ and $\mathbf{1}=\operatorname*{const}% +1$. + +\textbf{(f)} Now, let us define the sum, the difference and the product of two +polynomials. Indeed, let $a=\left( a_{0},a_{1},a_{2},\ldots\right) +\in\mathbb{Q}\left[ X\right] $ and $b=\left( b_{0},b_{1},b_{2}% +,\ldots\right) \in\mathbb{Q}\left[ X\right] $ be two polynomials. Then, we +define three polynomials $a+b$, $a-b$ and $a\cdot b$ in $\mathbb{Q}\left[ +X\right] $ by% +\begin{align*} +a+b & =\left( a_{0}+b_{0},a_{1}+b_{1},a_{2}+b_{2},\ldots\right) ;\\ +a-b & =\left( a_{0}-b_{0},a_{1}-b_{1},a_{2}-b_{2},\ldots\right) ;\\ +a\cdot b & =\left( c_{0},c_{1},c_{2},\ldots\right) , +\end{align*} +where% +\[ +c_{k}=\sum_{i=0}^{k}a_{i}b_{k-i}\ \ \ \ \ \ \ \ \ \ \text{for every }% +k\in\mathbb{N}. +\] +We call $a+b$ the \textit{sum} of $a$ and $b$; we call $a-b$ the +\textit{difference} of $a$ and $b$; we call $a\cdot b$ the \textit{product} of +$a$ and $b$. We abbreviate $a\cdot b$ by $ab$. + +For example,% +\begin{align*} +\left( 1,2,2,0,0,\ldots\right) +\left( 3,0,-1,0,0,0,\ldots\right) & +=\left( 4,2,1,0,0,0,\ldots\right) ;\\ +\left( 1,2,2,0,0,\ldots\right) -\left( 3,0,-1,0,0,0,\ldots\right) & +=\left( -2,2,3,0,0,0,\ldots\right) ;\\ +\left( 1,2,2,0,0,\ldots\right) \cdot\left( 3,0,-1,0,0,0,\ldots\right) & +=\left( 3,6,5,-2,-2,0,0,0,\ldots\right) . +\end{align*} + + +The definition of $a+b$ essentially says that \textquotedblleft polynomials +are added coefficientwise\textquotedblright\ (i.e., in order to obtain the sum +of two polynomials $a$ and $b$, it suffices to add each coefficient of $a$ to +the corresponding coefficient of $b$). Similarly, the definition of $a-b$ says +the same thing about subtraction. The definition of $a\cdot b$ is more +surprising. However, it loses its mystique when we identify the polynomials +$a$ and $b$ with the \textquotedblleft formal expressions\textquotedblright% +\ $a_{0}+a_{1}X+a_{2}X^{2}+\cdots$ and $b_{0}+b_{1}X+b_{2}X^{2}+\cdots$ +(although, at this point, we do not know what these expressions really mean); +indeed, it simply says that +\[ +\left( a_{0}+a_{1}X+a_{2}X^{2}+\cdots\right) \left( b_{0}+b_{1}X+b_{2}% +X^{2}+\cdots\right) =c_{0}+c_{1}X+c_{2}X^{2}+\cdots, +\] +where $c_{k}=\sum_{i=0}^{k}a_{i}b_{k-i}$ for every $k\in\mathbb{N}$. This is +precisely what one would expect, because if you expand $\left( a_{0}% ++a_{1}X+a_{2}X^{2}+\cdots\right) \left( b_{0}+b_{1}X+b_{2}X^{2}% ++\cdots\right) $ using the distributive law and collect equal powers of $X$, +then you get precisely $c_{0}+c_{1}X+c_{2}X^{2}+\cdots$. Thus, the definition +of $a\cdot b$ has been tailored to make the distributive law hold. + +(By the way, why is $a\cdot b$ a polynomial? That is, why does it satisfy +(\ref{eq.def.polynomial-univar.finite}) ? The proof is easy, but we omit it.) + +Addition, subtraction and multiplication of polynomials satisfy some of the +same rules as addition, subtraction and multiplication of numbers. For +example, the commutative laws $a+b=b+a$ and $ab=ba$ are valid for polynomials +just as they are for numbers; same holds for the associative laws $\left( +a+b\right) +c=a+\left( b+c\right) $ and $\left( ab\right) c=a\left( +bc\right) $ and the distributive laws $\left( a+b\right) c=ac+bc$ and +$a\left( b+c\right) =ab+ac$. + +The set $\mathbb{Q}\left[ X\right] $, endowed with the operations $+$ and +$\cdot$ just defined, and with the elements $\mathbf{0}$ and $\mathbf{1}$, is +a commutative ring (where we are using the notations of Definition +\ref{def.commring}). It is called the \textit{(univariate) polynomial ring +over }$\mathbb{Q}$. + +\textbf{(g)} Let $a=\left( a_{0},a_{1},a_{2},\ldots\right) \in +\mathbb{Q}\left[ X\right] $ and $\lambda\in\mathbb{Q}$. Then, $\lambda a$ +denotes the polynomial $\left( \lambda a_{0},\lambda a_{1},\lambda +a_{2},\ldots\right) \in\mathbb{Q}\left[ X\right] $. (This equals the +polynomial $\left( \operatorname*{const}\lambda\right) \cdot a$; thus, +identifying $\lambda$ with $\operatorname*{const}\lambda$ does not cause any +inconsistencies here.) + +\textbf{(h)} If $p=\left( p_{0},p_{1},p_{2},\ldots\right) \in\mathbb{Q}% +\left[ X\right] $ is a nonzero polynomial, then the \textit{degree} of $p$ +is defined to be the maximum $i\in\mathbb{N}$ satisfying $p_{i}\neq0$. If +$p\in\mathbb{Q}\left[ X\right] $ is the zero polynomial, then the degree of +$p$ is defined to be $-\infty$. (Here, $-\infty$ is just a fancy symbol, not a +number.) For example, $\deg\left( 1,4,0,-1,0,0,0,\ldots\right) =3$. + +\textbf{(i)} If $a=\left( a_{0},a_{1},a_{2},\ldots\right) \in\mathbb{Q}% +\left[ X\right] $ and $n\in\mathbb{N}$, then a polynomial $a^{n}% +\in\mathbb{Q}\left[ X\right] $ is defined to be the product +$\underbrace{aa\cdots a}_{n\text{ times}}$. (This is understood to be +$\mathbf{1}$ when $n=0$. In general, an empty product of polynomials is always +understood to be $\mathbf{1}$.) + +\textbf{(j)} We let $X$ denote the polynomial $\left( 0,1,0,0,0,\ldots +\right) \in\mathbb{Q}\left[ X\right] $. (This is the polynomial whose +$1$-st coefficient is $1$ and whose other coefficients are $0$.) This +polynomial is called the \textit{indeterminate} of $\mathbb{Q}\left[ +X\right] $. It is easy to see that, for any $n\in\mathbb{N}$, we have% +\[ +X^{n}=\left( \underbrace{0,0,\ldots,0}_{n\text{ zeroes}},1,0,0,0,\ldots +\right) . +\] + + +This polynomial $X$ finally provides an answer to the questions +\textquotedblleft what is an indeterminate\textquotedblright\ and +\textquotedblleft what is a formal expression\textquotedblright. Namely, let +$\left( p_{0},p_{1},p_{2},\ldots\right) \in\mathbb{Q}\left[ X\right] $ be +any polynomial. Then, the sum $p_{0}+p_{1}X+p_{2}X^{2}+\cdots$ is well-defined +(it is an infinite sum, but due to (\ref{eq.def.polynomial-univar.finite}) it +has only finitely many nonzero addends), and it is easy to see that this sum +equals $\left( p_{0},p_{1},p_{2},\ldots\right) $. Thus, +\[ +\left( p_{0},p_{1},p_{2},\ldots\right) =p_{0}+p_{1}X+p_{2}X^{2}% ++\cdots\ \ \ \ \ \ \ \ \ \ \text{for every }\left( p_{0},p_{1},p_{2}% +,\ldots\right) \in\mathbb{Q}\left[ X\right] . +\] +This finally allows us to write a polynomial $\left( p_{0},p_{1},p_{2}% +,\ldots\right) $ as a sum $p_{0}+p_{1}X+p_{2}X^{2}+\cdots$ while remaining +honest; the sum $p_{0}+p_{1}X+p_{2}X^{2}+\cdots$ is no longer a +\textquotedblleft formal expression\textquotedblright\ of unclear meaning, nor +a function, but it is just an alternative way to write the sequence $\left( +p_{0},p_{1},p_{2},\ldots\right) $. So, at last, our notion of a polynomial +resembles the intuitive notion of a polynomial! + +Of course, we can write polynomials as finite sums as well. Indeed, if +$\left( p_{0},p_{1},p_{2},\ldots\right) \in\mathbb{Q}\left[ X\right] $ is +a polynomial and $N$ is a nonnegative integer such that every $n>N$ satisfies +$p_{n}=0$, then% +\[ +\left( p_{0},p_{1},p_{2},\ldots\right) =p_{0}+p_{1}X+p_{2}X^{2}+\cdots +=p_{0}+p_{1}X+\cdots+p_{N}X^{N}% +\] +(because addends can be discarded when they are $0$). For example, $\left( +4,1,0,0,0,\ldots\right) =4+1X=4+X$ and $\left( \dfrac{1}{2},0,\dfrac{1}% +{3},0,0,0,\ldots\right) =\dfrac{1}{2}+0X+\dfrac{1}{3}X^{2}=\dfrac{1}% +{2}+\dfrac{1}{3}X^{2}$. + +\textbf{(k)} For our definition of polynomials to be fully compatible with our +intuition, we are missing only one more thing: a way to evaluate a polynomial +at a number, or some other object (e.g., another polynomial or a function). +This is easy: Let $p=\left( p_{0},p_{1},p_{2},\ldots\right) \in +\mathbb{Q}\left[ X\right] $ be a polynomial, and let $\alpha\in\mathbb{Q}$. +Then, $p\left( \alpha\right) $ means the number $p_{0}+p_{1}\alpha ++p_{2}\alpha^{2}+\cdots\in\mathbb{Q}$. (Again, the infinite sum $p_{0}% ++p_{1}\alpha+p_{2}\alpha^{2}+\cdots$ makes sense because of +(\ref{eq.def.polynomial-univar.finite}).) Similarly, we can define $p\left( +\alpha\right) $ when $\alpha\in\mathbb{R}$ (but in this case, $p\left( +\alpha\right) $ will be an element of $\mathbb{R}$) or when $\alpha +\in\mathbb{C}$ (in this case, $p\left( \alpha\right) \in\mathbb{C}$) or when +$\alpha$ is a square matrix with rational entries (in this case, $p\left( +\alpha\right) $ will also be such a matrix) or when $\alpha$ is another +polynomial (in this case, $p\left( \alpha\right) $ is such a polynomial as well). + +For example, if $p=\left( 1,-2,0,3,0,0,0,\ldots\right) =1-2X+3X^{3}$, then +$p\left( \alpha\right) =1-2\alpha+3\alpha^{3}$ for every $\alpha$. + +The map $\mathbb{Q}\rightarrow\mathbb{Q},\ \alpha\mapsto p\left( +\alpha\right) $ is called the \textit{polynomial function described by }$p$. +As we said above, this function is not $p$, and it is not a good idea to +equate it with $p$. + +If $\alpha$ is a number (or a square matrix, or another polynomial), then +$p\left( \alpha\right) $ is called the result of \textit{evaluating }$p$ +\textit{at }$X=\alpha$ (or, simply, evaluating $p$ at $\alpha$), or the result +of \textit{substituting }$\alpha$\textit{ for }$X$\textit{ in }$p$. This +notation, of course, reminds of functions; nevertheless, (as we already said a +few times) $p$ is \textbf{not a function}. + +Probably the simplest three cases of evaluation are the following ones: + +\begin{itemize} +\item We have $p\left( 0\right) =p_{0}+p_{1}0^{1}+p_{2}0^{2}+\cdots=p_{0}$. +In other words, evaluating $p$ at $X=0$ yields the constant term of $p$. + +\item We have $p\left( 1\right) =p_{0}+p_{1}1^{1}+p_{2}1^{2}+\cdots +=p_{0}+p_{1}+p_{2}+\cdots$. In other words, evaluating $p$ at $X=1$ yields the +sum of all coefficients of $p$. + +\item We have $p\left( X\right) =p_{0}+p_{1}X^{1}+p_{2}X^{2}+\cdots +=p_{0}+p_{1}X+p_{2}X^{2}+\cdots=p$. In other words, evaluating $p$ at $X=X$ +yields $p$ itself. This allows us to write $p\left( X\right) $ for $p$. Many +authors do so, just in order to stress that $p$ is a polynomial and that the +indeterminate is called $X$. It should be kept in mind that $X$ is \textbf{not +a variable} (just as $p$ is \textbf{not a function}); it is the (fixed!) +sequence $\left( 0,1,0,0,0,\ldots\right) \in\mathbb{Q}\left[ X\right] $ +which serves as the indeterminate for polynomials in $\mathbb{Q}\left[ +X\right] $. +\end{itemize} + +\textbf{(l)} Often, one wants (or is required) to give an indeterminate a name +other than $X$. (For instance, instead of polynomials with rational +coefficients, we could be considering polynomials whose coefficients +themselves are polynomials in $\mathbb{Q}\left[ X\right] $; and then, we +would not be allowed to use the letter $X$ for the \textquotedblleft +new\textquotedblright\ indeterminate anymore, as it already means the +indeterminate of $\mathbb{Q}\left[ X\right] $ !) This can be done, and the +rules are the following: Any letter (that does not already have a meaning) can +be used to denote the indeterminate; but then, the set of all polynomials has +to be renamed as $\mathbb{Q}\left[ \eta\right] $, where $\eta$ is this +letter. For instance, if we want to denote the indeterminate as $x$, then we +have to denote the set by $\mathbb{Q}\left[ x\right] $. + +It is furthermore convenient to regard the sets $\mathbb{Q}\left[ +\eta\right] $ for different letters $\eta$ as distinct. Thus, for example, +the polynomial $3X^{2}+1$ is not the same as the polynomial $3Y^{2}+1$. (The +reason for doing so is that one sometimes wishes to view both of these +polynomials as polynomials in the two variables $X$ and $Y$.) Formally +speaking, this means that we should define a polynomial in $\mathbb{Q}\left[ +\eta\right] $ to be not just a sequence $\left( p_{0},p_{1},p_{2}% +,\ldots\right) $ of rational numbers, but actually a pair $\left( \left( +p_{0},p_{1},p_{2},\ldots\right) ,\text{\textquotedblleft}\eta +\text{\textquotedblright}\right) $ of a sequence of rational numbers and the +letter $\eta$. (Here, \textquotedblleft$\eta$\textquotedblright\ really means +the letter $\eta$, not the sequence $\left( 0,1,0,0,0,\ldots\right) $.) This +is, of course, a very technical point which is of little relevance to most of +mathematics; it becomes important when one tries to implement polynomials in a +programming language. + +\textbf{(m)} As already explained, we can replace $\mathbb{Q}$ by $\mathbb{Z}% +$, $\mathbb{R}$, $\mathbb{C}$ or any other commutative ring $\mathbb{K}$ in +the above definition. (See Definition \ref{def.commring} for the definition of +a commutative ring.) When $\mathbb{Q}$ is replaced by a commutative ring +$\mathbb{K}$, the notion of \textquotedblleft univariate polynomials with +rational coefficients\textquotedblright\ becomes \textquotedblleft univariate +polynomials with coefficients in $\mathbb{K}$\textquotedblright\ (also known +as \textquotedblleft univariate polynomials over $\mathbb{K}$% +\textquotedblright), and the set of such polynomials is denoted by +$\mathbb{K}\left[ X\right] $ rather than $\mathbb{Q}\left[ X\right] $. +\end{definition} + +So much for univariate polynomials. + +Polynomials in multiple variables are (in my opinion) treated the best in +\cite[Chapter II, \S 3]{Lang02}, where they are introduced as elements of a +monoid ring. However, this treatment is rather abstract and uses a good deal +of algebraic language\footnote{Also, the book \cite{Lang02} is notorious for +its unpolished writing; it is best read with Bergman's companion +\cite{Bergman-Lang} at hand.}. The treatments in \cite[\S 4.5]{Walker87}, in +\cite[Chapter A-3]{Rotman15} and in \cite[Chapter IV, \S 4]{BirkMac} use the +above-mentioned recursive shortcut that makes them inferior (in my opinion). A +neat (and rather elementary) treatment of polynomials in $n$ variables (for +finite $n$) can be found in \cite[Chapter III, \S 5]{Hungerford-03} and in +\cite[\S 8]{AmaEsc05}; it generalizes the viewpoint we used in Definition +\ref{def.polynomial-univar} for univariate polynomials above\footnote{You are +reading right: The analysis textbook \cite{AmaEsc05} is one of the few sources +I am aware of to define the (algebraic!) notion of polynomials precisely and +well.}. + +\section{\label{chp.ind}A closer look at induction} + +In this chapter, we shall recall several versions of the \textit{induction +principle} (the principle of mathematical induction) and provide examples for +their use. We assume that the reader is at least somewhat familiar with +mathematical induction\footnote{If not, introductions can be found in +\cite[Chapter 5]{LeLeMe16}, \cite{Day-proofs}, \cite[Chapter 6]{Vellem06}, +\cite[Chapter 10]{Hammac15}, \cite{Vorobi02} and various other sources.}; we +shall present some nonstandard examples of its use (including a proof of the +legitimacy of the definition of a sum $\sum_{s\in S}a_{s}$ given in Section +\ref{sect.sums-repetitorium}). + +\subsection{\label{sect.ind.IP0}Standard induction} + +\subsubsection{The Principle of Mathematical Induction} + +We first recall the classical principle of mathematical +induction\footnote{Keep in mind that $\mathbb{N}$ means the set $\left\{ +0,1,2,\ldots\right\} $ for us.}: + +\begin{theorem} +\label{thm.ind.IP0}For each $n\in\mathbb{N}$, let $\mathcal{A}\left( +n\right) $ be a logical statement. + +Assume the following: + +\begin{statement} +\textit{Assumption 1:} The statement $\mathcal{A}\left( 0\right) $ holds. +\end{statement} + +\begin{statement} +\textit{Assumption 2:} If $m\in\mathbb{N}$ is such that $\mathcal{A}\left( +m\right) $ holds, then $\mathcal{A}\left( m+1\right) $ also holds. +\end{statement} + +Then, $\mathcal{A}\left( n\right) $ holds for each $n\in\mathbb{N}$. +\end{theorem} + +Theorem \ref{thm.ind.IP0} is commonly taken to be one of the axioms of +mathematics (the \textquotedblleft axiom of induction\textquotedblright), or +(in type theory) as part of the definition of $\mathbb{N}$. Intuitively, +Theorem \ref{thm.ind.IP0} should be obvious: For example, if you want to prove +(under the assumptions of Theorem \ref{thm.ind.IP0}) that $\mathcal{A}\left( +4\right) $ holds, you can argue as follows: + +\begin{itemize} +\item By Assumption 1, the statement $\mathcal{A}\left( 0\right) $ holds. + +\item Thus, by Assumption 2 (applied to $m=0$), the statement $\mathcal{A}% +\left( 1\right) $ holds. + +\item Thus, by Assumption 2 (applied to $m=1$), the statement $\mathcal{A}% +\left( 2\right) $ holds. + +\item Thus, by Assumption 2 (applied to $m=2$), the statement $\mathcal{A}% +\left( 3\right) $ holds. + +\item Thus, by Assumption 2 (applied to $m=3$), the statement $\mathcal{A}% +\left( 4\right) $ holds. +\end{itemize} + +A similar (but longer) argument shows that the statement $\mathcal{A}\left( +5\right) $ holds. Likewise, you can show that the statement $\mathcal{A}% +\left( 15\right) $ holds, if you have the patience to apply Assumption 2 a +total of $15$ times. It is thus not surprising that $\mathcal{A}\left( +n\right) $ holds for each $n\in\mathbb{N}$; but if you don't assume Theorem +\ref{thm.ind.IP0} as an axiom, you would need to write down a different proof +for each value of $n$ (which becomes the longer the larger $n$ is), and thus +would never reach the general result (i.e., that $\mathcal{A}\left( n\right) +$ holds for \textbf{each} $n\in\mathbb{N}$), because you cannot write down +infinitely many proofs. What Theorem \ref{thm.ind.IP0} does is, roughly +speaking, to apply Assumption 2 for you as many times as it is needed for each +$n\in\mathbb{N}$. + +(Authors of textbooks like to visualize Theorem \ref{thm.ind.IP0} by +envisioning an infinite sequence of dominos (numbered $0,1,2,\ldots$) placed +in row, sufficiently close to each other that if domino $m$ falls, then domino +$m+1$ will also fall. Now, assume that you kick domino $0$ over. What Theorem +\ref{thm.ind.IP0} then says is that each domino will fall. See, e.g., +\cite[Chapter 10]{Hammac15} for a detailed explanation of this metaphor. Here +is another metaphor for Theorem \ref{thm.ind.IP0}: Assume that there is a +virus that infects nonnegative integers. Once it has infected some +$m\in\mathbb{N}$, it will soon spread to $m+1$ as well. Now, assume that $0$ +gets infected. Then, Theorem \ref{thm.ind.IP0} says that each $n\in\mathbb{N}$ +will eventually be infected.) + +Theorem \ref{thm.ind.IP0} is called the \textit{principle of induction} or +\textit{principle of complete induction} or \textit{principle of mathematical +induction}, and we shall also call it \textit{principle of standard induction} +in order to distinguish it from several variant \textquotedblleft principles +of induction\textquotedblright\ that we will see later. Proofs that use this +principle are called \textit{proofs by induction} or \textit{induction +proofs}. Usually, in such proofs, we don't explicitly cite Theorem +\ref{thm.ind.IP0}, but instead say certain words that signal that Theorem +\ref{thm.ind.IP0} is being applied and that (ideally) also indicate what +statements $\mathcal{A}\left( n\right) $ it is being applied to\footnote{We +will explain this in Convention \ref{conv.ind.IP0lang} below.}. However, for +our very first example of a proof by induction, we are going to use Theorem +\ref{thm.ind.IP0} explicitly. We shall show the following fact: + +\begin{proposition} +\label{prop.ind.ari-geo}Let $q$ and $d$ be two real numbers such that $q\neq +1$. Let $\left( a_{0},a_{1},a_{2},\ldots\right) $ be a sequence of real +numbers. Assume that% +\begin{equation} +a_{n+1}=qa_{n}+d\ \ \ \ \ \ \ \ \ \ \text{for each }n\in\mathbb{N}. +\label{eq.prop.ind.ari-geo.ass}% +\end{equation} +Then, +\begin{equation} +a_{n}=q^{n}a_{0}+\dfrac{q^{n}-1}{q-1}d\ \ \ \ \ \ \ \ \ \ \text{for each }% +n\in\mathbb{N}. \label{eq.prop.ind.ari-geo.claim}% +\end{equation} + +\end{proposition} + +\begin{proof} +[Proof of Proposition \ref{prop.ind.ari-geo}.]For each $n\in\mathbb{N}$, we +let $\mathcal{A}\left( n\right) $ be the statement \newline$\left( +a_{n}=q^{n}a_{0}+\dfrac{q^{n}-1}{q-1}d\right) $. Thus, our goal is to prove +the statement $\mathcal{A}\left( n\right) $ for each $n\in\mathbb{N}$. + +We first notice that the statement $\mathcal{A}\left( 0\right) $ +holds\footnote{\textit{Proof.} This is easy to verify: We have $q^{0}=1$, thus +$q^{0}-1=0$, and therefore $\dfrac{q^{0}-1}{q-1}=\dfrac{0}{q-1}=0$. Now,% +\[ +\underbrace{q^{0}}_{=1}a_{0}+\underbrace{\dfrac{q^{0}-1}{q-1}}_{=0}% +d=1a_{0}+0d=a_{0}, +\] +so that $a_{0}=q^{0}a_{0}+\dfrac{q^{0}-1}{q-1}d$. But this is precisely the +statement $\mathcal{A}\left( 0\right) $ (since $\mathcal{A}\left( 0\right) +$ is defined to be the statement $\left( a_{0}=q^{0}a_{0}+\dfrac{q^{0}% +-1}{q-1}d\right) $). Hence, the statement $\mathcal{A}\left( 0\right) $ +holds.}. + +Now, we claim that +\begin{equation} +\text{if }m\in\mathbb{N}\text{ is such that }\mathcal{A}\left( m\right) +\text{ holds, then }\mathcal{A}\left( m+1\right) \text{ also holds.} +\label{pf.prop.ind.ari-geo.step}% +\end{equation} + + +[\textit{Proof of (\ref{pf.prop.ind.ari-geo.step}):} Let $m\in\mathbb{N}$ be +such that $\mathcal{A}\left( m\right) $ holds. We must show that +$\mathcal{A}\left( m+1\right) $ also holds. + +We have assumed that $\mathcal{A}\left( m\right) $ holds. In other words, +$a_{m}=q^{m}a_{0}+\dfrac{q^{m}-1}{q-1}d$ holds\footnote{because $\mathcal{A}% +\left( m\right) $ is defined to be the statement $\left( a_{m}=q^{m}% +a_{0}+\dfrac{q^{m}-1}{q-1}d\right) $}. Now, (\ref{eq.prop.ind.ari-geo.ass}) +(applied to $n=m$) yields% +\begin{align*} +a_{m+1} & =q\underbrace{a_{m}}_{=q^{m}a_{0}+\dfrac{q^{m}-1}{q-1}% +d}+d=q\left( q^{m}a_{0}+\dfrac{q^{m}-1}{q-1}d\right) +d\\ +& =\underbrace{qq^{m}}_{=q^{m+1}}a_{0}+\underbrace{q\cdot\dfrac{q^{m}-1}% +{q-1}d+d}_{=\left( q\cdot\dfrac{q^{m}-1}{q-1}+1\right) d}\\ +& =q^{m+1}a_{0}+\underbrace{\left( q\cdot\dfrac{q^{m}-1}{q-1}+1\right) +}_{\substack{=\dfrac{q\left( q^{m}-1\right) +\left( q-1\right) }% +{q-1}=\dfrac{q^{m+1}-1}{q-1}\\\text{(since }q\left( q^{m}-1\right) +\left( +q-1\right) =qq^{m}-q+q-1=qq^{m}-1=q^{m+1}-1\text{)}}}d\\ +& =q^{m+1}a_{0}+\dfrac{q^{m+1}-1}{q-1}d. +\end{align*} +So we have shown that $a_{m+1}=q^{m+1}a_{0}+\dfrac{q^{m+1}-1}{q-1}d$. But this +is precisely the statement $\mathcal{A}\left( m+1\right) $% +\ \ \ \ \footnote{because $\mathcal{A}\left( m+1\right) $ is defined to be +the statement $\left( a_{m+1}=q^{m+1}a_{0}+\dfrac{q^{m+1}-1}{q-1}d\right) $% +}. Thus, the statement $\mathcal{A}\left( m+1\right) $ holds. + +Now, forget that we fixed $m$. We thus have shown that if $m\in\mathbb{N}$ is +such that $\mathcal{A}\left( m\right) $ holds, then $\mathcal{A}\left( +m+1\right) $ also holds. This proves (\ref{pf.prop.ind.ari-geo.step}).] + +Now, both assumptions of Theorem \ref{thm.ind.IP0} are satisfied (indeed, +Assumption 1 holds because the statement $\mathcal{A}\left( 0\right) $ +holds, whereas Assumption 2 holds because of (\ref{pf.prop.ind.ari-geo.step}% +)). Thus, Theorem \ref{thm.ind.IP0} shows that $\mathcal{A}\left( n\right) $ +holds for each $n\in\mathbb{N}$. In other words, $a_{n}=q^{n}a_{0}% ++\dfrac{q^{n}-1}{q-1}d$ holds for each $n\in\mathbb{N}$ (since $\mathcal{A}% +\left( n\right) $ is the statement $\left( a_{n}=q^{n}a_{0}+\dfrac{q^{n}% +-1}{q-1}d\right) $). This proves Proposition \ref{prop.ind.ari-geo}. +\end{proof} + +\subsubsection{Conventions for writing induction proofs} + +Now, let us introduce some standard language that is commonly used in proofs +by induction: + +\begin{convention} +\label{conv.ind.IP0lang}For each $n\in\mathbb{N}$, let $\mathcal{A}\left( +n\right) $ be a logical statement. Assume that you want to prove that +$\mathcal{A}\left( n\right) $ holds for each $n\in\mathbb{N}$. + +Theorem \ref{thm.ind.IP0} offers the following strategy for proving this: +First show that Assumption 1 of Theorem \ref{thm.ind.IP0} is satisfied; then, +show that Assumption 2 of Theorem \ref{thm.ind.IP0} is satisfied; then, +Theorem \ref{thm.ind.IP0} automatically completes your proof. + +A proof that follows this strategy is called a \textit{proof by induction on +}$n$ (or \textit{proof by induction over }$n$) or (less precisely) an +\textit{inductive proof}. When you follow this strategy, you say that you are +\textit{inducting on }$n$ (or \textit{over }$n$). The proof that Assumption 1 +is satisfied is called the \textit{induction base} (or \textit{base case}) of +the proof. The proof that Assumption 2 is satisfied is called the +\textit{induction step} of the proof. + +In order to prove that Assumption 2 is satisfied, you will usually want to fix +an $m\in\mathbb{N}$ such that $\mathcal{A}\left( m\right) $ holds, and then +prove that $\mathcal{A}\left( m+1\right) $ holds. In other words, you will +usually want to fix $m\in\mathbb{N}$, assume that $\mathcal{A}\left( +m\right) $ holds, and then prove that $\mathcal{A}\left( m+1\right) $ +holds. When doing so, it is common to refer to the assumption that +$\mathcal{A}\left( m\right) $ holds as the \textit{induction hypothesis} (or +\textit{induction assumption}). +\end{convention} + +Using this language, we can rewrite our above proof of Proposition +\ref{prop.ind.ari-geo} as follows: + +\begin{proof} +[Proof of Proposition \ref{prop.ind.ari-geo} (second version).]For each +$n\in\mathbb{N}$, we let $\mathcal{A}\left( n\right) $ be the statement +$\left( a_{n}=q^{n}a_{0}+\dfrac{q^{n}-1}{q-1}d\right) $. Thus, our goal is +to prove the statement $\mathcal{A}\left( n\right) $ for each $n\in +\mathbb{N}$. + +We shall prove this by induction on $n$: + +\textit{Induction base:} We have $q^{0}=1$, thus $q^{0}-1=0$, and therefore +$\dfrac{q^{0}-1}{q-1}=\dfrac{0}{q-1}=0$. Now,% +\[ +\underbrace{q^{0}}_{=1}a_{0}+\underbrace{\dfrac{q^{0}-1}{q-1}}_{=0}% +d=1a_{0}+0d=a_{0}, +\] +so that $a_{0}=q^{0}a_{0}+\dfrac{q^{0}-1}{q-1}d$. But this is precisely the +statement $\mathcal{A}\left( 0\right) $ (since $\mathcal{A}\left( 0\right) +$ is defined to be the statement $\left( a_{0}=q^{0}a_{0}+\dfrac{q^{0}% +-1}{q-1}d\right) $). Hence, the statement $\mathcal{A}\left( 0\right) $ +holds. This completes the induction base. + +\textit{Induction step:} Let $m\in\mathbb{N}$. Assume that $\mathcal{A}\left( +m\right) $ holds. We must show that $\mathcal{A}\left( m+1\right) $ also holds. + +We have assumed that $\mathcal{A}\left( m\right) $ holds (this is our +induction hypothesis). In other words, $a_{m}=q^{m}a_{0}+\dfrac{q^{m}-1}% +{q-1}d$ holds\footnote{because $\mathcal{A}\left( m\right) $ is defined to +be the statement $\left( a_{m}=q^{m}a_{0}+\dfrac{q^{m}-1}{q-1}d\right) $}. +Now, (\ref{eq.prop.ind.ari-geo.ass}) (applied to $n=m$) yields% +\begin{align*} +a_{m+1} & =q\underbrace{a_{m}}_{=q^{m}a_{0}+\dfrac{q^{m}-1}{q-1}% +d}+d=q\left( q^{m}a_{0}+\dfrac{q^{m}-1}{q-1}d\right) +d\\ +& =\underbrace{qq^{m}}_{=q^{m+1}}a_{0}+\underbrace{q\cdot\dfrac{q^{m}-1}% +{q-1}d+d}_{=\left( q\cdot\dfrac{q^{m}-1}{q-1}+1\right) d}\\ +& =q^{m+1}a_{0}+\underbrace{\left( q\cdot\dfrac{q^{m}-1}{q-1}+1\right) +}_{\substack{=\dfrac{q\left( q^{m}-1\right) +\left( q-1\right) }% +{q-1}=\dfrac{q^{m+1}-1}{q-1}\\\text{(since }q\left( q^{m}-1\right) +\left( +q-1\right) =qq^{m}-q+q-1=qq^{m}-1=q^{m+1}-1\text{)}}}d\\ +& =q^{m+1}a_{0}+\dfrac{q^{m+1}-1}{q-1}d. +\end{align*} +So we have shown that $a_{m+1}=q^{m+1}a_{0}+\dfrac{q^{m+1}-1}{q-1}d$. But this +is precisely the statement $\mathcal{A}\left( m+1\right) $% +\ \ \ \ \footnote{because $\mathcal{A}\left( m+1\right) $ is defined to be +the statement $\left( a_{m+1}=q^{m+1}a_{0}+\dfrac{q^{m+1}-1}{q-1}d\right) $% +}. Thus, the statement $\mathcal{A}\left( m+1\right) $ holds. + +Now, forget that we fixed $m$. We thus have shown that if $m\in\mathbb{N}$ is +such that $\mathcal{A}\left( m\right) $ holds, then $\mathcal{A}\left( +m+1\right) $ also holds. This completes the induction step. + +Thus, we have completed both the induction base and the induction step. Hence, +by induction, we conclude that $\mathcal{A}\left( n\right) $ holds for each +$n\in\mathbb{N}$. This proves Proposition \ref{prop.ind.ari-geo}. +\end{proof} + +The proof we just gave still has a lot of \textquotedblleft +boilerplate\textquotedblright\ text. For example, we have explicitly defined +the statement $\mathcal{A}\left( n\right) $, but it is not really necessary, +since it is clear what this statement should be (viz., it should be the claim +we are proving, without the \textquotedblleft for each $n\in\mathbb{N}% +$\textquotedblright\ part). Allowing ourselves some imprecision, we could say +this statement is simply (\ref{eq.prop.ind.ari-geo.claim}). (This is a bit +imprecise, because (\ref{eq.prop.ind.ari-geo.claim}) contains the words +\textquotedblleft for each $n\in\mathbb{N}$\textquotedblright, but it should +be clear that we don't mean to include these words, since there can be no +\textquotedblleft for each $n\in\mathbb{N}$\textquotedblright\ in the +statement $\mathcal{A}\left( n\right) $.) Furthermore, we don't need to +write the sentence + +\begin{quote} +\textquotedblleft Thus, we have completed both the induction base and the +induction step\textquotedblright +\end{quote} + +\noindent before we declare our inductive proof to be finished; it is clear +enough that we have completed them. We also can remove the following two sentences: + +\begin{quote} +\textquotedblleft Now, forget that we fixed $m$. We thus have shown that if +$m\in\mathbb{N}$ is such that $\mathcal{A}\left( m\right) $ holds, then +$\mathcal{A}\left( m+1\right) $ also holds.\textquotedblright. +\end{quote} + +\noindent In fact, these sentences merely say that we have completed the +induction step; they carry no other information (since the induction step +always consists in fixing $m\in\mathbb{N}$ such that $\mathcal{A}\left( +m\right) $ holds, and proving that $\mathcal{A}\left( m+1\right) $ also +holds). So once we say that the induction step is completed, we don't need +these sentences anymore. + +So we can shorten our proof above a bit further: + +\begin{proof} +[Proof of Proposition \ref{prop.ind.ari-geo} (third version).]We shall prove +(\ref{eq.prop.ind.ari-geo.claim}) by induction on $n$: + +\textit{Induction base:} We have $q^{0}=1$, thus $q^{0}-1=0$, and therefore +$\dfrac{q^{0}-1}{q-1}=\dfrac{0}{q-1}=0$. Now,% +\[ +\underbrace{q^{0}}_{=1}a_{0}+\underbrace{\dfrac{q^{0}-1}{q-1}}_{=0}% +d=1a_{0}+0d=a_{0}, +\] +so that $a_{0}=q^{0}a_{0}+\dfrac{q^{0}-1}{q-1}d$. In other words, +(\ref{eq.prop.ind.ari-geo.claim}) holds for $n=0$.\ \ \ \ \footnote{Note that +the statement \textquotedblleft(\ref{eq.prop.ind.ari-geo.claim}) holds for +$n=0$\textquotedblright\ (which we just proved) is precisely the statement +$\mathcal{A}\left( 0\right) $ in the previous two versions of our proof.} +This completes the induction base. + +\textit{Induction step:} Let $m\in\mathbb{N}$. Assume that +(\ref{eq.prop.ind.ari-geo.claim}) holds for $n=m$.\ \ \ \ \footnote{Note that +the statement \textquotedblleft(\ref{eq.prop.ind.ari-geo.claim}) holds for +$n=m$\textquotedblright\ (which we just assumed) is precisely the statement +$\mathcal{A}\left( m\right) $ in the previous two versions of our proof.} We +must show that (\ref{eq.prop.ind.ari-geo.claim}) holds for $n=m+1$% +.\ \ \ \ \footnote{Note that this statement \textquotedblleft% +(\ref{eq.prop.ind.ari-geo.claim}) holds for $n=m+1$\textquotedblright\ is +precisely the statement $\mathcal{A}\left( m+1\right) $ in the previous two +versions of our proof.} + +We have assumed that (\ref{eq.prop.ind.ari-geo.claim}) holds for $n=m$. In +other words, $a_{m}=q^{m}a_{0}+\dfrac{q^{m}-1}{q-1}d$ holds. Now, +(\ref{eq.prop.ind.ari-geo.ass}) (applied to $n=m$) yields% +\begin{align*} +a_{m+1} & =q\underbrace{a_{m}}_{=q^{m}a_{0}+\dfrac{q^{m}-1}{q-1}% +d}+d=q\left( q^{m}a_{0}+\dfrac{q^{m}-1}{q-1}d\right) +d\\ +& =\underbrace{qq^{m}}_{=q^{m+1}}a_{0}+\underbrace{q\cdot\dfrac{q^{m}-1}% +{q-1}d+d}_{=\left( q\cdot\dfrac{q^{m}-1}{q-1}+1\right) d}\\ +& =q^{m+1}a_{0}+\underbrace{\left( q\cdot\dfrac{q^{m}-1}{q-1}+1\right) +}_{\substack{=\dfrac{q\left( q^{m}-1\right) +\left( q-1\right) }% +{q-1}=\dfrac{q^{m+1}-1}{q-1}\\\text{(since }q\left( q^{m}-1\right) +\left( +q-1\right) =qq^{m}-q+q-1=qq^{m}-1=q^{m+1}-1\text{)}}}d\\ +& =q^{m+1}a_{0}+\dfrac{q^{m+1}-1}{q-1}d. +\end{align*} +So we have shown that $a_{m+1}=q^{m+1}a_{0}+\dfrac{q^{m+1}-1}{q-1}d$. In other +words, (\ref{eq.prop.ind.ari-geo.claim}) holds for $n=m+1$. This completes the +induction step. Hence, (\ref{eq.prop.ind.ari-geo.claim}) is proven by +induction. This proves Proposition \ref{prop.ind.ari-geo}. +\end{proof} + +\subsection{Examples from modular arithmetic} + +\subsubsection{Divisibility of integers} + +We shall soon give some more examples of inductive proofs, including some that +will include slightly new tactics. These examples come from the realm of +\textit{modular arithmetic}, which is the study of congruences modulo +integers. Before we come to these examples, we will introduce the definition +of such congruences. But first, let us recall the definition of divisibility: + +\begin{definition} +\label{def.divisibility}Let $u$ and $v$ be two integers. Then, we say that $u$ +\textit{divides} $v$ if and only if there exists an integer $w$ such that +$v=uw$. Instead of saying \textquotedblleft$u$ divides $v$\textquotedblright, +we can also say \textquotedblleft$v$ is \textit{divisible by }$u$% +\textquotedblright\ or \textquotedblleft$v$ is a \textit{multiple} of +$u$\textquotedblright\ or \textquotedblleft$u$ is a \textit{divisor} of +$v$\textquotedblright\ or \textquotedblleft$u\mid v$\textquotedblright. +\end{definition} + +Thus, two integers $u$ and $v$ satisfy $u\mid v$ if and only if there is some +$w\in\mathbb{Z}$ such that $v=uw$. For example, $1\mid v$ holds for every +integer $v$ (since $v=1v$), whereas $0\mid v$ holds only for $v=0$ (since +$v=0w$ is equivalent to $v=0$). An integer $v$ satisfies $2\mid v$ if and only +if $v$ is even. + +Definition \ref{def.divisibility} is fairly common in the modern literature +(e.g., it is used in \cite{Day-proofs}, \cite{LeLeMe16}, \cite{Mulhol16} and +\cite{Rotman15}), but there also some books that define these notations +differently. For example, in \cite{GKP}, the notation \textquotedblleft$u$ +divides $v$\textquotedblright\ is defined differently (it requires not only +the existence of an integer $w$ such that $v=uw$, but also that $u$ is +positive), whereas the notation \textquotedblleft$v$ is a multiple of +$u$\textquotedblright\ is defined as it is here (i.e., it just means that +there exists an integer $w$ such that $v=uw$); thus, these two notations are +not mutually interchangeable in \cite{GKP}. + +Let us first prove some basic properties of divisibility: + +\begin{proposition} +\label{prop.div.trans}Let $a$, $b$ and $c$ be three integers such that $a\mid +b$ and $b\mid c$. Then, $a\mid c$. +\end{proposition} + +\begin{proof} +[Proof of Proposition \ref{prop.div.trans}.]We have $a\mid b$. In other words, +there exists an integer $w$ such that $b=aw$ (by the definition of +\textquotedblleft divides\textquotedblright). Consider this $w$, and denote it +by $k$. Thus, $k$ is an integer such that $b=ak$. + +We have $b\mid c$. In other words, there exists an integer $w$ such that +$c=bw$ (by the definition of \textquotedblleft divides\textquotedblright). +Consider this $w$, and denote it by $j$. Thus, $j$ is an integer such that +$c=bj$. + +Now, $c=\underbrace{b}_{=ak}j=akj$. Hence, there exists an integer $w$ such +that $c=aw$ (namely, $w=kj$). In other words, $a$ divides $c$ (by the +definition of \textquotedblleft divides\textquotedblright). In other words, +$a\mid c$. This proves Proposition \ref{prop.div.trans}. +\end{proof} + +\begin{proposition} +\label{prop.div.acbc}Let $a$, $b$ and $c$ be three integers such that $a\mid +b$. Then, $ac\mid bc$. +\end{proposition} + +\begin{proof} +[Proof of Proposition \ref{prop.div.acbc}.]We have $a\mid b$. In other words, +there exists an integer $w$ such that $b=aw$ (by the definition of +\textquotedblleft divides\textquotedblright). Consider this $w$, and denote it +by $k$. Thus, $k$ is an integer such that $b=ak$. Hence, $\underbrace{b}% +_{=ak}c=akc=ack$. Thus, there exists an integer $w$ such that $bc=acw$ +(namely, $w=k$). In other words, $ac$ divides $bc$ (by the definition of +\textquotedblleft divides\textquotedblright). In other words, $ac\mid bc$. +This proves Proposition \ref{prop.div.acbc}. +\end{proof} + +\begin{proposition} +\label{prop.div.ax+by}Let $a$, $b$, $g$, $x$ and $y$ be integers such that +$g=ax+by$. Let $d$ be an integer such that $d\mid a$ and $d\mid b$. Then, +$d\mid g$. +\end{proposition} + +\begin{proof} +[Proof of Proposition \ref{prop.div.ax+by}.]We have $d\mid a$. In other words, +there exists an integer $w$ such that $a=dw$ (by the definition of +\textquotedblleft divides\textquotedblright). Consider this $w$, and denote it +by $p$. Thus, $p$ is an integer and satisfies $a=dp$. + +\begin{vershort} +Similarly, there is an integer $q$ such that $b=dq$. Consider this $q$. +\end{vershort} + +\begin{verlong} +We have $d\mid b$. In other words, there exists an integer $w$ such that +$b=dw$ (by the definition of \textquotedblleft divides\textquotedblright). +Consider this $w$, and denote it by $q$. Thus, $q$ is an integer and satisfies +$b=dq$. +\end{verlong} + +Now, $g=\underbrace{a}_{=dp}x+\underbrace{b}_{=dq}y=dpx+dqy=d\left( +px+qy\right) $. Hence, there exists an integer $w$ such that $g=dw$ (namely, +$w=px+qy$). In other words, $d\mid g$ (by the definition of \textquotedblleft +divides\textquotedblright). This proves Proposition \ref{prop.div.ax+by}. +\end{proof} + +It is easy to characterize divisibility in terms of fractions: + +\begin{proposition} +\label{prop.div.frac}Let $a$ and $b$ be two integers such that $a\neq0$. Then, +$a\mid b$ if and only if $b/a$ is an integer. +\end{proposition} + +\begin{proof} +[Proof of Proposition \ref{prop.div.frac}.]We first claim the following +logical implication\footnote{A \textit{logical implication} (or, short, +\textit{implication}) is a logical statement of the form \textquotedblleft if +$\mathcal{A}$, then $\mathcal{B}$\textquotedblright\ (where $\mathcal{A}$ and +$\mathcal{B}$ are two statements).}:% +\begin{equation} +\left( a\mid b\right) \ \Longrightarrow\ \left( b/a\text{ is an +integer}\right) . \label{pf.prop.div.frac.1}% +\end{equation} + + +[\textit{Proof of (\ref{pf.prop.div.frac.1}):} Assume that $a\mid b$. In other +words, there exists an integer $w$ such that $b=aw$ (by the definition of +\textquotedblleft divides\textquotedblright). Consider this $w$. Now, dividing +the equality $b=aw$ by $a$, we obtain $b/a=w$ (since $a\neq0$). Hence, $b/a$ +is an integer (since $w$ is an integer). This proves the implication +(\ref{pf.prop.div.frac.1}).] + +Next, we claim the the following logical implication:% +\begin{equation} +\left( b/a\text{ is an integer}\right) \ \Longrightarrow\ \left( a\mid +b\right) . \label{pf.prop.div.frac.2}% +\end{equation} + + +[\textit{Proof of (\ref{pf.prop.div.frac.2}):} Assume that $b/a$ is an +integer. Let $k$ denote this integer. Thus, $b/a=k$, so that $b=ak$. Hence, +there exists an integer $w$ such that $b=aw$ (namely, $w=k$). In other words, +$a$ divides $b$ (by the definition of \textquotedblleft +divides\textquotedblright). In other words, $a\mid b$. This proves the +implication (\ref{pf.prop.div.frac.2}).] + +Combining the implications (\ref{pf.prop.div.frac.1}) and +(\ref{pf.prop.div.frac.2}), we obtain the equivalence $\left( a\mid b\right) +\ \Longleftrightarrow\ \left( b/a\text{ is an integer}\right) $. In other +words, $a\mid b$ if and only if $b/a$ is an integer. This proves Proposition +\ref{prop.div.frac}. +\end{proof} + +\subsubsection{Definition of congruences} + +We can now define congruences: + +\begin{definition} +\label{def.mod.equiv}Let $a$, $b$ and $n$ be three integers. Then, we say that +$a$\textit{ is congruent to }$b$\textit{ modulo }$n$ if and only if $n\mid +a-b$. We shall use the notation \textquotedblleft$a\equiv b\operatorname{mod}% +n$\textquotedblright\ for \textquotedblleft$a$ is congruent to $b$ modulo +$n$\textquotedblright. Relations of the form \textquotedblleft$a\equiv +b\operatorname{mod}n$\textquotedblright\ (for integers $a$, $b$ and $n$) are +called \textit{congruences modulo }$n$. +\end{definition} + +Thus, three integers $a$, $b$ and $n$ satisfy $a\equiv b\operatorname{mod}n$ +if and only if $n\mid a-b$. + +Hence, in particular: + +\begin{itemize} +\item Any two integers $a$ and $b$ satisfy $a\equiv b\operatorname{mod}1$. +(Indeed, any two integers $a$ and $b$ satisfy $a-b=1\left( a-b\right) $, +thus $1\mid a-b$, thus $a\equiv b\operatorname{mod}1$.) + +\item Two integers $a$ and $b$ satisfy $a\equiv b\operatorname{mod}0$ if and +only if $a=b$. (Indeed, $a\equiv b\operatorname{mod}0$ is equivalent to $0\mid +a-b$, which in turn is equivalent to $a-b=0$, which in turn is equivalent to +$a=b$.) + +\item Two integers $a$ and $b$ satisfy $a\equiv b\operatorname{mod}2$ if and +only if they have the same parity (i.e., they are either both odd or both +even). This is not obvious at this point yet, but follows easily from +Proposition \ref{prop.ind.quo-rem.odd} further below. +\end{itemize} + +We have% +\[ +4\equiv10\operatorname{mod}3\ \ \ \ \ \ \ \ \ \ \text{and}% +\ \ \ \ \ \ \ \ \ \ 5\equiv-35\operatorname{mod}4. +\] + + +Note that Day, in \cite{Day-proofs}, writes \textquotedblleft$a\equiv_{n}% +b$\textquotedblright\ instead of \textquotedblleft$a\equiv b\operatorname{mod}% +n$\textquotedblright. Also, other authors (particularly of older texts) write +\textquotedblleft$a\equiv b\pmod{n}$\textquotedblright\ instead of +\textquotedblleft$a\equiv b\operatorname{mod}n$\textquotedblright. + +Let us next introduce notations for the negations of the statements +\textquotedblleft$u\mid v$\textquotedblright\ and \textquotedblleft$a\equiv +b\operatorname{mod}n$\textquotedblright: + +\begin{definition} +\textbf{(a)} If $u$ and $v$ are two integers, then the notation +\textquotedblleft$u\nmid v$\textquotedblright\ shall mean \textquotedblleft +not $u\mid v$\textquotedblright\ (that is, \textquotedblleft$u$ does not +divide $v$\textquotedblright). + +\textbf{(b)} If $a$, $b$ and $n$ are three integers, then the notation +\textquotedblleft$a\not \equiv b\operatorname{mod}n$\textquotedblright\ shall +mean \textquotedblleft not $a\equiv b\operatorname{mod}n$\textquotedblright% +\ (that is, \textquotedblleft$a$ is not congruent to $b$ modulo $n$% +\textquotedblright). +\end{definition} + +Thus, three integers $a$, $b$ and $n$ satisfy $a\not \equiv +b\operatorname{mod}n$ if and only if $n\nmid a-b$. For example, $1\not \equiv +-1\operatorname{mod}3$, since $3\nmid1-\left( -1\right) $. + +\subsubsection{Congruence basics} + +Let us now state some of the basic laws of congruences (so far, not needing +induction to prove): + +\begin{proposition} +\label{prop.mod.0}Let $a$ and $n$ be integers. Then: + +\textbf{(a)} We have $a\equiv0\operatorname{mod}n$ if and only if $n\mid a$. + +\textbf{(b)} Let $b$ be an integer. Then, $a\equiv b\operatorname{mod}n$ if +and only if $a\equiv b\operatorname{mod}\left( -n\right) $. + +\textbf{(c)} Let $m$ and $b$ be integers such that $m\mid n$. If $a\equiv +b\operatorname{mod}n$, then $a\equiv b\operatorname{mod}m$. +\end{proposition} + +\begin{proof} +[Proof of Proposition \ref{prop.mod.0}.]\textbf{(a)} We have the following +chain of logical equivalences:% +\begin{align*} +& \ \left( a\equiv0\operatorname{mod}n\right) \\ +& \Longleftrightarrow\ \left( a\text{ is congruent to }0\text{ modulo +}n\right) \\ +& \ \ \ \ \ \ \ \ \ \ \left( \text{since \textquotedblleft}a\equiv +0\operatorname{mod}n\text{\textquotedblright\ is just a notation for +\textquotedblleft}a\text{ is congruent to }0\text{ modulo }% +n\text{\textquotedblright}\right) \\ +& \Longleftrightarrow\ \left( n\mid\underbrace{a-0}_{=a}\right) +\ \ \ \ \ \ \ \ \ \ \left( \text{by the definition of \textquotedblleft +congruent\textquotedblright}\right) \\ +& \Longleftrightarrow\ \left( n\mid a\right) . +\end{align*} +Thus, we have $a\equiv0\operatorname{mod}n$ if and only if $n\mid a$. This +proves Proposition \ref{prop.mod.0} \textbf{(a)}. + +\textbf{(b)} Let us first assume that $a\equiv b\operatorname{mod}n$. Thus, +$a$ is congruent to $b$ modulo $n$. In other words, $n\mid a-b$ (by the +definition of \textquotedblleft congruent\textquotedblright). In other words, +$n$ divides $a-b$. In other words, there exists an integer $w$ such that +$a-b=nw$ (by the definition of \textquotedblleft divides\textquotedblright). +Consider this $w$, and denote it by $k$. Thus, $k$ is an integer such that +$a-b=nk$. + +Thus, $a-b=nk=\left( -n\right) \left( -k\right) $. Hence, there exists an +integer $w$ such that $a-b=\left( -n\right) w$ (namely, $w=-k$). In other +words, $-n$ divides $a-b$ (by the definition of \textquotedblleft +divides\textquotedblright). In other words, $-n\mid a-b$. In other words, $a$ +is congruent to $b$ modulo $-n$ (by the definition of \textquotedblleft +congruent\textquotedblright). In other words, $a\equiv b\operatorname{mod}% +\left( -n\right) $. + +Now, forget that we assumed that $a\equiv b\operatorname{mod}n$. We thus have +shown that% +\begin{equation} +\text{if }a\equiv b\operatorname{mod}n\text{, then }a\equiv +b\operatorname{mod}\left( -n\right) . \label{pf.prop.mod.0.b.1}% +\end{equation} +The same argument (applied to $-n$ instead of $n$) shows that% +\[ +\text{if }a\equiv b\operatorname{mod}\left( -n\right) \text{, then }a\equiv +b\operatorname{mod}\left( -\left( -n\right) \right) . +\] +Since $-\left( -n\right) =n$, this rewrites as follows:% +\[ +\text{if }a\equiv b\operatorname{mod}\left( -n\right) \text{, then }a\equiv +b\operatorname{mod}n. +\] +Combining this implication with (\ref{pf.prop.mod.0.b.1}), we conclude that +$a\equiv b\operatorname{mod}n$ if and only if $a\equiv b\operatorname{mod}% +\left( -n\right) $. This proves Proposition \ref{prop.mod.0} \textbf{(b)}. + +\textbf{(c)} Assume that $a\equiv b\operatorname{mod}n$. Thus, $a$ is +congruent to $b$ modulo $n$. In other words, $n\mid a-b$ (by the definition of +\textquotedblleft congruent\textquotedblright). Hence, Proposition +\ref{prop.div.trans} (applied to $m$, $n$ and $a-b$ instead of $a$, $b$ and +$c$) yields $m\mid a-b$ (since $m\mid n$). In other words, $a$ is congruent to +$b$ modulo $m$ (by the definition of \textquotedblleft +congruent\textquotedblright). Thus, $a\equiv b\operatorname{mod}m$. This +proves Proposition \ref{prop.mod.0} \textbf{(c)}. +\end{proof} + +\begin{proposition} +\label{prop.mod.transi}Let $n$ be an integer. + +\textbf{(a)} For any integer $a$, we have $a\equiv a\operatorname{mod}n$. + +\textbf{(b)} For any integers $a$ and $b$ satisfying $a\equiv +b\operatorname{mod}n$, we have $b\equiv a\operatorname{mod}n$. + +\textbf{(c)} For any integers $a$, $b$ and $c$ satisfying $a\equiv +b\operatorname{mod}n$ and $b\equiv c\operatorname{mod}n$, we have $a\equiv +c\operatorname{mod}n$. +\end{proposition} + +\begin{proof} +[Proof of Proposition \ref{prop.mod.transi}.]\textbf{(a)} Let $a$ be an +integer. Then, $a-a=0=n\cdot0$. Hence, there exists an integer $w$ such that +$a-a=nw$ (namely, $w=0$). In other words, $n$ divides $a-a$ (by the definition +of \textquotedblleft divides\textquotedblright). In other words, $n\mid a-a$. +In other words, $a$ is congruent to $a$ modulo $n$ (by the definition of +\textquotedblleft congruent\textquotedblright). In other words, $a\equiv +a\operatorname{mod}n$. This proves Proposition \ref{prop.mod.transi} +\textbf{(a)}. + +\textbf{(b)} Let $a$ and $b$ be two integers satisfying $a\equiv +b\operatorname{mod}n$. Thus, $a$ is congruent to $b$ modulo $n$ (since +$a\equiv b\operatorname{mod}n$). In other words, $n\mid a-b$ (by the +definition of \textquotedblleft congruent\textquotedblright). In other words, +$n$ divides $a-b$. In other words, there exists an integer $w$ such that +$a-b=nw$ (by the definition of \textquotedblleft divides\textquotedblright). +Consider this $w$, and denote it by $q$. Thus, $q$ is an integer such that +$a-b=nq$. Now, $b-a=-\underbrace{\left( a-b\right) }_{=nq}=-nq=n\left( +-q\right) $. Hence, there exists an integer $w$ such that $b-a=nw$ (namely, +$w=-q$). In other words, $n$ divides $b-a$ (by the definition of +\textquotedblleft divides\textquotedblright). In other words, $n\mid b-a$. In +other words, $b$ is congruent to $a$ modulo $n$ (by the definition of +\textquotedblleft congruent\textquotedblright). In other words, $b\equiv +a\operatorname{mod}n$. This proves Proposition \ref{prop.mod.transi} +\textbf{(b)}. + +\textbf{(c)} Let $a$, $b$ and $c$ be three integers satisfying $a\equiv +b\operatorname{mod}n$ and $b\equiv c\operatorname{mod}n$. + +\begin{vershort} +Just as in the above proof of Proposition \ref{prop.mod.transi} \textbf{(b)}, +we can use the assumption $a\equiv b\operatorname{mod}n$ to construct an +integer $q$ such that $a-b=nq$. Similarly, we can use the assumption $b\equiv +c\operatorname{mod}n$ to construct an integer $r$ such that $b-c=nr$. Consider +these $q$ and $r$. +\end{vershort} + +\begin{verlong} +From $a\equiv b\operatorname{mod}n$, we conclude that $a$ is congruent to $b$ +modulo $n$. In other words, $n\mid a-b$ (by the definition of +\textquotedblleft congruent\textquotedblright). In other words, $n$ divides +$a-b$. In other words, there exists an integer $w$ such that $a-b=nw$ (by the +definition of \textquotedblleft divides\textquotedblright). Consider this $w$, +and denote it by $q$. Thus, $q$ is an integer such that $a-b=nq$. + +From $b\equiv c\operatorname{mod}n$, we conclude that $b$ is congruent to $c$ +modulo $n$. In other words, $n\mid b-c$ (by the definition of +\textquotedblleft congruent\textquotedblright). In other words, $n$ divides +$b-c$. In other words, there exists an integer $w$ such that $b-c=nw$ (by the +definition of \textquotedblleft divides\textquotedblright). Consider this $w$, +and denote it by $r$. Thus, $r$ is an integer such that $b-c=nr$. +\end{verlong} + +Now,% +\[ +a-c=\underbrace{\left( a-b\right) }_{=nq}+\underbrace{\left( b-c\right) +}_{=nr}=nq+nr=n\left( q+r\right) . +\] +Hence, there exists an integer $w$ such that $a-c=nw$ (namely, $w=q+r$). In +other words, $n$ divides $a-c$ (by the definition of \textquotedblleft +divides\textquotedblright). In other words, $n\mid a-c$. In other words, $a$ +is congruent to $c$ modulo $n$ (by the definition of \textquotedblleft +congruent\textquotedblright). In other words, $a\equiv c\operatorname{mod}n$. +This proves Proposition \ref{prop.mod.transi} \textbf{(c)}. +\end{proof} + +Simple as they are, the three parts of Proposition \ref{prop.mod.transi} have +names: Proposition \ref{prop.mod.transi} \textbf{(a)} is called the +\textit{reflexivity of congruence (modulo }$n$\textit{)}; Proposition +\ref{prop.mod.transi} \textbf{(b)} is called the \textit{symmetry of +congruence (modulo }$n$\textit{)}; Proposition \ref{prop.mod.transi} +\textbf{(c)} is called the \textit{transitivity of congruence (modulo }% +$n$\textit{)}. + +Proposition \ref{prop.mod.transi} \textbf{(b)} allows the following definition: + +\begin{definition} +Let $n$, $a$ and $b$ be three integers. Then, we say that $a$ \textit{and }$b$ +\textit{are congruent modulo }$n$ if and only if $a\equiv b\operatorname{mod}% +n$. Proposition \ref{prop.mod.transi} \textbf{(b)} shows that $a$ and $b$ +actually play equal roles in this relation (i.e., the statement +\textquotedblleft$a$ and $b$ are congruent modulo $n$\textquotedblright\ is +equivalent to \textquotedblleft$b$ and $a$ are congruent modulo $n$% +\textquotedblright). +\end{definition} + +\begin{proposition} +\label{prop.mod.n=0}Let $n$ be an integer. Then, $n\equiv0\operatorname{mod}n$. +\end{proposition} + +\begin{proof} +[Proof of Proposition \ref{prop.mod.n=0}.]We have $n=n\cdot1$. Thus, there +exists an integer $w$ such that $n=nw$ (namely, $w=1$). Therefore, $n\mid n$ +(by the definition of \textquotedblleft divides\textquotedblright). +Proposition \ref{prop.mod.0} \textbf{(a)} (applied to $a=n$) shows that we +have $n\equiv0\operatorname{mod}n$ if and only if $n\mid n$. Hence, we have +$n\equiv0\operatorname{mod}n$ (since $n\mid n$). This proves Proposition +\ref{prop.mod.n=0}. +\end{proof} + +\subsubsection{Chains of congruences} + +Proposition \ref{prop.mod.transi} shows that congruences (modulo $n$) behave +like equalities -- in that we can turn them around (since Proposition +\ref{prop.mod.transi} \textbf{(b)} says that $a\equiv b\operatorname{mod}n$ +implies $b\equiv a\operatorname{mod}n$) and we can chain them together (by +Proposition \ref{prop.mod.transi} \textbf{(c)}) and in that every integer is +congruent to itself (by Proposition \ref{prop.mod.transi} \textbf{(a)}). This +leads to the following notation: + +\begin{definition} +If $a_{1},a_{2},\ldots,a_{k}$ and $n$ are integers, then the statement +\textquotedblleft$a_{1}\equiv a_{2}\equiv\cdots\equiv a_{k}\operatorname{mod}% +n$\textquotedblright\ shall mean that +\[ +\left( a_{i}\equiv a_{i+1}\operatorname{mod}n\text{ holds for each }% +i\in\left\{ 1,2,\ldots,k-1\right\} \right) . +\] +Such a statement is called a \textit{chain of congruences modulo }$n$ (or, +less precisely, a \textit{chain of congruences}). We shall refer to the +integers $a_{1},a_{2},\ldots,a_{k}$ (but not $n$) as the \textit{members} of +this chain. +\end{definition} + +For example, the chain $a\equiv b\equiv c\equiv d\operatorname{mod}n$ (for +five integers $a,b,c,d,n$) means that $a\equiv b\operatorname{mod}n$ and +$b\equiv c\operatorname{mod}n$ and $c\equiv d\operatorname{mod}n$. + +The usefulness of such chains lies in the following fact: + +\begin{proposition} +\label{prop.mod.chain}Let $a_{1},a_{2},\ldots,a_{k}$ and $n$ be integers such +that $a_{1}\equiv a_{2}\equiv\cdots\equiv a_{k}\operatorname{mod}n$. Let $u$ +and $v$ be two elements of $\left\{ 1,2,\ldots,k\right\} $. Then,% +\[ +a_{u}\equiv a_{v}\operatorname{mod}n. +\] + +\end{proposition} + +In other words, any two members of a chain of congruences modulo $n$ are +congruent to each other modulo $n$. Thus, chains of congruences are like +chains of equalities: From any chain of congruences modulo $n$ with $k$ +members, you can extract $k^{2}$ congruences modulo $n$ by picking any two +members of the chain. + +\begin{example} +Proposition \ref{prop.mod.chain} shows (among other things) that if +$a,b,c,d,e,n$ are integers such that $a\equiv b\equiv c\equiv d\equiv +e\operatorname{mod}n$, then $a\equiv d\operatorname{mod}n$ and $b\equiv +d\operatorname{mod}n$ and $e\equiv b\operatorname{mod}n$ (and various other congruences). +\end{example} + +Unsurprisingly, Proposition \ref{prop.mod.chain} can be proven by induction, +although not in an immediately obvious manner: We cannot directly prove it by +induction on $n$, on $k$, on $u$ or on $v$. Instead, we will first introduce +an auxiliary statement (the statement (\ref{pf.prop.mod.chain.Ai}) in the +following proof) which will be tailored to an inductive proof. This is a +commonly used tactic, and particularly helpful to us now as we only have the +most basic form of the principle of induction available. (Soon, we will see +more versions of that principle, which will obviate the need for some of the tailoring.) + +\begin{proof} +[Proof of Proposition \ref{prop.mod.chain}.]By assumption, we have +$a_{1}\equiv a_{2}\equiv\cdots\equiv a_{k}\operatorname{mod}n$. In other +words, +\begin{equation} +\left( a_{i}\equiv a_{i+1}\operatorname{mod}n\text{ holds for each }% +i\in\left\{ 1,2,\ldots,k-1\right\} \right) \label{pf.prop.mod.chain.ass}% +\end{equation} +(since this is what \textquotedblleft$a_{1}\equiv a_{2}\equiv\cdots\equiv +a_{k}\operatorname{mod}n$\textquotedblright\ means). + +Fix $p\in\left\{ 1,2,\ldots,k\right\} $. For each $i\in\mathbb{N}$, we let +$\mathcal{A}\left( i\right) $ be the statement% +\begin{equation} +\left( \text{if }p+i\in\left\{ 1,2,\ldots,k\right\} \text{, then }% +a_{p}\equiv a_{p+i}\operatorname{mod}n\right) . \label{pf.prop.mod.chain.Ai}% +\end{equation} + + +We shall prove that this statement $\mathcal{A}\left( i\right) $ holds for +each $i\in\mathbb{N}$. + +In fact, let us prove this by induction on $i$:\ \ \ \ \footnote{Thus, the +letter \textquotedblleft$i$\textquotedblright\ plays the role of the +\textquotedblleft$n$\textquotedblright\ in Theorem \ref{thm.ind.IP0} (since we +are already using \textquotedblleft$n$\textquotedblright\ for a different +thing).} + +\textit{Induction base:} The statement $\mathcal{A}\left( 0\right) $ +holds\footnote{\textit{Proof.} Proposition \ref{prop.mod.transi} \textbf{(a)} +(applied to $a=a_{p}$) yields $a_{p}\equiv a_{p}\operatorname{mod}n$. In view +of $p=p+0$, this rewrites as $a_{p}\equiv a_{p+0}\operatorname{mod}n$. Hence, +$\left( \text{if }p+0\in\left\{ 1,2,\ldots,k\right\} \text{, then }% +a_{p}\equiv a_{p+0}\operatorname{mod}n\right) $. But this is precisely the +statement $\mathcal{A}\left( 0\right) $. Hence, the statement $\mathcal{A}% +\left( 0\right) $ holds.}. This completes the induction base. + +\textit{Induction step:} Let $m\in\mathbb{N}$. Assume that $\mathcal{A}\left( +m\right) $ holds. We must show that $\mathcal{A}\left( m+1\right) $ holds. + +We have assumed that $\mathcal{A}\left( m\right) $ holds. In other words,% +\begin{equation} +\left( \text{if }p+m\in\left\{ 1,2,\ldots,k\right\} \text{, then }% +a_{p}\equiv a_{p+m}\operatorname{mod}n\right) . +\label{pf.prop.mod.chain.Ai.IH}% +\end{equation} + + +Next, let us assume that $p+\left( m+1\right) \in\left\{ 1,2,\ldots +,k\right\} $. Thus, $p+\left( m+1\right) \leq k$, so that $p+m+1=p+\left( +m+1\right) \leq k$ and therefore $p+m\leq k-1$. Also, $p\in\left\{ +1,2,\ldots,k\right\} $, so that $p\geq1$ and thus $\underbrace{p}_{\geq +1}+\underbrace{m}_{\geq0}\geq1+0=1$. Combining this with $p+m\leq k-1$, we +obtain $p+m\in\left\{ 1,2,\ldots,k-1\right\} \subseteq\left\{ +1,2,\ldots,k\right\} $. Hence, (\ref{pf.prop.mod.chain.Ai.IH}) shows that +$a_{p}\equiv a_{p+m}\operatorname{mod}n$. But (\ref{pf.prop.mod.chain.ass}) +(applied to $p+m$ instead of $i$) yields $a_{p+m}\equiv a_{\left( p+m\right) ++1}\operatorname{mod}n$ (since $p+m\in\left\{ 1,2,\ldots,k-1\right\} $). + +So we know that $a_{p}\equiv a_{p+m}\operatorname{mod}n$ and $a_{p+m}\equiv +a_{\left( p+m\right) +1}\operatorname{mod}n$. Hence, Proposition +\ref{prop.mod.transi} \textbf{(c)} (applied to $a=a_{p}$, $b=a_{p+m}$ and +$c=a_{\left( p+m\right) +1}$) yields $a_{p}\equiv a_{\left( p+m\right) ++1}\operatorname{mod}n$. Since $\left( p+m\right) +1=p+\left( m+1\right) +$, this rewrites as $a_{p}\equiv a_{p+\left( m+1\right) }\operatorname{mod}% +n$. + +Now, forget that we assumed that $p+\left( m+1\right) \in\left\{ +1,2,\ldots,k\right\} $. We thus have shown that% +\[ +\left( \text{if }p+\left( m+1\right) \in\left\{ 1,2,\ldots,k\right\} +\text{, then }a_{p}\equiv a_{p+\left( m+1\right) }\operatorname{mod}% +n\right) . +\] +But this is precisely the statement $\mathcal{A}\left( m+1\right) $. Thus, +$\mathcal{A}\left( m+1\right) $ holds. + +Now, forget that we fixed $m$. We thus have shown that if $m\in\mathbb{N}$ is +such that $\mathcal{A}\left( m\right) $ holds, then $\mathcal{A}\left( +m+1\right) $ also holds. This completes the induction step. + +Thus, we have completed both the induction base and the induction step. Hence, +by induction, we conclude that $\mathcal{A}\left( i\right) $ holds for each +$i\in\mathbb{N}$. In other words, (\ref{pf.prop.mod.chain.Ai}) holds for each +$i\in\mathbb{N}$. + +We are not done yet, since our goal is to prove Proposition +\ref{prop.mod.chain}, not merely to prove $\mathcal{A}\left( i\right) $. But +this is now easy. + +First, let us forget that we fixed $p$. Thus, we have shown that +(\ref{pf.prop.mod.chain.Ai}) holds for each $p\in\left\{ 1,2,\ldots +,k\right\} $ and $i\in\mathbb{N}$. + +But we have either $u\leq v$ or $u>v$. In other words, we are in one of the +following two cases: + +\textit{Case 1:} We have $u\leq v$. + +\textit{Case 2:} We have $u>v$. + +Let us first consider Case 1. In this case, we have $u\leq v$. Thus, +$v-u\geq0$, so that $v-u\in\mathbb{N}$. But recall that +(\ref{pf.prop.mod.chain.Ai}) holds for each $p\in\left\{ 1,2,\ldots +,k\right\} $ and $i\in\mathbb{N}$. Applying this to $p=u$ and $i=v-u$, we +conclude that (\ref{pf.prop.mod.chain.Ai}) holds for $p=u$ and $i=v-u$ (since +$u\in\left\{ 1,2,\ldots,k\right\} $ and $v-u\in\mathbb{N}$). In other words,% +\[ +\left( \text{if }u+\left( v-u\right) \in\left\{ 1,2,\ldots,k\right\} +\text{, then }a_{u}\equiv a_{u+\left( v-u\right) }\operatorname{mod}% +n\right) . +\] +Since $u+\left( v-u\right) =v$, this rewrites as% +\[ +\left( \text{if }v\in\left\{ 1,2,\ldots,k\right\} \text{, then }a_{u}\equiv +a_{v}\operatorname{mod}n\right) . +\] +Since $v\in\left\{ 1,2,\ldots,k\right\} $ holds (by assumption), we conclude +that $a_{u}\equiv a_{v}\operatorname{mod}n$. Thus, Proposition +\ref{prop.mod.chain} is proven in Case 1. + +Let us now consider Case 2. In this case, we have $u>v$. Thus, $u-v>0$, so +that $u-v\in\mathbb{N}$. But recall that (\ref{pf.prop.mod.chain.Ai}) holds +for each $p\in\left\{ 1,2,\ldots,k\right\} $ and $i\in\mathbb{N}$. Applying +this to $p=v$ and $i=u-v$, we conclude that (\ref{pf.prop.mod.chain.Ai}) holds +for $p=v$ and $i=u-v$ (since $v\in\left\{ 1,2,\ldots,k\right\} $ and +$u-v\in\mathbb{N}$). In other words,% +\[ +\left( \text{if }v+\left( u-v\right) \in\left\{ 1,2,\ldots,k\right\} +\text{, then }a_{v}\equiv a_{v+\left( u-v\right) }\operatorname{mod}% +n\right) . +\] +Since $v+\left( u-v\right) =u$, this rewrites as% +\[ +\left( \text{if }u\in\left\{ 1,2,\ldots,k\right\} \text{, then }a_{v}\equiv +a_{u}\operatorname{mod}n\right) . +\] +Since $u\in\left\{ 1,2,\ldots,k\right\} $ holds (by assumption), we conclude +that $a_{v}\equiv a_{u}\operatorname{mod}n$. Therefore, Proposition +\ref{prop.mod.transi} \textbf{(b)} (applied to $a=a_{v}$ and $b=a_{u}$) yields +that $a_{u}\equiv a_{v}\operatorname{mod}n$. Thus, Proposition +\ref{prop.mod.chain} is proven in Case 2. + +Hence, Proposition \ref{prop.mod.chain} is proven in both Cases 1 and 2. Since +these two Cases cover all possibilities, we thus conclude that Proposition +\ref{prop.mod.chain} always holds. +\end{proof} + +\subsubsection{Chains of inequalities (a digression)} + +Much of the above proof of Proposition \ref{prop.mod.chain} was unremarkable +and straightforward reasoning -- but this proof is nevertheless fundamental +and important. More or less the same argument can be used to show the +following fact about chains of inequalities: + +\begin{proposition} +\label{prop.mod.chain-ineq}Let $a_{1},a_{2},\ldots,a_{k}$ be integers such +that $a_{1}\leq a_{2}\leq\cdots\leq a_{k}$. (Recall that the statement +\textquotedblleft$a_{1}\leq a_{2}\leq\cdots\leq a_{k}$\textquotedblright% +\ means that $\left( a_{i}\leq a_{i+1}\text{ holds for each }i\in\left\{ +1,2,\ldots,k-1\right\} \right) $.) Let $u$ and $v$ be two elements of +$\left\{ 1,2,\ldots,k\right\} $ such that $u\leq v$. Then,% +\[ +a_{u}\leq a_{v}. +\] + +\end{proposition} + +Proposition \ref{prop.mod.chain-ineq} is similar to Proposition +\ref{prop.mod.chain}, with the congruences replaced by inequalities; but note +that the condition \textquotedblleft$u\leq v$\textquotedblright\ is now +required. Make sure you understand where you need this condition when adapting +the proof of Proposition \ref{prop.mod.chain} to Proposition +\ref{prop.mod.chain-ineq}! + +For future use, let us prove a corollary of Proposition +\ref{prop.mod.chain-ineq} which essentially observes that the inequality sign +in $a_{u}\leq a_{v}$ can be made strict if there is any strict inequality sign +between $a_{u}$ and $a_{v}$ in the chain $a_{1}\leq a_{2}\leq\cdots\leq a_{k}$: + +\begin{corollary} +\label{cor.mod.chain-ineq2}Let $a_{1},a_{2},\ldots,a_{k}$ be integers such +that $a_{1}\leq a_{2}\leq\cdots\leq a_{k}$. Let $u$ and $v$ be two elements of +$\left\{ 1,2,\ldots,k\right\} $ such that $u\leq v$. Let $p\in\left\{ +u,u+1,\ldots,v-1\right\} $ be such that $a_{p}0$ (since $v$ is nonnegative). + +But $u$ divides $v$ (since $u\mid v$). In other words, there exists an integer +$w$ such that $v=uw$. Consider this $w$. If we had $w<0$, then we would have +$uw\leq0$ (since $u$ is nonnegative), which would contradict $uw=v>0$. Hence, +we cannot have $w<0$. Thus, we must have $w\geq0$. Therefore, $w\in\mathbb{N}% +$. Hence, Theorem \ref{thm.rec-seq.somos-simple} \textbf{(b)} (applied to +$n=u$) yields $a_{u}\mid a_{uw}$. In view of $v=uw$, this rewrites as +$a_{u}\mid a_{v}$. This proves Theorem \ref{thm.rec-seq.somos-simple} +\textbf{(c)}. +\end{proof} + +Applying Theorem \ref{thm.rec-seq.somos-simple} \textbf{(c)} to $q=2$ and +$r=1$, we obtain the observation about divisibility made in Example +\ref{exa.rec-seq.1}. + +\subsubsection{The Fibonacci sequence and a generalization} + +Another example of a recursively defined sequence is the famous Fibonacci sequence: + +\begin{example} +\label{exa.rec-seq.fib}The +\textit{\href{https://en.wikipedia.org/wiki/Fibonacci_number}{Fibonacci +sequence}} is the sequence $\left( f_{0},f_{1},f_{2},\ldots\right) $ of +integers which is defined recursively by% +\begin{align*} +f_{0} & =0,\ \ \ \ \ \ \ \ \ \ f_{1}=1,\ \ \ \ \ \ \ \ \ \ \text{and}\\ +f_{n} & =f_{n-1}+f_{n-2}\ \ \ \ \ \ \ \ \ \ \text{for all }n\geq2. +\end{align*} +Let us compute its first few entries:% +\begin{align*} +f_{0} & =0;\\ +f_{1} & =1;\\ +f_{2} & =\underbrace{f_{1}}_{=1}+\underbrace{f_{0}}_{=0}=1+0=1;\\ +f_{3} & =\underbrace{f_{2}}_{=1}+\underbrace{f_{1}}_{=1}=1+1=2;\\ +f_{4} & =\underbrace{f_{3}}_{=2}+\underbrace{f_{2}}_{=1}=2+1=3;\\ +f_{5} & =\underbrace{f_{4}}_{=3}+\underbrace{f_{3}}_{=2}=3+2=5;\\ +f_{6} & =\underbrace{f_{5}}_{=5}+\underbrace{f_{4}}_{=3}=5+3=8. +\end{align*} +Again, we observe (as in Example \ref{exa.rec-seq.1}) that $f_{2}\mid f_{6}$ +and $f_{3}\mid f_{6}$, which suggests that we might have $f_{u}\mid f_{v}$ +whenever $u$ and $v$ are two nonnegative integers satisfying $u\mid v$. + +Some further experimentation may suggest that the equality $f_{n+m+1}% +=f_{n}f_{m}+f_{n+1}f_{m+1}$ holds for all $n\in\mathbb{N}$ and $m\in +\mathbb{N}$. + +Both of these conjectures will be shown in the following theorem, in greater generality. +\end{example} + +\begin{theorem} +\label{thm.rec-seq.fibx}Fix some $a\in\mathbb{Z}$ and $b\in\mathbb{Z}$. Let +$\left( x_{0},x_{1},x_{2},\ldots\right) $ be a sequence of integers defined +recursively by% +\begin{align*} +x_{0} & =0,\ \ \ \ \ \ \ \ \ \ x_{1}=1,\ \ \ \ \ \ \ \ \ \ \text{and}\\ +x_{n} & =ax_{n-1}+bx_{n-2}\ \ \ \ \ \ \ \ \ \ \text{for each }n\geq2. +\end{align*} +(Note that if $a=1$ and $b=1$, then this sequence $\left( x_{0},x_{1}% +,x_{2},\ldots\right) $ is precisely the Fibonacci sequence $\left( +f_{0},f_{1},f_{2},\ldots\right) $ from Example \ref{exa.rec-seq.fib}. If +$a=0$ and $b=1$, then our sequence $\left( x_{0},x_{1},x_{2},\ldots\right) $ +is the sequence $\left( 0,1,0,b,0,b^{2},0,b^{3},\ldots\right) $ that +alternates between $0$'s and powers of $b$. The reader can easily work out +further examples.) + +\textbf{(a)} We have $x_{n+m+1}=bx_{n}x_{m}+x_{n+1}x_{m+1}$ for all +$n\in\mathbb{N}$ and $m\in\mathbb{N}$. + +\textbf{(b)} For any $n\in\mathbb{N}$ and $w\in\mathbb{N}$, we have $x_{n}\mid +x_{nw}$. + +\textbf{(c)} If $u$ and $v$ are two nonnegative integers satisfying $u\mid v$, +then $x_{u}\mid x_{v}$. +\end{theorem} + +Before we prove this theorem, let us discuss how \textbf{not} to prove it: + +\begin{remark} +\label{rmk.ind.abstract}The proof of Theorem \ref{thm.rec-seq.fibx} +\textbf{(a)} below illustrates an important aspect of induction proofs: +Namely, when devising an induction proof, we often have not only a choice of +what variable to induct on (e.g., we could try proving Theorem +\ref{thm.rec-seq.fibx} \textbf{(a)} by induction on $n$ or by induction on +$m$), but also a choice of whether to leave the other variables fixed. For +example, let us try to prove Theorem \ref{thm.rec-seq.fibx} \textbf{(a)} by +induction on $n$ while leaving the variable $m$ fixed. That is, we fix some +$m\in\mathbb{N}$, and we define $\mathcal{A}\left( n\right) $ (for each +$n\in\mathbb{N}$) to be the following statement:% +\[ +\left( x_{n+m+1}=bx_{n}x_{m}+x_{n+1}x_{m+1}\right) . +\] +Then, it is easy to check that $\mathcal{A}\left( 0\right) $ holds, so the +induction base is complete. For the induction step, we fix some $k\in +\mathbb{N}$. (This $k$ serves the role of the \textquotedblleft$m$% +\textquotedblright\ in Theorem \ref{thm.ind.IP0}, but we cannot call it $m$ +here since $m$ already stands for a fixed number.) We assume that +$\mathcal{A}\left( k\right) $ holds, and we intend to prove $\mathcal{A}% +\left( k+1\right) $. + +Our induction hypothesis says that $\mathcal{A}\left( k\right) $ holds; in +other words, we have $x_{k+m+1}=bx_{k}x_{m}+x_{k+1}x_{m+1}$. We want to prove +$\mathcal{A}\left( k+1\right) $; in other words, we want to prove that +$x_{\left( k+1\right) +m+1}=bx_{k+1}x_{m}+x_{\left( k+1\right) +1}x_{m+1}$. + +A short moment of deliberation shows that we cannot do this (at least not with +our current knowledge). There is no direct way of deriving $\mathcal{A}\left( +k+1\right) $ from $\mathcal{A}\left( k\right) $. \textbf{However}, if we +knew that the statement $\mathcal{A}\left( k\right) $ holds +\textquotedblleft for $m+1$ instead of $m$\textquotedblright\ (that is, if we +knew that $x_{k+\left( m+1\right) +1}=bx_{k}x_{m+1}+x_{k+1}x_{\left( +m+1\right) +1}$), then we could derive $\mathcal{A}\left( k+1\right) $. But +we cannot just \textquotedblleft apply $\mathcal{A}\left( k\right) $ to +$m+1$ instead of $m$\textquotedblright; after all, $m$ is a fixed number, so +we cannot have it take different values in $\mathcal{A}\left( k\right) $ and +in $\mathcal{A}\left( k+1\right) $. + +So we are at an impasse. We got into this impasse by fixing $m$. So let us try +\textbf{not} fixing $m\in\mathbb{N}$ right away, but instead defining +$\mathcal{A}\left( n\right) $ (for each $n\in\mathbb{N}$) to be the +following statement:% +\[ +\left( x_{n+m+1}=bx_{n}x_{m}+x_{n+1}x_{m+1}\text{ for all }m\in +\mathbb{N}\right) . +\] +Thus, $\mathcal{A}\left( n\right) $ is not a statement about a specific +integer $m$ any more, but rather a statement about all nonnegative integers +$m$. This allows us to apply $\mathcal{A}\left( k\right) $ to $m+1$ instead +of $m$ in the induction step. (We can still fix $m\in\mathbb{N}$ +\textbf{during the induction step}; this doesn't prevent us from applying +$\mathcal{A}\left( k\right) $ to $m+1$ instead of $m$, since $\mathcal{A}% +\left( k\right) $ has been formulated before $m$ was fixed.) This way, we +arrive at the following proof: +\end{remark} + +\begin{proof} +[Proof of Theorem \ref{thm.rec-seq.fibx}.]\textbf{(a)} We claim that for each +$n\in\mathbb{N}$, we have% +\begin{equation} +\left( x_{n+m+1}=bx_{n}x_{m}+x_{n+1}x_{m+1}\text{ for all }m\in +\mathbb{N}\right) . \label{pf.thm.rec-seq.fibx.a.claim}% +\end{equation} + + +Indeed, let us prove (\ref{pf.thm.rec-seq.fibx.a.claim}) by induction on $n$: + +\textit{Induction base:} We have $x_{0+m+1}=bx_{0}x_{m}+x_{0+1}x_{m+1}$ for +all $m\in\mathbb{N}$\ \ \ \ \footnote{\textit{Proof.} Let $m\in\mathbb{N}$. +Then, $x_{0+m+1}=x_{m+1}$. Comparing this with $b\underbrace{x_{0}}_{=0}% +x_{m}+\underbrace{x_{0+1}}_{=x_{1}=1}x_{m+1}=b0x_{m}+1x_{m+1}=x_{m+1}$, we +obtain $x_{0+m+1}=bx_{0}x_{m}+x_{0+1}x_{m+1}$, qed.}. In other words, +(\ref{pf.thm.rec-seq.fibx.a.claim}) holds for $n=0$. This completes the +induction base. + +\textit{Induction step:} Let $k\in\mathbb{N}$. Assume that +(\ref{pf.thm.rec-seq.fibx.a.claim}) holds for $n=k$. We must prove that +(\ref{pf.thm.rec-seq.fibx.a.claim}) holds for $n=k+1$. + +We have assumed that (\ref{pf.thm.rec-seq.fibx.a.claim}) holds for $n=k$. In +other words, we have% +\begin{equation} +\left( x_{k+m+1}=bx_{k}x_{m}+x_{k+1}x_{m+1}\text{ for all }m\in +\mathbb{N}\right) . \label{pf.thm.rec-seq.fibx.a.claim.pf.IH}% +\end{equation} + + +Now, let $m\in\mathbb{N}$. We have $m+2\geq2$; thus, the recursive definition +of the sequence $\left( x_{0},x_{1},x_{2},\ldots\right) $ yields% +\begin{equation} +x_{m+2}=a\underbrace{x_{\left( m+2\right) -1}}_{=x_{m+1}}% ++b\underbrace{x_{\left( m+2\right) -2}}_{=x_{m}}=ax_{m+1}+bx_{m}. +\label{pf.thm.rec-seq.fibx.a.claim.pf.1}% +\end{equation} +The same argument (with $m$ replaced by $k$) yields% +\begin{equation} +x_{k+2}=ax_{k+1}+bx_{k}. \label{pf.thm.rec-seq.fibx.a.claim.pf.2}% +\end{equation} + + +But we can apply (\ref{pf.thm.rec-seq.fibx.a.claim.pf.IH}) to $m+1$ instead of +$m$. Thus, we obtain% +\begin{align*} +x_{k+\left( m+1\right) +1} & =bx_{k}x_{m+1}+x_{k+1}\underbrace{x_{\left( +m+1\right) +1}}_{\substack{=x_{m+2}=ax_{m+1}+bx_{m}\\\text{(by +(\ref{pf.thm.rec-seq.fibx.a.claim.pf.1}))}}}\\ +& =bx_{k}x_{m+1}+\underbrace{x_{k+1}\left( ax_{m+1}+bx_{m}\right) +}_{=ax_{k+1}x_{m+1}+bx_{k+1}x_{m}}=\underbrace{bx_{k}x_{m+1}+ax_{k+1}x_{m+1}% +}_{=\left( ax_{k+1}+bx_{k}\right) x_{m+1}}+bx_{k+1}x_{m}\\ +& =\underbrace{\left( ax_{k+1}+bx_{k}\right) }_{\substack{=x_{k+2}% +\\\text{(by (\ref{pf.thm.rec-seq.fibx.a.claim.pf.2}))}}}x_{m+1}+bx_{k+1}% +x_{m}=x_{k+2}x_{m+1}+bx_{k+1}x_{m}\\ +& =bx_{k+1}x_{m}+\underbrace{x_{k+2}}_{=x_{\left( k+1\right) +1}}% +x_{m+1}=bx_{k+1}x_{m}+x_{\left( k+1\right) +1}x_{m+1}. +\end{align*} +In view of $k+\left( m+1\right) +1=\left( k+1\right) +m+1$, this rewrites +as +\[ +x_{\left( k+1\right) +m+1}=bx_{k+1}x_{m}+x_{\left( k+1\right) +1}x_{m+1}. +\] + + +Now, forget that we fixed $m$. We thus have shown that $x_{\left( k+1\right) ++m+1}=bx_{k+1}x_{m}+x_{\left( k+1\right) +1}x_{m+1}$ for all $m\in +\mathbb{N}$. In other words, (\ref{pf.thm.rec-seq.fibx.a.claim}) holds for +$n=k+1$. This completes the induction step. Thus, +(\ref{pf.thm.rec-seq.fibx.a.claim}) is proven. + +Hence, Theorem \ref{thm.rec-seq.fibx} \textbf{(a)} holds. + +\textbf{(b)} Fix $n\in\mathbb{N}$. We claim that% +\begin{equation} +x_{n}\mid x_{nw}\ \ \ \ \ \ \ \ \ \ \text{for each }w\in\mathbb{N}. +\label{pf.thm.rec-seq.fibx.b.claim}% +\end{equation} + + +Indeed, let us prove (\ref{pf.thm.rec-seq.fibx.b.claim}) by induction on $w$: + +\textit{Induction base:} We have $x_{n\cdot0}=x_{0}=0=0x_{n}$ and thus +$x_{n}\mid x_{n\cdot0}$. In other words, (\ref{pf.thm.rec-seq.fibx.b.claim}) +holds for $w=0$. This completes the induction base. + +\textit{Induction step:} Let $k\in\mathbb{N}$. Assume that +(\ref{pf.thm.rec-seq.fibx.b.claim}) holds for $w=k$. We must now prove that +(\ref{pf.thm.rec-seq.fibx.b.claim}) holds for $w=k+1$. In other words, we must +prove that $x_{n}\mid x_{n\left( k+1\right) }$. + +If $n=0$, then this is true\footnote{\textit{Proof.} Let us assume that $n=0$. +Then, $x_{n\left( k+1\right) }=x_{0\left( k+1\right) }=x_{0}=0=0x_{n}$, +and thus $x_{n}\mid x_{n\left( k+1\right) }$, qed.}. Hence, for the rest of +this proof, we can WLOG assume that we don't have $n=0$. Assume this. + +We have assumed that (\ref{pf.thm.rec-seq.fibx.b.claim}) holds for $w=k$. In +other words, we have $x_{n}\mid x_{nk}$. In other words, $x_{nk}% +\equiv0\operatorname{mod}x_{n}$.\ \ \ \ \footnote{Here, again, we have used +Proposition \ref{prop.mod.0} \textbf{(a)} (applied to $x_{nk}$ and $x_{n}$ +instead of $a$ and $n$). This argument is simple enough that we will leave it +unsaid in the future.} Likewise, from $x_{n}\mid x_{n}$, we obtain +$x_{n}\equiv0\operatorname{mod}x_{n}$. + +We have $n\in\mathbb{N}$ but $n\neq0$ (since we don't have $n=0$). Hence, $n$ +is a positive integer. Thus, $n-1\in\mathbb{N}$. Therefore, Theorem +\ref{thm.rec-seq.fibx} \textbf{(a)} (applied to $nk$ and $n-1$ instead of $n$ +and $m$) yields +\[ +x_{nk+\left( n-1\right) +1}=bx_{nk}x_{n-1}+x_{nk+1}x_{\left( n-1\right) ++1}. +\] +In view of $nk+\left( n-1\right) +1=n\left( k+1\right) $, this rewrites as% +\[ +x_{n\left( k+1\right) }=b\underbrace{x_{nk}}_{\equiv0\operatorname{mod}% +x_{n}}x_{n-1}+x_{nk+1}\underbrace{x_{\left( n-1\right) +1}}_{=x_{n}% +\equiv0\operatorname{mod}x_{n}}\equiv b0x_{n-1}+x_{nk+1}0=0\operatorname{mod}% +x_{n}. +\] +\footnote{We have used substitutivity for congruences in this computation. +Here is, again, a way to rewrite it without this use: +\par +We have $x_{n\left( k+1\right) }=bx_{nk}x_{n-1}+x_{nk+1}x_{\left( +n-1\right) +1}$. But $b\equiv b\operatorname{mod}x_{n}$ (by Proposition +\ref{prop.mod.transi} \textbf{(a)}) and $x_{n-1}\equiv x_{n-1}% +\operatorname{mod}x_{n}$ (for the same reason) and $x_{nk+1}\equiv +x_{nk+1}\operatorname{mod}x_{n}$ (for the same reason). Now, Proposition +\ref{prop.mod.+-*} \textbf{(c)} (applied to $b$, $b$, $x_{nk}$, $0$ and +$x_{n}$ instead of $a$, $b$, $c$, $d$ and $n$) yields $bx_{nk}\equiv +b0\operatorname{mod}x_{n}$ (since $b\equiv b\operatorname{mod}x_{n}$ and +$x_{nk}\equiv0\operatorname{mod}x_{n}$). Hence, Proposition \ref{prop.mod.+-*} +\textbf{(c)} (applied to $bx_{nk}$, $b0$, $x_{n-1}$, $x_{n-1}$ and $x_{n}$ +instead of $a$, $b$, $c$, $d$ and $n$) yields $bx_{nk}x_{n-1}\equiv +b0x_{n-1}\operatorname{mod}x_{n}$ (since $bx_{nk}\equiv b0\operatorname{mod}% +x_{n}$ and $x_{n-1}\equiv x_{n-1}\operatorname{mod}x_{n}$). Also, +$x_{nk+1}x_{\left( n-1\right) +1}\equiv x_{nk+1}x_{\left( n-1\right) ++1}\operatorname{mod}x_{n}$ (by Proposition \ref{prop.mod.transi} +\textbf{(a)}). Hence, Proposition \ref{prop.mod.+-*} \textbf{(a)} (applied to +$bx_{nk}x_{n-1}$, $b0x_{n-1}$, $x_{nk+1}x_{\left( n-1\right) +1}$, +$x_{nk+1}x_{\left( n-1\right) +1}$ and $x_{n}$ instead of $a$, $b$, $c$, $d$ +and $n$) yields +\[ +bx_{nk}x_{n-1}+x_{nk+1}x_{\left( n-1\right) +1}\equiv b0x_{n-1}% ++x_{nk+1}x_{\left( n-1\right) +1}\operatorname{mod}x_{n}% +\] +(since $bx_{nk}x_{n-1}\equiv b0x_{n-1}\operatorname{mod}x_{n}$ and +$x_{nk+1}x_{\left( n-1\right) +1}\equiv x_{nk+1}x_{\left( n-1\right) ++1}\operatorname{mod}x_{n}$). +\par +Also, Proposition \ref{prop.mod.+-*} \textbf{(c)} (applied to $x_{nk+1}$, +$x_{nk+1}$, $x_{\left( n-1\right) +1}$, $0$ and $x_{n}$ instead of $a$, $b$, +$c$, $d$ and $n$) yields $x_{nk+1}x_{\left( n-1\right) +1}\equiv +x_{nk+1}0\operatorname{mod}x_{n}$ (since $x_{nk+1}\equiv x_{nk+1}% +\operatorname{mod}x_{n}$ and $x_{\left( n-1\right) +1}=x_{n}\equiv +0\operatorname{mod}x_{n}$). Furthermore, $b0x_{n-1}\equiv b0x_{n-1}% +\operatorname{mod}x_{n}$ (by Proposition \ref{prop.mod.transi} \textbf{(a)}). +Finally, Proposition \ref{prop.mod.+-*} \textbf{(a)} (applied to $b0x_{n-1}$, +$b0x_{n-1}$, $x_{nk+1}x_{\left( n-1\right) +1}$, $x_{nk+1}0$ and $x_{n}$ +instead of $a$, $b$, $c$, $d$ and $n$) yields +\[ +b0x_{n-1}+x_{nk+1}x_{\left( n-1\right) +1}\equiv b0x_{n-1}+x_{nk+1}% +0\operatorname{mod}x_{n}% +\] +(since $b0x_{n-1}\equiv b0x_{n-1}\operatorname{mod}x_{n}$ and $x_{nk+1}% +x_{\left( n-1\right) +1}\equiv x_{nk+1}0\operatorname{mod}x_{n}$). Thus,% +\begin{align*} +x_{n\left( k+1\right) } & =bx_{nk}x_{n-1}+x_{nk+1}x_{\left( n-1\right) ++1}\equiv b0x_{n-1}+x_{nk+1}x_{\left( n-1\right) +1}\\ +& \equiv b0x_{n-1}+x_{nk+1}0=0\operatorname{mod}x_{n}. +\end{align*} +So we have proven that $x_{n\left( k+1\right) }\equiv0\operatorname{mod}% +x_{n}$.} Thus, we have shown that $x_{n\left( k+1\right) }\equiv +0\operatorname{mod}x_{n}$. In other words, $x_{n}\mid x_{n\left( k+1\right) +}$ (again, this follows from Proposition \ref{prop.mod.0} \textbf{(a)}). In +other words, (\ref{pf.thm.rec-seq.fibx.b.claim}) holds for $w=k+1$. This +completes the induction step. Hence, (\ref{pf.thm.rec-seq.fibx.b.claim}) is +proven by induction. + +This proves Theorem \ref{thm.rec-seq.fibx} \textbf{(b)}. + +\begin{vershort} +\textbf{(c)} Theorem \ref{thm.rec-seq.fibx} \textbf{(c)} can be derived from +Theorem \ref{thm.rec-seq.fibx} \textbf{(b)} in the same way as Theorem +\ref{thm.rec-seq.somos-simple} \textbf{(c)} was derived from Theorem +\ref{thm.rec-seq.somos-simple} \textbf{(b)}. +\end{vershort} + +\begin{verlong} +\textbf{(c)} Let $u$ and $v$ be two nonnegative integers satisfying $u\mid v$. +We must prove that $x_{u}\mid x_{v}$. If $v=0$, then this is obvious (because +if $v=0$, then $x_{v}=x_{0}=0=0x_{u}$ and therefore $x_{u}\mid x_{v}$). Hence, +for the rest of this proof, we can WLOG assume that we don't have $v=0$. +Assume this. + +Thus, we don't have $v=0$. Hence, $v\neq0$, so that $v>0$ (since $v$ is nonnegative). + +But $u$ divides $v$ (since $u\mid v$). In other words, there exists an integer +$w$ such that $v=uw$. Consider this $w$. If we had $w<0$, then we would have +$uw\leq0$ (since $u$ is nonnegative), which would contradict $uw=v>0$. Hence, +we cannot have $w<0$. Thus, we must have $w\geq0$. Therefore, $w\in\mathbb{N}% +$. Hence, Theorem \ref{thm.rec-seq.fibx} \textbf{(b)} (applied to $n=u$) +yields $x_{u}\mid x_{uw}$. In view of $v=uw$, this rewrites as $x_{u}\mid +x_{v}$. This proves Theorem \ref{thm.rec-seq.fibx} \textbf{(c)}. +\end{verlong} +\end{proof} + +Applying Theorem \ref{thm.rec-seq.fibx} \textbf{(a)} to $a=1$ and $b=1$, we +obtain the equality $f_{n+m+1}=f_{n}f_{m}+f_{n+1}f_{m+1}$ noticed in Example +\ref{exa.rec-seq.fib}. Applying Theorem \ref{thm.rec-seq.fibx} \textbf{(c)} to +$a=1$ and $b=1$, we obtain the observation about divisibility made in Example +\ref{exa.rec-seq.fib}. + +Note that part \textbf{(a)} of Theorem \ref{thm.rec-seq.fibx} still works if +$a$ and $b$ are real numbers (instead of being integers). But of course, in +this case, $\left( x_{0},x_{1},x_{2},\ldots\right) $ will be merely a +sequence of real numbers (rather than a sequence of integers), and thus parts +\textbf{(b)} and \textbf{(c)} of Theorem \ref{thm.rec-seq.fibx} will no longer +make sense (since divisibility is only defined for integers). + +\subsection{\label{sect.ind.trinum}The sum of the first $n$ positive integers} + +We now come to one of the most classical examples of a proof by induction: +Namely, we shall prove the fact that for each $n\in\mathbb{N}$, the sum of the +first $n$ positive integers (that is, the sum $1+2+\cdots+n$) equals +$\dfrac{n\left( n+1\right) }{2}$. However, there is a catch here, which is +easy to overlook if one isn't trying to be completely rigorous: We don't +really know yet whether there is such a thing as \textquotedblleft the sum of +the first $n$ positive integers\textquotedblright! To be more precise, we have +introduced the $\sum$ sign in Section \ref{sect.sums-repetitorium}, which +would allow us to define the sum of the first $n$ positive integers (as +$\sum_{i=1}^{n}i$); but our definition of the $\sum$ sign relied on a fact +which we have not proved yet (namely, the fact that the right hand side of +(\ref{eq.sum.def.1}) does not depend on the choice of $t$). We shall prove +this fact later (Theorem \ref{thm.ind.gen-com.wd} \textbf{(a)}), but for now +we prefer not to use it. Instead, let us replace the notion of +\textquotedblleft the sum of the first $n$ positive integers\textquotedblright% +\ by a recursively defined sequence: + +\begin{proposition} +\label{prop.rec-seq.triangular}Let $\left( t_{0},t_{1},t_{2},\ldots\right) $ +be a sequence of integers defined recursively by% +\begin{align*} +t_{0} & =0,\ \ \ \ \ \ \ \ \ \ \text{and}\\ +t_{n} & =t_{n-1}+n\ \ \ \ \ \ \ \ \ \ \text{for each }n\geq1. +\end{align*} +Then,% +\begin{equation} +t_{n}=\dfrac{n\left( n+1\right) }{2}\ \ \ \ \ \ \ \ \ \ \text{for each }% +n\in\mathbb{N}. \label{eq.prop.rec-seq.triangular.claim}% +\end{equation} + +\end{proposition} + +The sequence $\left( t_{0},t_{1},t_{2},\ldots\right) $ defined in +Proposition \ref{prop.rec-seq.triangular} is known as the \textit{sequence of +triangular numbers}. Its definition shows that% +\begin{align*} +t_{0} & =0;\\ +t_{1} & =\underbrace{t_{0}}_{=0}+1=0+1=1;\\ +t_{2} & =\underbrace{t_{1}}_{=1}+2=1+2;\\ +t_{3} & =\underbrace{t_{2}}_{=1+2}+3=\left( 1+2\right) +3;\\ +t_{4} & =\underbrace{t_{3}}_{=\left( 1+2\right) +3}+4=\left( \left( +1+2\right) +3\right) +4;\\ +t_{5} & =\underbrace{t_{4}}_{=\left( \left( 1+2\right) +3\right) ++4}+5=\left( \left( \left( 1+2\right) +3\right) +4\right) +5 +\end{align*} +\footnote{Note that we write \textquotedblleft$\left( \left( \left( +1+2\right) +3\right) +4\right) +5$\textquotedblright\ and not +\textquotedblleft$1+2+3+4+5$\textquotedblright. The reason for this is that we +haven't proven yet that the expression \textquotedblleft$1+2+3+4+5$% +\textquotedblright\ is well-defined. (This expression \textbf{is} +well-defined, but this will only be clear once we have proven Theorem +\ref{thm.ind.gen-com.wd} \textbf{(a)} below.)} and so on; this explains why it +makes sense to think of $t_{n}$ as the sum of the first $n$ positive integers. +(This is legitimate even when $n=0$, because the sum of the first $0$ positive +integers is an empty sum, and an empty sum is always defined to be equal to +$0$.) Once we have convinced ourselves that \textquotedblleft the sum of the +first $n$ positive integers\textquotedblright\ is a well-defined concept, it +will be easy to see (by induction) that $t_{n}$ \textbf{is} the sum of the +first $n$ positive integers whenever $n\in\mathbb{N}$. Therefore, Proposition +\ref{prop.rec-seq.triangular} will tell us that the sum of the first $n$ +positive integers equals $\dfrac{n\left( n+1\right) }{2}$ whenever +$n\in\mathbb{N}$. + +For now, let us prove Proposition \ref{prop.rec-seq.triangular}: + +\begin{proof} +[Proof of Proposition \ref{prop.rec-seq.triangular}.]We shall prove +(\ref{eq.prop.rec-seq.triangular.claim}) by induction on $n$: + +\textit{Induction base:} Comparing $t_{0}=0$ with $\dfrac{0\left( 0+1\right) +}{2}=0$, we obtain $t_{0}=\dfrac{0\left( 0+1\right) }{2}$. In other words, +(\ref{eq.prop.rec-seq.triangular.claim}) holds for $n=0$. This completes the +induction base. + +\textit{Induction step:} Let $m\in\mathbb{N}$. Assume that +(\ref{eq.prop.rec-seq.triangular.claim}) holds for $n=m$. We must prove that +(\ref{eq.prop.rec-seq.triangular.claim}) holds for $n=m+1$. + +We have assumed that (\ref{eq.prop.rec-seq.triangular.claim}) holds for $n=m$. +In other words, we have $t_{m}=\dfrac{m\left( m+1\right) }{2}$. + +Recall that $t_{n}=t_{n-1}+n$ for each $n\geq1$. Applying this to $n=m+1$, we +obtain% +\begin{align*} +t_{m+1} & =\underbrace{t_{\left( m+1\right) -1}}_{=t_{m}=\dfrac{m\left( +m+1\right) }{2}}+\left( m+1\right) =\dfrac{m\left( m+1\right) }% +{2}+\left( m+1\right) \\ +& =\dfrac{m\left( m+1\right) +2\left( m+1\right) }{2}=\dfrac{\left( +m+2\right) \left( m+1\right) }{2}=\dfrac{\left( m+1\right) \left( +m+2\right) }{2}\\ +& =\dfrac{\left( m+1\right) \left( \left( m+1\right) +1\right) }{2}% +\end{align*} +(since $m+2=\left( m+1\right) +1$). In other words, +(\ref{eq.prop.rec-seq.triangular.claim}) holds for $n=m+1$. This completes the +induction step. Hence, (\ref{eq.prop.rec-seq.triangular.claim}) is proven by +induction. This proves Proposition \ref{prop.rec-seq.triangular}. +\end{proof} + +\subsection{\label{sect.ind.max}Induction on a derived quantity: maxima of +sets} + +\subsubsection{Defining maxima} + +We have so far been applying the Induction Principle in fairly obvious ways: +With the exception of our proof of Proposition \ref{prop.mod.chain}, we have +mostly been doing induction on a variable ($n$ or $k$ or $i$) that already +appeared in the claim that we were proving. But sometimes, it is worth doing +induction on a variable that does \textbf{not} explicitly appear in this claim +(which, formally speaking, means that we introduce a new variable to do +induction on). For example, the claim might be saying \textquotedblleft Each +nonempty finite set $S$ of integers has a largest element\textquotedblright, +and we prove it by induction on $\left\vert S\right\vert -1$. This means that +instead of directly proving the claim itself, we rather prove the equivalent +claim \textquotedblleft For each $n\in\mathbb{N}$, each nonempty finite set +$S$ of integers satisfying $\left\vert S\right\vert -1=n$ has a largest +element\textquotedblright\ by induction on $n$. We shall show this proof in +more detail below (see Theorem \ref{thm.ind.max}). First, we prepare by +discussing largest elements of sets in general. + +\begin{definition} +Let $S$ be a set of integers (or rational numbers, or real numbers). A +\textit{maximum} of $S$ is defined to be an element $s\in S$ that satisfies% +\[ +\left( s\geq t\text{ for each }t\in S\right) . +\] +In other words, a maximum of $S$ is defined to be an element of $S$ which is +greater or equal to each element of $S$. + +(The plural of the word \textquotedblleft maximum\textquotedblright\ is +\textquotedblleft maxima\textquotedblright.) +\end{definition} + +\begin{example} +The set $\left\{ 2,4,5\right\} $ has exactly one maximum: namely, $5$. + +The set $\mathbb{N}=\left\{ 0,1,2,\ldots\right\} $ has no maximum: If $k$ +was a maximum of $\mathbb{N}$, then we would have $k\geq k+1$, which is absurd. + +The set $\left\{ 0,-1,-2,\ldots\right\} $ has a maximum: namely, $0$. + +The set $\varnothing$ has no maximum, since a maximum would have to be an +element of $\varnothing$. +\end{example} + +In Theorem \ref{thm.ind.max}, we shall soon show that every nonempty finite +set of integers has a maximum. First, we prove that a maximum is unique if it exists: + +\begin{proposition} +\label{prop.ind.max-uni}Let $S$ be a set of integers (or rational numbers, or +real numbers). Then, $S$ has \textbf{at most one} maximum. +\end{proposition} + +\begin{proof} +[Proof of Proposition \ref{prop.ind.max-uni}.]Let $s_{1}$ and $s_{2}$ be two +maxima of $S$. We shall show that $s_{1}=s_{2}$. + +Indeed, $s_{1}$ is a maximum of $S$. In other words, $s_{1}$ is an element +$s\in S$ that satisfies $\left( s\geq t\text{ for each }t\in S\right) $ (by +the definition of a maximum). In other words, $s_{1}$ is an element of $S$ and +satisfies% +\begin{equation} +\left( s_{1}\geq t\text{ for each }t\in S\right) . +\label{pf.prop.ind.max-uni.1}% +\end{equation} +The same argument (applied to $s_{2}$ instead of $s_{1}$) shows that $s_{2}$ +is an element of $S$ and satisfies% +\begin{equation} +\left( s_{2}\geq t\text{ for each }t\in S\right) . +\label{pf.prop.ind.max-uni.2}% +\end{equation} + + +Now, $s_{1}$ is an element of $S$. Hence, (\ref{pf.prop.ind.max-uni.2}) +(applied to $t=s_{1}$) yields $s_{2}\geq s_{1}$. But the same argument (with +the roles of $s_{1}$ and $s_{2}$ interchanged) shows that $s_{1}\geq s_{2}$. +Combining this with $s_{2}\geq s_{1}$, we obtain $s_{1}=s_{2}$. + +Now, forget that we fixed $s_{1}$ and $s_{2}$. We thus have shown that if +$s_{1}$ and $s_{2}$ are two maxima of $S$, then $s_{1}=s_{2}$. In other words, +any two maxima of $S$ are equal. In other words, $S$ has \textbf{at most one} +maximum. This proves Proposition \ref{prop.ind.max-uni}. +\end{proof} + +\begin{definition} +Let $S$ be a set of integers (or rational numbers, or real numbers). +Proposition \ref{prop.ind.max-uni} shows that $S$ has \textbf{at most one} +maximum. Thus, if $S$ has a maximum, then this maximum is the unique maximum +of $S$; we shall thus call it \textit{the maximum} of $S$ or \textit{the +largest element} of $S$. We shall denote this maximum by $\max S$. +\end{definition} + +Thus, if $S$ is a set of integers (or rational numbers, or real numbers) that +has a maximum, then this maximum $\max S$ satisfies +\begin{equation} +\max S\in S \label{eq.ind.max.def-max.1}% +\end{equation} +and% +\begin{equation} +\left( \max S\geq t\text{ for each }t\in S\right) +\label{eq.ind.max.def-max.2}% +\end{equation} +(because of the definition of a maximum). + +Let us next show two simple facts: + +\begin{lemma} +\label{lem.ind.max-1el}Let $x$ be an integer (or rational number, or real +number). Then, the set $\left\{ x\right\} $ has a maximum, namely $x$. +\end{lemma} + +\begin{proof} +[Proof of Lemma \ref{lem.ind.max-1el}.]Clearly, $x\geq x$. Thus, $x\geq t$ for +each $t\in\left\{ x\right\} $ (because the only $t\in\left\{ x\right\} $ +is $x$). In other words, $x$ is an element $s\in\left\{ x\right\} $ that +satisfies \newline$\left( s\geq t\text{ for each }t\in\left\{ x\right\} +\right) $ (since $x\in\left\{ x\right\} $). + +But recall that a maximum of $\left\{ x\right\} $ means an element +$s\in\left\{ x\right\} $ that satisfies $\left( s\geq t\text{ for each +}t\in\left\{ x\right\} \right) $ (by the definition of a maximum). Hence, +$x$ is a maximum of $\left\{ x\right\} $ (since $x$ is such an element). +Thus, the set $\left\{ x\right\} $ has a maximum, namely $x$. This proves +Lemma \ref{lem.ind.max-1el}. +\end{proof} + +\begin{proposition} +\label{prop.ind.max-PuQ}Let $P$ and $Q$ be two sets of integers (or rational +numbers, or real numbers). Assume that $P$ has a maximum, and assume that $Q$ +has a maximum. Then, the set $P\cup Q$ has a maximum. +\end{proposition} + +\begin{proof} +[Proof of Proposition \ref{prop.ind.max-PuQ}.]We know that $P$ has a maximum; +it is denoted by $\max P$. We also know that $Q$ has a maximum; it is denoted +by $\max Q$. The sets $P$ and $Q$ play symmetric roles in Proposition +\ref{prop.ind.max-PuQ} (since $P\cup Q=Q\cup P$). Thus, we can WLOG assume +that $\max P\geq\max Q$ (since otherwise, we can simply swap $P$ with $Q$, +without altering the meaning of Proposition \ref{prop.ind.max-PuQ}). Assume this. + +Now, (\ref{eq.ind.max.def-max.1}) (applied to $S=P$) shows that $\max P\in +P\subseteq P\cup Q$. Furthermore, we claim that% +\begin{equation} +\left( \max P\geq t\text{ for each }t\in P\cup Q\right) . +\label{pf.prop.ind.max-PuQ.1}% +\end{equation} + + +[\textit{Proof of (\ref{pf.prop.ind.max-PuQ.1}):} Let $t\in P\cup Q$. We must +show that $\max P\geq t$. + +We have $t\in P\cup Q$. In other words, $t\in P$ or $t\in Q$. Hence, we are in +one of the following two cases: + +\textit{Case 1:} We have $t\in P$. + +\textit{Case 2:} We have $t\in Q$. + +(These two cases might have overlap, but there is nothing wrong about this.) + +Let us first consider Case 1. In this case, we have $t\in P$. Hence, +(\ref{eq.ind.max.def-max.2}) (applied to $S=P$) yields $\max P\geq t$. Hence, +$\max P\geq t$ is proven in Case 1. + +Let us next consider Case 2. In this case, we have $t\in Q$. Hence, +(\ref{eq.ind.max.def-max.2}) (applied to $S=Q$) yields $\max Q\geq t$. Hence, +$\max P\geq\max Q\geq t$. Thus, $\max P\geq t$ is proven in Case 2. + +We have now proven $\max P\geq t$ in each of the two Cases 1 and 2. Since +these two Cases cover all possibilities, we thus conclude that $\max P\geq t$ +always holds. This proves (\ref{pf.prop.ind.max-PuQ.1}).] + +Now, $\max P$ is an element $s\in P\cup Q$ that satisfies $\left( s\geq +t\text{ for each }t\in P\cup Q\right) $ (since $\max P\in P\cup Q$ and +$\left( \max P\geq t\text{ for each }t\in P\cup Q\right) $). + +But recall that a maximum of $P\cup Q$ means an element $s\in P\cup Q$ that +satisfies $\left( s\geq t\text{ for each }t\in P\cup Q\right) $ (by the +definition of a maximum). Hence, $\max P$ is a maximum of $P\cup Q$ (since +$\max P$ is such an element). Thus, the set $P\cup Q$ has a maximum. This +proves Proposition \ref{prop.ind.max-PuQ}. +\end{proof} + +\subsubsection{Nonempty finite sets of integers have maxima} + +\begin{theorem} +\label{thm.ind.max}Let $S$ be a nonempty finite set of integers. Then, $S$ has +a maximum. +\end{theorem} + +\begin{proof} +[First proof of Theorem \ref{thm.ind.max}.]First of all, let us forget that we +fixed $S$. So we want to prove that if $S$ is a nonempty finite set of +integers, then $S$ has a maximum. + +For each $n\in\mathbb{N}$, we let $\mathcal{A}\left( n\right) $ be the +statement% +\[ +\left( +\begin{array} +[c]{c}% +\text{if }S\text{ is a nonempty finite set of integers satisfying }\left\vert +S\right\vert -1=n\text{,}\\ +\text{then }S\text{ has a maximum}% +\end{array} +\right) . +\] + + +We claim that $\mathcal{A}\left( n\right) $ holds for all $n\in\mathbb{N}$. + +Indeed, let us prove this by induction on $n$: + +\textit{Induction base:} If $S$ is a nonempty finite set of integers +satisfying $\left\vert S\right\vert -1=0$, then $S$ has a +maximum\footnote{\textit{Proof.} Let $S$ be a nonempty finite set of integers +satisfying $\left\vert S\right\vert -1=0$. We must show that $S$ has a +maximum. +\par +Indeed, $\left\vert S\right\vert =1$ (since $\left\vert S\right\vert -1=0$). +In other words, $S$ is a $1$-element set. In other words, $S=\left\{ +x\right\} $ for some integer $x$. Consider this $x$. Lemma +\ref{lem.ind.max-1el} shows that the set $\left\{ x\right\} $ has a maximum. +In other words, the set $S$ has a maximum (since $S=\left\{ x\right\} $). +This completes our proof.}. But this is exactly the statement $\mathcal{A}% +\left( 0\right) $. Hence, $\mathcal{A}\left( 0\right) $ holds. This +completes the induction base. + +\textit{Induction step:} Let $m\in\mathbb{N}$. Assume that $\mathcal{A}\left( +m\right) $ holds. We shall now show that $\mathcal{A}\left( m+1\right) $ holds. + +We have assumed that $\mathcal{A}\left( m\right) $ holds. In other words,% +\begin{equation} +\left( +\begin{array} +[c]{c}% +\text{if }S\text{ is a nonempty finite set of integers satisfying }\left\vert +S\right\vert -1=m\text{,}\\ +\text{then }S\text{ has a maximum}% +\end{array} +\right) \label{pf.thm.ind.max.IH}% +\end{equation} +(because this is what the statement $\mathcal{A}\left( m\right) $ says). + +Now, let $S$ be a nonempty finite set of integers satisfying $\left\vert +S\right\vert -1=m+1$. There exists some $t\in S$ (since $S$ is nonempty). +Consider this $t$. We have $\left( S\setminus\left\{ t\right\} \right) +\cup\left\{ t\right\} =S\cup\left\{ t\right\} =S$ (since $t\in S$). + +From $t\in S$, we obtain $\left\vert S\setminus\left\{ t\right\} \right\vert +=\left\vert S\right\vert -1=m+1>m\geq0$ (since $m\in\mathbb{N}$). Hence, the +set $S\setminus\left\{ t\right\} $ is nonempty. Furthermore, this set +$S\setminus\left\{ t\right\} $ is finite (since $S$ is finite) and satisfies +$\left\vert S\setminus\left\{ t\right\} \right\vert -1=m$ (since $\left\vert +S\setminus\left\{ t\right\} \right\vert =m+1$). Hence, +(\ref{pf.thm.ind.max.IH}) (applied to $S\setminus\left\{ t\right\} $ instead +of $S$) shows that $S\setminus\left\{ t\right\} $ has a maximum. Also, Lemma +\ref{lem.ind.max-1el} (applied to $x=t$) shows that the set $\left\{ +t\right\} $ has a maximum, namely $t$. Hence, Proposition +\ref{prop.ind.max-PuQ} (applied to $P=S\setminus\left\{ t\right\} $ and +$Q=\left\{ t\right\} $) shows that the set $\left( S\setminus\left\{ +t\right\} \right) \cup\left\{ t\right\} $ has a maximum. Since $\left( +S\setminus\left\{ t\right\} \right) \cup\left\{ t\right\} =S$, this +rewrites as follows: The set $S$ has a maximum. + +Now, forget that we fixed $S$. We thus have shown that if $S$ is a nonempty +finite set of integers satisfying $\left\vert S\right\vert -1=m+1$, then $S$ +has a maximum. But this is precisely the statement $\mathcal{A}\left( +m+1\right) $. Hence, we have shown that $\mathcal{A}\left( m+1\right) $ +holds. This completes the induction step. + +Thus, we have proven (by induction) that $\mathcal{A}\left( n\right) $ holds +for all $n\in\mathbb{N}$. In other words, for all $n\in\mathbb{N}$, the +following holds:% +\begin{equation} +\left( +\begin{array} +[c]{c}% +\text{if }S\text{ is a nonempty finite set of integers satisfying }\left\vert +S\right\vert -1=n\text{,}\\ +\text{then }S\text{ has a maximum}% +\end{array} +\right) \label{pf.thm.ind.max.AT}% +\end{equation} +(because this is what $\mathcal{A}\left( n\right) $ says). + +Now, let $S$ be a nonempty finite set of integers. We shall prove that $S$ has +a maximum. + +Indeed, $\left\vert S\right\vert \in\mathbb{N}$ (since $S$ is finite) and +$\left\vert S\right\vert >0$ (since $S$ is nonempty); hence, $\left\vert +S\right\vert \geq1$. Thus, $\left\vert S\right\vert -1\geq0$, so that +$\left\vert S\right\vert -1\in\mathbb{N}$. Hence, we can define $n\in +\mathbb{N}$ by $n=\left\vert S\right\vert -1$. Consider this $n$. Thus, +$\left\vert S\right\vert -1=n$. Hence, (\ref{pf.thm.ind.max.AT}) shows that +$S$ has a maximum. This proves Theorem \ref{thm.ind.max}. +\end{proof} + +\subsubsection{Conventions for writing induction proofs on derived quantities} + +Let us take a closer look at the proof we just gave. The definition of the +statement $\mathcal{A}\left( n\right) $ was not exactly unmotivated: This +statement simply says that Theorem \ref{thm.ind.max} holds under the condition +that $\left\vert S\right\vert -1=n$. Thus, by introducing $\mathcal{A}\left( +n\right) $, we have \textquotedblleft sliced\textquotedblright\ Theorem +\ref{thm.ind.max} into a sequence of statements $\mathcal{A}\left( 0\right) +,\mathcal{A}\left( 1\right) ,\mathcal{A}\left( 2\right) ,\ldots$, which +then allowed us to prove these statements by induction on $n$ even though no +\textquotedblleft$n$\textquotedblright\ appeared in Theorem \ref{thm.ind.max} +itself. This kind of strategy applies to various other problems. Again, we +don't need to explicitly define the statement $\mathcal{A}\left( n\right) $ +if it is simply saying that the claim we are trying to prove (in our case, +Theorem \ref{thm.ind.max}) holds under the condition that $\left\vert +S\right\vert -1=n$; we can just say that we are doing \textquotedblleft +induction on $\left\vert S\right\vert -1$\textquotedblright. More generally: + +\begin{convention} +\label{conv.ind.IP0der}Let $\mathcal{B}$ be a logical statement that involves +some variables $v_{1},v_{2},v_{3},\ldots$. (For example, $\mathcal{B}$ can be +the statement of Theorem \ref{thm.ind.max}; then, there is only one variable, +namely $S$.) + +Let $q$ be some expression (involving the variables $v_{1},v_{2},v_{3},\ldots$ +or some of them) that has the property that whenever the variables +$v_{1},v_{2},v_{3},\ldots$ satisfy the assumptions of $\mathcal{B}$, the +expression $q$ evaluates to some nonnegative integer. (For example, if +$\mathcal{B}$ is the statement of Theorem \ref{thm.ind.max}, then $q$ can be +the expression $\left\vert S\right\vert -1$, because it is easily seen that if +$S$ is a nonempty finite set of integers, then $\left\vert S\right\vert -1$ is +a nonnegative integer.) + +Assume that you want to prove the statement $\mathcal{B}$. Then, you can +proceed as follows: For each $n\in\mathbb{N}$, define $\mathcal{A}\left( +n\right) $ to be the statement saying that\footnotemark% +\[ +\left( \text{the statement }\mathcal{B}\text{ holds under the condition that +}q=n\right) . +\] +Then, prove $\mathcal{A}\left( n\right) $ by induction on $n$. Thus: + +\begin{itemize} +\item The \textit{induction base} consists in proving that the statement +$\mathcal{B}$ holds under the condition that $q=0$. + +\item The \textit{induction step} consists in fixing $m\in\mathbb{N}$, and +showing that if the statement $\mathcal{B}$ holds under the condition that +$q=m$, then the statement $\mathcal{B}$ holds under the condition that $q=m+1$. +\end{itemize} + +Once this induction proof is finished, it immediately follows that the +statement $\mathcal{B}$ always holds. + +This strategy of proof is called \textquotedblleft induction on $q$% +\textquotedblright\ (or \textquotedblleft induction over $q$\textquotedblright% +). Once you have specified what $q$ is, you don't need to explicitly define +$\mathcal{A}\left( n\right) $, nor do you ever need to mention $n$. +\end{convention} + +\footnotetext{We assume that no variable named \textquotedblleft% +$n$\textquotedblright\ appears in the statement $\mathcal{B}$; otherwise, we +need a different letter for our new variable in order to avoid confusion.}% +Using this convention, we can rewrite our above proof of Theorem +\ref{thm.ind.max} as follows: + +\begin{proof} +[First proof of Theorem \ref{thm.ind.max} (second version).]It is easy to see +that $\left\vert S\right\vert -1\in\mathbb{N}$% +\ \ \ \ \footnote{\textit{Proof.} We have $\left\vert S\right\vert +\in\mathbb{N}$ (since $S$ is finite) and $\left\vert S\right\vert >0$ (since +$S$ is nonempty); hence, $\left\vert S\right\vert \geq1$. Thus, $\left\vert +S\right\vert -1\in\mathbb{N}$, qed.}. Hence, we can apply induction on +$\left\vert S\right\vert -1$ to prove Theorem \ref{thm.ind.max}: + +\textit{Induction base:} Theorem \ref{thm.ind.max} holds under the condition +that $\left\vert S\right\vert -1=0$\ \ \ \ \footnote{\textit{Proof.} Let $S$ +be as in Theorem \ref{thm.ind.max}, and assume that $\left\vert S\right\vert +-1=0$. We must show that the claim of Theorem \ref{thm.ind.max} holds. +\par +Indeed, $\left\vert S\right\vert =1$ (since $\left\vert S\right\vert -1=0$). +In other words, $S$ is a $1$-element set. In other words, $S=\left\{ +x\right\} $ for some integer $x$. Consider this $x$. Lemma +\ref{lem.ind.max-1el} shows that the set $\left\{ x\right\} $ has a maximum. +In other words, the set $S$ has a maximum (since $S=\left\{ x\right\} $). In +other words, the claim of Theorem \ref{thm.ind.max} holds. This completes our +proof.}. This completes the induction base. + +\textit{Induction step:} Let $m\in\mathbb{N}$. Assume that Theorem +\ref{thm.ind.max} holds under the condition that $\left\vert S\right\vert +-1=m$. We shall now show that Theorem \ref{thm.ind.max} holds under the +condition that $\left\vert S\right\vert -1=m+1$. + +We have assumed that Theorem \ref{thm.ind.max} holds under the condition that +$\left\vert S\right\vert -1=m$. In other words,% +\begin{equation} +\left( +\begin{array} +[c]{c}% +\text{if }S\text{ is a nonempty finite set of integers satisfying }\left\vert +S\right\vert -1=m\text{,}\\ +\text{then }S\text{ has a maximum}% +\end{array} +\right) . \label{pf.thm.ind.max.ver2.1}% +\end{equation} + + +Now, let $S$ be a nonempty finite set of integers satisfying $\left\vert +S\right\vert -1=m+1$. There exists some $t\in S$ (since $S$ is nonempty). +Consider this $t$. We have $\left( S\setminus\left\{ t\right\} \right) +\cup\left\{ t\right\} =S\cup\left\{ t\right\} =S$ (since $t\in S$). + +From $t\in S$, we obtain $\left\vert S\setminus\left\{ t\right\} \right\vert +=\left\vert S\right\vert -1=m+1>m\geq0$ (since $m\in\mathbb{N}$). Hence, the +set $S\setminus\left\{ t\right\} $ is nonempty. Furthermore, this set +$S\setminus\left\{ t\right\} $ is finite (since $S$ is finite) and satisfies +$\left\vert S\setminus\left\{ t\right\} \right\vert -1=m$ (since $\left\vert +S\setminus\left\{ t\right\} \right\vert =m+1$). Hence, +(\ref{pf.thm.ind.max.ver2.1}) (applied to $S\setminus\left\{ t\right\} $ +instead of $S$) shows that $S\setminus\left\{ t\right\} $ has a maximum. +Also, Lemma \ref{lem.ind.max-1el} (applied to $x=t$) shows that the set +$\left\{ t\right\} $ has a maximum, namely $t$. Hence, Proposition +\ref{prop.ind.max-PuQ} (applied to $P=S\setminus\left\{ t\right\} $ and +$Q=\left\{ t\right\} $) shows that the set $\left( S\setminus\left\{ +t\right\} \right) \cup\left\{ t\right\} $ has a maximum. Since $\left( +S\setminus\left\{ t\right\} \right) \cup\left\{ t\right\} =S$, this +rewrites as follows: The set $S$ has a maximum. + +Now, forget that we fixed $S$. We thus have shown that if $S$ is a nonempty +finite set of integers satisfying $\left\vert S\right\vert -1=m+1$, then $S$ +has a maximum. In other words, Theorem \ref{thm.ind.max} holds under the +condition that $\left\vert S\right\vert -1=m+1$. This completes the induction +step. Thus, the induction proof of Theorem \ref{thm.ind.max} is complete. +\end{proof} + +We could have shortened this proof even further if we didn't explicitly state +(\ref{pf.thm.ind.max.ver2.1}), but rather (instead of applying +(\ref{pf.thm.ind.max.ver2.1})) said that \textquotedblleft we can apply +Theorem \ref{thm.ind.max} to $S\setminus\left\{ t\right\} $ instead of +$S$\textquotedblright. + +Let us stress again that, in order to prove Theorem \ref{thm.ind.max} by +induction on $\left\vert S\right\vert -1$, we had to check that $\left\vert +S\right\vert -1\in\mathbb{N}$ whenever $S$ satisfies the assumptions of +Theorem \ref{thm.ind.max}.\footnote{In our first version of the above proof, +we checked this at the end; in the second version, we checked it at the +beginning of the proof.} This check was necessary. For example, if we had +instead tried to proceed by induction on $\left\vert S\right\vert -2$, then we +would only have proven Theorem \ref{thm.ind.max} under the condition that +$\left\vert S\right\vert -2\in\mathbb{N}$; but this condition isn't always +satisfied (indeed, it misses the case when $S$ is a $1$-element set). + +\subsubsection{Vacuous truth and induction bases} + +Can we also prove Theorem \ref{thm.ind.max} by induction on $\left\vert +S\right\vert $ (instead of $\left\vert S\right\vert -1$)? This seems a bit +strange, since $\left\vert S\right\vert $ can never be $0$ in Theorem +\ref{thm.ind.max} (because $S$ is required to be nonempty), so that the +induction base would be talking about a situation that never occurs. However, +there is nothing wrong about it, and we already do talk about such situations +oftentimes (for example, every time we make a proof by contradiction). The +following concept from basic logic explains this: + +\begin{convention} +\label{conv.logic.vacuous} \textbf{(a)} A logical statement of the form +\textquotedblleft if $\mathcal{A}$, then $\mathcal{B}$\textquotedblright% +\ (where $\mathcal{A}$ and $\mathcal{B}$ are two statements) is said to be +\textit{vacuously true} if $\mathcal{A}$ does not hold. For example, the +statement \textquotedblleft if $0=1$, then every set is +empty\textquotedblright\ is vacuously true, because $0=1$ is false. The +statement \textquotedblleft if $0=1$, then $1=1$\textquotedblright\ is also +vacuously true, although its truth can also be seen as a consequence of the +fact that $1=1$ is true. + +By the laws of logic, a vacuously true statement is always true! This may +sound counterintuitive, but actually makes perfect sense: A statement +\textquotedblleft if $\mathcal{A}$, then $\mathcal{B}$\textquotedblright\ only +says anything about situations where $\mathcal{A}$ holds. If $\mathcal{A}$ +never holds, then it therefore says nothing. And when you are saying nothing, +you are certainly not lying. + +The principle that a vacuously true statement always holds is known as +\textquotedblleft\textit{ex falso quodlibet}\textquotedblright\ (literal +translation: \textquotedblleft from the false, anything\textquotedblright) or +\textquotedblleft% +\href{https://en.wikipedia.org/wiki/Principle_of_explosion}{\textit{principle +of explosion}}\textquotedblright. It can be restated as follows: From a false +statement, any statement follows. + +\textbf{(b)} Now, let $X$ be a set, and let $\mathcal{A}\left( x\right) $ +and $\mathcal{B}\left( x\right) $ be two statements defined for each $x\in +X$. A statement of the form \textquotedblleft for each $x\in X$ satisfying +$\mathcal{A}\left( x\right) $, we have $\mathcal{B}\left( x\right) +$\textquotedblright\ will automatically hold if there exists no $x\in X$ +satisfying $\mathcal{A}\left( x\right) $. (Indeed, this statement can be +rewritten as \textquotedblleft for each $x\in X$, we have $\left( \text{if +}\mathcal{A}\left( x\right) \text{, then }\mathcal{B}\left( x\right) +\right) $\textquotedblright; but this holds because the statement +\textquotedblleft if $\mathcal{A}\left( x\right) $, then $\mathcal{B}\left( +x\right) $\textquotedblright\ is vacuously true for each $x\in X$.) Such a +statement will also be called \textit{vacuously true}. + +For example, the statement \textquotedblleft if $n\in\mathbb{N}$ is both odd +and even, then $n=n+1$\textquotedblright\ is vacuously true, since no +$n\in\mathbb{N}$ can be both odd and even at the same time. + +\textbf{(c)} Now, let $X$ be the empty set (that is, $X=\varnothing$), and let +$\mathcal{B}\left( x\right) $ be a statement defined for each $x\in X$. +Then, a statement of the form \textquotedblleft for each $x\in X$, we have +$\mathcal{B}\left( x\right) $\textquotedblright\ will automatically hold. +(Indeed, this statement can be rewritten as \textquotedblleft for each $x\in +X$, we have $\left( \text{if }x\in X\text{, then }\mathcal{B}\left( +x\right) \right) $\textquotedblright; but this holds because the statement +\textquotedblleft if $x\in X$, then $\mathcal{B}\left( x\right) +$\textquotedblright\ is vacuously true for each $x\in X$, since its premise +($x\in X$) is false.) Again, such a statement is said to be \textit{vacuously +true}. + +For example, the statement \textquotedblleft for each $x\in\varnothing$, we +have $x\neq x$\textquotedblright\ is vacuously true (because there exists no +$x\in\varnothing$). +\end{convention} + +Thus, if we try to prove Theorem \ref{thm.ind.max} by induction on $\left\vert +S\right\vert $, then the induction base becomes vacuously true. However, the +induction step becomes more complicated, since we can no longer argue that +$S\setminus\left\{ t\right\} $ is nonempty, but instead have to account for +the case when $S\setminus\left\{ t\right\} $ is empty as well. So we gain +and we lose at the same time. Here is how this proof looks like: + +\begin{proof} +[Second proof of Theorem \ref{thm.ind.max}.]Clearly, $\left\vert S\right\vert +\in\mathbb{N}$ (since $S$ is a finite set). Hence, we can apply induction on +$\left\vert S\right\vert $ to prove Theorem \ref{thm.ind.max}: + +\textit{Induction base:} Theorem \ref{thm.ind.max} holds under the condition +that $\left\vert S\right\vert =0$\ \ \ \ \footnote{\textit{Proof.} Let $S$ be +as in Theorem \ref{thm.ind.max}, and assume that $\left\vert S\right\vert =0$. +We must show that the claim of Theorem \ref{thm.ind.max} holds. +\par +Indeed, $\left\vert S\right\vert =0$, so that $S$ is the empty set. This +contradicts the assumption that $S$ be nonempty. From this contradiction, we +conclude that everything holds (by the \textquotedblleft ex falso +quodlibet\textquotedblright\ principle). Thus, in particular, the claim of +Theorem \ref{thm.ind.max} holds. This completes our proof.}. This completes +the induction base. + +\textit{Induction step:} Let $m\in\mathbb{N}$. Assume that Theorem +\ref{thm.ind.max} holds under the condition that $\left\vert S\right\vert =m$. +We shall now show that Theorem \ref{thm.ind.max} holds under the condition +that $\left\vert S\right\vert =m+1$. + +We have assumed that Theorem \ref{thm.ind.max} holds under the condition that +$\left\vert S\right\vert =m$. In other words,% +\begin{equation} +\left( +\begin{array} +[c]{c}% +\text{if }S\text{ is a nonempty finite set of integers satisfying }\left\vert +S\right\vert =m\text{,}\\ +\text{then }S\text{ has a maximum}% +\end{array} +\right) . \label{pf.thm.ind.max.ver3.1}% +\end{equation} + + +Now, let $S$ be a nonempty finite set of integers satisfying $\left\vert +S\right\vert =m+1$. We want to prove that $S$ has a maximum. + +There exists some $t\in S$ (since $S$ is nonempty). Consider this $t$. We have +$\left( S\setminus\left\{ t\right\} \right) \cup\left\{ t\right\} +=S\cup\left\{ t\right\} =S$ (since $t\in S$). Lemma \ref{lem.ind.max-1el} +(applied to $x=t$) shows that the set $\left\{ t\right\} $ has a maximum, +namely $t$. + +We are in one of the following two cases: + +\textit{Case 1:} We have $S\setminus\left\{ t\right\} =\varnothing$. + +\textit{Case 2:} We have $S\setminus\left\{ t\right\} \neq\varnothing$. + +Let us first consider Case 1. In this case, we have $S\setminus\left\{ +t\right\} =\varnothing$. Hence, $S\subseteq\left\{ t\right\} $. Thus, +either $S=\varnothing$ or $S=\left\{ t\right\} $ (since the only subsets of +$\left\{ t\right\} $ are $\varnothing$ and $\left\{ t\right\} $). Since +$S=\varnothing$ is impossible (because $S$ is nonempty), we thus have +$S=\left\{ t\right\} $. But the set $\left\{ t\right\} $ has a maximum. In +view of $S=\left\{ t\right\} $, this rewrites as follows: The set $S$ has a +maximum. Thus, our goal (to prove that $S$ has a maximum) is achieved in Case 1. + +Let us now consider Case 2. In this case, we have $S\setminus\left\{ +t\right\} \neq\varnothing$. Hence, the set $S\setminus\left\{ t\right\} $ +is nonempty. From $t\in S$, we obtain $\left\vert S\setminus\left\{ +t\right\} \right\vert =\left\vert S\right\vert -1=m$ (since $\left\vert +S\right\vert =m+1$). Furthermore, the set $S\setminus\left\{ t\right\} $ is +finite (since $S$ is finite). Hence, (\ref{pf.thm.ind.max.ver3.1}) (applied to +$S\setminus\left\{ t\right\} $ instead of $S$) shows that $S\setminus +\left\{ t\right\} $ has a maximum. Also, recall that the set $\left\{ +t\right\} $ has a maximum. Hence, Proposition \ref{prop.ind.max-PuQ} (applied +to $P=S\setminus\left\{ t\right\} $ and $Q=\left\{ t\right\} $) shows that +the set $\left( S\setminus\left\{ t\right\} \right) \cup\left\{ +t\right\} $ has a maximum. Since $\left( S\setminus\left\{ t\right\} +\right) \cup\left\{ t\right\} =S$, this rewrites as follows: The set $S$ +has a maximum. Hence, our goal (to prove that $S$ has a maximum) is achieved +in Case 2. + +We have now proven that $S$ has a maximum in each of the two Cases 1 and 2. +Therefore, $S$ always has a maximum (since Cases 1 and 2 cover all possibilities). + +Now, forget that we fixed $S$. We thus have shown that if $S$ is a nonempty +finite set of integers satisfying $\left\vert S\right\vert =m+1$, then $S$ has +a maximum. In other words, Theorem \ref{thm.ind.max} holds under the condition +that $\left\vert S\right\vert =m+1$. This completes the induction step. Thus, +the induction proof of Theorem \ref{thm.ind.max} is complete. +\end{proof} + +\subsubsection{Further results on maxima and minima} + +We can replace \textquotedblleft integers\textquotedblright\ by +\textquotedblleft rational numbers\textquotedblright\ or \textquotedblleft +real numbers\textquotedblright\ in Theorem \ref{thm.ind.max}; all the proofs +given above still apply then. Thus, we obtain the following: + +\begin{theorem} +\label{thm.ind.max2}Let $S$ be a nonempty finite set of integers (or rational +numbers, or real numbers). Then, $S$ has a maximum. +\end{theorem} + +Hence, if $S$ is a nonempty finite set of integers (or rational numbers, or +real numbers), then $\max S$ is well-defined (because Theorem +\ref{thm.ind.max2} shows that $S$ has a maximum, and Proposition +\ref{prop.ind.max-uni} shows that this maximum is unique). + +Moreover, just as we have defined maxima (i.e., largest elements) of sets, we +can define minima (i.e., smallest elements) of sets, and prove similar results +about them: + +\begin{definition} +Let $S$ be a set of integers (or rational numbers, or real numbers). A +\textit{minimum} of $S$ is defined to be an element $s\in S$ that satisfies% +\[ +\left( s\leq t\text{ for each }t\in S\right) . +\] +In other words, a minimum of $S$ is defined to be an element of $S$ which is +less or equal to each element of $S$. + +(The plural of the word \textquotedblleft minimum\textquotedblright\ is +\textquotedblleft minima\textquotedblright.) +\end{definition} + +\begin{example} +The set $\left\{ 2,4,5\right\} $ has exactly one minimum: namely, $2$. + +The set $\mathbb{N}=\left\{ 0,1,2,\ldots\right\} $ has exactly one minimum: +namely, $0$. + +The set $\left\{ 0,-1,-2,\ldots\right\} $ has no minimum: If $k$ was a +minimum of this set, then we would have $k\leq k-1$, which is absurd. + +The set $\varnothing$ has no minimum, since a minimum would have to be an +element of $\varnothing$. +\end{example} + +The analogue of Proposition \ref{prop.ind.max-uni} for minima instead of +maxima looks exactly as one would expect it: + +\begin{proposition} +\label{prop.ind.min-uni}Let $S$ be a set of integers (or rational numbers, or +real numbers). Then, $S$ has \textbf{at most one} minimum. +\end{proposition} + +\begin{proof} +[Proof of Proposition \ref{prop.ind.min-uni}.]To obtain a proof of Proposition +\ref{prop.ind.min-uni}, it suffices to replace every \textquotedblleft$\geq +$\textquotedblright\ sign by a \textquotedblleft$\leq$\textquotedblright\ sign +(and every word \textquotedblleft maximum\textquotedblright\ by +\textquotedblleft minimum\textquotedblright) in the proof of Proposition +\ref{prop.ind.max-uni} given above. +\end{proof} + +\begin{definition} +Let $S$ be a set of integers (or rational numbers, or real numbers). +Proposition \ref{prop.ind.min-uni} shows that $S$ has \textbf{at most one} +minimum. Thus, if $S$ has a minimum, then this minimum is the unique minimum +of $S$; we shall thus call it \textit{the minimum} of $S$ or \textit{the +smallest element} of $S$. We shall denote this minimum by $\min S$. +\end{definition} + +The analogue of Theorem \ref{thm.ind.max2} is the following: + +\begin{theorem} +\label{thm.ind.min2}Let $S$ be a nonempty finite set of integers (or rational +numbers, or real numbers). Then, $S$ has a minimum. +\end{theorem} + +\begin{proof} +[Proof of Theorem \ref{thm.ind.min2}.]To obtain a proof of Theorem +\ref{thm.ind.min2}, it suffices to replace every \textquotedblleft$\geq +$\textquotedblright\ sign by a \textquotedblleft$\leq$\textquotedblright\ sign +(and every word \textquotedblleft maximum\textquotedblright\ by +\textquotedblleft minimum\textquotedblright) in the proof of Theorem +\ref{thm.ind.max2} given above (and also in the proofs of all the auxiliary +results that were used in said proof).\footnote{To be technically precise: not +every \textquotedblleft$\geq$\textquotedblright\ sign, of course. The +\textquotedblleft$\geq$\textquotedblright\ sign in \textquotedblleft$m\geq +0$\textquotedblright\ should stay unchanged.} +\end{proof} + +Alternatively, Theorem \ref{thm.ind.min2} can be obtained from Theorem +\ref{thm.ind.max2} by applying the latter theorem to the set $\left\{ +-s\ \mid\ s\in S\right\} $. In fact, it is easy to see that a number $x$ is +the minimum of $S$ if and only if $-x$ is the maximum of the set $\left\{ +-s\ \mid\ s\in S\right\} $. We leave the details of this simple argument to +the reader. + +We also should mention that Theorem \ref{thm.ind.max} holds \textbf{without} +requiring that $S$ be finite, if we instead require that $S$ consist of +nonnegative integers: + +\begin{theorem} +\label{thm.ind.max3}Let $S$ be a nonempty set of nonnegative integers. Then, +$S$ has a minimum. +\end{theorem} + +But $S$ does not necessarily have a maximum in this situation; the +nonnegativity requirement has \textquotedblleft broken the +symmetry\textquotedblright\ between maxima and minima. + +\begin{proof} +[Proof of Theorem \ref{thm.ind.max3}.]The set $S$ is nonempty. Thus, there +exists some $p\in S$. Consider this $p$. + +We have $p\in S\subseteq\mathbb{N}$ (since $S$ is a set of nonnegative +integers). Thus, $p\in\left\{ 0,1,\ldots,p\right\} $. Combining this with +$p\in S$, we obtain $p\in\left\{ 0,1,\ldots,p\right\} \cap S$. Hence, the +set $\left\{ 0,1,\ldots,p\right\} \cap S$ contains the element $p$, and thus +is nonempty. Moreover, this set $\left\{ 0,1,\ldots,p\right\} \cap S$ is a +subset of the finite set $\left\{ 0,1,\ldots,p\right\} $, and thus is finite. + +Now we know that $\left\{ 0,1,\ldots,p\right\} \cap S$ is a nonempty finite +set of integers. Hence, Theorem \ref{thm.ind.min2} (applied to $\left\{ +0,1,\ldots,p\right\} \cap S$ instead of $S$) shows that the set $\left\{ +0,1,\ldots,p\right\} \cap S$ has a minimum. Denote this minimum by $m$. + +Hence, $m$ is a minimum of the set $\left\{ 0,1,\ldots,p\right\} \cap S$. In +other words, $m$ is an element $s\in\left\{ 0,1,\ldots,p\right\} \cap S$ +that satisfies% +\[ +\left( s\leq t\text{ for each }t\in\left\{ 0,1,\ldots,p\right\} \cap +S\right) +\] +(by the definition of a minimum). In other words, $m$ is an element of +$\left\{ 0,1,\ldots,p\right\} \cap S$ and satisfies% +\begin{equation} +\left( m\leq t\text{ for each }t\in\left\{ 0,1,\ldots,p\right\} \cap +S\right) . \label{pf.thm.ind.max3.m1}% +\end{equation} + + +Hence, $m\in\left\{ 0,1,\ldots,p\right\} \cap S\subseteq\left\{ +0,1,\ldots,p\right\} $, so that $m\leq p$. + +Furthermore, $m\in\left\{ 0,1,\ldots,p\right\} \cap S\subseteq S$. Moreover, +we have% +\begin{equation} +\left( m\leq t\text{ for each }t\in S\right) . \label{pf.thm.ind.max3.m2}% +\end{equation} + + +\begin{vershort} +[\textit{Proof of (\ref{pf.thm.ind.max3.m2}):} Let $t\in S$. We must prove +that $m\leq t$. + +If $t\in\left\{ 0,1,\ldots,p\right\} \cap S$, then this follows from +(\ref{pf.thm.ind.max3.m1}). Hence, for the rest of this proof, we can WLOG +assume that we don't have $t\in\left\{ 0,1,\ldots,p\right\} \cap S$. Assume +this. Thus, $t\notin\left\{ 0,1,\ldots,p\right\} \cap S$. Combining $t\in S$ +with $t\notin\left\{ 0,1,\ldots,p\right\} \cap S$, we obtain +\[ +t\in S\setminus\left( \left\{ 0,1,\ldots,p\right\} \cap S\right) +=S\setminus\left\{ 0,1,\ldots,p\right\} . +\] +Hence, $t\notin\left\{ 0,1,\ldots,p\right\} $, so that $t>p$ (since +$t\in\mathbb{N}$). Therefore, $t\geq p\geq m$ (since $m\leq p$), so that +$m\leq t$. This completes the proof of (\ref{pf.thm.ind.max3.m2}).] +\end{vershort} + +\begin{verlong} +[\textit{Proof of (\ref{pf.thm.ind.max3.m2}):} Let $t\in S$. We must prove +that $m\leq t$. + +If $t\in\left\{ 0,1,\ldots,p\right\} \cap S$, then this follows from +(\ref{pf.thm.ind.max3.m1}). Hence, for the rest of this proof, we can WLOG +assume that we don't have $t\in\left\{ 0,1,\ldots,p\right\} \cap S$. Assume +this. Thus, $t\notin\left\{ 0,1,\ldots,p\right\} \cap S$ (since we don't +have $t\in\left\{ 0,1,\ldots,p\right\} \cap S$). Combining $t\in S$ with +$t\notin\left\{ 0,1,\ldots,p\right\} \cap S$, we obtain +\begin{align*} +t & \in S\setminus\left( \left\{ 0,1,\ldots,p\right\} \cap S\right) +=\underbrace{S}_{\subseteq\mathbb{N}}\setminus\left\{ 0,1,\ldots,p\right\} \\ +& \subseteq\mathbb{N}\setminus\left\{ 0,1,\ldots,p\right\} =\left\{ +p+1,p+2,p+3,\ldots\right\} . +\end{align*} +Hence, $t\geq p+1\geq p\geq m$ (since $m\leq p$), so that $m\leq t$. This +completes the proof of (\ref{pf.thm.ind.max3.m2}).] +\end{verlong} + +Now, we know that $m$ is an element of $S$ (since $m\in S$) and satisfies +\newline$\left( m\leq t\text{ for each }t\in S\right) $ (by +(\ref{pf.thm.ind.max3.m2})). In other words, $m$ is an $s\in S$ that satisfies +\newline$\left( s\leq t\text{ for each }t\in S\right) $. In other words, $m$ +is a minimum of $S$ (by the definition of a minimum). Thus, $S$ has a minimum +(namely, $m$). This proves Theorem \ref{thm.ind.max3}. +\end{proof} + +\subsection{Increasing lists of finite sets} + +We shall next study (again using induction) another basic feature of finite sets. + +We recall that \textquotedblleft list\textquotedblright\ is just a synonym for +\textquotedblleft tuple\textquotedblright; i.e., a list is a $k$-tuple for +some $k\in\mathbb{N}$. Note that tuples and lists are always understood to be finite. + +\begin{definition} +\label{def.ind.inclist0}Let $S$ be a set of integers. An \textit{increasing +list} of $S$ shall mean a list $\left( s_{1},s_{2},\ldots,s_{k}\right) $ of +elements of $S$ such that $S=\left\{ s_{1},s_{2},\ldots,s_{k}\right\} $ and +$s_{1}0$ (since $S$ is +nonempty). Thus, $k\geq1$ (since $k$ is an integer). Therefore, $s_{k}$ is +well-defined. Clearly, $k\in\left\{ 1,2,\ldots,k\right\} $ (since $k\geq1$), +so that $s_{k}\in\left\{ s_{1},s_{2},\ldots,s_{k}\right\} =S$. + +We have $s_{1}g\geq0$). Thus, +$S$ has a maximum (by Theorem \ref{thm.ind.max}). Hence, $\max S$ is +well-defined. Set $m=\max S$. Thus, $m=\max S\in S$ (by +(\ref{eq.ind.max.def-max.1})). Therefore, $\left\vert S\setminus\left\{ +m\right\} \right\vert =\left\vert S\right\vert -1=g$ (since $\left\vert +S\right\vert =g+1$). Hence, (\ref{pf.thm.ind.inclist.unex.IH}) (applied to +$S\setminus\left\{ m\right\} $ instead of $S$) shows that $S\setminus +\left\{ m\right\} $ has exactly one increasing list. Let $\left( +t_{1},t_{2},\ldots,t_{j}\right) $ be this list. We extend this list to a +$\left( j+1\right) $-tuple $\left( t_{1},t_{2},\ldots,t_{j+1}\right) $ by +setting $t_{j+1}=m$. + +We have defined $\left( t_{1},t_{2},\ldots,t_{j}\right) $ as an increasing +list of the set $S\setminus\left\{ m\right\} $. In other words, $\left( +t_{1},t_{2},\ldots,t_{j}\right) $ is a list of elements of $S\setminus +\left\{ m\right\} $ such that $S\setminus\left\{ m\right\} =\left\{ +t_{1},t_{2},\ldots,t_{j}\right\} $ and $t_{1}1$, so that $j>0$ and thus $j\geq1$ (since $j$ is an +integer). Hence, $t_{j}$ is well-defined. We have $j\in\left\{ 1,2,\ldots +,j\right\} $ (since $j\geq1$) and thus $t_{j}\in\left\{ t_{1},t_{2}% +,\ldots,t_{j}\right\} =S\setminus\left\{ m\right\} \subseteq S$. Hence, +(\ref{eq.ind.max.def-max.2}) (applied to $t=t_{j}$) yields $\max S\geq t_{j}$. +Hence, $t_{j}\leq\max S=m$. Moreover, $t_{j}\notin\left\{ m\right\} $ (since +$t_{j}\in S\setminus\left\{ m\right\} $); in other words, $t_{j}\neq m$. +Combining this with $t_{j}\leq m$, we obtain $t_{j}$\textquotedblright\ signs, then we obtain the notion of a +\textit{decreasing list} of $S$. There are straightforward analogues of +Theorem \ref{thm.ind.inclist.unex}, Proposition \ref{prop.ind.inclist.size}, +Proposition \ref{prop.ind.inclist.empty} and Proposition +\ref{prop.ind.inclist.nonempty1} for decreasing lists (where, of course, the +analogue of Proposition \ref{prop.ind.inclist.nonempty1} uses $\min S$ instead +of $\max S$). Thus, we can state an analogue of Definition +\ref{def.ind.inclist} as well. In this analogue, the word \textquotedblleft +increasing\textquotedblright\ is replaced by \textquotedblleft +decreasing\textquotedblright\ everywhere, the word \textquotedblleft +smallest\textquotedblright\ is replaced by \textquotedblleft +largest\textquotedblright, and the word \textquotedblleft +lowest\textquotedblright\ is replaced by \textquotedblleft +highest\textquotedblright. + +\textbf{(c)} That said, the decreasing list and the increasing list are +closely related: If $S$ is a finite set of integers (or rational numbers, or +real numbers), and if $\left( s_{1},s_{2},\ldots,s_{k}\right) $ is the +increasing list of $S$, then $\left( s_{k},s_{k-1},\ldots,s_{1}\right) $ is +the decreasing list of $S$. (The proof is very simple.) + +\textbf{(d)} Let $S$ be a nonempty finite set of integers (or rational +numbers, or real numbers), and let $\left( s_{1},s_{2},\ldots,s_{k}\right) $ +be the increasing list of $S$. Proposition \ref{prop.ind.inclist.nonempty1} +\textbf{(a)} (applied to $m=\max S$) shows that $k\geq1$ and $s_{k}=\max S$. A +similar argument can be used to show that $s_{1}=\min S$. Thus, the increasing +list of $S$ begins with the smallest element of $S$ and ends with the largest +element of $S$ (as one would expect). +\end{remark} + +\subsection{Induction with shifted base} + +\subsubsection{Induction starting at $g$} + +All the induction proofs we have done so far were applications of Theorem +\ref{thm.ind.IP0} (even though we have often written them up in ways that hide +the exact statements $\mathcal{A}\left( n\right) $ to which Theorem +\ref{thm.ind.IP0} is being applied). We are soon going to see several other +\textquotedblleft induction principles\textquotedblright\ which can also be +used to make proofs. Unlike Theorem \ref{thm.ind.IP0}, these other principles +need not be taken on trust; instead, they can themselves be proven using +Theorem \ref{thm.ind.IP0}. Thus, they merely offer convenience, not new +logical opportunities. + +Our first such \textquotedblleft alternative induction +principle\textquotedblright\ is Theorem \ref{thm.ind.IPg} below. First, we +introduce a simple notation: + +\begin{definition} +Let $g\in\mathbb{Z}$. Then, $\mathbb{Z}_{\geq g}$ denotes the set $\left\{ +g,g+1,g+2,\ldots\right\} $; this is the set of all integers that are $\geq g$. +\end{definition} + +For example, $\mathbb{Z}_{\geq0}=\left\{ 0,1,2,\ldots\right\} =\mathbb{N}$ +is the set of all nonnegative integers, whereas $\mathbb{Z}_{\geq1}=\left\{ +1,2,3,\ldots\right\} $ is the set of all positive integers. + +Now, we state our first \textquotedblleft alternative induction +principle\textquotedblright: + +\begin{theorem} +\label{thm.ind.IPg}Let $g\in\mathbb{Z}$. For each $n\in\mathbb{Z}_{\geq g}$, +let $\mathcal{A}\left( n\right) $ be a logical statement. + +Assume the following: + +\begin{statement} +\textit{Assumption 1:} The statement $\mathcal{A}\left( g\right) $ holds. +\end{statement} + +\begin{statement} +\textit{Assumption 2:} If $m\in\mathbb{Z}_{\geq g}$ is such that +$\mathcal{A}\left( m\right) $ holds, then $\mathcal{A}\left( m+1\right) $ +also holds. +\end{statement} + +Then, $\mathcal{A}\left( n\right) $ holds for each $n\in\mathbb{Z}_{\geq g}$. +\end{theorem} + +Again, Theorem \ref{thm.ind.IPg} is intuitively clear: For example, if you +have $g=4$, and you want to prove (under the assumptions of Theorem +\ref{thm.ind.IPg}) that $\mathcal{A}\left( 8\right) $ holds, you can argue +as follows: + +\begin{itemize} +\item By Assumption 1, the statement $\mathcal{A}\left( 4\right) $ holds. + +\item Thus, by Assumption 2 (applied to $m=4$), the statement $\mathcal{A}% +\left( 5\right) $ holds. + +\item Thus, by Assumption 2 (applied to $m=5$), the statement $\mathcal{A}% +\left( 6\right) $ holds. + +\item Thus, by Assumption 2 (applied to $m=6$), the statement $\mathcal{A}% +\left( 7\right) $ holds. + +\item Thus, by Assumption 2 (applied to $m=7$), the statement $\mathcal{A}% +\left( 8\right) $ holds. +\end{itemize} + +A similar (but longer) argument shows that the statement $\mathcal{A}\left( +9\right) $ holds; likewise, $\mathcal{A}\left( n\right) $ can be shown to +hold for each $n\in\mathbb{Z}_{\geq g}$ by means of an argument that takes +$n-g+1$ steps. + +Theorem \ref{thm.ind.IPg} generalizes Theorem \ref{thm.ind.IP0}. Indeed, +Theorem \ref{thm.ind.IP0} is the particular case of Theorem \ref{thm.ind.IPg} +for $g=0$ (since $\mathbb{Z}_{\geq0}=\mathbb{N}$). However, Theorem +\ref{thm.ind.IPg} can also be derived from Theorem \ref{thm.ind.IP0}. In order +to do this, we essentially need to \textquotedblleft shift\textquotedblright% +\ the index $n$ in Theorem \ref{thm.ind.IPg} down by $g$ -- that is, we need +to rename our sequence $\left( \mathcal{A}\left( g\right) ,\mathcal{A}% +\left( g+1\right) ,\mathcal{A}\left( g+2\right) ,\ldots\right) $ of +statements as $\left( \mathcal{B}\left( 0\right) ,\mathcal{B}\left( +1\right) ,\mathcal{B}\left( 2\right) ,\ldots\right) $, and apply Theorem +\ref{thm.ind.IP0} to $\mathcal{B}\left( n\right) $ instead of $\mathcal{A}% +\left( n\right) $. In order to make this renaming procedure rigorous, let us +first restate Theorem \ref{thm.ind.IP0} as follows: + +\begin{corollary} +\label{cor.ind.IP0.renamed}For each $n\in\mathbb{N}$, let $\mathcal{B}\left( +n\right) $ be a logical statement. + +Assume the following: + +\begin{statement} +\textit{Assumption A:} The statement $\mathcal{B}\left( 0\right) $ holds. +\end{statement} + +\begin{statement} +\textit{Assumption B:} If $p\in\mathbb{N}$ is such that $\mathcal{B}\left( +p\right) $ holds, then $\mathcal{B}\left( p+1\right) $ also holds. +\end{statement} + +Then, $\mathcal{B}\left( n\right) $ holds for each $n\in\mathbb{N}$. +\end{corollary} + +\begin{proof} +[Proof of Corollary \ref{cor.ind.IP0.renamed}.]Corollary +\ref{cor.ind.IP0.renamed} is exactly Theorem \ref{thm.ind.IP0}, except that +some names have been changed: + +\begin{itemize} +\item The statements $\mathcal{A}\left( n\right) $ have been renamed as +$\mathcal{B}\left( n\right) $. + +\item Assumption 1 and Assumption 2 have been renamed as Assumption A and +Assumption B. + +\item The variable $m$ in Assumption B has been renamed as $p$. +\end{itemize} + +Thus, Corollary \ref{cor.ind.IP0.renamed} holds (since Theorem +\ref{thm.ind.IP0} holds). +\end{proof} + +Let us now derive Theorem \ref{thm.ind.IPg} from Theorem \ref{thm.ind.IP0}: + +\begin{proof} +[Proof of Theorem \ref{thm.ind.IPg}.]For any $n\in\mathbb{N}$, we have +$n+g\in\mathbb{Z}_{\geq g}$\ \ \ \ \footnote{\textit{Proof.} Let +$n\in\mathbb{N}$. Thus, $n\geq0$, so that $\underbrace{n}_{\geq0}+g\geq0+g=g$. +Hence, $n+g$ is an integer $\geq g$. In other words, $n+g\in\mathbb{Z}_{\geq +g}$ (since $\mathbb{Z}_{\geq g}$ is the set of all integers that are $\geq +g$). Qed.}. Hence, for each $n\in\mathbb{N}$, we can define a logical +statement $\mathcal{B}\left( n\right) $ by% +\[ +\mathcal{B}\left( n\right) =\mathcal{A}\left( n+g\right) . +\] +Consider this $\mathcal{B}\left( n\right) $. + +Now, let us consider the Assumptions A and B from Corollary +\ref{cor.ind.IP0.renamed}. We claim that both of these assumptions are satisfied. + +Indeed, the statement $\mathcal{A}\left( g\right) $ holds (by Assumption 1). +But the definition of the statement $\mathcal{B}\left( 0\right) $ shows that +$\mathcal{B}\left( 0\right) =\mathcal{A}\left( 0+g\right) =\mathcal{A}% +\left( g\right) $. Hence, the statement $\mathcal{B}\left( 0\right) $ +holds (since the statement $\mathcal{A}\left( g\right) $ holds). In other +words, Assumption A is satisfied. + +Now, we shall show that Assumption B is satisfied. Indeed, let $p\in +\mathbb{N}$ be such that $\mathcal{B}\left( p\right) $ holds. The definition +of the statement $\mathcal{B}\left( p\right) $ shows that $\mathcal{B}% +\left( p\right) =\mathcal{A}\left( p+g\right) $. Hence, the statement +$\mathcal{A}\left( p+g\right) $ holds (since $\mathcal{B}\left( p\right) $ holds). + +Also, $p\in\mathbb{N}$, so that $p\geq0$ and thus $p+g\geq g$. In other words, +$p+g\in\mathbb{Z}_{\geq g}$ (since $\mathbb{Z}_{\geq g}$ is the set of all +integers that are $\geq g$). + +Recall that Assumption 2 holds. In other words, if $m\in\mathbb{Z}_{\geq g}$ +is such that $\mathcal{A}\left( m\right) $ holds, then $\mathcal{A}\left( +m+1\right) $ also holds. Applying this to $m=p+g$, we conclude that +$\mathcal{A}\left( \left( p+g\right) +1\right) $ holds (since +$\mathcal{A}\left( p+g\right) $ holds). + +But the definition of $\mathcal{B}\left( p+1\right) $ yields $\mathcal{B}% +\left( p+1\right) =\mathcal{A}\left( \underbrace{p+1+g}_{=\left( +p+g\right) +1}\right) =\mathcal{A}\left( \left( p+g\right) +1\right) $. +Hence, the statement $\mathcal{B}\left( p+1\right) $ holds (since the +statement $\mathcal{A}\left( \left( p+g\right) +1\right) $ holds). + +Now, forget that we fixed $p$. We thus have shown that if $p\in\mathbb{N}$ is +such that $\mathcal{B}\left( p\right) $ holds, then $\mathcal{B}\left( +p+1\right) $ also holds. In other words, Assumption B is satisfied. + +We now know that both Assumption A and Assumption B are satisfied. Hence, +Corollary \ref{cor.ind.IP0.renamed} shows that +\begin{equation} +\mathcal{B}\left( n\right) \text{ holds for each }n\in\mathbb{N}. +\label{pf.thm.ind.IPg.at}% +\end{equation} + + +Now, let $n\in\mathbb{Z}_{\geq g}$. Thus, $n$ is an integer such that $n\geq +g$ (by the definition of $\mathbb{Z}_{\geq g}$). Hence, $n-g\geq0$, so that +$n-g\in\mathbb{N}$. Thus, (\ref{pf.thm.ind.IPg.at}) (applied to $n-g$ instead +of $n$) yields that $\mathcal{B}\left( n-g\right) $ holds. But the +definition of $\mathcal{B}\left( n-g\right) $ yields $\mathcal{B}\left( +n-g\right) =\mathcal{A}\left( \underbrace{\left( n-g\right) +g}% +_{=n}\right) =\mathcal{A}\left( n\right) $. Hence, the statement +$\mathcal{A}\left( n\right) $ holds (since $\mathcal{B}\left( n-g\right) $ holds). + +Now, forget that we fixed $n$. We thus have shown that $\mathcal{A}\left( +n\right) $ holds for each $n\in\mathbb{Z}_{\geq g}$. This proves Theorem +\ref{thm.ind.IPg}. +\end{proof} + +Theorem \ref{thm.ind.IPg} is called the \textit{principle of induction +starting at }$g$, and proofs that use it are usually called \textit{proofs by +induction} or \textit{induction proofs}. As with the standard induction +principle (Theorem \ref{thm.ind.IP0}), we don't usually explicitly cite +Theorem \ref{thm.ind.IPg}, but instead say certain words that signal that it +is being applied and that (ideally) also indicate what integer $g$ and what +statements $\mathcal{A}\left( n\right) $ it is being applied to\footnote{We +will explain this in Convention \ref{conv.ind.IPglang} below.}. However, for +our very first example of the use of Theorem \ref{thm.ind.IPg}, we are going +to reference it explicitly: + +\begin{proposition} +\label{prop.mod.binom01}Let $a$ and $b$ be integers. Then, every positive +integer $n$ satisfies% +\begin{equation} +\left( a+b\right) ^{n}\equiv a^{n}+na^{n-1}b\operatorname{mod}b^{2}. +\label{eq.prop.mod.binom01.claim}% +\end{equation} + +\end{proposition} + +Note that we have chosen not to allow $n=0$ in Proposition +\ref{prop.mod.binom01}, because it is not clear what \textquotedblleft% +$a^{n-1}$\textquotedblright\ would mean when $n=0$ and $a=0$. (Recall that +$0^{0-1}=0^{-1}$ is not defined!) In truth, it is easy to convince oneself +that this is not a serious hindrance, since the expression \textquotedblleft% +$na^{n-1}$\textquotedblright\ has a meaningful interpretation even when its +sub-expression \textquotedblleft$a^{n-1}$\textquotedblright\ does not (one +just has to interpret it as $0$ when $n=0$, without regard to whether +\textquotedblleft$a^{n-1}$\textquotedblright\ is well-defined). Nevertheless, +we prefer to rule out the case of $n=0$ by requiring $n$ to be positive, in +order to avoid having to discuss such questions of interpretation. (Of course, +this also gives us an excuse to apply Theorem \ref{thm.ind.IPg} instead of the +old Theorem \ref{thm.ind.IP0}.) + +\begin{proof} +[Proof of Proposition \ref{prop.mod.binom01}.]For each $n\in\mathbb{Z}_{\geq +1}$, we let $\mathcal{A}\left( n\right) $ be the statement% +\[ +\left( \left( a+b\right) ^{n}\equiv a^{n}+na^{n-1}b\operatorname{mod}% +b^{2}\right) . +\] +Our next goal is to prove the statement $\mathcal{A}\left( n\right) $ for +each $n\in\mathbb{Z}_{\geq1}$. + +We first notice that the statement $\mathcal{A}\left( 1\right) $ +holds\footnote{\textit{Proof.} We have $\left( a+b\right) ^{1}=a+b$. +Comparing this with $\underbrace{a^{1}}_{=a}+1\underbrace{a^{1-1}}_{=a^{0}% +=1}b=a+b$, we obtain $\left( a+b\right) ^{1}=a^{1}+1a^{1-1}b$. Hence, +$\left( a+b\right) ^{1}\equiv a^{1}+1a^{1-1}b\operatorname{mod}b^{2}$. But +this is precisely the statement $\mathcal{A}\left( 1\right) $ (since +$\mathcal{A}\left( 1\right) $ is defined to be the statement $\left( +\left( a+b\right) ^{1}\equiv a^{1}+1a^{1-1}b\operatorname{mod}b^{2}\right) +$). Hence, the statement $\mathcal{A}\left( 1\right) $ holds.}. + +Now, we claim that +\begin{equation} +\text{if }m\in\mathbb{Z}_{\geq1}\text{ is such that }\mathcal{A}\left( +m\right) \text{ holds, then }\mathcal{A}\left( m+1\right) \text{ also +holds.} \label{pf.prop.mod.binom01.step}% +\end{equation} + + +[\textit{Proof of (\ref{pf.prop.mod.binom01.step}):} Let $m\in\mathbb{Z}% +_{\geq1}$ be such that $\mathcal{A}\left( m\right) $ holds. We must show +that $\mathcal{A}\left( m+1\right) $ also holds. + +We have assumed that $\mathcal{A}\left( m\right) $ holds. In other words, +$\left( a+b\right) ^{m}\equiv a^{m}+ma^{m-1}b\operatorname{mod}b^{2}$ +holds\footnote{because $\mathcal{A}\left( m\right) $ is defined to be the +statement $\left( \left( a+b\right) ^{m}\equiv a^{m}+ma^{m-1}% +b\operatorname{mod}b^{2}\right) $}. Now,% +\begin{align*} +\left( a+b\right) ^{m+1} & =\underbrace{\left( a+b\right) ^{m}}_{\equiv +a^{m}+ma^{m-1}b\operatorname{mod}b^{2}}\left( a+b\right) \\ +& \equiv\left( a^{m}+ma^{m-1}b\right) \left( a+b\right) \\ +& =\underbrace{a^{m}a}_{=a^{m+1}}+a^{m}b+m\underbrace{a^{m-1}ba}% +_{\substack{=a^{m-1}ab=a^{m}b\\\text{(since }a^{m-1}a=a^{m}\text{)}% +}}+\underbrace{ma^{m-1}bb}_{\substack{=ma^{m-1}b^{2}\equiv0\operatorname{mod}% +b^{2}\\\text{(since }b^{2}\mid ma^{m-1}b^{2}\text{)}}}\\ +& \equiv a^{m+1}+\underbrace{a^{m}b+ma^{m}b}_{=\left( m+1\right) a^{m}% +b}+0\\ +& =a^{m+1}+\left( m+1\right) \underbrace{a^{m}}_{\substack{=a^{\left( +m+1\right) -1}\\\text{(since }m=\left( m+1\right) -1\text{)}}% +}b=a^{m+1}+\left( m+1\right) a^{\left( m+1\right) -1}b\operatorname{mod}% +b^{2}. +\end{align*} + + +So we have shown that $\left( a+b\right) ^{m+1}\equiv a^{m+1}+\left( +m+1\right) a^{\left( m+1\right) -1}b\operatorname{mod}b^{2}$. But this is +precisely the statement $\mathcal{A}\left( m+1\right) $% +\ \ \ \ \footnote{because $\mathcal{A}\left( m+1\right) $ is defined to be +the statement $\left( \left( a+b\right) ^{m+1}\equiv a^{m+1}+\left( +m+1\right) a^{\left( m+1\right) -1}b\operatorname{mod}b^{2}\right) $}. +Thus, the statement $\mathcal{A}\left( m+1\right) $ holds. + +Now, forget that we fixed $m$. We thus have shown that if $m\in\mathbb{Z}% +_{\geq1}$ is such that $\mathcal{A}\left( m\right) $ holds, then +$\mathcal{A}\left( m+1\right) $ also holds. This proves +(\ref{pf.prop.mod.binom01.step}).] + +Now, both assumptions of Theorem \ref{thm.ind.IPg} (applied to $g=1$) are +satisfied (indeed, Assumption 1 is satisfied because the statement +$\mathcal{A}\left( 1\right) $ holds, whereas Assumption 2 is satisfied +because of (\ref{pf.prop.mod.binom01.step})). Thus, Theorem \ref{thm.ind.IPg} +(applied to $g=1$) shows that $\mathcal{A}\left( n\right) $ holds for each +$n\in\mathbb{Z}_{\geq1}$. In other words, $\left( a+b\right) ^{n}\equiv +a^{n}+na^{n-1}b\operatorname{mod}b^{2}$ holds for each $n\in\mathbb{Z}_{\geq +1}$ (since $\mathcal{A}\left( n\right) $ is the statement $\left( \left( +a+b\right) ^{n}\equiv a^{n}+na^{n-1}b\operatorname{mod}b^{2}\right) $). In +other words, $\left( a+b\right) ^{n}\equiv a^{n}+na^{n-1}b\operatorname{mod}% +b^{2}$ holds for each positive integer $n$ (because the positive integers are +exactly the $n\in\mathbb{Z}_{\geq1}$). This proves Proposition +\ref{prop.mod.binom01}. +\end{proof} + +\subsubsection{Conventions for writing proofs by induction starting at $g$} + +Now, let us introduce some standard language that is commonly used in proofs +by induction starting at $g$: + +\begin{convention} +\label{conv.ind.IPglang}Let $g\in\mathbb{Z}$. For each $n\in\mathbb{Z}_{\geq +g}$, let $\mathcal{A}\left( n\right) $ be a logical statement. Assume that +you want to prove that $\mathcal{A}\left( n\right) $ holds for each +$n\in\mathbb{Z}_{\geq g}$. + +Theorem \ref{thm.ind.IPg} offers the following strategy for proving this: +First show that Assumption 1 of Theorem \ref{thm.ind.IPg} is satisfied; then, +show that Assumption 2 of Theorem \ref{thm.ind.IPg} is satisfied; then, +Theorem \ref{thm.ind.IPg} automatically completes your proof. + +A proof that follows this strategy is called a \textit{proof by induction on +}$n$ (or \textit{proof by induction over }$n$) \textit{starting at }$g$ or +(less precisely) an \textit{inductive proof}. Most of the time, the words +\textquotedblleft starting at $g$\textquotedblright\ are omitted, since they +merely repeat what is clear from the context anyway: For example, if you make +a claim about all integers $n\geq3$, and you say that you are proving it by +induction on $n$, then it is clear that you are using induction on $n$ +starting at $3$. (And if this isn't clear from the claim, then the induction +base will make it clear.) + +The proof that Assumption 1 is satisfied is called the \textit{induction base} +(or \textit{base case}) of the proof. The proof that Assumption 2 is satisfied +is called the \textit{induction step} of the proof. + +In order to prove that Assumption 2 is satisfied, you will usually want to fix +an $m\in\mathbb{Z}_{\geq g}$ such that $\mathcal{A}\left( m\right) $ holds, +and then prove that $\mathcal{A}\left( m+1\right) $ holds. In other words, +you will usually want to fix $m\in\mathbb{Z}_{\geq g}$, assume that +$\mathcal{A}\left( m\right) $ holds, and then prove that $\mathcal{A}\left( +m+1\right) $ holds. When doing so, it is common to refer to the assumption +that $\mathcal{A}\left( m\right) $ holds as the \textit{induction +hypothesis} (or \textit{induction assumption}). +\end{convention} + +Unsurprisingly, this language parallels the language introduced in Convention +\ref{conv.ind.IP0lang} for proofs by \textquotedblleft +standard\textquotedblright\ induction. + +Again, we can shorten our inductive proofs by omitting some sentences that +convey no information. In particular, we can leave out the explicit definition +of the statement $\mathcal{A}\left( n\right) $ when this statement is +precisely the claim that we are proving (without the \textquotedblleft for +each $n\in\mathbb{N}$\textquotedblright\ part). Thus, we can rewrite our above +proof of Proposition \ref{prop.mod.binom01} as follows: + +\begin{proof} +[Proof of Proposition \ref{prop.mod.binom01} (second version).]We must prove +(\ref{eq.prop.mod.binom01.claim}) for every positive integer $n$. In other +words, we must prove (\ref{eq.prop.mod.binom01.claim}) for every +$n\in\mathbb{Z}_{\geq1}$ (since the positive integers are precisely the +$n\in\mathbb{Z}_{\geq1}$). We shall prove this by induction on $n$ starting at +$1$: + +\textit{Induction base:} We have $\left( a+b\right) ^{1}=a+b$. Comparing +this with $\underbrace{a^{1}}_{=a}+1\underbrace{a^{1-1}}_{=a^{0}=1}b=a+b$, we +obtain $\left( a+b\right) ^{1}=a^{1}+1a^{1-1}b$. Hence, $\left( a+b\right) +^{1}\equiv a^{1}+1a^{1-1}b\operatorname{mod}b^{2}$. In other words, +(\ref{eq.prop.mod.binom01.claim}) holds for $n=1$. This completes the +induction base. + +\textit{Induction step:} Let $m\in\mathbb{Z}_{\geq1}$. Assume that +(\ref{eq.prop.mod.binom01.claim}) holds for $n=m$. We must show that +(\ref{eq.prop.mod.binom01.claim}) also holds for $n=m+1$. + +We have assumed that (\ref{eq.prop.mod.binom01.claim}) holds for $n=m$. In +other words, $\left( a+b\right) ^{m}\equiv a^{m}+ma^{m-1}b\operatorname{mod}% +b^{2}$ holds. Now,% +\begin{align*} +\left( a+b\right) ^{m+1} & =\underbrace{\left( a+b\right) ^{m}}_{\equiv +a^{m}+ma^{m-1}b\operatorname{mod}b^{2}}\left( a+b\right) \\ +& \equiv\left( a^{m}+ma^{m-1}b\right) \left( a+b\right) \\ +& =\underbrace{a^{m}a}_{=a^{m+1}}+a^{m}b+m\underbrace{a^{m-1}ba}% +_{\substack{=a^{m-1}ab=a^{m}b\\\text{(since }a^{m-1}a=a^{m}\text{)}% +}}+\underbrace{ma^{m-1}bb}_{\substack{=ma^{m-1}b^{2}\equiv0\operatorname{mod}% +b^{2}\\\text{(since }b^{2}\mid ma^{m-1}b^{2}\text{)}}}\\ +& \equiv a^{m+1}+\underbrace{a^{m}b+ma^{m}b}_{=\left( m+1\right) a^{m}% +b}+0\\ +& =a^{m+1}+\left( m+1\right) \underbrace{a^{m}}_{\substack{=a^{\left( +m+1\right) -1}\\\text{(since }m=\left( m+1\right) -1\text{)}}% +}b=a^{m+1}+\left( m+1\right) a^{\left( m+1\right) -1}b\operatorname{mod}% +b^{2}. +\end{align*} + + +So we have shown that $\left( a+b\right) ^{m+1}\equiv a^{m+1}+\left( +m+1\right) a^{\left( m+1\right) -1}b\operatorname{mod}b^{2}$. In other +words, (\ref{eq.prop.mod.binom01.claim}) holds for $n=m+1$. + +Now, forget that we fixed $m$. We thus have shown that if $m\in\mathbb{Z}% +_{\geq1}$ is such that (\ref{eq.prop.mod.binom01.claim}) holds for $n=m$, then +(\ref{eq.prop.mod.binom01.claim}) also holds for $n=m+1$. This completes the +induction step. Hence, (\ref{eq.prop.mod.binom01.claim}) is proven by +induction. This proves Proposition \ref{prop.mod.binom01}. +\end{proof} + +Proposition \ref{prop.mod.binom01} can also be seen as a consequence of the +binomial formula (Proposition \ref{prop.binom.binomial} further below). + +\subsubsection{More properties of congruences} + +Let us use this occasion to show two corollaries of Proposition +\ref{prop.mod.binom01}: + +\begin{corollary} +\label{cor.mod.lte1}Let $a$, $b$ and $n$ be three integers such that $a\equiv +b\operatorname{mod}n$. Let $d\in\mathbb{N}$ be such that $d\mid n$. Then, +$a^{d}\equiv b^{d}\operatorname{mod}nd$. +\end{corollary} + +\begin{proof} +[Proof of Corollary \ref{cor.mod.lte1}.]We have $a\equiv b\operatorname{mod}% +n$. In other words, $a$ is congruent to $b$ modulo $n$. In other words, $n\mid +a-b$ (by the definition of \textquotedblleft congruent\textquotedblright). In +other words, there exists an integer $w$ such that $a-b=nw$. Consider this +$w$. From $a-b=nw$, we obtain $a=b+nw$. Also, $d\mid n$, thus $dn\mid nn$ (by +Proposition \ref{prop.div.acbc}, applied to $d$, $n$ and $n$ instead of $a$, +$b$ and $c$). On the other hand, $nn\mid\left( nw\right) ^{2}$ (since +$\left( nw\right) ^{2}=nwnw=nnww$). Hence, Proposition \ref{prop.div.trans} +(applied to $dn$, $nn$ and $\left( nw\right) ^{2}$ instead of $a$, $b$ and +$c$) yields $dn\mid\left( nw\right) ^{2}$ (since $dn\mid nn$ and +$nn\mid\left( nw\right) ^{2}$). In other words, $nd\mid\left( nw\right) +^{2}$ (since $dn=nd$). + +Next, we claim that% +\begin{equation} +nd\mid a^{d}-b^{d}. \label{pf.cor.mod.lte1.1}% +\end{equation} + + +[\textit{Proof of (\ref{pf.cor.mod.lte1.1}):} If $d=0$, then +(\ref{pf.cor.mod.lte1.1}) holds (because if $d=0$, then $a^{d}-b^{d}% +=\underbrace{a^{0}}_{=1}-\underbrace{b^{0}}_{=1}=1-1=0=0nd$, and thus $nd\mid +a^{d}-b^{d}$). Hence, for the rest of this proof of (\ref{pf.cor.mod.lte1.1}), +we WLOG assume that we don't have $d=0$. Thus, $d\neq0$. Hence, $d$ is a +positive integer (since $d\in\mathbb{N}$). Thus, Proposition +\ref{prop.mod.binom01} (applied to $d$, $b$ and $nw$ instead of $n$, $a$ and +$b$) yields +\[ +\left( b+nw\right) ^{d}\equiv b^{d}+db^{d-1}nw\operatorname{mod}\left( +nw\right) ^{2}. +\] +In view of $a=b+nw$, this rewrites as% +\[ +a^{d}\equiv b^{d}+db^{d-1}nw\operatorname{mod}\left( nw\right) ^{2}. +\] +Hence, Proposition \ref{prop.mod.0} \textbf{(c)} (applied to $a^{d}$, +$b^{d}+db^{d-1}nw$, $\left( nw\right) ^{2}$ and $nd$ instead of $a$, $b$, +$n$ and $m$) yields +\[ +a^{d}\equiv b^{d}+db^{d-1}nw\operatorname{mod}nd +\] +(since $nd\mid\left( nw\right) ^{2}$). Hence,% +\[ +a^{d}\equiv b^{d}+\underbrace{db^{d-1}nw}_{\substack{=ndb^{d-1}w\equiv +0\operatorname{mod}nd\\\text{(since }nd\mid ndb^{d-1}w\text{)}}}\equiv +b^{d}+0=b^{d}\operatorname{mod}nd. +\] +In other words, $nd\mid a^{d}-b^{d}$. This proves (\ref{pf.cor.mod.lte1.1}).] + +From (\ref{pf.cor.mod.lte1.1}), we immediately obtain $a^{d}\equiv +b^{d}\operatorname{mod}nd$ (by the definition of \textquotedblleft +congruent\textquotedblright). This proves Corollary \ref{cor.mod.lte1}. +\end{proof} + +For the next corollary, we need a convention: + +\begin{convention} +Let $a$, $b$ and $c$ be three integers. Then, the expression \textquotedblleft% +$a^{b^{c}}$\textquotedblright\ shall always be interpreted as +\textquotedblleft$a^{\left( b^{c}\right) }$\textquotedblright, never as +\textquotedblleft$\left( a^{b}\right) ^{c}$\textquotedblright. +\end{convention} + +Thus, for example, \textquotedblleft$3^{3^{3}}$\textquotedblright\ means +$3^{\left( 3^{3}\right) }=3^{27}=\allowbreak7625\,\allowbreak597\,484\,987$, +not $\left( 3^{3}\right) ^{3}=27^{3}=\allowbreak19\,683$. The reason for +this convention is that $\left( a^{b}\right) ^{c}$ can be simplified to +$a^{bc}$ and thus there is little use in having yet another notation for it. +Of course, this convention applies not only to integers, but to any other +numbers $a,b,c$. + +We can now state the following fact, which is sometimes known as +\textquotedblleft lifting-the-exponent lemma\textquotedblright: + +\begin{corollary} +\label{cor.mod.lte2}Let $n\in\mathbb{N}$. Let $a$ and $b$ be two integers such +that $a\equiv b\operatorname{mod}n$. Let $k\in\mathbb{N}$. Then, +\begin{equation} +a^{n^{k}}\equiv b^{n^{k}}\operatorname{mod}n^{k+1}. +\label{eq.cor.mod.lte2.claim}% +\end{equation} + +\end{corollary} + +We shall give two \textbf{different} proofs of Corollary \ref{cor.mod.lte2} by +induction on $k$, to illustrate once again the point (previously made in +Remark \ref{rmk.ind.abstract}) that we have a choice of what precise statement +we are proving by induction. In the first proof, the statement will be the +congruence (\ref{eq.cor.mod.lte2.claim}) for three \textbf{fixed} integers +$a$, $b$ and $n$, whereas in the second proof, it will be the statement% +\[ +\left( a^{n^{k}}\equiv b^{n^{k}}\operatorname{mod}n^{k+1}\text{ for +\textbf{all} integers }a\text{ and }b\text{ and \textbf{all} }n\in +\mathbb{N}\text{ satisfying }a\equiv b\operatorname{mod}n\right) . +\] + + +\begin{proof} +[First proof of Corollary \ref{cor.mod.lte2}.]Forget that we fixed $k$. We +thus must prove (\ref{eq.cor.mod.lte2.claim}) for each $k\in\mathbb{N}$. + +We shall prove this by induction on $k$: + +\textit{Induction base:} We have $n^{0}=1$ and thus $a^{n^{0}}=a^{1}=a$. +Similarly, $b^{n^{0}}=b$. Thus, $a^{n^{0}}=a\equiv b=b^{n^{0}}% +\operatorname{mod}n$. In other words, $a^{n^{0}}\equiv b^{n^{0}}% +\operatorname{mod}n^{0+1}$ (since $n^{0+1}=n^{1}=n$). In other words, +(\ref{eq.cor.mod.lte2.claim}) holds for $k=0$. This completes the induction base. + +\textit{Induction step:} Let $m\in\mathbb{N}$. Assume that +(\ref{eq.cor.mod.lte2.claim}) holds for $k=m$. We must prove that +(\ref{eq.cor.mod.lte2.claim}) holds for $k=m+1$. + +We have $n^{m+1}=nn^{m}$. Hence, $n\mid n^{m+1}$. + +We have assumed that (\ref{eq.cor.mod.lte2.claim}) holds for $k=m$. In other +words, we have% +\[ +a^{n^{m}}\equiv b^{n^{m}}\operatorname{mod}n^{m+1}. +\] +Hence, Corollary \ref{cor.mod.lte1} (applied to $a^{n^{m}}$, $b^{n^{m}}$, +$n^{m+1}$ and $n$ instead of $a$, $b$, $n$ and $d$) yields% +\[ +\left( a^{n^{m}}\right) ^{n}\equiv\left( b^{n^{m}}\right) ^{n}% +\operatorname{mod}n^{m+1}n. +\] +Now, $n^{m+1}=n^{m}n$, so that +\[ +a^{n^{m+1}}=a^{n^{m}n}=\left( a^{n^{m}}\right) ^{n}\equiv\left( b^{n^{m}% +}\right) ^{n}=b^{n^{m}n}=b^{n^{m+1}}\operatorname{mod}n^{m+1}n +\] +(since $n^{m}n=n^{m+1}$). In view of $n^{m+1}n=n^{\left( m+1\right) +1}$, +this rewrites as% +\[ +a^{n^{m+1}}\equiv b^{n^{m+1}}\operatorname{mod}n^{\left( m+1\right) +1}. +\] +In other words, (\ref{eq.cor.mod.lte2.claim}) holds for $k=m+1$. This +completes the induction step. Thus, (\ref{eq.cor.mod.lte2.claim}) is proven by +induction. Hence, Corollary \ref{cor.mod.lte2} holds. +\end{proof} + +\begin{proof} +[Second proof of Corollary \ref{cor.mod.lte2}.]Forget that we fixed $a$, $b$, +$n$ and $k$. We thus must prove +\begin{equation} +\left( a^{n^{k}}\equiv b^{n^{k}}\operatorname{mod}n^{k+1}\text{ for all +integers }a\text{ and }b\text{ and all }n\in\mathbb{N}\text{ satisfying +}a\equiv b\operatorname{mod}n\right) \label{pf.cor.mod.lte2.pf2.goal}% +\end{equation} +for all $k\in\mathbb{N}$. + +We shall prove this by induction on $k$: + +\textit{Induction base:} Let $n\in\mathbb{N}$. Let $a$ and $b$ be two integers +such that $a\equiv b\operatorname{mod}n$. We have $n^{0}=1$ and thus +$a^{n^{0}}=a^{1}=a$. Similarly, $b^{n^{0}}=b$. Thus, $a^{n^{0}}=a\equiv +b=b^{n^{0}}\operatorname{mod}n$. In other words, $a^{n^{0}}\equiv b^{n^{0}% +}\operatorname{mod}n^{0+1}$ (since $n^{0+1}=n^{1}=n$). + +Now, forget that we fixed $n$, $a$ and $b$. We thus have proven that +$a^{n^{0}}\equiv b^{n^{0}}\operatorname{mod}n^{0+1}$ for all integers $a$ and +$b$ and all $n\in\mathbb{N}$ satisfying $a\equiv b\operatorname{mod}n$. In +other words, (\ref{pf.cor.mod.lte2.pf2.goal}) holds for $k=0$. This completes +the induction base. + +\textit{Induction step:} Let $m\in\mathbb{N}$. Assume that +(\ref{pf.cor.mod.lte2.pf2.goal}) holds for $k=m$. We must prove that +(\ref{pf.cor.mod.lte2.pf2.goal}) holds for $k=m+1$. + +Let $n\in\mathbb{N}$. Let $a$ and $b$ be two integers such that $a\equiv +b\operatorname{mod}n$. Now, +\begin{align*} +\left( n^{2}\right) ^{m+1} & =n^{2\left( m+1\right) }=n^{\left( +m+2\right) +m}\ \ \ \ \ \ \ \ \ \ \left( \text{since }2\left( m+1\right) +=\left( m+2\right) +m\right) \\ +& =n^{m+2}n^{m}, +\end{align*} +so that $n^{m+2}\mid\left( n^{2}\right) ^{m+1}$. + +We have $n\mid n$. Hence, Corollary \ref{cor.mod.lte1} (applied to $d=n$) +yields $a^{n}\equiv b^{n}\operatorname{mod}nn$. In other words, $a^{n}\equiv +b^{n}\operatorname{mod}n^{2}$ (since $nn=n^{2}$). + +We have assumed that (\ref{pf.cor.mod.lte2.pf2.goal}) holds for $k=m$. Hence, +we can apply (\ref{pf.cor.mod.lte2.pf2.goal}) to $a^{n}$, $b^{n}$, $n^{2}$ and +$m$ instead of $a$, $b$, $n$ and $k$ (since $a^{n}\equiv b^{n}% +\operatorname{mod}n^{2}$). We thus conclude that% +\[ +\left( a^{n}\right) ^{n^{m}}\equiv\left( b^{n}\right) ^{n^{m}% +}\operatorname{mod}\left( n^{2}\right) ^{m+1}. +\] +Now, $n^{m+1}=nn^{m}$, so that +\[ +a^{n^{m+1}}=a^{nn^{m}}=\left( a^{n}\right) ^{n^{m}}\equiv\left( +b^{n}\right) ^{n^{m}}=b^{nn^{m}}=b^{n^{m+1}}\operatorname{mod}\left( +n^{2}\right) ^{m+1}% +\] +(since $nn^{m}=n^{m+1}$). Hence, Proposition \ref{prop.mod.0} \textbf{(c)} +(applied to $a^{n^{m+1}}$, $b^{n^{m+1}}$, $\left( n^{2}\right) ^{m+1}$ and +$n^{m+2}$ instead of $a$, $b$, $n$ and $m$) yields $a^{n^{m+1}}\equiv +b^{n^{m+1}}\operatorname{mod}n^{m+2}$ (since $n^{m+2}\mid\left( n^{2}\right) +^{m+1}$). In view of $m+2=\left( m+1\right) +1$, this rewrites as% +\[ +a^{n^{m+1}}\equiv b^{n^{m+1}}\operatorname{mod}n^{\left( m+1\right) +1}. +\] + + +Now, forget that we fixed $n$, $a$ and $b$. We thus have proven that +\newline$a^{n^{m+1}}\equiv b^{n^{m+1}}\operatorname{mod}n^{\left( m+1\right) ++1}$ for all integers $a$ and $b$ and all $n\in\mathbb{N}$ satisfying $a\equiv +b\operatorname{mod}n$. In other words, (\ref{pf.cor.mod.lte2.pf2.goal}) holds +for $k=m+1$. This completes the induction step. Thus, +(\ref{pf.cor.mod.lte2.pf2.goal}) is proven by induction. Hence, Corollary +\ref{cor.mod.lte2} is proven again. +\end{proof} + +\subsection{\label{sect.ind.SIP}Strong induction} + +\subsubsection{The strong induction principle} + +We shall now show another \textquotedblleft alternative induction +principle\textquotedblright, which is known as the \textit{strong induction +principle} because it feels stronger than Theorem \ref{thm.ind.IP0} (in the +sense that it appears to get the same conclusion from weaker assumptions). +Just as Theorem \ref{thm.ind.IPg}, this principle is not a new axiom, but +rather a consequence of the standard induction principle; we shall soon deduce +it from Theorem \ref{thm.ind.IPg}. + +\begin{theorem} +\label{thm.ind.SIP}Let $g\in\mathbb{Z}$. For each $n\in\mathbb{Z}_{\geq g}$, +let $\mathcal{A}\left( n\right) $ be a logical statement. + +Assume the following: + +\begin{statement} +\textit{Assumption 1:} If $m\in\mathbb{Z}_{\geq g}$ is such that% +\[ +\left( \mathcal{A}\left( n\right) \text{ holds for every }n\in +\mathbb{Z}_{\geq g}\text{ satisfying }n1$, so that $m\geq2$ (since $m$ is an integer). +\end{verlong} + +From $m\geq2$, we conclude that $m-1\geq2-1=1\geq0$ and $m-2\geq2-2=0$. Thus, +both $m-1$ and $m-2$ belong to $\mathbb{N}$; therefore, $f_{m-1}$ and +$f_{m-2}$ are well-defined. + +We have $m-1\in\mathbb{N}=\mathbb{Z}_{\geq0}$ and $m-11$, so that $m\geq2$ (since $m$ is +an integer). +\end{verlong} + +From $m\geq2$, we conclude that $m-1\geq2-1=1\geq0$ and $m-2\geq2-2=0$. Thus, +both $m-1$ and $m-2$ belong to $\mathbb{N}$; therefore, $f_{m-1}$ and +$f_{m-2}$ are well-defined. + +We have $m-1\in\mathbb{N}=\mathbb{Z}_{\geq0}$ and $m-11$, so that $m\geq2$ (since $m$ is +an integer). +\end{verlong} + +From $m\geq2$, we conclude that $m-1\geq2-1=1\geq0$ and $m-2\geq2-2=0$. Thus, +both $m-1$ and $m-2$ belong to $\mathbb{N}$; therefore, $f_{m-1}$ and +$f_{m-2}$ are well-defined. + +We have $m-1\in\mathbb{N}=\mathbb{Z}_{\geq0}$ and $m-13$. + +Let us first consider Case 1. In this case, we have $m=0$. Thus, $t_{m}% +=t_{0}=1\in\mathbb{N}$. Hence, $t_{m}\in\mathbb{N}$ is proven in Case 1. + +Similarly, we can prove $t_{m}\in\mathbb{N}$ in Case 2 (using $t_{1}=1$) and +in Case 3 (using $t_{2}=1$) and in Case 4 (using $t_{3}=2$). It thus remains +to prove $t_{m}\in\mathbb{N}$ in Case 5. + +So let us consider Case 5. In this case, we have $m>3$. Thus, $m\geq4$ (since +$m$ is an integer), so that $m-2\geq4-2=2$. Thus, $m-2$ is an integer that is +$\geq2$. In other words, $m-2\in\mathbb{Z}_{\geq2}$. Hence, Proposition +\ref{prop.ind.LP1} \textbf{(a)} (applied to $n=m-2$) yields $t_{\left( +m-2\right) +2}=4t_{m-2}-t_{\left( m-2\right) -2}$. In view of $\left( +m-2\right) +2=m$ and $\left( m-2\right) -2=m-4$, this rewrites as +$t_{m}=4t_{m-2}-t_{m-4}$. + +\begin{vershort} +But $m\geq4$, so that $m-4\in\mathbb{N}$, and $m-43$. + +Let us first consider Case 1. In this case, we have $m=0$. Thus, $b_{m}% +=b_{0}=1\in\mathbb{N}$. Hence, $b_{m}\in\mathbb{N}$ is proven in Case 1. + +Similarly, we can prove $b_{m}\in\mathbb{N}$ in Case 2 (using $b_{1}=1$) and +in Case 3 (using $b_{2}=2$) and in Case 4 (using $b_{3}=2^{r}+1$). It thus +remains to prove $b_{m}\in\mathbb{N}$ in Case 5. + +So let us consider Case 5. In this case, we have $m>3$. Thus, $m\geq4$ (since +$m$ is an integer), so that $m-2\geq4-2=2$. Hence, Observation 1 (applied to +$n=m-2$) yields $b_{\left( m-2\right) +2}=b_{\left( m-2\right) +-2}b_{\left( m-2\right) +1}^{r}-b_{m-2}^{r-1}H\left( b_{m-2}^{r}\right) $. +In view of $\left( m-2\right) +2=m$ and $\left( m-2\right) -2=m-4$ and +$\left( m-2\right) +1=m-1$, this rewrites as +\begin{equation} +b_{m}=b_{m-4}b_{m-1}^{r}-b_{m-2}^{r-1}H\left( b_{m-2}^{r}\right) . +\label{pf.prop.ind.LP2.a.bm=}% +\end{equation} + + +But $m-2\in\mathbb{N}$ (since $m\geq4\geq2$) and $m-20$ +(since $v\in\mathbb{N}$). Thus, $u+v>u+0=u$, so that $u0$ +(since $v\in\mathbb{N}$). Thus, $u+v>u+0=u$, so that $uh$, then the set $\left\{ +g,g+1,\ldots,h\right\} $ is understood to be the empty set. +\end{convention} + +Thus, for example, $\left\{ 2,3,\ldots,1\right\} =\varnothing$ and $\left\{ +2,3,\ldots,0\right\} =\varnothing$ and $\left\{ 5,6,\ldots,-100\right\} +=\varnothing$. (But $\left\{ 5,6,\ldots,5\right\} =\left\{ 5\right\} $ and +$\left\{ 5,6,\ldots,6\right\} =\left\{ 5,6\right\} $.) + +We now state our induction principle for intervals: + +\begin{theorem} +\label{thm.ind.IPgh}Let $g\in\mathbb{Z}$ and $h\in\mathbb{Z}$. For each +$n\in\left\{ g,g+1,\ldots,h\right\} $, let $\mathcal{A}\left( n\right) $ +be a logical statement. + +Assume the following: + +\begin{statement} +\textit{Assumption 1:} If $g\leq h$, then the statement $\mathcal{A}\left( +g\right) $ holds. +\end{statement} + +\begin{statement} +\textit{Assumption 2:} If $m\in\left\{ g,g+1,\ldots,h-1\right\} $ is such +that $\mathcal{A}\left( m\right) $ holds, then $\mathcal{A}\left( +m+1\right) $ also holds. +\end{statement} + +Then, $\mathcal{A}\left( n\right) $ holds for each $n\in\left\{ +g,g+1,\ldots,h\right\} $. +\end{theorem} + +Theorem \ref{thm.ind.IPgh} is, in a sense, the closest one can get to Theorem +\ref{thm.ind.IPg} when having only finitely many statements $\mathcal{A}% +\left( g\right) ,\mathcal{A}\left( g+1\right) ,\ldots,\mathcal{A}\left( +h\right) $ instead of an infinite sequence of statements $\mathcal{A}\left( +g\right) ,\mathcal{A}\left( g+1\right) ,\mathcal{A}\left( g+2\right) +,\ldots$. It is easy to derive Theorem \ref{thm.ind.IPgh} from Corollary +\ref{cor.ind.IPg.renamed}: + +\begin{proof} +[Proof of Theorem \ref{thm.ind.IPgh}.]For each $n\in\mathbb{Z}_{\geq g}$, we +define $\mathcal{B}\left( n\right) $ to be the logical statement% +\[ +\left( \text{if }n\in\left\{ g,g+1,\ldots,h\right\} \text{, then +}\mathcal{A}\left( n\right) \text{ holds}\right) . +\] + + +Now, let us consider the Assumptions A and B from Corollary +\ref{cor.ind.IPg.renamed}. We claim that both of these assumptions are satisfied. + +Assumption 1 says that if $g\leq h$, then the statement $\mathcal{A}\left( +g\right) $ holds. Thus, $\mathcal{B}\left( g\right) $ +holds\footnote{\textit{Proof.} Assume that $g\in\left\{ g,g+1,\ldots +,h\right\} $. Thus, $g\leq h$. But Assumption 1 says that if $g\leq h$, then +the statement $\mathcal{A}\left( g\right) $ holds. Hence, the statement +$\mathcal{A}\left( g\right) $ holds (since $g\leq h$). +\par +Now, forget that we assumed that $g\in\left\{ g,g+1,\ldots,h\right\} $. We +thus have proven that if $g\in\left\{ g,g+1,\ldots,h\right\} $, then +$\mathcal{A}\left( g\right) $ holds. In other words, $\mathcal{B}\left( +g\right) $ holds (because the statement $\mathcal{B}\left( g\right) $ is +defined as $\left( \text{if }g\in\left\{ g,g+1,\ldots,h\right\} \text{, +then }\mathcal{A}\left( g\right) \text{ holds}\right) $). Qed.}. In other +words, Assumption A is satisfied. + +Next, we shall prove that Assumption B is satisfied. Indeed, let +$p\in\mathbb{Z}_{\geq g}$ be such that $\mathcal{B}\left( p\right) $ holds. +We shall now show that $\mathcal{B}\left( p+1\right) $ also holds. + +Indeed, assume that $p+1\in\left\{ g,g+1,\ldots,h\right\} $. Thus, $p+1\leq +h$, so that $p\leq p+1\leq h$. Combining this with $p\geq g$ (since +$p\in\mathbb{Z}_{\geq g}$), we conclude that $p\in\left\{ g,g+1,\ldots +,h\right\} $ (since $p$ is an integer). But we have assumed that +$\mathcal{B}\left( p\right) $ holds. In other words,% +\[ +\text{if }p\in\left\{ g,g+1,\ldots,h\right\} \text{, then }\mathcal{A}% +\left( p\right) \text{ holds}% +\] +(because the statement $\mathcal{B}\left( p\right) $ is defined as $\left( +\text{if }p\in\left\{ g,g+1,\ldots,h\right\} \text{, then }\mathcal{A}% +\left( p\right) \text{ holds}\right) $). Thus, $\mathcal{A}\left( +p\right) $ holds (since we have $p\in\left\{ g,g+1,\ldots,h\right\} $). +Also, from $p+1\leq h$, we obtain $p\leq h-1$. Combining this with $p\geq g$, +we find $p\in\left\{ g,g+1,\ldots,h-1\right\} $. Thus, we know that +$p\in\left\{ g,g+1,\ldots,h-1\right\} $ is such that $\mathcal{A}\left( +p\right) $ holds. Hence, Assumption 2 (applied to $m=p$) shows that +$\mathcal{A}\left( p+1\right) $ also holds. + +Now, forget that we assumed that $p+1\in\left\{ g,g+1,\ldots,h\right\} $. We +thus have proven that if $p+1\in\left\{ g,g+1,\ldots,h\right\} $, then +$\mathcal{A}\left( p+1\right) $ holds. In other words, $\mathcal{B}\left( +p+1\right) $ holds (since the statement $\mathcal{B}\left( p+1\right) $ is +defined as \newline$\left( \text{if }p+1\in\left\{ g,g+1,\ldots,h\right\} +\text{, then }\mathcal{A}\left( p+1\right) \text{ holds}\right) $). + +Now, forget that we fixed $p$. We thus have proven that if $p\in +\mathbb{Z}_{\geq g}$ is such that $\mathcal{B}\left( p\right) $ holds, then +$\mathcal{B}\left( p+1\right) $ also holds. In other words, Assumption B is satisfied. + +We now know that both Assumption A and Assumption B are satisfied. Hence, +Corollary \ref{cor.ind.IPg.renamed} shows that +\begin{equation} +\mathcal{B}\left( n\right) \text{ holds for each }n\in\mathbb{Z}_{\geq g}. +\label{pf.thm.ind.IPgh.at}% +\end{equation} + + +Now, let $n\in\left\{ g,g+1,\ldots,h\right\} $. Thus, $n\geq g$, so that +$n\in\mathbb{Z}_{\geq g}$. Hence, (\ref{pf.thm.ind.IPgh.at}) shows that +$\mathcal{B}\left( n\right) $ holds. In other words, +\[ +\text{if }n\in\left\{ g,g+1,\ldots,h\right\} \text{, then }\mathcal{A}% +\left( n\right) \text{ holds}% +\] +(since the statement $\mathcal{B}\left( n\right) $ was defined as $\left( +\text{if }n\in\left\{ g,g+1,\ldots,h\right\} \text{, then }\mathcal{A}% +\left( n\right) \text{ holds}\right) $). Thus, $\mathcal{A}\left( +n\right) $ holds (since we have $n\in\left\{ g,g+1,\ldots,h\right\} $). + +Now, forget that we fixed $n$. We thus have shown that $\mathcal{A}\left( +n\right) $ holds for each $n\in\left\{ g,g+1,\ldots,h\right\} $. This +proves Theorem \ref{thm.ind.IPgh}. +\end{proof} + +Theorem \ref{thm.ind.IPgh} is called the \textit{principle of induction +starting at }$g$\textit{ and ending at }$h$, and proofs that use it are +usually called \textit{proofs by induction} or \textit{induction proofs}. As +with all the other induction principles seen so far, we don't usually +explicitly cite Theorem \ref{thm.ind.IPgh}, but instead say certain words that +signal that it is being applied and that (ideally) also indicate what integers +$g$ and $h$ and what statements $\mathcal{A}\left( n\right) $ it is being +applied to\footnote{We will explain this in Convention \ref{conv.ind.IPghlang} +below.}. However, we shall reference it explicitly in our very first example +of the use of Theorem \ref{thm.ind.IPgh}: + +\begin{proposition} +\label{prop.ind.dc}Let $g$ and $h$ be integers such that $g\leq h$. Let +$b_{g},b_{g+1},\ldots,b_{h}$ be any $h-g+1$ nonzero integers. Assume that +$b_{g}\geq0$. Assume further that% +\begin{equation} +\left\vert b_{i+1}-b_{i}\right\vert \leq1\ \ \ \ \ \ \ \ \ \ \text{for every +}i\in\left\{ g,g+1,\ldots,h-1\right\} . \label{eq.prop.ind.dc.ass}% +\end{equation} +Then, $b_{n}>0$ for each $n\in\left\{ g,g+1,\ldots,h\right\} $. +\end{proposition} + +Proposition \ref{prop.ind.dc} is often called the \textquotedblleft% +\textit{discrete intermediate value theorem}\textquotedblright\ or the +\textquotedblleft\textit{discrete continuity principle}\textquotedblright. Its +intuitive meaning is that if a finite list of nonzero integers starts with a +nonnegative integer, and every further entry of this list differs from its +preceding entry by at most $1$, then all entries of this list must be +positive. An example of such a list is $\left( +2,3,3,2,3,4,4,3,2,3,2,3,2,1\right) $. Notice that Proposition +\ref{prop.ind.dc} is, again, rather obvious from an intuitive perspective: It +just says that it isn't possible to go from a nonnegative integer to a +negative integer by steps of $1$ without ever stepping at $0$. The rigorous +proof of Proposition \ref{prop.ind.dc} is not much harder -- but because it is +a statement about elements of $\left\{ g,g+1,\ldots,h\right\} $, it +naturally relies on Theorem \ref{thm.ind.IPgh}: + +\begin{proof} +[Proof of Proposition \ref{prop.ind.dc}.]For each $n\in\left\{ g,g+1,\ldots +,h\right\} $, we let $\mathcal{A}\left( n\right) $ be the statement +$\left( b_{n}>0\right) $. + +Our next goal is to prove the statement $\mathcal{A}\left( n\right) $ for +each $n\in\left\{ g,g+1,\ldots,h\right\} $. + +All the $h-g+1$ integers $b_{g},b_{g+1},\ldots,b_{h}$ are nonzero (by +assumption). Thus, in particular, $b_{g}$ is nonzero. In other words, +$b_{g}\neq0$. Combining this with $b_{g}\geq0$, we obtain $b_{g}>0$. In other +words, the statement $\mathcal{A}\left( g\right) $ holds (since this +statement $\mathcal{A}\left( g\right) $ is defined to be $\left( +b_{g}>0\right) $). Hence, +\begin{equation} +\text{if }g\leq h\text{, then the statement }\mathcal{A}\left( g\right) +\text{ holds.} \label{pf.prop.ind.dc.base}% +\end{equation} + + +Now, we claim that +\begin{equation} +\text{if }m\in\left\{ g,g+1,\ldots,h-1\right\} \text{ is such that +}\mathcal{A}\left( m\right) \text{ holds, then }\mathcal{A}\left( +m+1\right) \text{ also holds.} \label{pf.prop.ind.dc.step}% +\end{equation} + + +[\textit{Proof of (\ref{pf.prop.ind.dc.step}):} Let $m\in\left\{ +g,g+1,\ldots,h-1\right\} $ be such that $\mathcal{A}\left( m\right) $ +holds. We must show that $\mathcal{A}\left( m+1\right) $ also holds. + +We have assumed that $\mathcal{A}\left( m\right) $ holds. In other words, +$b_{m}>0$ holds (since $\mathcal{A}\left( m\right) $ is defined to be the +statement $\left( b_{m}>0\right) $). Now, (\ref{eq.prop.ind.dc.ass}) +(applied to $i=m$) yields $\left\vert b_{m+1}-b_{m}\right\vert \leq1$. But it +is well-known (and easy to see) that every integer $x$ satisfies +$-x\leq\left\vert x\right\vert $. Applying this to $x=b_{m+1}-b_{m}$, we +obtain $-\left( b_{m+1}-b_{m}\right) \leq\left\vert b_{m+1}-b_{m}\right\vert +\leq1$. In other words, $1\geq-\left( b_{m+1}-b_{m}\right) =b_{m}-b_{m+1}$. +In other words, $1+b_{m+1}\geq b_{m}$. Hence, $1+b_{m+1}\geq b_{m}>0$, so that +$1+b_{m+1}\geq1$ (since $1+b_{m+1}$ is an integer). In other words, +$b_{m+1}\geq0$. + +But all the $h-g+1$ integers $b_{g},b_{g+1},\ldots,b_{h}$ are nonzero (by +assumption). Thus, in particular, $b_{m+1}$ is nonzero. In other words, +$b_{m+1}\neq0$. Combining this with $b_{m+1}\geq0$, we obtain $b_{m+1}>0$. But +this is precisely the statement $\mathcal{A}\left( m+1\right) $ (because +$\mathcal{A}\left( m+1\right) $ is defined to be the statement $\left( +b_{m+1}>0\right) $). Thus, the statement $\mathcal{A}\left( m+1\right) $ holds. + +Now, forget that we fixed $m$. We thus have shown that if $m\in\left\{ +g,g+1,\ldots,h-1\right\} $ is such that $\mathcal{A}\left( m\right) $ +holds, then $\mathcal{A}\left( m+1\right) $ also holds. This proves +(\ref{pf.prop.ind.dc.step}).] + +Now, both assumptions of Theorem \ref{thm.ind.IPgh} are satisfied (indeed, +Assumption 1 holds because of (\ref{pf.prop.ind.dc.base}), whereas Assumption +2 holds because of (\ref{pf.prop.ind.dc.step})). Thus, Theorem +\ref{thm.ind.IPgh} shows that $\mathcal{A}\left( n\right) $ holds for each +$n\in\left\{ g,g+1,\ldots,h\right\} $. In other words, $b_{n}>0$ holds for +each $n\in\left\{ g,g+1,\ldots,h\right\} $ (since $\mathcal{A}\left( +n\right) $ is the statement $\left( b_{n}>0\right) $). This proves +Proposition \ref{prop.ind.dc}. +\end{proof} + +\subsubsection{Conventions for writing induction proofs in intervals} + +Next, we shall introduce some standard language that is commonly used in +proofs by induction starting at $g$ and ending at $h$. This language closely +imitates the one we use for proofs by standard induction: + +\begin{convention} +\label{conv.ind.IPghlang}Let $g\in\mathbb{Z}$ and $h\in\mathbb{Z}$. For each +$n\in\left\{ g,g+1,\ldots,h\right\} $, let $\mathcal{A}\left( n\right) $ +be a logical statement. Assume that you want to prove that $\mathcal{A}\left( +n\right) $ holds for each $n\in\left\{ g,g+1,\ldots,h\right\} $. + +Theorem \ref{thm.ind.IPgh} offers the following strategy for proving this: +First show that Assumption 1 of Theorem \ref{thm.ind.IPgh} is satisfied; then, +show that Assumption 2 of Theorem \ref{thm.ind.IPgh} is satisfied; then, +Theorem \ref{thm.ind.IPgh} automatically completes your proof. + +A proof that follows this strategy is called a \textit{proof by induction on +}$n$ (or \textit{proof by induction over }$n$) \textit{starting at }$g$ +\textit{and ending at }$h$ or (less precisely) an \textit{inductive proof}. +Most of the time, the words \textquotedblleft starting at $g$ and ending at +$h$\textquotedblright\ are omitted, since they merely repeat what is clear +from the context anyway: For example, if you make a claim about all integers +$n\in\left\{ 3,4,5,6\right\} $, and you say that you are proving it by +induction on $n$, it is clear that you are using induction on $n$ starting at +$3$ and ending at $6$. + +The proof that Assumption 1 is satisfied is called the \textit{induction base} +(or \textit{base case}) of the proof. The proof that Assumption 2 is satisfied +is called the \textit{induction step} of the proof. + +In order to prove that Assumption 2 is satisfied, you will usually want to fix +an $m\in\left\{ g,g+1,\ldots,h-1\right\} $ such that $\mathcal{A}\left( +m\right) $ holds, and then prove that $\mathcal{A}\left( m+1\right) $ +holds. In other words, you will usually want to fix $m\in\left\{ +g,g+1,\ldots,h-1\right\} $, assume that $\mathcal{A}\left( m\right) $ +holds, and then prove that $\mathcal{A}\left( m+1\right) $ holds. When doing +so, it is common to refer to the assumption that $\mathcal{A}\left( m\right) +$ holds as the \textit{induction hypothesis} (or \textit{induction assumption}). +\end{convention} + +Unsurprisingly, this language parallels the language introduced in Convention +\ref{conv.ind.IP0lang} and in Convention \ref{conv.ind.IPglang}. + +Again, we can shorten our inductive proofs by omitting some sentences that +convey no information. In particular, we can leave out the explicit definition +of the statement $\mathcal{A}\left( n\right) $ when this statement is +precisely the claim that we are proving (without the \textquotedblleft for +each $n\in\mathbb{N}$\textquotedblright\ part). Furthermore, it is common to +leave the \textquotedblleft If $g\leq h$\textquotedblright\ part of Assumption +1 unsaid (i.e., to pretend that Assumption 1 simply says that $\mathcal{A}% +\left( g\right) $ holds). Strictly speaking, this is somewhat imprecise, +since $\mathcal{A}\left( g\right) $ is not defined when $g>h$; but of +course, the whole claim that is being proven is moot anyway when $g>h$ +(because there exist no $n\in\left\{ g,g+1,\ldots,h\right\} $ in this case), +so this imprecision doesn't matter. + +Thus, we can rewrite our above proof of Proposition \ref{prop.ind.dc} as follows: + +\begin{proof} +[Proof of Proposition \ref{prop.ind.dc} (second version).]We claim that% +\begin{equation} +b_{n}>0 \label{pf.prop.ind.dc.2nd.claim}% +\end{equation} +for each $n\in\left\{ g,g+1,\ldots,h\right\} $. + +Indeed, we shall prove (\ref{pf.prop.ind.dc.2nd.claim}) by induction on $n$: + +\textit{Induction base:} All the $h-g+1$ integers $b_{g},b_{g+1},\ldots,b_{h}$ +are nonzero (by assumption). Thus, in particular, $b_{g}$ is nonzero. In other +words, $b_{g}\neq0$. Combining this with $b_{g}\geq0$, we obtain $b_{g}>0$. In +other words, (\ref{pf.prop.ind.dc.2nd.claim}) holds for $n=g$. This completes +the induction base. + +\textit{Induction step:} Let $m\in\left\{ g,g+1,\ldots,h-1\right\} $. Assume +that (\ref{pf.prop.ind.dc.2nd.claim}) holds for $n=m$. We must show that +(\ref{pf.prop.ind.dc.2nd.claim}) also holds for $n=m+1$. + +We have assumed that (\ref{pf.prop.ind.dc.2nd.claim}) holds for $n=m$. In +other words, $b_{m}>0$. Now, (\ref{eq.prop.ind.dc.ass}) (applied to $i=m$) +yields $\left\vert b_{m+1}-b_{m}\right\vert \leq1$. But it is well-known (and +easy to see) that every integer $x$ satisfies $-x\leq\left\vert x\right\vert +$. Applying this to $x=b_{m+1}-b_{m}$, we obtain $-\left( b_{m+1}% +-b_{m}\right) \leq\left\vert b_{m+1}-b_{m}\right\vert \leq1$. In other words, +$1\geq-\left( b_{m+1}-b_{m}\right) =b_{m}-b_{m+1}$. In other words, +$1+b_{m+1}\geq b_{m}$. Hence, $1+b_{m+1}\geq b_{m}>0$, so that $1+b_{m+1}% +\geq1$ (since $1+b_{m+1}$ is an integer). In other words, $b_{m+1}\geq0$. + +But all the $h-g+1$ integers $b_{g},b_{g+1},\ldots,b_{h}$ are nonzero (by +assumption). Thus, in particular, $b_{m+1}$ is nonzero. In other words, +$b_{m+1}\neq0$. Combining this with $b_{m+1}\geq0$, we obtain $b_{m+1}>0$. In +other words, (\ref{pf.prop.ind.dc.2nd.claim}) holds for $n=m+1$. This +completes the induction step. Thus, (\ref{pf.prop.ind.dc.2nd.claim}) is proven +by induction. This proves Proposition \ref{prop.ind.dc}. +\end{proof} + +\subsection{\label{sect.ind.strong-interval}Strong induction in an interval} + +\subsubsection{The strong induction principle for intervals} + +We shall next state yet another induction principle -- one that combines the +idea of strong induction (as in Theorem \ref{thm.ind.SIP}) with the idea of +working inside an interval $\left\{ g,g+1,\ldots,h\right\} $ (as in Theorem +\ref{thm.ind.IPgh}): + +\begin{theorem} +\label{thm.ind.SIPgh}Let $g\in\mathbb{Z}$ and $h\in\mathbb{Z}$. For each +$n\in\left\{ g,g+1,\ldots,h\right\} $, let $\mathcal{A}\left( n\right) $ +be a logical statement. + +Assume the following: + +\begin{statement} +\textit{Assumption 1:} If $m\in\left\{ g,g+1,\ldots,h\right\} $ is such that% +\[ +\left( \mathcal{A}\left( n\right) \text{ holds for every }n\in\left\{ +g,g+1,\ldots,h\right\} \text{ satisfying }n0$ for each $n\in\left\{ +g,g+1,\ldots,h\right\} $. +\end{proposition} + +Proposition \ref{prop.ind.dcs} is a more general (although less intuitive) +version of Proposition \ref{prop.ind.dc}; indeed, it is easy to see that the +condition (\ref{eq.prop.ind.dc.ass}) is stronger than the condition +(\ref{eq.prop.ind.dcs.ass}) (when required for all $p\in\left\{ +g+1,g+2,\ldots,h\right\} $). + +\begin{example} +For this example, set $g=3$ and $h=7$. Then, if we set $\left( b_{3}% +,b_{4},b_{5},b_{6},b_{7}\right) =\left( 4,5,3,4,2\right) $, then the +condition (\ref{eq.prop.ind.dcs.ass}) holds for all $p\in\left\{ +g+1,g+2,\ldots,h\right\} $. (For example, it holds for $p=5$, since +$b_{5}=3\geq4-1=b_{1}-1$ and $1\in\left\{ g,g+1,\ldots,5-1\right\} $.) On +the other hand, if we set $\left( b_{3},b_{4},b_{5},b_{6},b_{7}\right) +=\left( 4,5,2,4,3\right) $, then this condition does not hold (indeed, it +fails for $p=5$, since $b_{5}=2$ is neither $\geq4-1$ nor $\geq5-1$). +\end{example} + +Let us now prove Proposition \ref{prop.ind.dcs} using Theorem +\ref{thm.ind.SIPgh}: + +\begin{proof} +[Proof of Proposition \ref{prop.ind.dcs}.]For each $n\in\left\{ +g,g+1,\ldots,h\right\} $, we let $\mathcal{A}\left( n\right) $ be the +statement $\left( b_{n}>0\right) $. + +Our next goal is to prove the statement $\mathcal{A}\left( n\right) $ for +each $n\in\left\{ g,g+1,\ldots,h\right\} $. + +All the $h-g+1$ integers $b_{g},b_{g+1},\ldots,b_{h}$ are nonzero (by +assumption). Thus, in particular, $b_{g}$ is nonzero. In other words, +$b_{g}\neq0$. Combining this with $b_{g}\geq0$, we obtain $b_{g}>0$. In other +words, the statement $\mathcal{A}\left( g\right) $ holds (since this +statement $\mathcal{A}\left( g\right) $ is defined to be $\left( +b_{g}>0\right) $). + +Now, we make the following claim: + +\begin{statement} +\textit{Claim 1:} If $m\in\left\{ g,g+1,\ldots,h\right\} $ is such that% +\[ +\left( \mathcal{A}\left( n\right) \text{ holds for every }n\in\left\{ +g,g+1,\ldots,h\right\} \text{ satisfying }n0$ holds (since $\mathcal{A}\left( +j\right) $ is defined to be the statement $\left( b_{j}>0\right) $). Thus, +$b_{j}\geq1$ (since $b_{j}$ is an integer), so that $b_{j}-1\geq0$. But recall +that $b_{m}\geq b_{j}-1\geq0$. + +But all the $h-g+1$ integers $b_{g},b_{g+1},\ldots,b_{h}$ are nonzero (by +assumption). Thus, in particular, $b_{m}$ is nonzero. In other words, +$b_{m}\neq0$. Combining this with $b_{m}\geq0$, we obtain $b_{m}>0$. But this +is precisely the statement $\mathcal{A}\left( m\right) $ (because +$\mathcal{A}\left( m\right) $ is defined to be the statement $\left( +b_{m}>0\right) $). Thus, the statement $\mathcal{A}\left( m\right) $ holds. +This completes the proof of Claim 1.] + +Claim 1 says that Assumption 1 of Theorem \ref{thm.ind.SIPgh} is satisfied. +Thus, Theorem \ref{thm.ind.SIPgh} shows that $\mathcal{A}\left( n\right) $ +holds for each $n\in\left\{ g,g+1,\ldots,h\right\} $. In other words, +$b_{n}>0$ holds for each $n\in\left\{ g,g+1,\ldots,h\right\} $ (since +$\mathcal{A}\left( n\right) $ is the statement $\left( b_{n}>0\right) $). +This proves Proposition \ref{prop.ind.dcs}. +\end{proof} + +\subsubsection{Conventions for writing strong induction proofs in intervals} + +Next, we shall introduce some standard language that is commonly used in +proofs by strong induction starting at $g$ and ending at $h$. This language +closely imitates the one we use for proofs by \textquotedblleft +usual\textquotedblright\ strong induction: + +\begin{convention} +\label{conv.ind.SIPghlang}Let $g\in\mathbb{Z}$ and $h\in\mathbb{Z}$. For each +$n\in\left\{ g,g+1,\ldots,h\right\} $, let $\mathcal{A}\left( n\right) $ +be a logical statement. Assume that you want to prove that $\mathcal{A}\left( +n\right) $ holds for each $n\in\left\{ g,g+1,\ldots,h\right\} $. + +Theorem \ref{thm.ind.SIPgh} offers the following strategy for proving this: +Show that Assumption 1 of Theorem \ref{thm.ind.SIPgh} is satisfied; then, +Theorem \ref{thm.ind.SIPgh} automatically completes your proof. + +A proof that follows this strategy is called a \textit{proof by strong +induction on }$n$ \textit{starting at }$g$ \textit{and ending at }$h$. Most of +the time, the words \textquotedblleft starting at $g$ and ending at +$h$\textquotedblright\ are omitted. The proof that Assumption 1 is satisfied +is called the \textit{induction step} of the proof. This kind of proof has no +\textquotedblleft induction base\textquotedblright. + +In order to prove that Assumption 1 is satisfied, you will usually want to fix +an $m\in\left\{ g,g+1,\ldots,h\right\} $ such that +\begin{equation} +\left( \mathcal{A}\left( n\right) \text{ holds for every }n\in\left\{ +g,g+1,\ldots,h\right\} \text{ satisfying }n0 \label{pf.prop.ind.dcs.2nd.claim}% +\end{equation} +for each $n\in\left\{ g,g+1,\ldots,h\right\} $. + +Indeed, we shall prove (\ref{pf.prop.ind.dcs.2nd.claim}) by strong induction +on $n$: + +Let $m\in\left\{ g,g+1,\ldots,h\right\} $. Assume that +(\ref{pf.prop.ind.dcs.2nd.claim}) holds for every $n\in\left\{ g,g+1,\ldots +,h\right\} $ satisfying $n0$. + +All the $h-g+1$ integers $b_{g},b_{g+1},\ldots,b_{h}$ are nonzero (by +assumption). Thus, in particular, $b_{g}$ is nonzero. In other words, +$b_{g}\neq0$. Combining this with $b_{g}\geq0$, we obtain $b_{g}>0$. + +We have assumed that (\ref{pf.prop.ind.dcs.2nd.claim}) holds for every +$n\in\left\{ g,g+1,\ldots,h\right\} $ satisfying $n0\text{ for every }n\in\left\{ g,g+1,\ldots,h\right\} \text{ +satisfying }n0$. If $m=g$, then this follows from +$b_{g}>0$. Thus, for the rest of this induction step, we WLOG assume that we +don't have $m=g$. Hence, $m\neq g$. Combining this with $m\in\left\{ +g,g+1,\ldots,h\right\} $, we obtain $m\in\left\{ g,g+1,\ldots,h\right\} +\setminus\left\{ g\right\} \subseteq\left\{ g+1,g+2,\ldots,h\right\} $. +Hence, (\ref{eq.prop.ind.dcs.ass}) (applied to $p=m$) shows that there exists +some $j\in\left\{ g,g+1,\ldots,m-1\right\} $ such that $b_{m}\geq b_{j}-1$. +Consider this $j$. From $m\in\left\{ g+1,g+2,\ldots,h\right\} $, we obtain +$m\leq h$. + +From $j\in\left\{ g,g+1,\ldots,m-1\right\} $, we obtain $j\leq m-10$. +Thus, $b_{j}\geq1$ (since $b_{j}$ is an integer), so that $b_{j}-1\geq0$. But +recall that $b_{m}\geq b_{j}-1\geq0$. + +But all the $h-g+1$ integers $b_{g},b_{g+1},\ldots,b_{h}$ are nonzero (by +assumption). Thus, in particular, $b_{m}$ is nonzero. In other words, +$b_{m}\neq0$. Combining this with $b_{m}\geq0$, we obtain $b_{m}>0$. + +Thus, we have proven that $b_{m}>0$. In other words, +(\ref{pf.prop.ind.dcs.2nd.claim}) holds for $n=m$. This completes the +induction step. Thus, (\ref{pf.prop.ind.dcs.2nd.claim}) is proven by strong +induction. This proves Proposition \ref{prop.ind.dcs}. +\end{proof} + +\subsection{\label{sect.ind.gen-ass}General associativity for composition of +maps} + +\subsubsection{Associativity of map composition} + +Recall that if $f:X\rightarrow Y$ and $g:Y\rightarrow Z$ are two maps, then +the \textit{composition} $g\circ f$ of the maps $g$ and $f$ is defined to be +the map% +\[ +X\rightarrow Z,\ x\mapsto g\left( f\left( x\right) \right) . +\] + + +Now, if we have four sets $X$, $Y$, $Z$ and $W$ and three maps $c:X\rightarrow +Y$, $b:Y\rightarrow Z$ and $a:Z\rightarrow W$, then we can build two possible +compositions that use all three of these maps: namely, the two compositions +$\left( a\circ b\right) \circ c$ and $a\circ\left( b\circ c\right) $. It +turns out that these two compositions are the same map:\footnote{Of course, +when some of the four sets $X$, $Y$, $Z$ and $W$ are equal, then more +compositions can be built: For example, if $Y=Z=W$, then we can also build the +composition $\left( b\circ a\right) \circ c$ or the composition $\left( +\left( b\circ b\right) \circ a\right) \circ c$. But these compositions are +not the same map as the two that we previously constructed.} + +\begin{proposition} +\label{prop.ind.gen-ass-maps.fgh}Let $X$, $Y$, $Z$ and $W$ be four sets. Let +$c:X\rightarrow Y$, $b:Y\rightarrow Z$ and $a:Z\rightarrow W$ be three maps. +Then,% +\[ +\left( a\circ b\right) \circ c=a\circ\left( b\circ c\right) . +\] + +\end{proposition} + +Proposition \ref{prop.ind.gen-ass-maps.fgh} is called the +\textit{associativity of map composition}, and is proven straightforwardly: + +\begin{proof} +[Proof of Proposition \ref{prop.ind.gen-ass-maps.fgh}.]Let $x\in X$. Then, the +definition of $b\circ c$ yields $\left( b\circ c\right) \left( x\right) +=b\left( c\left( x\right) \right) $. But% +\begin{align*} +\left( \left( a\circ b\right) \circ c\right) \left( x\right) & +=\left( a\circ b\right) \left( c\left( x\right) \right) +\ \ \ \ \ \ \ \ \ \ \left( \text{by the definition of }\left( a\circ +b\right) \circ c\right) \\ +& =a\left( b\left( c\left( x\right) \right) \right) +\ \ \ \ \ \ \ \ \ \ \left( \text{by the definition of }a\circ b\right) . +\end{align*} +Comparing this with% +\[ +\left( a\circ\left( b\circ c\right) \right) \left( x\right) =a\left( +\underbrace{\left( b\circ c\right) \left( x\right) }_{=b\left( c\left( +x\right) \right) }\right) =a\left( b\left( c\left( x\right) \right) +\right) , +\] +we obtain $\left( \left( a\circ b\right) \circ c\right) \left( x\right) +=\left( a\circ\left( b\circ c\right) \right) \left( x\right) $. + +Now, forget that we fixed $x$. We thus have shown that +\[ +\left( \left( a\circ b\right) \circ c\right) \left( x\right) =\left( +a\circ\left( b\circ c\right) \right) \left( x\right) +\ \ \ \ \ \ \ \ \ \ \text{for each }x\in X. +\] +In other words, $\left( a\circ b\right) \circ c=a\circ\left( b\circ +c\right) $. This proves Proposition \ref{prop.ind.gen-ass-maps.fgh}. +\end{proof} + +\subsubsection{Composing more than $3$ maps: exploration} + +Proposition \ref{prop.ind.gen-ass-maps.fgh} can be restated as follows: If +$a$, $b$ and $c$ are three maps such that the compositions $a\circ b$ and +$b\circ c$ are well-defined, then $\left( a\circ b\right) \circ +c=a\circ\left( b\circ c\right) $. This allows us to write \textquotedblleft% +$a\circ b\circ c$\textquotedblright\ for each of the compositions $\left( +a\circ b\right) \circ c$ and $a\circ\left( b\circ c\right) $ without having +to disambiguate this expression by means of parentheses. It is natural to ask +whether we can do the same thing for more than three maps. For example, let us +consider four maps $a$, $b$, $c$ and $d$ for which the compositions $a\circ +b$, $b\circ c$ and $c\circ d$ are well-defined: + +\begin{example} +\label{exa.ind.gen-ass-maps.abcd}Let $X$, $Y$, $Z$, $W$ and $U$ be five sets. +Let $d:X\rightarrow Y$, $c:Y\rightarrow Z$, $b:Z\rightarrow W$ and +$a:W\rightarrow U$ be four maps. Then, there we can construct five +compositions that use all four of these maps; these five compositions are% +\begin{align} +& \left( \left( a\circ b\right) \circ c\right) \circ +d,\ \ \ \ \ \ \ \ \ \ \left( a\circ\left( b\circ c\right) \right) \circ +d,\ \ \ \ \ \ \ \ \ \ \left( a\circ b\right) \circ\left( c\circ d\right) +,\label{eq.exa.ind.gen-ass-maps.abcd.cp1}\\ +& a\circ\left( \left( b\circ c\right) \circ d\right) +,\ \ \ \ \ \ \ \ \ \ a\circ\left( b\circ\left( c\circ d\right) \right) . +\label{eq.exa.ind.gen-ass-maps.abcd.cp2}% +\end{align} +It turns out that these five compositions are all the same map. Indeed, this +follows by combining the following observations: + +\begin{itemize} +\item We have $\left( \left( a\circ b\right) \circ c\right) \circ +d=\left( a\circ\left( b\circ c\right) \right) \circ d$ (since Proposition +\ref{prop.ind.gen-ass-maps.fgh} yields $\left( a\circ b\right) \circ +c=a\circ\left( b\circ c\right) $). + +\item We have $a\circ\left( \left( b\circ c\right) \circ d\right) +=a\circ\left( b\circ\left( c\circ d\right) \right) $ (since Proposition +\ref{prop.ind.gen-ass-maps.fgh} yields $\left( b\circ c\right) \circ +d=b\circ\left( c\circ d\right) $). + +\item We have $\left( a\circ\left( b\circ c\right) \right) \circ +d=a\circ\left( \left( b\circ c\right) \circ d\right) $ (by Proposition +\ref{prop.ind.gen-ass-maps.fgh}, applied to $W$, $U$, $b\circ c$ and $d$ +instead of $Z$, $W$, $b$ and $c$). + +\item We have $\left( \left( a\circ b\right) \circ c\right) \circ +d=\left( a\circ b\right) \circ\left( c\circ d\right) $ (by Proposition +\ref{prop.ind.gen-ass-maps.fgh}, applied to $U$, $a\circ b$, $c$ and $d$ +instead of $W$, $a$, $b$ and $c$). +\end{itemize} + +Hence, all five compositions are equal. Thus, we can write \textquotedblleft% +$a\circ b\circ c\circ d$\textquotedblright\ for each of these five +compositions, again dropping the parentheses. + +We shall refer to the five compositions listed in +(\ref{eq.exa.ind.gen-ass-maps.abcd.cp1}) and +(\ref{eq.exa.ind.gen-ass-maps.abcd.cp2}) as the \textquotedblleft complete +parenthesizations of $a\circ b\circ c\circ d$\textquotedblright. Here, the +word \textquotedblleft parenthesization\textquotedblright\ means a way to put +parentheses into the expression \textquotedblleft$a\circ b\circ c\circ +d$\textquotedblright, whereas the word \textquotedblleft +complete\textquotedblright\ means that these parentheses unambiguously +determine which two maps any given $\circ$ sign is composing. (For example, +the parenthesization \textquotedblleft$\left( a\circ b\circ c\right) \circ +d$\textquotedblright\ is not complete, because the first $\circ$ sign in it +could be either composing $a$ with $b$ or composing $a$ with $b\circ c$. But +the parenthesization \textquotedblleft$\left( \left( a\circ b\right) \circ +c\right) \circ d$\textquotedblright\ is complete, because its first $\circ$ +sign composes $a$ and $b$, whereas its second $\circ$ sign composes $a\circ b$ +with $c$, and finally its third $\circ$ sign composes $\left( a\circ +b\right) \circ c$ with $d$.) + +Thus, we have seen that all five complete parenthesizations of $a\circ b\circ +c\circ d$ are the same map. +\end{example} + +What happens if we compose more than four maps? Clearly, the more maps we +have, the more complete parenthesizations can be constructed. We have good +reasons to suspect that these parenthesizations will all be the same map (so +we can again drop the parentheses); but if we try to prove it in the ad-hoc +way we did in Example \ref{exa.ind.gen-ass-maps.abcd}, then we have more and +more work to do the more maps we are composing. Clearly, if we want to prove +our suspicion for arbitrarily many maps, we need a more general approach. + +\subsubsection{Formalizing general associativity} + +So let us make a general statement; but first, let us formally define the +notion of a \textquotedblleft complete parenthesization\textquotedblright: + +\begin{definition} +\label{def.ind.gen-ass-maps.cp}Let $n$ be a positive integer. Let $X_{1}% +,X_{2},\ldots,X_{n+1}$ be $n+1$ sets. For each $i\in\left\{ 1,2,\ldots +,n\right\} $, let $f_{i}:X_{i}\rightarrow X_{i+1}$ be a map. Then, we want to +define the notion of a \textit{complete parenthesization} of $f_{n}\circ +f_{n-1}\circ\cdots\circ f_{1}$. We define this notion by recursion on $n$ as follows: + +\begin{itemize} +\item For $n=1$, there is only one complete parenthesization of $f_{n}\circ +f_{n-1}\circ\cdots\circ f_{1}$, and this is simply the map $f_{1}% +:X_{1}\rightarrow X_{2}$. + +\item If $n>1$, then the complete parenthesizations of $f_{n}\circ +f_{n-1}\circ\cdots\circ f_{1}$ are all the maps of the form $\alpha\circ\beta +$, where + +\begin{itemize} +\item $k$ is some element of $\left\{ 1,2,\ldots,n-1\right\} $; + +\item $\alpha$ is a complete parenthesization of $f_{n}\circ f_{n-1}% +\circ\cdots\circ f_{k+1}$; + +\item $\beta$ is a complete parenthesization of $f_{k}\circ f_{k-1}\circ +\cdots\circ f_{1}$. +\end{itemize} +\end{itemize} +\end{definition} + +\begin{example} +Let us see what this definition yields for small values of $n$: + +\begin{itemize} +\item For $n=1$, the only complete parenthesization of $f_{1}$ is $f_{1}$. + +\item For $n=2$, the only complete parenthesization of $f_{2}\circ f_{1}$ is +the composition $f_{2}\circ f_{1}$ (because here, the only possible values of +$k$, $\alpha$ and $\beta$ are $1$, $f_{2}$ and $f_{1}$, respectively). + +\item For $n=3$, the complete parenthesizations of $f_{3}\circ f_{2}\circ +f_{1}$ are the two compositions $\left( f_{3}\circ f_{2}\right) \circ f_{1}$ +and $f_{3}\circ\left( f_{2}\circ f_{1}\right) $ (because here, the only +possible values of $k$ are $1$ and $2$, and each value of $k$ uniquely +determines $\alpha$ and $\beta$). Proposition \ref{prop.ind.gen-ass-maps.fgh} +shows that they are equal (as maps). + +\item For $n=4$, the complete parenthesizations of $f_{4}\circ f_{3}\circ +f_{2}\circ f_{1}$ are the five compositions% +\begin{align*} +& \left( \left( f_{4}\circ f_{3}\right) \circ f_{2}\right) \circ +f_{1},\ \ \ \ \ \ \ \ \ \ \left( f_{4}\circ\left( f_{3}\circ f_{2}\right) +\right) \circ f_{1},\ \ \ \ \ \ \ \ \ \ \left( f_{4}\circ f_{3}\right) +\circ\left( f_{2}\circ f_{1}\right) ,\\ +& f_{4}\circ\left( \left( f_{3}\circ f_{2}\right) \circ f_{1}\right) +,\ \ \ \ \ \ \ \ \ \ f_{4}\circ\left( f_{3}\circ\left( f_{2}\circ +f_{1}\right) \right) . +\end{align*} +(These are exactly the five compositions listed in +(\ref{eq.exa.ind.gen-ass-maps.abcd.cp1}) and +(\ref{eq.exa.ind.gen-ass-maps.abcd.cp2}), except that the maps $d,c,b,a$ are +now called $f_{1},f_{2},f_{3},f_{4}$.) We have seen in Example +\ref{exa.ind.gen-ass-maps.abcd} that these five compositions are equal as maps. + +\item For $n=5$, the complete parenthesizations of $f_{5}\circ f_{4}\circ +f_{3}\circ f_{2}\circ f_{1}$ are $14$ compositions, one of which is $\left( +f_{5}\circ f_{4}\right) \circ\left( f_{3}\circ\left( f_{2}\circ +f_{1}\right) \right) $. Again, it is laborious but not difficult to check +that all the $14$ compositions are equal as maps. +\end{itemize} +\end{example} + +Now, we want to prove the following general statement: + +\begin{theorem} +\label{thm.ind.gen-ass-maps.cp}Let $n$ be a positive integer. Let $X_{1}% +,X_{2},\ldots,X_{n+1}$ be $n+1$ sets. For each $i\in\left\{ 1,2,\ldots +,n\right\} $, let $f_{i}:X_{i}\rightarrow X_{i+1}$ be a map. Then, all +complete parenthesizations of $f_{n}\circ f_{n-1}\circ\cdots\circ f_{1}$ are +the same map (from $X_{1}$ to $X_{n+1}$). +\end{theorem} + +Theorem \ref{thm.ind.gen-ass-maps.cp} is sometimes called the \textit{general +associativity} theorem, and is often proved in the context of monoids (see, +e.g., \cite[Proposition 2.1.4]{Artin}); while the context is somewhat +different from ours, the proofs usually given still apply in ours. + +\subsubsection{Defining the \textquotedblleft canonical\textquotedblright% +\ composition $C\left( f_{n},f_{n-1},\ldots,f_{1}\right) $} + +We shall prove Theorem \ref{thm.ind.gen-ass-maps.cp} in a slightly indirect +way: We first define a \textit{specific} complete parenthesization of +$f_{n}\circ f_{n-1}\circ\cdots\circ f_{1}$, which we shall call $C\left( +f_{n},f_{n-1},\ldots,f_{1}\right) $; then we will show that it satisfies +certain equalities (Proposition \ref{prop.ind.gen-ass-maps.Ceq}), and then +prove that every complete parenthesization of $f_{n}\circ f_{n-1}\circ +\cdots\circ f_{1}$ equals this map $C\left( f_{n},f_{n-1},\ldots +,f_{1}\right) $ (Proposition \ref{prop.ind.gen-ass-maps.Ceq-cp}). Each step +of this strategy will rely on induction. + +We begin with the definition of $C\left( f_{n},f_{n-1},\ldots,f_{1}\right) $: + +\begin{definition} +\label{def.ind.gen-ass-maps.C}Let $n$ be a positive integer. Let $X_{1}% +,X_{2},\ldots,X_{n+1}$ be $n+1$ sets. For each $i\in\left\{ 1,2,\ldots +,n\right\} $, let $f_{i}:X_{i}\rightarrow X_{i+1}$ be a map. Then, we want to +define a map $C\left( f_{n},f_{n-1},\ldots,f_{1}\right) :X_{1}\rightarrow +X_{n+1}$. We define this map by recursion on $n$ as follows: + +\begin{itemize} +\item If $n=1$, then we define $C\left( f_{n},f_{n-1},\ldots,f_{1}\right) $ +to be the map $f_{1}:X_{1}\rightarrow X_{2}$. (Note that in this case, +$C\left( f_{n},f_{n-1},\ldots,f_{1}\right) =C\left( f_{1}\right) $, +because $\left( f_{n},f_{n-1},\ldots,f_{1}\right) =\left( f_{1}% +,f_{1-1},\ldots,f_{1}\right) =\left( f_{1}\right) $.) + +\item If $n>1$, then we define $C\left( f_{n},f_{n-1},\ldots,f_{1}\right) +:X_{1}\rightarrow X_{n+1}$ by% +\begin{equation} +C\left( f_{n},f_{n-1},\ldots,f_{1}\right) =f_{n}\circ C\left( +f_{n-1},f_{n-2},\ldots,f_{1}\right) . \label{eq.def.ind.gen-ass-maps.C.rec}% +\end{equation} + +\end{itemize} +\end{definition} + +\begin{example} +\label{exa.ind.gen-ass-maps.Cex}Consider the situation of Definition +\ref{def.ind.gen-ass-maps.C}. + +\textbf{(a)} If $n=1$, then +\begin{equation} +C\left( f_{1}\right) =f_{1} \label{eq.exa.ind.gen-ass-maps.Cex.1}% +\end{equation} +(by the $n=1$ case of the definition). + +\textbf{(b)} If $n=2$, then% +\begin{align} +C\left( f_{2},f_{1}\right) & =f_{2}\circ\underbrace{C\left( f_{1}\right) +}_{\substack{=f_{1}\\\text{(by (\ref{eq.exa.ind.gen-ass-maps.Cex.1}))}% +}}\ \ \ \ \ \ \ \ \ \ \left( \text{by (\ref{eq.def.ind.gen-ass-maps.C.rec}), +applied to }n=2\right) \nonumber\\ +& =f_{2}\circ f_{1}. \label{eq.exa.ind.gen-ass-maps.Cex.2}% +\end{align} + + +\textbf{(c)} If $n=3$, then +\begin{align} +C\left( f_{3},f_{2},f_{1}\right) & =f_{3}\circ\underbrace{C\left( +f_{2},f_{1}\right) }_{\substack{=f_{2}\circ f_{1}\\\text{(by +(\ref{eq.exa.ind.gen-ass-maps.Cex.2}))}}}\ \ \ \ \ \ \ \ \ \ \left( \text{by +(\ref{eq.def.ind.gen-ass-maps.C.rec}), applied to }n=3\right) \nonumber\\ +& =f_{3}\circ\left( f_{2}\circ f_{1}\right) . +\label{eq.exa.ind.gen-ass-maps.Cex.3}% +\end{align} + + +\textbf{(d)} If $n=4$, then% +\begin{align} +C\left( f_{4},f_{3},f_{2},f_{1}\right) & =f_{4}\circ\underbrace{C\left( +f_{3},f_{2},f_{1}\right) }_{\substack{=f_{3}\circ\left( f_{2}\circ +f_{1}\right) \\\text{(by (\ref{eq.exa.ind.gen-ass-maps.Cex.3}))}% +}}\ \ \ \ \ \ \ \ \ \ \left( \text{by (\ref{eq.def.ind.gen-ass-maps.C.rec}), +applied to }n=4\right) \nonumber\\ +& =f_{4}\circ\left( f_{3}\circ\left( f_{2}\circ f_{1}\right) \right) . +\label{eq.exa.ind.gen-ass-maps.Cex.4}% +\end{align} + + +\textbf{(e)} For an arbitrary $n\geq1$, we can informally write $C\left( +f_{n},f_{n-1},\ldots,f_{1}\right) $ as% +\[ +C\left( f_{n},f_{n-1},\ldots,f_{1}\right) =f_{n}\circ\left( f_{n-1}% +\circ\left( f_{n-2}\circ\left( \cdots\circ\left( f_{2}\circ f_{1}\right) +\cdots\right) \right) \right) . +\] +The right hand side of this equality is a complete parenthesization of +$f_{n}\circ f_{n-1}\circ\cdots\circ f_{1}$, where all the parentheses are +\textquotedblleft concentrated as far right as possible\textquotedblright% +\ (i.e., there is an opening parenthesis after each \textquotedblleft$\circ +$\textquotedblright\ sign except for the last one; and there are $n-2$ closing +parentheses at the end of the expression). This is merely a visual restatement +of the recursive definition of $C\left( f_{n},f_{n-1},\ldots,f_{1}\right) $ +we gave above. +\end{example} + +\subsubsection{The crucial property of $C\left( f_{n},f_{n-1},\ldots +,f_{1}\right) $} + +The following proposition will be key to our proof of Theorem +\ref{thm.ind.gen-ass-maps.cp}: + +\begin{proposition} +\label{prop.ind.gen-ass-maps.Ceq}Let $n$ be a positive integer. Let +$X_{1},X_{2},\ldots,X_{n+1}$ be $n+1$ sets. For each $i\in\left\{ +1,2,\ldots,n\right\} $, let $f_{i}:X_{i}\rightarrow X_{i+1}$ be a map. Then,% +\[ +C\left( f_{n},f_{n-1},\ldots,f_{1}\right) =C\left( f_{n},f_{n-1}% +,\ldots,f_{k+1}\right) \circ C\left( f_{k},f_{k-1},\ldots,f_{1}\right) +\] +for each $k\in\left\{ 1,2,\ldots,n-1\right\} $. +\end{proposition} + +\begin{proof} +[Proof of Proposition \ref{prop.ind.gen-ass-maps.Ceq}.]Forget that we fixed +$n$, $X_{1},X_{2},\ldots,X_{n+1}$ and the maps $f_{i}$. We shall prove +Proposition \ref{prop.ind.gen-ass-maps.Ceq} by induction on $n$% +:\ \ \ \ \footnote{The induction principle that we are applying here is +Theorem \ref{thm.ind.IPg} with $g=1$ (since $\mathbb{Z}_{\geq1}$ is the set of +all positive integers).} + +\textit{Induction base:} If $n=1$, then $\left\{ 1,2,\ldots,n-1\right\} +=\left\{ 1,2,\ldots,1-1\right\} =\varnothing$. Hence, if $n=1$, then there +exists no $k\in\left\{ 1,2,\ldots,n-1\right\} $. Thus, if $n=1$, then +Proposition \ref{prop.ind.gen-ass-maps.Ceq} is vacuously true (since +Proposition \ref{prop.ind.gen-ass-maps.Ceq} has a \textquotedblleft for each +$k\in\left\{ 1,2,\ldots,n-1\right\} $\textquotedblright\ clause). This +completes the induction base. + +\textit{Induction step:} Let $m\in\mathbb{Z}_{\geq1}$. Assume that Proposition +\ref{prop.ind.gen-ass-maps.Ceq} holds under the condition that $n=m$. We must +now prove that Proposition \ref{prop.ind.gen-ass-maps.Ceq} holds under the +condition that $n=m+1$. In other words, we must prove the following claim: + +\begin{statement} +\textit{Claim 1:} Let $X_{1},X_{2},\ldots,X_{\left( m+1\right) +1}$ be +$\left( m+1\right) +1$ sets. For each $i\in\left\{ 1,2,\ldots,m+1\right\} +$, let $f_{i}:X_{i}\rightarrow X_{i+1}$ be a map. Then, +\[ +C\left( f_{m+1},f_{\left( m+1\right) -1},\ldots,f_{1}\right) =C\left( +f_{m+1},f_{\left( m+1\right) -1},\ldots,f_{k+1}\right) \circ C\left( +f_{k},f_{k-1},\ldots,f_{1}\right) +\] +for each $k\in\left\{ 1,2,\ldots,\left( m+1\right) -1\right\} $. +\end{statement} + +[\textit{Proof of Claim 1:} Let $k\in\left\{ 1,2,\ldots,\left( m+1\right) +-1\right\} $. Thus, $k\in\left\{ 1,2,\ldots,\left( m+1\right) -1\right\} +=\left\{ 1,2,\ldots,m\right\} $ (since $\left( m+1\right) -1=m$). + +We know that $X_{1},X_{2},\ldots,X_{\left( m+1\right) +1}$ are $\left( +m+1\right) +1$ sets. In other words, \newline$X_{1},X_{2},\ldots,X_{m+2}$ are +$m+2$ sets (since $\left( m+1\right) +1=m+2$). We have $m\in\mathbb{Z}% +_{\geq1}$, thus $m\geq1>0$; hence, $m+1>1$. Thus, +(\ref{eq.def.ind.gen-ass-maps.C.rec}) (applied to $n=m+1$) yields% +\begin{align} +C\left( f_{m+1},f_{\left( m+1\right) -1},\ldots,f_{1}\right) & +=f_{m+1}\circ C\left( f_{\left( m+1\right) -1},f_{\left( m+1\right) +-2},\ldots,f_{1}\right) \nonumber\\ +& =f_{m+1}\circ C\left( f_{m},f_{m-1},\ldots,f_{1}\right) +\label{pf.prop.ind.gen-ass-maps.Ceq.c1.pf.1}% +\end{align} +(since $\left( m+1\right) -1=m$ and $\left( m+1\right) -2=m-1$). + +But we are in one of the following two cases: + +\textit{Case 1:} We have $k=m$. + +\textit{Case 2:} We have $k\neq m$. + +Let us first consider Case 1. In this case, we have $k=m$. Hence,% +\[ +C\left( f_{m+1},f_{\left( m+1\right) -1},\ldots,f_{k+1}\right) =C\left( +f_{m+1},f_{\left( m+1\right) -1},\ldots,f_{m+1}\right) =C\left( +f_{m+1}\right) =f_{m+1}% +\] +(by (\ref{eq.exa.ind.gen-ass-maps.Cex.1}), applied to $X_{m+1}$, $X_{m+2}$ and +$f_{m+1}$ instead of $X_{1}$, $X_{2}$ and $f_{1}$), so that% +\[ +\underbrace{C\left( f_{m+1},f_{\left( m+1\right) -1},\ldots,f_{k+1}\right) +}_{=f_{m+1}}\circ\underbrace{C\left( f_{k},f_{k-1},\ldots,f_{1}\right) +}_{\substack{=C\left( f_{m},f_{m-1},\ldots,f_{1}\right) \\\text{(since +}k=m\text{)}}}=f_{m+1}\circ C\left( f_{m},f_{m-1},\ldots,f_{1}\right) . +\] +Comparing this with (\ref{pf.prop.ind.gen-ass-maps.Ceq.c1.pf.1}), we obtain% +\[ +C\left( f_{m+1},f_{\left( m+1\right) -1},\ldots,f_{1}\right) =C\left( +f_{m+1},f_{\left( m+1\right) -1},\ldots,f_{k+1}\right) \circ C\left( +f_{k},f_{k-1},\ldots,f_{1}\right) . +\] +Hence, Claim 1 is proven in Case 1. + +Let us now consider Case 2. In this case, we have $k\neq m$. Combining +$k\in\left\{ 1,2,\ldots,m\right\} $ with $k\neq m$, we obtain% +\[ +k\in\left\{ 1,2,\ldots,m\right\} \setminus\left\{ m\right\} =\left\{ +1,2,\ldots,m-1\right\} . +\] +Hence, $k\leq m-1m+1-m=1$. + +But we assumed that Proposition \ref{prop.ind.gen-ass-maps.Ceq} holds under +the condition that $n=m$. Hence, we can apply Proposition +\ref{prop.ind.gen-ass-maps.Ceq} to $m$ instead of $n$. We thus obtain% +\[ +C\left( f_{m},f_{m-1},\ldots,f_{1}\right) =C\left( f_{m},f_{m-1}% +,\ldots,f_{k+1}\right) \circ C\left( f_{k},f_{k-1},\ldots,f_{1}\right) +\] +(since $k\in\left\{ 1,2,\ldots,m-1\right\} $). Now, +(\ref{pf.prop.ind.gen-ass-maps.Ceq.c1.pf.1}) yields% +\begin{align} +& C\left( f_{m+1},f_{\left( m+1\right) -1},\ldots,f_{1}\right) \nonumber\\ +& =f_{m+1}\circ\underbrace{C\left( f_{m},f_{m-1},\ldots,f_{1}\right) +}_{=C\left( f_{m},f_{m-1},\ldots,f_{k+1}\right) \circ C\left( f_{k}% +,f_{k-1},\ldots,f_{1}\right) }\nonumber\\ +& =f_{m+1}\circ\left( C\left( f_{m},f_{m-1},\ldots,f_{k+1}\right) \circ +C\left( f_{k},f_{k-1},\ldots,f_{1}\right) \right) . +\label{pf.prop.ind.gen-ass-maps.Ceq.c1.pf.4}% +\end{align} + + +On the other hand, $m+1-k>1$. Hence, (\ref{eq.def.ind.gen-ass-maps.C.rec}) +(applied to $m+1-k$, $X_{k+i}$ and $f_{k+i}$ instead of $n$, $X_{i}$ and +$f_{i}$) yields% +\begin{align*} +& C\left( f_{k+\left( m+1-k\right) },f_{k+\left( \left( m+1-k\right) +-1\right) },\ldots,f_{k+1}\right) \\ +& =f_{k+\left( m+1-k\right) }\circ C\left( f_{k+\left( \left( +m+1-k\right) -1\right) },f_{k+\left( \left( m+1-k\right) -2\right) +},\ldots,f_{k+1}\right) \\ +& =f_{m+1}\circ C\left( f_{m},f_{m-1},\ldots,f_{k+1}\right) \\ +& \ \ \ \ \ \ \ \ \ \ \left( +\begin{array} +[c]{c}% +\text{since }k+\left( m+1-k\right) =m+1\text{ and }k+\left( \left( +m+1-k\right) -1\right) =m\\ +\text{and }k+\left( \left( m+1-k\right) -2\right) =m-1 +\end{array} +\right) . +\end{align*} +Since $k+\left( m+1-k\right) =m+1$ and $k+\left( \left( m+1-k\right) +-1\right) =\left( m+1\right) -1$, this rewrites as% +\[ +C\left( f_{m+1},f_{\left( m+1\right) -1},\ldots,f_{k+1}\right) +=f_{m+1}\circ C\left( f_{m},f_{m-1},\ldots,f_{k+1}\right) . +\] +Hence,% +\begin{align*} +& \underbrace{C\left( f_{m+1},f_{\left( m+1\right) -1},\ldots +,f_{k+1}\right) }_{=f_{m+1}\circ C\left( f_{m},f_{m-1},\ldots,f_{k+1}% +\right) }\circ C\left( f_{k},f_{k-1},\ldots,f_{1}\right) \\ +& =\left( f_{m+1}\circ C\left( f_{m},f_{m-1},\ldots,f_{k+1}\right) +\right) \circ C\left( f_{k},f_{k-1},\ldots,f_{1}\right) \\ +& =f_{m+1}\circ\left( C\left( f_{m},f_{m-1},\ldots,f_{k+1}\right) \circ +C\left( f_{k},f_{k-1},\ldots,f_{1}\right) \right) +\end{align*} +(by Proposition \ref{prop.ind.gen-ass-maps.fgh}, applied to $X=X_{1}$, +$Y=X_{k+1}$, $Z=X_{m+1}$, $W=X_{m+2}$, $c=C\left( f_{k},f_{k-1},\ldots +,f_{1}\right) $, $b=C\left( f_{m},f_{m-1},\ldots,f_{k+1}\right) $ and +$a=f_{m+1}$). Comparing this with (\ref{pf.prop.ind.gen-ass-maps.Ceq.c1.pf.4}% +), we obtain% +\[ +C\left( f_{m+1},f_{\left( m+1\right) -1},\ldots,f_{1}\right) =C\left( +f_{m+1},f_{\left( m+1\right) -1},\ldots,f_{k+1}\right) \circ C\left( +f_{k},f_{k-1},\ldots,f_{1}\right) . +\] +Hence, Claim 1 is proven in Case 2. + +We have now proven Claim 1 in each of the two Cases 1 and 2. Since these two +Cases cover all possibilities, we thus conclude that Claim 1 always holds.] + +Now, we have proven Claim 1. In other words, we have proven that Proposition +\ref{prop.ind.gen-ass-maps.Ceq} holds under the condition that $n=m+1$. This +completes the induction step. Hence, Proposition +\ref{prop.ind.gen-ass-maps.Ceq} is proven by induction. +\end{proof} + +\subsubsection{Proof of general associativity} + +\begin{proposition} +\label{prop.ind.gen-ass-maps.Ceq-cp}Let $n$ be a positive integer. Let +$X_{1},X_{2},\ldots,X_{n+1}$ be $n+1$ sets. For each $i\in\left\{ +1,2,\ldots,n\right\} $, let $f_{i}:X_{i}\rightarrow X_{i+1}$ be a map. Then, +every complete parenthesization of $f_{n}\circ f_{n-1}\circ\cdots\circ f_{1}$ +equals $C\left( f_{n},f_{n-1},\ldots,f_{1}\right) $. +\end{proposition} + +\begin{proof} +[Proof of Proposition \ref{prop.ind.gen-ass-maps.Ceq-cp}.]Forget that we fixed +$n$, $X_{1},X_{2},\ldots,X_{n+1}$ and the maps $f_{i}$. We shall prove +Proposition \ref{prop.ind.gen-ass-maps.Ceq-cp} by strong induction on +$n$:\ \ \ \ \footnote{The induction principle that we are applying here is +Theorem \ref{thm.ind.SIP} with $g=1$ (since $\mathbb{Z}_{\geq1}$ is the set of +all positive integers).} + +\textit{Induction step:} Let $m\in\mathbb{Z}_{\geq1}$. Assume that Proposition +\ref{prop.ind.gen-ass-maps.Ceq-cp} holds under the condition that $n1$. +Thus, we are in one of the following two cases: + +\textit{Case 1:} We have $m=1$. + +\textit{Case 2:} We have $m>1$. + +Let us first consider Case 1. In this case, we have $m=1$. Thus, $C\left( +f_{m},f_{m-1},\ldots,f_{1}\right) =f_{1}$ (by the definition of $C\left( +f_{m},f_{m-1},\ldots,f_{1}\right) $). + +Recall that $m=1$. Thus, the definition of a \textquotedblleft complete +parenthesization of $f_{m}\circ f_{m-1}\circ\cdots\circ f_{1}$% +\textquotedblright\ shows that there is only one complete parenthesization of +$f_{m}\circ f_{m-1}\circ\cdots\circ f_{1}$, and this is simply the map +$f_{1}:X_{1}\rightarrow X_{2}$. Hence, $\gamma$ is simply the map $f_{1}% +:X_{1}\rightarrow X_{2}$ (since $\gamma$ is a complete parenthesization of +$f_{m}\circ f_{m-1}\circ\cdots\circ f_{1}$). Thus, $\gamma=f_{1}=C\left( +f_{m},f_{m-1},\ldots,f_{1}\right) $ (since $C\left( f_{m},f_{m-1}% +,\ldots,f_{1}\right) =f_{1}$). Thus, $\gamma=C\left( f_{m},f_{m-1}% +,\ldots,f_{1}\right) $ is proven in Case 1. + +Now, let us consider Case 2. In this case, we have $m>1$. Hence, the +definition of a \textquotedblleft complete parenthesization of $f_{m}\circ +f_{m-1}\circ\cdots\circ f_{1}$\textquotedblright\ shows that any complete +parenthesization of $f_{m}\circ f_{m-1}\circ\cdots\circ f_{1}$ is a map of the +form $\alpha\circ\beta$, where + +\begin{itemize} +\item $k$ is some element of $\left\{ 1,2,\ldots,m-1\right\} $; + +\item $\alpha$ is a complete parenthesization of $f_{m}\circ f_{m-1}% +\circ\cdots\circ f_{k+1}$; + +\item $\beta$ is a complete parenthesization of $f_{k}\circ f_{k-1}\circ +\cdots\circ f_{1}$. +\end{itemize} + +Thus, $\gamma$ is a map of this form (since $\gamma$ is a complete +parenthesization of $f_{m}\circ f_{m-1}\circ\cdots\circ f_{1}$). In other +words, we can write $\gamma$ in the form $\gamma=\alpha\circ\beta$, where $k$ +is some element of $\left\{ 1,2,\ldots,m-1\right\} $, where $\alpha$ is a +complete parenthesization of $f_{m}\circ f_{m-1}\circ\cdots\circ f_{k+1}$, and +where $\beta$ is a complete parenthesization of $f_{k}\circ f_{k-1}\circ +\cdots\circ f_{1}$. Consider these $k$, $\alpha$ and $\beta$. + +We have $k\in\left\{ 1,2,\ldots,m-1\right\} $, thus $k\leq m-1m\geq0$). Hence, Proposition +\ref{prop.ind.map-powers.1} \textbf{(b)} (applied to $m+1$ and $f^{\circ a}$ +instead of $n$ and $f$) yields +\begin{equation} +\left( f^{\circ a}\right) ^{\circ\left( m+1\right) }=\left( f^{\circ +a}\right) ^{\circ\left( \left( m+1\right) -1\right) }\circ f^{\circ +a}=\left( f^{\circ a}\right) ^{\circ m}\circ f^{\circ a} +\label{pf.prop.ind.map-powers.2.7}% +\end{equation} +(since $\left( m+1\right) -1=m$). + +But $a\left( m+1\right) =am+a$. Thus,% +\begin{align*} +f^{\circ\left( a\left( m+1\right) \right) } & =f^{\circ\left( +am+a\right) }=\underbrace{f^{\circ\left( am\right) }}_{=\left( f^{\circ +a}\right) ^{\circ m}}\circ f^{\circ a}\\ +& \ \ \ \ \ \ \ \ \ \ \left( +\begin{array} +[c]{c}% +\text{by Proposition \ref{prop.ind.map-powers.2} \textbf{(a)}}\\ +\text{(applied to }am\text{ and }a\text{ instead of }a\text{ and }b\text{)}% +\end{array} +\right) \\ +& =\left( f^{\circ a}\right) ^{\circ m}\circ f^{\circ a}=\left( f^{\circ +a}\right) ^{\circ\left( m+1\right) }\ \ \ \ \ \ \ \ \ \ \left( \text{by +(\ref{pf.prop.ind.map-powers.2.7})}\right) . +\end{align*} +In other words, (\ref{pf.prop.ind.map-powers.2.goal}) holds for $b=m+1$. This +completes the induction step. Thus, (\ref{pf.prop.ind.map-powers.2.goal}) is +proven by induction. Hence, Proposition \ref{prop.ind.map-powers.2} +\textbf{(b)} is proven. +\end{proof} + +Note that Proposition \ref{prop.ind.map-powers.2} is similar to the rules of +exponents% +\[ +n^{a+b}=n^{a}n^{b}\ \ \ \ \ \ \ \ \ \ \text{and}\ \ \ \ \ \ \ \ \ \ n^{ab}% +=\left( n^{a}\right) ^{b}% +\] +that hold for $n\in\mathbb{Q}$ and $a,b\in\mathbb{N}$ (and for various other +situations). Can we find similar analogues for other rules of exponents, such +as $\left( mn\right) ^{a}=m^{a}n^{a}$? The simplest analogue one could think +of for this rule would be $\left( f\circ g\right) ^{\circ a}=f^{\circ +a}\circ g^{\circ a}$; but this does not hold in general (unless $a\leq2$). +However, it turns out that this does hold if we assume that $f\circ g=g\circ +f$ (which is not automatically true, unlike the analogous equality $mn=nm$ for +integers). Let us prove this: + +\begin{proposition} +\label{prop.ind.map-powers.3}Let $X$ be a set. Let $f:X\rightarrow X$ and +$g:X\rightarrow X$ be two maps such that $f\circ g=g\circ f$. Then: + +\textbf{(a)} We have $f\circ g^{\circ b}=g^{\circ b}\circ f$ for each +$b\in\mathbb{N}$. + +\textbf{(b)} We have $f^{\circ a}\circ g^{\circ b}=g^{\circ b}\circ f^{\circ +a}$ for each $a\in\mathbb{N}$ and $b\in\mathbb{N}$. + +\textbf{(c)} We have $\left( f\circ g\right) ^{\circ a}=f^{\circ a}\circ +g^{\circ a}$ for each $a\in\mathbb{N}$. +\end{proposition} + +\begin{example} +Let us see why the requirement $f\circ g=g\circ f$ is needed in Proposition +\ref{prop.ind.map-powers.3}: + +Let $X$ be the set $\mathbb{Z}$. Let $f:X\rightarrow X$ be the map that sends +every integer $x$ to $-x$. Let $g:X\rightarrow X$ be the map that sends every +integer $x$ to $1-x$. Then, $f^{\circ2}=\operatorname*{id}\nolimits_{X}$ +(since $f^{\circ2}\left( x\right) =f\left( f\left( x\right) \right) +=-\left( -x\right) =x$ for each $x\in X$) and $g^{\circ2}=\operatorname*{id}% +\nolimits_{X}$ (since $g^{\circ2}\left( x\right) =g\left( g\left( +x\right) \right) =1-\left( 1-x\right) =x$ for each $x\in X$). But the map +$f\circ g$ satisfies $\left( f\circ g\right) \left( x\right) =f\left( +g\left( x\right) \right) =-\left( 1-x\right) =x-1$ for each $x\in X$. +Hence, $\left( f\circ g\right) ^{\circ2}\left( x\right) =\left( f\circ +g\right) \left( \left( f\circ g\right) \left( x\right) \right) =\left( +x-1\right) -1=x-2$ for each $x\in X$. Thus, $\left( f\circ g\right) +^{\circ2}\neq\operatorname*{id}\nolimits_{X}$. Comparing this with +$\underbrace{f^{\circ2}}_{=\operatorname*{id}\nolimits_{X}}\circ +\underbrace{g^{\circ2}}_{=\operatorname*{id}\nolimits_{X}}=\operatorname*{id}% +\nolimits_{X}\circ\operatorname*{id}\nolimits_{X}=\operatorname*{id}% +\nolimits_{X}$, we obtain $\left( f\circ g\right) ^{\circ2}\neq f^{\circ +2}\circ g^{\circ2}$. This shows that Proposition \ref{prop.ind.map-powers.3} +\textbf{(c)} would not hold without the requirement $f\circ g=g\circ f$. +\end{example} + +\begin{proof} +[Proof of Proposition \ref{prop.ind.map-powers.3}.]\textbf{(a)} We claim that% +\begin{equation} +f\circ g^{\circ b}=g^{\circ b}\circ f\ \ \ \ \ \ \ \ \ \ \text{for each }% +b\in\mathbb{N}. \label{pf.prop.ind.map-powers.3.a.1}% +\end{equation} + + +Indeed, let us prove (\ref{pf.prop.ind.map-powers.3.a.1}) by induction on $b$: + +\textit{Induction base:} We have $g^{\circ0}=\operatorname*{id}\nolimits_{X}$ +(by (\ref{eq.def.ind.gen-ass-maps.fn.f0}), applied to $g$ instead of $f$). +Hence, $f\circ\underbrace{g^{\circ0}}_{=\operatorname*{id}\nolimits_{X}% +}=f\circ\operatorname*{id}\nolimits_{X}=f$ and $\underbrace{g^{\circ0}% +}_{=\operatorname*{id}\nolimits_{X}}\circ f=\operatorname*{id}\nolimits_{X}% +\circ f=f$. Comparing these two equalities, we obtain $f\circ g^{\circ +0}=g^{\circ0}\circ f$. In other words, (\ref{pf.prop.ind.map-powers.3.a.1}) +holds for $b=0$. This completes the induction base. + +\textit{Induction step:} Let $m\in\mathbb{N}$. Assume that +(\ref{pf.prop.ind.map-powers.3.a.1}) holds for $b=m$. We must prove that +(\ref{pf.prop.ind.map-powers.3.a.1}) holds for $b=m+1$. + +We have assumed that (\ref{pf.prop.ind.map-powers.3.a.1}) holds for $b=m$. In +other words,% +\begin{equation} +f\circ g^{\circ m}=g^{\circ m}\circ f. \label{pf.prop.ind.map-powers.3.a.2}% +\end{equation} + + +Proposition \ref{prop.ind.gen-ass-maps.fgh} (applied to $Y=X$, $Z=X$, $W=X$, +$c=g$, $b=g^{\circ m}$ and $a=f$) yields +\begin{equation} +\left( f\circ g^{\circ m}\right) \circ g=f\circ\left( g^{\circ m}\circ +g\right) . \label{pf.prop.ind.map-powers.3.a.3}% +\end{equation} + + +Proposition \ref{prop.ind.map-powers.2} \textbf{(a)} (applied to $g$, $m$ and +$1$ instead of $f$, $a$ and $b$) yields +\begin{equation} +g^{\circ\left( m+1\right) }=g^{\circ m}\circ\underbrace{g^{\circ1}}% +_{=g}=g^{\circ m}\circ g. \label{pf.prop.ind.map-powers.3.a.4}% +\end{equation} +Hence,% +\begin{align} +f\circ\underbrace{g^{\circ\left( m+1\right) }}_{=g^{\circ m}\circ g} & +=f\circ\left( g^{\circ m}\circ g\right) =\underbrace{\left( f\circ g^{\circ +m}\right) }_{\substack{=g^{\circ m}\circ f\\\text{(by +(\ref{pf.prop.ind.map-powers.3.a.2}))}}}\circ g\ \ \ \ \ \ \ \ \ \ \left( +\text{by (\ref{pf.prop.ind.map-powers.3.a.3})}\right) \nonumber\\ +& =\left( g^{\circ m}\circ f\right) \circ g=g^{\circ m}\circ +\underbrace{\left( f\circ g\right) }_{=g\circ f}\nonumber\\ +& \ \ \ \ \ \ \ \ \ \ \left( +\begin{array} +[c]{c}% +\text{by Proposition \ref{prop.ind.gen-ass-maps.fgh} (applied}\\ +\text{to }Y=X\text{, }Z=X\text{, }W=X\text{, }c=g\text{, }b=f\text{ and +}a=g^{\circ m}\text{)}% +\end{array} +\right) \nonumber\\ +& =g^{\circ m}\circ\left( g\circ f\right) . +\label{pf.prop.ind.map-powers.3.a.5}% +\end{align} + + +On the other hand, Proposition \ref{prop.ind.gen-ass-maps.fgh} (applied to +$Y=X$, $Z=X$, $W=X$, $c=f$, $b=g$ and $a=g^{\circ m}$) yields +\[ +\left( g^{\circ m}\circ g\right) \circ f=g^{\circ m}\circ\left( g\circ +f\right) . +\] +Comparing this with (\ref{pf.prop.ind.map-powers.3.a.5}), we obtain% +\[ +f\circ g^{\circ\left( m+1\right) }=\underbrace{\left( g^{\circ m}\circ +g\right) }_{\substack{=g^{\circ\left( m+1\right) }\\\text{(by +(\ref{pf.prop.ind.map-powers.3.a.4}))}}}\circ f=g^{\circ\left( m+1\right) +}\circ f. +\] +In other words, (\ref{pf.prop.ind.map-powers.3.a.1}) holds for $b=m+1$. This +completes the induction step. Thus, (\ref{pf.prop.ind.map-powers.3.a.1}) is +proven by induction. + +Therefore, Proposition \ref{prop.ind.map-powers.3} \textbf{(a)} follows. + +\textbf{(b)} Let $a\in\mathbb{N}$ and $b\in\mathbb{N}$. From $f\circ g=g\circ +f$, we obtain $g\circ f=f\circ g$. Hence, Proposition +\ref{prop.ind.map-powers.3} \textbf{(a)} (applied to $g$, $f$ and $a$ instead +of $f$, $g$ and $b$) yields $g\circ f^{\circ a}=f^{\circ a}\circ g$. In other +words, $f^{\circ a}\circ g=g\circ f^{\circ a}$. Hence, Proposition +\ref{prop.ind.map-powers.3} \textbf{(a)} (applied to $f^{\circ a}$ instead of +$f$) yields $f^{\circ a}\circ g^{\circ b}=g^{\circ b}\circ f^{\circ a}$. This +proves Proposition \ref{prop.ind.map-powers.3} \textbf{(b)}. + +\textbf{(c)} We claim that% +\begin{equation} +\left( f\circ g\right) ^{\circ a}=f^{\circ a}\circ g^{\circ a}% +\ \ \ \ \ \ \ \ \ \ \text{for each }a\in\mathbb{N}. +\label{pf.prop.ind.map-powers.3.c.claim}% +\end{equation} + + +Indeed, let us prove (\ref{pf.prop.ind.map-powers.3.c.claim}) by induction on +$a$: + +\textit{Induction base:} From (\ref{eq.def.ind.gen-ass-maps.fn.f0}), we obtain +$f^{\circ0}=\operatorname*{id}\nolimits_{X}$ and $g^{\circ0}% +=\operatorname*{id}\nolimits_{X}$ and $\left( f\circ g\right) ^{\circ +0}=\operatorname*{id}\nolimits_{X}$. Thus,% +\[ +\left( f\circ g\right) ^{\circ0}=\operatorname*{id}\nolimits_{X}% +=\underbrace{\operatorname*{id}\nolimits_{X}}_{=f^{\circ0}}\circ +\underbrace{\operatorname*{id}\nolimits_{X}}_{=g^{\circ0}}=f^{\circ0}\circ +g^{\circ0}. +\] +In other words, (\ref{pf.prop.ind.map-powers.3.c.claim}) holds for $a=0$. This +completes the induction base. + +\textit{Induction step:} Let $m\in\mathbb{N}$. Assume that +(\ref{pf.prop.ind.map-powers.3.c.claim}) holds for $a=m$. We must prove that +(\ref{pf.prop.ind.map-powers.3.c.claim}) holds for $a=m+1$. + +We have assumed that (\ref{pf.prop.ind.map-powers.3.c.claim}) holds for $a=m$. +In other words, +\begin{equation} +\left( f\circ g\right) ^{\circ m}=f^{\circ m}\circ g^{\circ m}. +\label{pf.prop.ind.map-powers.3.c.0}% +\end{equation} + + +But Proposition \ref{prop.ind.map-powers.2} \textbf{(a)} (applied to $g$, $m$ +and $1$ instead of $f$, $a$ and $b$) yields +\[ +g^{\circ\left( m+1\right) }=g^{\circ m}\circ\underbrace{g^{\circ1}}% +_{=g}=g^{\circ m}\circ g. +\] +The same argument (applied to $f$ instead of $g$) yields $f^{\circ\left( +m+1\right) }=f^{\circ m}\circ f$. Hence,% +\begin{equation} +\underbrace{f^{\circ\left( m+1\right) }}_{=f^{\circ m}\circ f}\circ +g^{\circ\left( m+1\right) }=\left( f^{\circ m}\circ f\right) \circ +g^{\circ\left( m+1\right) }=f^{\circ m}\circ\left( f\circ g^{\circ\left( +m+1\right) }\right) \label{pf.prop.ind.map-powers.3.c.1}% +\end{equation} +(by Proposition \ref{prop.ind.gen-ass-maps.fgh} (applied to $Y=X$, $Z=X$, +$W=X$, $c=g^{\circ\left( m+1\right) }$, $b=f$ and $a=f^{\circ m}$)). + +But Proposition \ref{prop.ind.map-powers.3} \textbf{(a)} (applied to $b=m+1$) +yields% +\[ +f\circ g^{\circ\left( m+1\right) }=\underbrace{g^{\circ\left( m+1\right) +}}_{=g^{\circ m}\circ g}\circ f=\left( g^{\circ m}\circ g\right) \circ +f=g^{\circ m}\circ\left( g\circ f\right) +\] +(by Proposition \ref{prop.ind.gen-ass-maps.fgh} (applied to $Y=X$, $Z=X$, +$W=X$, $c=f$, $b=g$ and $a=g^{\circ m}$)). Hence,% +\begin{equation} +f\circ g^{\circ\left( m+1\right) }=g^{\circ m}\circ\underbrace{\left( +g\circ f\right) }_{\substack{=f\circ g\\\text{(since }f\circ g=g\circ +f\text{)}}}=g^{\circ m}\circ\left( f\circ g\right) . +\label{pf.prop.ind.map-powers.3.c.3}% +\end{equation} + + +On the other hand, Proposition \ref{prop.ind.map-powers.2} \textbf{(a)} +(applied to $f\circ g$, $m$ and $1$ instead of $f$, $a$ and $b$) yields +\[ +\left( f\circ g\right) ^{\circ\left( m+1\right) }=\underbrace{\left( +f\circ g\right) ^{\circ m}}_{\substack{=f^{\circ m}\circ g^{\circ +m}\\\text{(by (\ref{pf.prop.ind.map-powers.3.c.0}))}}}\circ\underbrace{\left( +f\circ g\right) ^{\circ1}}_{=f\circ g}=\left( f^{\circ m}\circ g^{\circ +m}\right) \circ\left( f\circ g\right) =f^{\circ m}\circ\left( g^{\circ +m}\circ\left( f\circ g\right) \right) +\] +(by Proposition \ref{prop.ind.gen-ass-maps.fgh} (applied to $Y=X$, $Z=X$, +$W=X$, $c=f\circ g$, $b=g^{\circ m}$ and $a=f^{\circ m}$)). Hence,% +\[ +\left( f\circ g\right) ^{\circ\left( m+1\right) }=f^{\circ m}% +\circ\underbrace{\left( g^{\circ m}\circ\left( f\circ g\right) \right) +}_{\substack{=f\circ g^{\circ\left( m+1\right) }\\\text{(by +(\ref{pf.prop.ind.map-powers.3.c.3}))}}}=f^{\circ m}\circ\left( f\circ +g^{\circ\left( m+1\right) }\right) =f^{\circ\left( m+1\right) }\circ +g^{\circ\left( m+1\right) }% +\] +(by (\ref{pf.prop.ind.map-powers.3.c.1})). In other words, +(\ref{pf.prop.ind.map-powers.3.c.claim}) holds for $a=m+1$. This completes the +induction step. Thus, (\ref{pf.prop.ind.map-powers.3.c.claim}) is proven by +induction. Therefore, Proposition \ref{prop.ind.map-powers.3} \textbf{(c)} follows. +\end{proof} + +\begin{remark} +In our above proof of Proposition \ref{prop.ind.map-powers.3}, we have not +used the notation $f_{n}\circ f_{n-1}\circ\cdots\circ f_{1}$ introduced in +Definition \ref{def.ind.gen-ass-maps.comp}, but instead relied on parentheses +and compositions of two maps (i.e., we have never composed more than two maps +at the same time). Thus, for example, in the proof of Proposition +\ref{prop.ind.map-powers.3} \textbf{(a)}, we wrote \textquotedblleft$\left( +g^{\circ m}\circ g\right) \circ f$\textquotedblright\ and \textquotedblleft% +$g^{\circ m}\circ\left( g\circ f\right) $\textquotedblright\ rather than +\textquotedblleft$g^{\circ m}\circ g\circ f$\textquotedblright. But Remark +\ref{rmk.ind.gen-ass-maps.drop} says that we could have just as well dropped +all the parentheses. This would have saved us the trouble of explicitly +applying Proposition \ref{prop.ind.gen-ass-maps.fgh} (since if we drop all +parentheses, then there is no difference between \textquotedblleft$\left( +g^{\circ m}\circ g\right) \circ f$\textquotedblright\ and \textquotedblleft% +$g^{\circ m}\circ\left( g\circ f\right) $\textquotedblright\ any more). This +way, the induction step in the proof of Proposition +\ref{prop.ind.map-powers.3} \textbf{(a)} could have been made much shorter: + +\textit{Induction step (second version):} Let $m\in\mathbb{N}$. Assume that +(\ref{pf.prop.ind.map-powers.3.a.1}) holds for $b=m$. We must prove that +(\ref{pf.prop.ind.map-powers.3.a.1}) holds for $b=m+1$. + +We have assumed that (\ref{pf.prop.ind.map-powers.3.a.1}) holds for $b=m$. In +other words,% +\begin{equation} +f\circ g^{\circ m}=g^{\circ m}\circ f. +\label{pf.prop.ind.map-powers.3.a.2nd.2}% +\end{equation} + + +Proposition \ref{prop.ind.map-powers.2} \textbf{(a)} (applied to $g$, $m$ and +$1$ instead of $f$, $a$ and $b$) yields $g^{\circ\left( m+1\right) +}=g^{\circ m}\circ\underbrace{g^{\circ1}}_{=g}=g^{\circ m}\circ g$. Hence,% +\[ +f\circ\underbrace{g^{\circ\left( m+1\right) }}_{=g^{\circ m}\circ +g}=\underbrace{f\circ g^{\circ m}}_{\substack{=g^{\circ m}\circ f\\\text{(by +(\ref{pf.prop.ind.map-powers.3.a.2nd.2}))}}}\circ g=g^{\circ m}\circ +\underbrace{f\circ g}_{=g\circ f}=\underbrace{g^{\circ m}\circ g}% +_{=g^{\circ\left( m+1\right) }}\circ f=g^{\circ\left( m+1\right) }\circ +f. +\] +In other words, (\ref{pf.prop.ind.map-powers.3.a.1}) holds for $b=m+1$. This +completes the induction step. + +Similarly, we can simplify the proof of Proposition +\ref{prop.ind.map-powers.3} \textbf{(c)} by dropping the parentheses. (The +details are left to the reader.) +\end{remark} + +\subsection{\label{sect.ind.gen-com}General commutativity for addition of +numbers} + +\subsubsection{The setup and the problem} + +Throughout Section \ref{sect.ind.gen-com}, we let $\mathbb{A}$ be one of the +sets $\mathbb{N}$, $\mathbb{Z}$, $\mathbb{Q}$, $\mathbb{R}$ and $\mathbb{C}$. +The elements of $\mathbb{A}$ will be simply called \textit{numbers}. + +There is an analogue of Proposition \ref{prop.ind.gen-ass-maps.fgh} for numbers: + +\begin{proposition} +\label{prop.ind.gen-com.fgh}Let $a$, $b$ and $c$ be three numbers (i.e., +elements of $\mathbb{A}$). Then, $\left( a+b\right) +c=a+\left( b+c\right) +$. +\end{proposition} + +Proposition \ref{prop.ind.gen-com.fgh} is known as the \textit{associativity +of addition} (in $\mathbb{A}$), and is fundamental; its proof can be found in +any textbook on the construction of the number system\footnote{For example, +Proposition \ref{prop.ind.gen-com.fgh} is proven in \cite[Theorem 3.2.3 +(3)]{Swanso18} for the case when $\mathbb{A}=\mathbb{N}$; in \cite[Theorem +3.5.4 (3)]{Swanso18} for the case when $\mathbb{A}=\mathbb{Z}$; in +\cite[Theorem 3.6.4 (3)]{Swanso18} for the case when $\mathbb{A}=\mathbb{Q}$; +in \cite[Theorem 3.7.10]{Swanso18} for the case when $\mathbb{A}=\mathbb{R}$; +in \cite[Theorem 3.9.3]{Swanso18} for the case when $\mathbb{A}=\mathbb{C}$.}. + +In Section \ref{sect.ind.gen-ass}, we have used Proposition +\ref{prop.ind.gen-ass-maps.fgh} to show that we can \textquotedblleft drop the +parentheses\textquotedblright\ in a composition $f_{n}\circ f_{n-1}\circ +\cdots\circ f_{1}$ of maps (i.e., all possible complete parenthesizations of +this composition are actually the same map). Likewise, we can use Proposition +\ref{prop.ind.gen-com.fgh} to show that we can \textquotedblleft drop the +parentheses\textquotedblright\ in a sum $a_{1}+a_{2}+\cdots+a_{n}$ of numbers +(i.e., all possible complete parenthesizations of this sum are actually the +same number). For example, if $a,b,c,d$ are four numbers, then the complete +parenthesizations of $a+b+c+d$ are% +\begin{align*} +& \left( \left( a+b\right) +c\right) +d,\ \ \ \ \ \ \ \ \ \ \left( +a+\left( b+c\right) \right) +d,\ \ \ \ \ \ \ \ \ \ \left( a+b\right) ++\left( c+d\right) ,\\ +& a+\left( \left( b+c\right) +d\right) ,\ \ \ \ \ \ \ \ \ \ a+\left( +b+\left( c+d\right) \right) , +\end{align*} +and all of these five complete parenthesizations are the same number. + +However, numbers behave better than maps. In particular, along with +Proposition \ref{prop.ind.gen-com.fgh}, they satisfy another law that maps +(generally) don't satisfy: + +\begin{proposition} +\label{prop.ind.gen-com.fg}Let $a$ and $b$ be two numbers (i.e., elements of +$\mathbb{A}$). Then, $a+b=b+a$. +\end{proposition} + +Proposition \ref{prop.ind.gen-com.fg} is known as the \textit{commutativity of +addition} (in $\mathbb{A}$), and again is a fundamental result whose proofs +are found in standard textbooks\footnote{For example, Proposition +\ref{prop.ind.gen-com.fg} is proven in \cite[Theorem 3.2.3 (4)]{Swanso18} for +the case when $\mathbb{A}=\mathbb{N}$; in \cite[Theorem 3.5.4 (4)]{Swanso18} +for the case when $\mathbb{A}=\mathbb{Z}$; in \cite[Theorem 3.6.4 +(4)]{Swanso18} for the case when $\mathbb{A}=\mathbb{Q}$; in \cite[Theorem +3.7.10]{Swanso18} for the case when $\mathbb{A}=\mathbb{R}$; in \cite[Theorem +3.9.3]{Swanso18} for the case when $\mathbb{A}=\mathbb{C}$.}. + +Furthermore, numbers can \textbf{always} be added, whereas maps can only be +composed if the domain of one is the codomain of the other. Thus, when we want +to take the sum of $n$ numbers $a_{1},a_{2},\ldots,a_{n}$, we can not only +choose where to put the parentheses, but also in what order the numbers should +appear in the sum. It turns out that neither of these choices affects the +result. For example, if $a,b,c$ are three numbers, then all $12$ possible sums% +\begin{align*} +& \left( a+b\right) +c,\ \ \ \ \ \ \ \ \ \ a+\left( b+c\right) +,\ \ \ \ \ \ \ \ \ \ \left( a+c\right) +b,\ \ \ \ \ \ \ \ \ \ a+\left( +c+b\right) ,\\ +& \left( b+a\right) +c,\ \ \ \ \ \ \ \ \ \ b+\left( a+c\right) +,\ \ \ \ \ \ \ \ \ \ \left( b+c\right) +a,\ \ \ \ \ \ \ \ \ \ b+\left( +c+a\right) ,\\ +& \left( c+a\right) +b,\ \ \ \ \ \ \ \ \ \ c+\left( a+b\right) +,\ \ \ \ \ \ \ \ \ \ \left( c+b\right) +a,\ \ \ \ \ \ \ \ \ \ c+\left( +b+a\right) +\end{align*} +are actually the same number. The reader can easily verify this for three +numbers $a,b,c$ (using Proposition \ref{prop.ind.gen-com.fgh} and Proposition +\ref{prop.ind.gen-com.fg}), but of course the general case (with $n$ numbers) +is more difficult. The independence of the result on the parenthesization can +be proven using the same arguments that we gave in Section +\ref{sect.ind.gen-ass} (except that the $\circ$ symbol is now replaced by +$+$), but the independence on the order cannot easily be shown (or even +stated) in this way. + +Thus, we shall proceed differently: We shall rigorously define the sum of $n$ +numbers without specifying an order in which they are added or using +parentheses. Unlike the composition of $n$ maps, which was defined for an +\textit{ordered list} of $n$ maps, we shall define the sum of $n$ numbers for +a \textit{family} of $n$ numbers (see the next subsection for the definition +of a \textquotedblleft family\textquotedblright). Families don't come with an +ordering chosen in advance, so we cannot single out any specific ordering for +use in the definition. Thus, the independence on the order will be baked right +into the definition. + +Different solutions to the problem of formalizing the concept of the sum of +$n$ numbers can be found in \cite[Chapter 1, \S 1.5]{Bourba74}% +\footnote{Bourbaki, in \cite[Chapter 1, \S 1.5]{Bourba74}, define something +more general than a sum of $n$ numbers: They define the \textquotedblleft +composition\textquotedblright\ of a finite family of elements of a commutative +magma. The sum of $n$ numbers is a particular case of this concept when the +magma is the set $\mathbb{A}$ (endowed with its addition).} and in +\cite[\S 3.2]{GalQua18}. + +\subsubsection{Families} + +Let us first define what we mean by a \textquotedblleft +family\textquotedblright\ of $n$ numbers. More generally, we can define a +family of elements of any set, or even a family of elements of +\textbf{different} sets. To motivate the definition, we first recall a concept +of an $n$-tuple: + +\begin{remark} +\label{rmk.ind.families.tups}Let $n\in\mathbb{N}$. + +\textbf{(a)} Let $A$ be a set. Then, to specify an $n$\textit{-tuple of +elements of }$A$ means specifying an element $a_{i}$ of $A$ for each +$i\in\left\{ 1,2,\ldots,n\right\} $. This $n$-tuple is then denoted by +$\left( a_{1},a_{2},\ldots,a_{n}\right) $ or by $\left( a_{i}\right) +_{i\in\left\{ 1,2,\ldots,n\right\} }$. For each $i\in\left\{ 1,2,\ldots +,n\right\} $, we refer to $a_{i}$ as the $i$\textit{-th entry} of this $n$-tuple. + +The set of all $n$-tuples of elements of $A$ is denoted by $A^{n}$ or by +$A^{\times n}$; it is called the $n$\textit{-th Cartesian power} of the set +$A$. + +\textbf{(b)} More generally, we can define $n$-tuples of elements from +\textbf{different} sets: For each $i\in\left\{ 1,2,\ldots,n\right\} $, let +$A_{i}$ be a set. Then, to specify an $n$\textit{-tuple of elements of }% +$A_{1},A_{2},\ldots,A_{n}$ means specifying an element $a_{i}$ of $A_{i}$ for +each $i\in\left\{ 1,2,\ldots,n\right\} $. This $n$-tuple is (again) denoted +by $\left( a_{1},a_{2},\ldots,a_{n}\right) $ or by $\left( a_{i}\right) +_{i\in\left\{ 1,2,\ldots,n\right\} }$. For each $i\in\left\{ 1,2,\ldots +,n\right\} $, we refer to $a_{i}$ as the $i$\textit{-th entry} of this $n$-tuple. + +The set of all $n$-tuples of elements of $A_{1},A_{2},\ldots,A_{n}$ is denoted +by $A_{1}\times A_{2}\times\cdots\times A_{n}$ or by $\prod_{i=1}^{n}A_{i}$; +it is called the \textit{Cartesian product} of the $n$ sets $A_{1}% +,A_{2},\ldots,A_{n}$. These $n$ sets $A_{1},A_{2},\ldots,A_{n}$ are called the +\textit{factors} of this Cartesian product. +\end{remark} + +\begin{example} +\label{exa.ind.families.tups}\textbf{(a)} The $3$-tuple $\left( 7,8,9\right) +$ is a $3$-tuple of elements of $\mathbb{N}$, and also a $3$-tuple of elements +of $\mathbb{Z}$. It can also be written in the form $\left( 6+i\right) +_{i\in\left\{ 1,2,3\right\} }$. Thus, $\left( 6+i\right) _{i\in\left\{ +1,2,3\right\} }=\left( 6+1,6+2,6+3\right) =\left( 7,8,9\right) +\in\mathbb{N}^{3}$ and also $\left( 6+i\right) _{i\in\left\{ 1,2,3\right\} +}\in\mathbb{Z}^{3}$. + +\textbf{(b)} The $5$-tuple $\left( \left\{ 1\right\} ,\left\{ 2\right\} +,\left\{ 3\right\} ,\varnothing,\mathbb{N}\right) $ is a $5$-tuple of +elements of the powerset of $\mathbb{N}$ (since $\left\{ 1\right\} ,\left\{ +2\right\} ,\left\{ 3\right\} ,\varnothing,\mathbb{N}$ are subsets of +$\mathbb{N}$, thus elements of the powerset of $\mathbb{N}$). + +\textbf{(c)} The $0$-tuple $\left( {}\right) $ can be viewed as a $0$-tuple +of elements of \textbf{any} set $A$. + +\textbf{(d)} If we let $\left[ n\right] $ be the set $\left\{ +1,2,\ldots,n\right\} $ for each $n\in\mathbb{N}$, then $\left( +1,2,2,3,3\right) $ is a $5$-tuple of elements of $\left[ 1\right] ,\left[ +2\right] ,\left[ 3\right] ,\left[ 4\right] ,\left[ 5\right] $ (because +$1\in\left[ 1\right] $, $2\in\left[ 2\right] $, $2\in\left[ 3\right] $, +$3\in\left[ 4\right] $ and $3\in\left[ 5\right] $). In other words, +$\left( 1,2,2,3,3\right) \in\left[ 1\right] \times\left[ 2\right] +\times\left[ 3\right] \times\left[ 4\right] \times\left[ 5\right] $. + +\textbf{(e)} A $2$-tuple is the same as an ordered pair. A $3$-tuple is the +same as an ordered triple. A $1$-tuple of elements of a set $A$ is +\textquotedblleft almost\textquotedblright\ the same as a single element of +$A$; more precisely, there is a bijection +\[ +A\rightarrow A^{1},\ \ \ \ \ \ \ \ \ \ a\mapsto\left( a\right) +\] +from $A$ to the set of $1$-tuples of elements of $A$. +\end{example} + +The notation \textquotedblleft$\left( a_{i}\right) _{i\in\left\{ +1,2,\ldots,n\right\} }$\textquotedblright\ in Remark +\ref{rmk.ind.families.tups} should be pronounced as \textquotedblleft the +$n$-tuple whose $i$-th entry is $a_{i}$ for each $i\in\left\{ 1,2,\ldots +,n\right\} $\textquotedblright. The letter \textquotedblleft$i$% +\textquotedblright\ is used as a variable in this notation (similar to the +\textquotedblleft$i$\textquotedblright\ in the expression \textquotedblleft% +$\sum_{i=1}^{n}i$\textquotedblright\ or in the expression \textquotedblleft +the map $\mathbb{N}\rightarrow\mathbb{N},\ i\mapsto i+1$\textquotedblright\ or +in the expression \textquotedblleft for all $i\in\mathbb{N}$, we have +$i+1>i$\textquotedblright); it does not refer to any specific element of +$\left\{ 1,2,\ldots,n\right\} $. As usual, it does not matter which letter +we are using for this variable (as long as it does not already have a +different meaning); thus, for example, the $3$-tuples $\left( 6+i\right) +_{i\in\left\{ 1,2,3\right\} }$ and $\left( 6+j\right) _{j\in\left\{ +1,2,3\right\} }$ and $\left( 6+x\right) _{x\in\left\{ 1,2,3\right\} }$ +are all identical (and equal $\left( 7,8,9\right) $). + +We also note that the \textquotedblleft$\prod$\textquotedblright\ sign in +Remark \ref{rmk.ind.families.tups} \textbf{(b)} has a different meaning than +the \textquotedblleft$\prod$\textquotedblright\ sign in Section +\ref{sect.sums-repetitorium}. The former stands for a Cartesian product of +sets, whereas the latter stands for a product of numbers. In particular, a +product $\prod_{i=1}^{n}a_{i}$ of numbers does not change when its factors are +swapped, whereas a Cartesian product $\prod_{i=1}^{n}A_{i}$ of sets does. (In +particular, if $A$ and $B$ are two sets, then $A\times B$ and $B\times A$ are +different sets in general. The $2$-tuple $\left( 1,-1\right) $ belongs to +$\mathbb{N}\times\mathbb{Z}$, but not to $\mathbb{Z}\times\mathbb{N}$.) + +Thus, the purpose of an $n$-tuple is storing several elements (possibly of +different sets) in one \textquotedblleft container\textquotedblright. This is +a highly useful notion, but sometimes one wants a more general concept, which +can store several elements but not necessarily organized in a +\textquotedblleft linear order\textquotedblright. For example, assume you want +to store four integers $a,b,c,d$ in the form of a rectangular table $\left( +\begin{array} +[c]{cc}% +a & b\\ +c & d +\end{array} +\right) $ (also known as a \textquotedblleft$2\times2$-table of +integers\textquotedblright). Such a table doesn't have a well-defined +\textquotedblleft$1$-st entry\textquotedblright\ or \textquotedblleft$2$-nd +entry\textquotedblright\ (unless you agree on a specific order in which you +read it); instead, it makes sense to speak of a \textquotedblleft$\left( +1,2\right) $-th entry\textquotedblright\ (i.e., the entry in row $1$ and +column $2$, which is $b$) or of a \textquotedblleft$\left( 2,2\right) $-th +entry\textquotedblright\ (i.e., the entry in row $2$ and column $2$, which is +$d$). Thus, such tables work similarly to $n$-tuples, but they are +\textquotedblleft indexed\textquotedblright\ by pairs $\left( i,j\right) $ +of appropriate integers rather than by the numbers $1,2,\ldots,n$. + +The concept of a \textquotedblleft family\textquotedblright\ generalizes both +$n$-tuples and rectangular tables: It allows the entries to be indexed by the +elements of an arbitrary (possibly infinite) set $I$ instead of the numbers +$1,2,\ldots,n$. Here is its definition (watch the similarities to Remark +\ref{rmk.ind.families.tups}): + +\begin{definition} +\label{def.ind.families.fams}Let $I$ be a set. + +\textbf{(a)} Let $A$ be a set. Then, to specify an $I$\textit{-family of +elements of }$A$ means specifying an element $a_{i}$ of $A$ for each $i\in I$. +This $I$-family is then denoted by $\left( a_{i}\right) _{i\in I}$. For each +$i\in I$, we refer to $a_{i}$ as the $i$\textit{-th entry} of this $I$-family. +(Unlike the case of $n$-tuples, there is no notation like $\left( a_{1}% +,a_{2},\ldots,a_{n}\right) $ for $I$-families, because there is no natural +way in which their entries should be listed.) + +An $I$-family of elements of $A$ is also called an $A$\textit{-valued }% +$I$\textit{-family}. + +The set of all $I$-families of elements of $A$ is denoted by $A^{I}$ or by +$A^{\times I}$. (Note that the notation $A^{I}$ is also used for the set of +all maps from $I$ to $A$. But this set is more or less the same as the set of +all $I$-families of elements of $A$; see Remark \ref{rmk.ind.families.maps} +below for the details.) + +\textbf{(b)} More generally, we can define $I$-families of elements from +\textbf{different} sets: For each $i\in I$, let $A_{i}$ be a set. Then, to +specify an $I$\textit{-family of elements of }$\left( A_{i}\right) _{i\in +I}$ means specifying an element $a_{i}$ of $A_{i}$ for each $i\in I$. This +$I$-family is (again) denoted by $\left( a_{i}\right) _{i\in I}$. For each +$i\in I$, we refer to $a_{i}$ as the $i$\textit{-th entry} of this $I$-family. + +The set of all $I$-families of elements of $\left( A_{i}\right) _{i\in I}$ +is denoted by $\prod_{i\in I}A_{i}$. + +The word \textquotedblleft$I$-family\textquotedblright\ (without further +qualifications) means an $I$-family of elements of $\left( A_{i}\right) +_{i\in I}$ for some sets $A_{i}$. + +The word \textquotedblleft family\textquotedblright\ (without further +qualifications) means an $I$-family for some set $I$. +\end{definition} + +\begin{example} +\label{exa.ind.families.fams}\textbf{(a)} The family $\left( 6+i\right) +_{i\in\left\{ 0,3,5\right\} }$ is a $\left\{ 0,3,5\right\} $-family of +elements of $\mathbb{N}$ (that is, an $\mathbb{N}$-valued $\left\{ +0,3,5\right\} $-family). It has three entries: Its $0$-th entry is $6+0=6$; +its $3$-rd entry is $6+3=9$; its $5$-th entry is $6+5=11$. Of course, this +family is also a $\left\{ 0,3,5\right\} $-family of elements of $\mathbb{Z}% +$. If we squint hard enough, we can pretend that this family is simply the +$3$-tuple $\left( 6,9,11\right) $; but this is not advisable, and also does +not extend to situations in which there is no natural order on the set $I$. + +\textbf{(b)} Let $X$ be the set $\left\{ \text{\textquotedblleft +cat\textquotedblright},\text{\textquotedblleft chicken\textquotedblright% +},\text{\textquotedblleft dog\textquotedblright}\right\} $ consisting of +three words. Then, we can define an $X$-family $\left( a_{i}\right) _{i\in +X}$ of elements of $\mathbb{N}$ by setting% +\[ +a_{\text{\textquotedblleft cat\textquotedblright}}% +=4,\ \ \ \ \ \ \ \ \ \ a_{\text{\textquotedblleft chicken\textquotedblright}% +}=2,\ \ \ \ \ \ \ \ \ \ a_{\text{\textquotedblleft dog\textquotedblright}}=4. +\] +This family has $3$ entries, which are $4$, $2$ and $4$; but there is no +natural order on the set $X$, so we cannot identify it with a $3$-tuple. + +We can also rewrite this family as +\[ +\left( \text{the number of legs of a typical specimen of animal }i\right) +_{i\in X}. +\] +Of course, not every family will have a description like this; sometimes a +family is just a choice of elements without any underlying pattern. + +\textbf{(c)} If $I$ is the empty set $\varnothing$, and if $A$ is any set, +then there is exactly one $I$-family of elements of $A$; namely, the +\textit{empty family}. Indeed, specifying such a family means specifying no +elements at all, and there is just one way to do that. We can denote the empty +family by $\left( {}\right) $, just like the empty $0$-tuple. + +\textbf{(d)} The family $\left( \left\vert i\right\vert \right) +_{i\in\mathbb{Z}}$ is a $\mathbb{Z}$-family of elements of $\mathbb{N}$ +(because $\left\vert i\right\vert $ is an element of $\mathbb{N}$ for each +$i\in\mathbb{Z}$). It can also be regarded as a $\mathbb{Z}$-family of +elements of $\mathbb{Z}$. + +\textbf{(e)} If $I$ is the set $\left\{ 1,2,\ldots,n\right\} $ for some +$n\in\mathbb{N}$, and if $A$ is any set, then an $I$-family $\left( +a_{i}\right) _{i\in\left\{ 1,2,\ldots,n\right\} }$ of elements of $A$ is +the same as an $n$-tuple of elements of $A$. The same holds for families and +$n$-tuples of elements from different sets. Thus, any $n$ sets $A_{1}% +,A_{2},\ldots,A_{n}$ satisfy $\prod_{i\in\left\{ 1,2,\ldots,n\right\} }% +A_{i}=\prod_{i=1}^{n}A_{i}$. +\end{example} + +The notation \textquotedblleft$\left( a_{i}\right) _{i\in I}$% +\textquotedblright\ in Definition \ref{def.ind.families.fams} should be +pronounced as \textquotedblleft the $I$-family whose $i$-th entry is $a_{i}$ +for each $i\in I$\textquotedblright. The letter \textquotedblleft% +$i$\textquotedblright\ is used as a variable in this notation (similar to the +\textquotedblleft$i$\textquotedblright\ in the expression \textquotedblleft% +$\sum_{i=1}^{n}i$\textquotedblright); it does not refer to any specific +element of $I$. As usual, it does not matter which letter we are using for +this variable (as long as it does not already have a different meaning); thus, +for example, the $\mathbb{Z}$-families $\left( \left\vert i\right\vert +\right) _{i\in\mathbb{Z}}$ and $\left( \left\vert p\right\vert \right) +_{p\in\mathbb{Z}}$ and $\left( \left\vert w\right\vert \right) +_{w\in\mathbb{Z}}$ are all identical. + +\begin{remark} +\label{rmk.ind.families.maps}Let $I$ and $A$ be two sets. What is the +difference between an $A$-valued $I$-family and a map from $I$ to $A$ ? Both +of these objects consist of a choice of an element of $A$ for each $i\in I$. + +The main difference is terminological: e.g., when we speak of a family, the +elements of $A$ that constitute it are called its \textquotedblleft +entries\textquotedblright, whereas for a map they are called its +\textquotedblleft images\textquotedblright\ or \textquotedblleft +values\textquotedblright. Also, the notations for them are different: The +$A$-valued $I$-family $\left( a_{i}\right) _{i\in I}$ corresponds to the map +$I\rightarrow A,\ i\mapsto a_{i}$. + +There is also another, subtler difference: A map from $I$ to $A$ +\textquotedblleft knows\textquotedblright\ what the set $A$ is (so that, for +example, the maps $\mathbb{N}\rightarrow\mathbb{N},\ i\mapsto i$ and +$\mathbb{N}\rightarrow\mathbb{Z},\ i\mapsto i$ are considered different, even +though they map every element of $\mathbb{N}$ to the same value); but an +$A$-valued $I$-family does not \textquotedblleft know\textquotedblright\ what +the set $A$ is (so that, for example, the $\mathbb{N}$-valued $\mathbb{N}% +$-family $\left( i\right) _{i\in\mathbb{N}}$ is considered identical with +the $\mathbb{Z}$-valued $\mathbb{N}$-family $\left( i\right) _{i\in +\mathbb{N}}$). This matters occasionally when one wants to consider maps or +families for different sets simultaneously; it is not relevant if we just work +with $A$-valued $I$-families (or maps from $I$ to $A$) for two fixed sets $I$ +and $A$. And either way, these conventions are not universal across the +mathematical literature; for some authors, maps from $I$ to $A$ do not +\textquotedblleft know\textquotedblright\ what $A$ is, whereas other authors +want families to \textquotedblleft know\textquotedblright\ this too. + +What is certainly true, independently of any conventions, is the following +fact: If $I$ and $A$ are two sets, then the map% +\begin{align*} +\left\{ \text{maps from }I\text{ to }A\right\} & \rightarrow\left\{ +A\text{-valued }I\text{-families}\right\} ,\\ +f & \mapsto\left( f\left( i\right) \right) _{i\in I}% +\end{align*} +is bijective. (Its inverse map sends every $A$-valued $I$-family $\left( +a_{i}\right) _{i\in I}$ to the map $I\rightarrow A,\ i\mapsto a_{i}$.) Thus, +there is little harm in equating $\left\{ \text{maps from }I\text{ to +}A\right\} $ with $\left\{ A\text{-valued }I\text{-families}\right\} $. +\end{remark} + +We already know from Example \ref{exa.ind.families.fams} \textbf{(e)} that +$n$-tuples are a particular case of families; the same holds for rectangular tables: + +\begin{definition} +\label{def.ind.families.rectab}Let $A$ be a set. Let $n\in\mathbb{N}$ and +$m\in\mathbb{N}$. Then, an $n\times m$\textit{-table} of elements of $A$ means +an $A$-valued $\left\{ 1,2,\ldots,n\right\} \times\left\{ 1,2,\ldots +,m\right\} $-family. According to Remark \ref{rmk.ind.families.maps}, this is +tantamount to saying that an $n\times m$-table of elements of $A$ means a map +from $\left\{ 1,2,\ldots,n\right\} \times\left\{ 1,2,\ldots,m\right\} $ to +$A$, except for notational differences (such as referring to the elements that +constitute the $n\times m$-table as \textquotedblleft +entries\textquotedblright\ rather than \textquotedblleft +values\textquotedblright) and for the fact that an $n\times m$-table does not +\textquotedblleft know\textquotedblright\ $A$ (whereas a map would do). + +In future chapters, we shall consider \textquotedblleft$n\times m$% +-matrices\textquotedblright, which are defined as maps from $\left\{ +1,2,\ldots,n\right\} \times\left\{ 1,2,\ldots,m\right\} $ to $A$ rather +than as $A$-valued $\left\{ 1,2,\ldots,n\right\} \times\left\{ +1,2,\ldots,m\right\} $-families. We shall keep using the same notations for +them as for $n\times m$-tables, but unlike $n\times m$-tables, they will +\textquotedblleft know\textquotedblright\ $A$ (that is, two $n\times +m$-matrices with the same entries but different sets $A$ will be considered +different). Anyway, this difference is minor. +\end{definition} + +\begin{noncompile} +(The following has been removed, since I don't use these notations.) + +\textbf{(b)} Let $C=\left( c_{i,j}\right) _{\left( i,j\right) \in\left\{ +1,2,\ldots,n\right\} \times\left\{ 1,2,\ldots,m\right\} }$ be an $n\times +m$-table of elements of $A$. Then, $C$ is often written as% +\[ +\left( +\begin{array} +[c]{cccc}% +c_{1,1} & c_{1,2} & \cdots & c_{1,m}\\ +c_{2,1} & c_{2,2} & \cdots & c_{2,m}\\ +\vdots & \vdots & \ddots & \vdots\\ +c_{n,1} & c_{n,2} & \cdots & c_{n,m}% +\end{array} +\right) +\] +(that is, as a rectangular table with $n$ rows and $m$ columns such that the +entry in the $i$-th row and the $j$-th column is $c_{i,j}$). For any +$p\in\left\{ 1,2,\ldots,n\right\} $, we denote the $1\times m$-table% +\[ +\left( c_{p,j}\right) _{\left( i,j\right) \in\left\{ 1\right\} +\times\left\{ 1,2,\ldots,m\right\} }=\left( +\begin{array} +[c]{cccc}% +c_{p,1} & c_{p,2} & \cdots & c_{p,m}% +\end{array} +\right) +\] +as the $p$\textit{-th row} of $C$. For any $q\in\left\{ 1,2,\ldots,m\right\} +$, we denote the $n\times1$-table% +\[ +\left( c_{i,q}\right) _{\left( i,j\right) \in\left\{ 1,2,\ldots +,n\right\} \times\left\{ 1\right\} }=\left( +\begin{array} +[c]{c}% +c_{1,q}\\ +c_{2,q}\\ +\vdots\\ +c_{n,q}% +\end{array} +\right) +\] +as the $q$\textit{-th column} of $C$. +\end{noncompile} + +\subsubsection{A desirable definition} + +We now know what an $\mathbb{A}$-valued $S$-family is (for some set $S$): It +is just a way of choosing some element of $\mathbb{A}$ for each $s\in S$. When +this element is called $a_{s}$, the $S$-family is called $\left( +a_{s}\right) _{s\in S}$. + +We now want to define the sum of an $\mathbb{A}$-valued $S$-family $\left( +a_{s}\right) _{s\in S}$ when the set $S$ is finite. Actually, we have already +seen a definition of this sum (which is called $\sum_{s\in S}a_{s}$) in +Section \ref{sect.sums-repetitorium}. The only problem with that definition is +that we don't know yet that it is legitimate. Let us nevertheless recall it +(rewriting it using the notion of an $\mathbb{A}$-valued $S$-family): + +\begin{definition} +\label{def.ind.gen-com.defsum1}If $S$ is a finite set, and if $\left( +a_{s}\right) _{s\in S}$ is an $\mathbb{A}$-valued $S$-family, then we want to +define the number $\sum_{s\in S}a_{s}$. We define this number by recursion on +$\left\vert S\right\vert $ as follows: + +\begin{itemize} +\item If $\left\vert S\right\vert =0$, then $\sum_{s\in S}a_{s}$ is defined to +be $0$. + +\item Let $n\in\mathbb{N}$. Assume that we have defined $\sum_{s\in S}a_{s}$ +for every finite set $S$ with $\left\vert S\right\vert =n$ and any +$\mathbb{A}$-valued $S$-family $\left( a_{s}\right) _{s\in S}$. Now, if $S$ +is a finite set with $\left\vert S\right\vert =n+1$, and if $\left( +a_{s}\right) _{s\in S}$ is any $\mathbb{A}$-valued $S$-family, then +$\sum_{s\in S}a_{s}$ is defined by picking any $t\in S$ and setting% +\begin{equation} +\sum_{s\in S}a_{s}=a_{t}+\sum_{s\in S\setminus\left\{ t\right\} }a_{s}. +\label{eq.def.ind.gen-com.defsum1.step}% +\end{equation} + +\end{itemize} +\end{definition} + +As we already observed in Section \ref{sect.sums-repetitorium}, it is not +obvious that this definition is legitimate: The right hand side of +(\ref{eq.def.ind.gen-com.defsum1.step}) is defined using a choice of $t$, but +we want our value of $\sum_{s\in S}a_{s}$ to depend only on $S$ and $\left( +a_{s}\right) _{s\in S}$ (not on some arbitrarily chosen $t\in S$). Thus, we +cannot use this definition yet. Our main goal in this section is to prove that +it is indeed legitimate. + +\subsubsection{The set of all possible sums} + +There are two ways to approach this goal. One is to prove the legitimacy of +Definition \ref{def.ind.gen-com.defsum1} by strong induction on $\left\vert +S\right\vert $; the statement $\mathcal{A}\left( n\right) $ that we would be +proving for each $n\in\mathbb{N}$ here would be saying that Definition +\ref{def.ind.gen-com.defsum1} is legitimate for all finite sets $S$ satisfying +$\left\vert S\right\vert =n$. This is not hard, but conceptually confusing, as +it would require us to use Definition \ref{def.ind.gen-com.defsum1} for +\textbf{some} sets $S$ while its legitimacy for other sets $S$ is yet unproven. + +We prefer to proceed in a different way: We shall first define a set +$\operatorname*{Sums}\left( \left( a_{s}\right) _{s\in S}\right) $ for any +$\mathbb{A}$-valued $S$-family $\left( a_{s}\right) _{s\in S}$; this set +shall consist (roughly speaking) of \textquotedblleft all possible values that +$\sum_{s\in S}a_{s}$ could have according to Definition +\ref{def.ind.gen-com.defsum1}\textquotedblright. This set will be defined +recursively, more or less following Definition \ref{def.ind.gen-com.defsum1}, +but instead of relying on a choice of \textbf{some} $t\in S$, it will use +\textbf{all} possible elements $t\in S$. (See Definition +\ref{def.ind.gen-com.Sums} for the precise definition.) Unlike $\sum_{s\in +S}a_{s}$ itself, it will be a set of numbers, not a single number; however, it +has the advantage that the legitimacy of its definition will be immediately +obvious. Then, we will prove (in Theorem \ref{thm.ind.gen-com.Sums1}) that +this set $\operatorname*{Sums}\left( \left( a_{s}\right) _{s\in S}\right) +$ is actually a $1$-element set; this will allow us to define $\sum_{s\in +S}a_{s}$ to be the unique element of $\operatorname*{Sums}\left( \left( +a_{s}\right) _{s\in S}\right) $ for any $\mathbb{A}$-valued $S$-family +$\left( a_{s}\right) _{s\in S}$ (see Definition +\ref{def.ind.gen-com.defsum2}). Then, we will retroactively legitimize +Definition \ref{def.ind.gen-com.defsum1} by showing that Definition +\ref{def.ind.gen-com.defsum1} leads to the same value of $\sum_{s\in S}a_{s}$ +as Definition \ref{def.ind.gen-com.defsum2} (no matter which $t\in S$ is +chosen). Having thus justified Definition \ref{def.ind.gen-com.defsum1}, we +will forget about the set $\operatorname*{Sums}\left( \left( a_{s}\right) +_{s\in S}\right) $ and about Definition \ref{def.ind.gen-com.defsum2}. + +In later subsections, we shall prove some basic properties of sums. + +Let us define the set $\operatorname*{Sums}\left( \left( a_{s}\right) +_{s\in S}\right) $, as promised: + +\begin{definition} +\label{def.ind.gen-com.Sums}If $S$ is a finite set, and if $\left( +a_{s}\right) _{s\in S}$ is an $\mathbb{A}$-valued $S$-family, then we want to +define the set $\operatorname*{Sums}\left( \left( a_{s}\right) _{s\in +S}\right) $ of numbers. We define this set by recursion on $\left\vert +S\right\vert $ as follows: + +\begin{itemize} +\item If $\left\vert S\right\vert =0$, then $\operatorname*{Sums}\left( +\left( a_{s}\right) _{s\in S}\right) $ is defined to be $\left\{ +0\right\} $. + +\item Let $n\in\mathbb{N}$. Assume that we have defined $\operatorname*{Sums}% +\left( \left( a_{s}\right) _{s\in S}\right) $ for every finite set $S$ +with $\left\vert S\right\vert =n$ and any $\mathbb{A}$-valued $S$-family +$\left( a_{s}\right) _{s\in S}$. Now, if $S$ is a finite set with +$\left\vert S\right\vert =n+1$, and if $\left( a_{s}\right) _{s\in S}$ is +any $\mathbb{A}$-valued $S$-family, then $\operatorname*{Sums}\left( \left( +a_{s}\right) _{s\in S}\right) $ is defined by% +\begin{align} +& \operatorname*{Sums}\left( \left( a_{s}\right) _{s\in S}\right) +\nonumber\\ +& =\left\{ a_{t}+b\ \mid\ t\in S\text{ and }b\in\operatorname*{Sums}\left( +\left( a_{s}\right) _{s\in S\setminus\left\{ t\right\} }\right) \right\} +. \label{eq.def.ind.gen-com.Sums.rec}% +\end{align} +(The sets $\operatorname*{Sums}\left( \left( a_{s}\right) _{s\in +S\setminus\left\{ t\right\} }\right) $ on the right hand side of this +equation are well-defined, because for each $t\in S$, we have $\left\vert +S\setminus\left\{ t\right\} \right\vert =\left\vert S\right\vert -1=n$ +(since $\left\vert S\right\vert =n+1$), and therefore $\operatorname*{Sums}% +\left( \left( a_{s}\right) _{s\in S\setminus\left\{ t\right\} }\right) $ +is well-defined by our assumption.) +\end{itemize} +\end{definition} + +\begin{example} +\label{exa.def.ind.gen-com.Sums.1}Let $S$ be a finite set. Let $\left( +a_{s}\right) _{s\in S}$ be an $\mathbb{A}$-valued $S$-family. Let us see what +Definition \ref{def.ind.gen-com.Sums} says when $S$ has only few elements: + +\textbf{(a)} If $S=\varnothing$, then +\begin{equation} +\operatorname*{Sums}\left( \left( a_{s}\right) _{s\in\varnothing}\right) +=\left\{ 0\right\} \label{eq.exa.def.ind.gen-com.Sums.1.0}% +\end{equation} +(directly by Definition \ref{def.ind.gen-com.Sums}, since $\left\vert +S\right\vert =\left\vert \varnothing\right\vert =0$ in this case). + +\textbf{(b)} If $S=\left\{ x\right\} $ for some element $x$, then Definition +\ref{def.ind.gen-com.Sums} yields% +\begin{align} +& \operatorname*{Sums}\left( \left( a_{s}\right) _{s\in\left\{ x\right\} +}\right) \nonumber\\ +& =\left\{ a_{t}+b\ \mid\ t\in\left\{ x\right\} \text{ and }% +b\in\operatorname*{Sums}\left( \left( a_{s}\right) _{s\in\left\{ +x\right\} \setminus\left\{ t\right\} }\right) \right\} \nonumber\\ +& =\left\{ a_{x}+b\ \mid\ b\in\operatorname*{Sums}\left( \left( +a_{s}\right) _{s\in\left\{ x\right\} \setminus\left\{ x\right\} }\right) +\right\} \ \ \ \ \ \ \ \ \ \ \left( \text{since the only }t\in\left\{ +x\right\} \text{ is }x\right) \nonumber\\ +& =\left\{ a_{x}+b\ \mid\ b\in\underbrace{\operatorname*{Sums}\left( +\left( a_{s}\right) _{s\in\varnothing}\right) }_{=\left\{ 0\right\} +}\right\} \ \ \ \ \ \ \ \ \ \ \left( \text{since }\left\{ x\right\} +\setminus\left\{ x\right\} =\varnothing\right) \nonumber\\ +& =\left\{ a_{x}+b\ \mid\ b\in\left\{ 0\right\} \right\} =\left\{ +a_{x}+0\right\} =\left\{ a_{x}\right\} . +\label{eq.exa.def.ind.gen-com.Sums.1.1}% +\end{align} + + +\textbf{(c)} If $S=\left\{ x,y\right\} $ for two distinct elements $x$ and +$y$, then Definition \ref{def.ind.gen-com.Sums} yields% +\begin{align*} +& \operatorname*{Sums}\left( \left( a_{s}\right) _{s\in\left\{ +x,y\right\} }\right) \\ +& =\left\{ a_{t}+b\ \mid\ t\in\left\{ x,y\right\} \text{ and }% +b\in\operatorname*{Sums}\left( \left( a_{s}\right) _{s\in\left\{ +x,y\right\} \setminus\left\{ t\right\} }\right) \right\} \\ +& =\left\{ a_{x}+b\ \mid\ b\in\operatorname*{Sums}\left( \left( +a_{s}\right) _{s\in\left\{ x,y\right\} \setminus\left\{ x\right\} +}\right) \right\} \\ +& \ \ \ \ \ \ \ \ \ \ \cup\left\{ a_{y}+b\ \mid\ b\in\operatorname*{Sums}% +\left( \left( a_{s}\right) _{s\in\left\{ x,y\right\} \setminus\left\{ +y\right\} }\right) \right\} \\ +& =\left\{ a_{x}+b\ \mid\ b\in\underbrace{\operatorname*{Sums}\left( +\left( a_{s}\right) _{s\in\left\{ y\right\} }\right) }% +_{\substack{=\left\{ a_{y}\right\} \\\text{(by +(\ref{eq.exa.def.ind.gen-com.Sums.1.1}), applied to }y\text{ instead of +}x\text{)}}}\right\} \\ +& \ \ \ \ \ \ \ \ \ \ \cup\left\{ a_{y}+b\ \mid\ b\in +\underbrace{\operatorname*{Sums}\left( \left( a_{s}\right) _{s\in\left\{ +x\right\} }\right) }_{\substack{=\left\{ a_{x}\right\} \\\text{(by +(\ref{eq.exa.def.ind.gen-com.Sums.1.1}))}}}\right\} \\ +& \ \ \ \ \ \ \ \ \ \ \left( \text{since }\left\{ x,y\right\} +\setminus\left\{ x\right\} =\left\{ y\right\} \text{ and }\left\{ +x,y\right\} \setminus\left\{ y\right\} =\left\{ x\right\} \right) \\ +& =\underbrace{\left\{ a_{x}+b\ \mid\ b\in\left\{ a_{y}\right\} \right\} +}_{=\left\{ a_{x}+a_{y}\right\} }\cup\underbrace{\left\{ a_{y}% ++b\ \mid\ b\in\left\{ a_{x}\right\} \right\} }_{=\left\{ a_{y}% ++a_{x}\right\} }\\ +& =\left\{ a_{x}+a_{y}\right\} \cup\left\{ a_{y}+a_{x}\right\} =\left\{ +a_{x}+a_{y},a_{y}+a_{x}\right\} =\left\{ a_{x}+a_{y}\right\} +\end{align*} +(since $a_{y}+a_{x}=a_{x}+a_{y}$). + +\textbf{(d)} Similar reasoning shows that if $S=\left\{ x,y,z\right\} $ for +three distinct elements $x$, $y$ and $z$, then% +\[ +\operatorname*{Sums}\left( \left( a_{s}\right) _{s\in\left\{ +x,y,z\right\} }\right) =\left\{ a_{x}+\left( a_{y}+a_{z}\right) +,a_{y}+\left( a_{x}+a_{z}\right) ,a_{z}+\left( a_{x}+a_{y}\right) +\right\} . +\] +It is not hard to check (using Proposition \ref{prop.ind.gen-com.fgh} and +Proposition \ref{prop.ind.gen-com.fg}) that the three elements $a_{x}+\left( +a_{y}+a_{z}\right) $, $a_{y}+\left( a_{x}+a_{z}\right) $ and $a_{z}+\left( +a_{x}+a_{y}\right) $ of this set are equal, so we may call them $a_{x}% ++a_{y}+a_{z}$; thus, we can rewrite this equality as% +\[ +\operatorname*{Sums}\left( \left( a_{s}\right) _{s\in\left\{ +x,y,z\right\} }\right) =\left\{ a_{x}+a_{y}+a_{z}\right\} . +\] + + +\textbf{(e)} Going further, we can see that if $S=\left\{ x,y,z,w\right\} $ +for four distinct elements $x$, $y$, $z$ and $w$, then% +\begin{align*} +\operatorname*{Sums}\left( \left( a_{s}\right) _{s\in\left\{ +x,y,z,w\right\} }\right) & =\left\{ a_{x}+\left( a_{y}+a_{z}% ++a_{w}\right) ,a_{y}+\left( a_{x}+a_{z}+a_{w}\right) ,\right. \\ +& \ \ \ \ \ \ \ \ \ \ \left. a_{z}+\left( a_{x}+a_{y}+a_{w}\right) +,a_{w}+\left( a_{x}+a_{y}+a_{z}\right) \right\} . +\end{align*} +Again, it is not hard to prove that +\begin{align*} +a_{x}+\left( a_{y}+a_{z}+a_{w}\right) & =a_{y}+\left( a_{x}+a_{z}% ++a_{w}\right) \\ +& =a_{z}+\left( a_{x}+a_{y}+a_{w}\right) =a_{w}+\left( a_{x}+a_{y}% ++a_{z}\right) , +\end{align*} +and thus the set $\operatorname*{Sums}\left( \left( a_{s}\right) +_{s\in\left\{ x,y,z,w\right\} }\right) $ is again a $1$-element set, whose +unique element can be called $a_{x}+a_{y}+a_{z}+a_{w}$. +\end{example} + +These examples suggest that the set $\operatorname*{Sums}\left( \left( +a_{s}\right) _{s\in S}\right) $ should always be a $1$-element set. This is +precisely what we are going to claim now: + +\begin{theorem} +\label{thm.ind.gen-com.Sums1}If $S$ is a finite set, and if $\left( +a_{s}\right) _{s\in S}$ is an $\mathbb{A}$-valued $S$-family, then the set +$\operatorname*{Sums}\left( \left( a_{s}\right) _{s\in S}\right) $ is a +$1$-element set. +\end{theorem} + +\subsubsection{The set of all possible sums is a $1$-element set: proof} + +Before we step to the proof of Theorem \ref{thm.ind.gen-com.Sums1}, we observe +an almost trivial lemma: + +\begin{lemma} +\label{lem.ind.gen-com.Sums-lem}Let $a$, $b$ and $c$ be three numbers (i.e., +elements of $\mathbb{A}$). Then, $a+\left( b+c\right) =b+\left( a+c\right) +$. +\end{lemma} + +\begin{proof} +[Proof of Lemma \ref{lem.ind.gen-com.Sums-lem}.]Proposition +\ref{prop.ind.gen-com.fgh} (applied to $b$ and $a$ instead of $a$ and $b$) +yields $\left( b+a\right) +c=b+\left( a+c\right) $. Also, Proposition +\ref{prop.ind.gen-com.fgh} yields $\left( a+b\right) +c=a+\left( +b+c\right) $. Hence,% +\[ +a+\left( b+c\right) =\underbrace{\left( a+b\right) }% +_{\substack{=b+a\\\text{(by Proposition \ref{prop.ind.gen-com.fg})}% +}}+c=\left( b+a\right) +c=b+\left( a+c\right) . +\] +This proves Lemma \ref{lem.ind.gen-com.Sums-lem}. +\end{proof} + +\begin{proof} +[Proof of Theorem \ref{thm.ind.gen-com.Sums1}.]We shall prove Theorem +\ref{thm.ind.gen-com.Sums1} by strong induction on $\left\vert S\right\vert $: + +Let $m\in\mathbb{N}$. Assume that Theorem \ref{thm.ind.gen-com.Sums1} holds +under the condition that $\left\vert S\right\vert 0$, so that the set $S$ is nonempty. Fix any $t\in S$. (Such a $t$ +exists, since the set $S$ is nonempty.) We have $\left\vert S\setminus\left\{ +t\right\} \right\vert =\left\vert S\right\vert -1=n$ (since $\left\vert +S\right\vert =n+1$), so that we can assume (because we are using recursion) +that $\sum_{s\in S\setminus\left\{ t\right\} }a_{s}$ has already been +computed. Then, $\sum_{s\in S}a_{s}=a_{t}+\sum_{s\in S\setminus\left\{ +t\right\} }a_{s}$. (This follows from Lemma \ref{lem.ind.gen-com.same-sum} +\textbf{(b)}.) +\end{itemize} + +We can restate this algorithm as an alternative definition of $\sum_{s\in +S}a_{s}$; it then takes the following form: + +\begin{statement} +\textit{Alternative definition of }$\sum_{s\in S}a_{s}$\textit{ for any finite +set }$S$ \textit{and any }$\mathbb{A}$\textit{-valued }$S$\textit{-family +}$\left( a_{s}\right) _{s\in S}$\textit{:} If $S$ is a finite set, and if +$\left( a_{s}\right) _{s\in S}$ is an $\mathbb{A}$-valued $S$-family, then +we define $\sum_{s\in S}a_{s}$ by recursion on $\left\vert S\right\vert $ as follows: + +\begin{itemize} +\item If $\left\vert S\right\vert =0$, then $\sum_{s\in S}a_{s}$ is defined to +be $0$. + +\item Let $n\in\mathbb{N}$. Assume that we have defined $\sum_{s\in S}a_{s}$ +for every finite set $S$ with $\left\vert S\right\vert =n$ and any +$\mathbb{A}$-valued $S$-family $\left( a_{s}\right) _{s\in S}$. Now, if $S$ +is a finite set with $\left\vert S\right\vert =n+1$, and if $\left( +a_{s}\right) _{s\in S}$ is any $\mathbb{A}$-valued $S$-family, then +$\sum_{s\in S}a_{s}$ is defined by picking any $t\in S$ and setting% +\begin{equation} +\sum_{s\in S}a_{s}=a_{t}+\sum_{s\in S\setminus\left\{ t\right\} }a_{s}. +\label{pf.thm.ind.gen-com.wd.ad.t}% +\end{equation} + +\end{itemize} +\end{statement} + +This alternative definition of $\sum_{s\in S}a_{s}$ merely follows the above +algorithm for computing $\sum_{s\in S}a_{s}$. Thus, it is guaranteed to always +yield the same value of $\sum_{s\in S}a_{s}$ as Definition +\ref{def.ind.gen-com.defsum2}, independently of the choice of $t$. Hence, we +obtain the following: + +\begin{statement} +\textit{Claim 1:} This alternative definition is legitimate (i.e., the value +of $\sum_{s\in S}a_{s}$ in (\ref{pf.thm.ind.gen-com.wd.ad.t}) does not depend +on the choice of $t$), and is equivalent to Definition +\ref{def.ind.gen-com.defsum2}. +\end{statement} + +But on the other hand, this alternative definition is precisely Definition +\ref{def.ind.gen-com.defsum1}. Hence, Claim 1 rewrites as follows: Definition +\ref{def.ind.gen-com.defsum1} is legitimate (i.e., the value of $\sum_{s\in +S}a_{s}$ in Definition \ref{def.ind.gen-com.defsum1} does not depend on the +choice of $t$), and is equivalent to Definition \ref{def.ind.gen-com.defsum2}. +This proves both parts \textbf{(a)} and \textbf{(b)} of Theorem +\ref{thm.ind.gen-com.wd}. +\end{proof} + +Theorem \ref{thm.ind.gen-com.wd} \textbf{(a)} shows that Definition +\ref{def.ind.gen-com.defsum1} is legitimate. + +Thus, at last, we have vindicated the notation $\sum_{s\in S}a_{s}$ that was +introduced in Section \ref{sect.sums-repetitorium} (because the definition of +this notation we gave in Section \ref{sect.sums-repetitorium} was precisely +Definition \ref{def.ind.gen-com.defsum1}). We can now forget about Definition +\ref{def.ind.gen-com.defsum2}, since it has served its purpose (which was to +justify Definition \ref{def.ind.gen-com.defsum1}). (Of course, we could also +forget about Definition \ref{def.ind.gen-com.defsum1} instead, and use +Definition \ref{def.ind.gen-com.defsum2} as our definition of $\sum_{s\in +S}a_{s}$ (after all, these two definitions are equivalent, as we now know). +Then, we would have to replace every reference to the definition of +$\sum_{s\in S}a_{s}$ by a reference to Lemma \ref{lem.ind.gen-com.same-sum}; +in particular, we would have to replace every use of (\ref{eq.sum.def.1}) by a +use of Lemma \ref{lem.ind.gen-com.same-sum} \textbf{(b)}. Other than this, +everything would work the same way.) + +The notation $\sum_{s\in S}a_{s}$ has several properties, many of which were +collected in Section \ref{sect.sums-repetitorium}. We shall prove some of +these properties later in this section. + +From now on, we shall be using all the conventions and notations regarding +sums that we introduced in Section \ref{sect.sums-repetitorium}. In +particular, expressions of the form \textquotedblleft$\sum_{s\in S}a_{s}% ++b$\textquotedblright\ shall always be interpreted as $\left( \sum_{s\in +S}a_{s}\right) +b$, not as $\sum_{s\in S}\left( a_{s}+b\right) $; but +expressions of the form \textquotedblleft$\sum_{s\in S}ba_{s}c$% +\textquotedblright\ shall always be understood to mean $\sum_{s\in S}\left( +ba_{s}c\right) $. + +\subsubsection{Triangular numbers revisited} + +Recall one specific notation we introduced in Section +\ref{sect.sums-repetitorium}: If $u$ and $v$ are two integers, and if $a_{s}$ +is a number for each $s\in\left\{ u,u+1,\ldots,v\right\} $, then $\sum +_{s=u}^{v}a_{s}$ is defined by% +\[ +\sum_{s=u}^{v}a_{s}=\sum_{s\in\left\{ u,u+1,\ldots,v\right\} }a_{s}. +\] +This sum $\sum_{s=u}^{v}a_{s}$ is also denoted by $a_{u}+a_{u+1}+\cdots+a_{v}$. + +We are now ready to do something that we evaded in Section +\ref{sect.ind.trinum}: namely, to speak of the sum of the first $n$ positive +integers without having to define it recursively. Indeed, we can now interpret +this sum as $\sum_{i\in\left\{ 1,2,\ldots,n\right\} }i$, an expression which +has a well-defined meaning because we have shown that the notation $\sum_{s\in +S}a_{s}$ is well-defined. We can also rewrite this expression as $\sum +_{i=1}^{n}i$ or as $1+2+\cdots+n$. + +Thus, the classical fact that the sum of the first $n$ positive integers is +$\dfrac{n\left( n+1\right) }{2}$ can now be stated as follows: + +\begin{proposition} +\label{prop.ind.gen-com.n(n+1)/2}We have% +\begin{equation} +\sum_{i\in\left\{ 1,2,\ldots,n\right\} }i=\dfrac{n\left( n+1\right) }% +{2}\ \ \ \ \ \ \ \ \ \ \text{for each }n\in\mathbb{N}. +\label{eq.prop.ind.gen-com.n(n+1)/2.claim}% +\end{equation} + +\end{proposition} + +\begin{proof} +[Proof of Proposition \ref{prop.ind.gen-com.n(n+1)/2}.]We shall prove +(\ref{eq.prop.ind.gen-com.n(n+1)/2.claim}) by induction on $n$: + +\textit{Induction base:} We have $\left\{ 1,2,\ldots,0\right\} =\varnothing$ +and thus $\left\vert \left\{ 1,2,\ldots,0\right\} \right\vert =\left\vert +\varnothing\right\vert =0$. Hence, the definition of $\sum_{i\in\left\{ +1,2,\ldots,0\right\} }i$ yields% +\begin{equation} +\sum_{i\in\left\{ 1,2,\ldots,0\right\} }i=0. +\label{pf.prop.ind.gen-com.n(n+1)/2.IB.1}% +\end{equation} +(To be more precise, we have used the first bullet point of Definition +\ref{def.ind.gen-com.defsum1} here, which says that $\sum_{s\in S}a_{s}=0$ +whenever the set $S$ and the $\mathbb{A}$-valued $S$-family $\left( +a_{s}\right) _{s\in S}$ satisfy $\left\vert S\right\vert =0$. If you are +using Definition \ref{def.ind.gen-com.defsum2} instead of Definition +\ref{def.ind.gen-com.defsum1}, you should instead be using Lemma +\ref{lem.ind.gen-com.same-sum} \textbf{(a)} to argue this.) + +Comparing (\ref{pf.prop.ind.gen-com.n(n+1)/2.IB.1}) with $\dfrac{0\left( +0+1\right) }{2}=0$, we obtain $\sum_{i\in\left\{ 1,2,\ldots,0\right\} +}i=\dfrac{0\left( 0+1\right) }{2}$. In other words, +(\ref{eq.prop.ind.gen-com.n(n+1)/2.claim}) holds for $n=0$. This completes the +induction base. + +\textit{Induction step:} Let $m\in\mathbb{N}$. Assume that +(\ref{eq.prop.ind.gen-com.n(n+1)/2.claim}) holds for $n=m$. We must prove that +(\ref{eq.prop.ind.gen-com.n(n+1)/2.claim}) holds for $n=m+1$. + +We have assumed that (\ref{eq.prop.ind.gen-com.n(n+1)/2.claim}) holds for +$n=m$. In other words, we have +\begin{equation} +\sum_{i\in\left\{ 1,2,\ldots,m\right\} }i=\dfrac{m\left( m+1\right) }{2}. +\label{pf.prop.ind.gen-com.n(n+1)/2.IH}% +\end{equation} + + +Now, $\left\vert \left\{ 1,2,\ldots,m+1\right\} \right\vert =m+1$ and +$m+1\in\left\{ 1,2,\ldots,m+1\right\} $ (since $m+1$ is a positive integer +(since $m\in\mathbb{N}$)). Hence, (\ref{eq.sum.def.1}) (applied to $n=m$, +$S=\left\{ 1,2,\ldots,m+1\right\} $, $t=m+1$ and $\left( a_{s}\right) +_{s\in S}=\left( i\right) _{i\in\left\{ 1,2,\ldots,m+1\right\} }$) yields% +\begin{equation} +\sum_{i\in\left\{ 1,2,\ldots,m+1\right\} }i=\left( m+1\right) +\sum +_{i\in\left\{ 1,2,\ldots,m+1\right\} \setminus\left\{ m+1\right\} }i. +\label{pf.prop.ind.gen-com.n(n+1)/2.3}% +\end{equation} +(Here, we have relied on the equality (\ref{eq.sum.def.1}), which appears +verbatim in Definition \ref{def.ind.gen-com.defsum1}. If you are using +Definition \ref{def.ind.gen-com.defsum2} instead of Definition +\ref{def.ind.gen-com.defsum1}, you should instead be using Lemma +\ref{lem.ind.gen-com.same-sum} \textbf{(b)} to argue this.) + +Now, (\ref{pf.prop.ind.gen-com.n(n+1)/2.3}) becomes% +\begin{align*} +\sum_{i\in\left\{ 1,2,\ldots,m+1\right\} }i & =\left( m+1\right) ++\sum_{i\in\left\{ 1,2,\ldots,m+1\right\} \setminus\left\{ m+1\right\} +}i=\left( m+1\right) +\underbrace{\sum_{i\in\left\{ 1,2,\ldots,m\right\} +}i}_{\substack{=\dfrac{m\left( m+1\right) }{2}\\\text{(by +(\ref{pf.prop.ind.gen-com.n(n+1)/2.IH}))}}}\\ +& \ \ \ \ \ \ \ \ \ \ \left( \text{since }\left\{ 1,2,\ldots,m+1\right\} +\setminus\left\{ m+1\right\} =\left\{ 1,2,\ldots,m\right\} \right) \\ +& =\left( m+1\right) +\dfrac{m\left( m+1\right) }{2}=\dfrac{2\left( +m+1\right) +m\left( m+1\right) }{2}\\ +& =\dfrac{\left( m+1\right) \left( \left( m+1\right) +1\right) }{2}% +\end{align*} +(since $2\left( m+1\right) +m\left( m+1\right) =\left( m+1\right) +\left( \left( m+1\right) +1\right) $). In other words, +(\ref{eq.prop.ind.gen-com.n(n+1)/2.claim}) holds for $n=m+1$. This completes +the induction step. Thus, the induction proof of +(\ref{eq.prop.ind.gen-com.n(n+1)/2.claim}) is finished. Hence, Proposition +\ref{prop.ind.gen-com.n(n+1)/2} holds. +\end{proof} + +\subsubsection{Sums of a few numbers} + +Merely for the sake of future convenience, let us restate (\ref{eq.sum.def.1}) +in a slightly more direct way (without mentioning $\left\vert S\right\vert $): + +\begin{proposition} +\label{prop.ind.gen-com.split-off}Let $S$ be a finite set, and let $\left( +a_{s}\right) _{s\in S}$ be an $\mathbb{A}$-valued $S$-family. Let $t\in S$. +Then,% +\[ +\sum_{s\in S}a_{s}=a_{t}+\sum_{s\in S\setminus\left\{ t\right\} }a_{s}. +\] + +\end{proposition} + +\begin{proof} +[Proof of Proposition \ref{prop.ind.gen-com.split-off}.]Let $n=\left\vert +S\setminus\left\{ t\right\} \right\vert $; thus, $n\in\mathbb{N}$ (since +$S\setminus\left\{ t\right\} $ is a finite set). Also, $n=\left\vert +S\setminus\left\{ t\right\} \right\vert =\left\vert S\right\vert -1$ (since +$t\in S$), and thus $\left\vert S\right\vert =n+1$. Hence, (\ref{eq.sum.def.1}% +) yields $\sum_{s\in S}a_{s}=a_{t}+\sum_{s\in S\setminus\left\{ t\right\} +}a_{s}$. This proves Proposition \ref{prop.ind.gen-com.split-off}. +\end{proof} + +(Alternatively, we can argue that Proposition \ref{prop.ind.gen-com.split-off} +is the same as Lemma \ref{lem.ind.gen-com.same-sum} \textbf{(b)}, except that +we are now using Definition \ref{def.ind.gen-com.defsum1} instead of +Definition \ref{def.ind.gen-com.defsum2} to define the sums involved -- but +this difference is insubstantial, since we have shown that these two +definitions are equivalent.) + +\begin{noncompile} +(Proof of Proposition \ref{prop.ind.gen-com.split-off}.) If we use Definition +\ref{def.ind.gen-com.defsum2} instead of Definition +\ref{def.ind.gen-com.defsum1} (for defining sums), then the claim of +Proposition \ref{prop.ind.gen-com.split-off} becomes precisely the claim of +Lemma \ref{lem.ind.gen-com.same-sum} \textbf{(b)}. Since we know that +Definition \ref{def.ind.gen-com.defsum2} is equivalent to Definition +\ref{def.ind.gen-com.defsum1} (according to Theorem \ref{thm.ind.gen-com.wd} +\textbf{(b)}), we thus conclude that Proposition +\ref{prop.ind.gen-com.split-off} is equivalent to the claim of Lemma +\ref{lem.ind.gen-com.same-sum} \textbf{(b)}. Thus, Proposition +\ref{prop.ind.gen-com.split-off} holds (since Lemma +\ref{lem.ind.gen-com.same-sum} \textbf{(b)} holds). +\end{noncompile} + +In Section \ref{sect.sums-repetitorium}, we have introduced $a_{u}% ++a_{u+1}+\cdots+a_{v}$ as an abbreviation for the sum $\sum_{s=u}^{v}% +a_{s}=\sum_{s\in\left\{ u,u+1,\ldots,v\right\} }a_{s}$ (whenever $u$ and $v$ +are two integers, and $a_{s}$ is a number for each $s\in\left\{ +u,u+1,\ldots,v\right\} $). In order to ensure that this abbreviation does not +create any nasty surprises, we need to check that it behaves as we would +expect -- i.e., that it satisfies the following four properties: + +\begin{itemize} +\item If the sum $a_{u}+a_{u+1}+\cdots+a_{v}$ has no addends (i.e., if $u>v$), +then it equals $0$. + +\item If the sum $a_{u}+a_{u+1}+\cdots+a_{v}$ has exactly one addend (i.e., if +$u=v$), then it equals $a_{u}$. + +\item If the sum $a_{u}+a_{u+1}+\cdots+a_{v}$ has exactly two addends (i.e., +if $u=v-1$), then it equals $a_{u}+a_{v}$. + +\item If $v\geq u$, then +\begin{align*} +a_{u}+a_{u+1}+\cdots+a_{v} & =\left( a_{u}+a_{u+1}+\cdots+a_{v-1}\right) ++a_{v}\\ +& =a_{u}+\left( a_{u+1}+a_{u+2}+\cdots+a_{v}\right) . +\end{align*} + +\end{itemize} + +The first of these four properties follows from the definition (indeed, if +$u>v$, then the set $\left\{ u,u+1,\ldots,v\right\} $ is empty and thus +satisfies $\left\vert \left\{ u,u+1,\ldots,v\right\} \right\vert =0$; but +this yields $\sum_{s\in\left\{ u,u+1,\ldots,v\right\} }a_{s}=0$). The fourth +of these four properties can easily be obtained from Proposition +\ref{prop.ind.gen-com.split-off}\footnote{In more detail: Assume that $v\geq +u$. Thus, both $u$ and $v$ belong to the set $\left\{ u,u+1,\ldots,v\right\} +$. Hence, Proposition \ref{prop.ind.gen-com.split-off} (applied to $S=\left\{ +u,u+1,\ldots,v\right\} $ and $t=v$) yields $\sum_{s\in\left\{ u,u+1,\ldots +,v\right\} }a_{s}=a_{v}+\sum_{s\in\left\{ u,u+1,\ldots,v\right\} +\setminus\left\{ v\right\} }a_{s}$. Thus,% +\begin{align*} +& a_{u}+a_{u+1}+\cdots+a_{v}\\ +& =\sum_{s\in\left\{ u,u+1,\ldots,v\right\} }a_{s}=a_{v}+\sum_{s\in\left\{ +u,u+1,\ldots,v\right\} \setminus\left\{ v\right\} }a_{s}\\ +& =a_{v}+\sum_{s\in\left\{ u,u+1,\ldots,v-1\right\} }a_{s}% +\ \ \ \ \ \ \ \ \ \ \left( \text{since }\left\{ u,u+1,\ldots,v\right\} +\setminus\left\{ v\right\} =\left\{ u,u+1,\ldots,v-1\right\} \right) \\ +& =a_{v}+\left( a_{u}+a_{u+1}+\cdots+a_{v-1}\right) =\left( a_{u}% ++a_{u+1}+\cdots+a_{v-1}\right) +a_{v}. +\end{align*} +\par +Also, Proposition \ref{prop.ind.gen-com.split-off} (applied to $S=\left\{ +u,u+1,\ldots,v\right\} $ and $t=u$) yields $\sum_{s\in\left\{ u,u+1,\ldots +,v\right\} }a_{s}=a_{u}+\sum_{s\in\left\{ u,u+1,\ldots,v\right\} +\setminus\left\{ u\right\} }a_{s}$. Thus,% +\begin{align*} +& a_{u}+a_{u+1}+\cdots+a_{v}\\ +& =\sum_{s\in\left\{ u,u+1,\ldots,v\right\} }a_{s}=a_{u}+\sum_{s\in\left\{ +u,u+1,\ldots,v\right\} \setminus\left\{ u\right\} }a_{s}\\ +& =a_{u}+\sum_{s\in\left\{ u+1,u+2,\ldots,v\right\} }a_{s}% +\ \ \ \ \ \ \ \ \ \ \left( \text{since }\left\{ u,u+1,\ldots,v\right\} +\setminus\left\{ u\right\} =\left\{ u+1,u+2,\ldots,v\right\} \right) \\ +& =a_{u}+\left( a_{u+1}+a_{u+2}+\cdots+a_{v}\right) . +\end{align*} +Hence,% +\[ +a_{u}+a_{u+1}+\cdots+a_{v}=\left( a_{u}+a_{u+1}+\cdots+a_{v-1}\right) ++a_{v}=a_{u}+\left( a_{u+1}+a_{u+2}+\cdots+a_{v}\right) . +\] +}. The second and third properties follow from the following fact: + +\begin{proposition} +\label{prop.ind.gen-com.sum12}Let $S$ be a finite set. For every $s\in S$, let +$a_{s}$ be an element of $\mathbb{A}$. + +\textbf{(a)} If $S=\left\{ p\right\} $ for some element $p$, then this $p$ +satisfies% +\[ +\sum_{s\in S}a_{s}=a_{p}. +\] + + +\textbf{(b)} If $S=\left\{ p,q\right\} $ for two distinct elements $p$ and +$q$, then these $p$ and $q$ satisfy +\[ +\sum_{s\in S}a_{s}=a_{p}+a_{q}. +\] + +\end{proposition} + +\begin{proof} +[Proof of Proposition \ref{prop.ind.gen-com.sum12}.]\textbf{(a)} Assume that +$S=\left\{ p\right\} $ for some element $p$. Consider this $p$. + +The first bullet point of Definition \ref{def.ind.gen-com.defsum1} shows that +$\sum_{s\in\varnothing}a_{s}=0$ (since $\left\vert \varnothing\right\vert +=0$). But $p\in\left\{ p\right\} =S$. Hence, Proposition +\ref{prop.ind.gen-com.split-off} (applied to $t=p$) yields +\begin{align*} +\sum_{s\in S}a_{s} & =a_{p}+\sum_{s\in S\setminus\left\{ p\right\} }% +a_{s}=a_{p}+\underbrace{\sum_{s\in\varnothing}a_{s}}_{=0}% +\ \ \ \ \ \ \ \ \ \ \left( \text{since }\underbrace{S}_{=\left\{ p\right\} +}\setminus\left\{ p\right\} =\left\{ p\right\} \setminus\left\{ +p\right\} =\varnothing\right) \\ +& =a_{p}+0=a_{p}. +\end{align*} +This proves Proposition \ref{prop.ind.gen-com.sum12} \textbf{(a)}. + +\textbf{(b)} Assume that $S=\left\{ p,q\right\} $ for two distinct elements +$p$ and $q$. Consider these $p$ and $q$. Thus, $q\neq p$ (since $p$ and $q$ +are distinct), so that $q\notin\left\{ p\right\} $. + +Proposition \ref{prop.ind.gen-com.sum12} \textbf{(a)} (applied to $\left\{ +p\right\} $ instead of $S$) yields $\sum_{s\in\left\{ p\right\} }% +a_{s}=a_{p}$ (since $\left\{ p\right\} =\left\{ p\right\} $). + +We have $\underbrace{S}_{=\left\{ p,q\right\} =\left\{ p\right\} +\cup\left\{ q\right\} }\setminus\left\{ q\right\} =\left( \left\{ +p\right\} \cup\left\{ q\right\} \right) \setminus\left\{ q\right\} +=\left\{ p\right\} \setminus\left\{ q\right\} =\left\{ p\right\} $ +(since $q\notin\left\{ p\right\} $). Also, $q\in\left\{ p,q\right\} =S$. +Hence, Proposition \ref{prop.ind.gen-com.split-off} (applied to $t=q$) yields +\begin{align*} +\sum_{s\in S}a_{s} & =a_{q}+\sum_{s\in S\setminus\left\{ q\right\} }% +a_{s}=a_{q}+\underbrace{\sum_{s\in\left\{ p\right\} }a_{s}}_{=a_{p}\text{ }% +}\ \ \ \ \ \ \ \ \ \ \left( \text{since }S\setminus\left\{ q\right\} +=\left\{ p\right\} \right) \\ +& =a_{q}+a_{p}=a_{p}+a_{q}. +\end{align*} +This proves Proposition \ref{prop.ind.gen-com.sum12} \textbf{(b)}. +\end{proof} + +\subsubsection{Linearity of sums} + +We shall now prove some general properties of finite sums. We begin with the +equality (\ref{eq.sum.linear1}) from Section \ref{sect.sums-repetitorium}: + +\begin{theorem} +\label{thm.ind.gen-com.sum(a+b)}Let $S$ be a finite set. For every $s\in S$, +let $a_{s}$ and $b_{s}$ be elements of $\mathbb{A}$. Then,% +\[ +\sum_{s\in S}\left( a_{s}+b_{s}\right) =\sum_{s\in S}a_{s}+\sum_{s\in +S}b_{s}. +\] + +\end{theorem} + +Before we prove this theorem, let us show a simple lemma: + +\begin{lemma} +\label{lem.ind.gen-com.xyuv}Let $x$, $y$, $u$ and $v$ be four numbers (i.e., +elements of $\mathbb{A}$). Then,% +\[ +\left( x+y\right) +\left( u+v\right) =\left( x+u\right) +\left( +y+v\right) . +\] + +\end{lemma} + +\begin{proof} +[Proof of Lemma \ref{lem.ind.gen-com.xyuv}.]Proposition +\ref{prop.ind.gen-com.fgh} (applied to $a=y$, $b=u$ and $c=v$) yields% +\begin{equation} +\left( y+u\right) +v=y+\left( u+v\right) . +\label{pf.lem.ind.gen-com.xyuv.1}% +\end{equation} +Also, Proposition \ref{prop.ind.gen-com.fgh} (applied to $a=x$, $b=y$ and +$c=u+v$) yields% +\begin{align} +\left( x+y\right) +\left( u+v\right) & =x+\underbrace{\left( y+\left( +u+v\right) \right) }_{\substack{=\left( y+u\right) +v\\\text{(by +(\ref{pf.lem.ind.gen-com.xyuv.1}))}}}\nonumber\\ +& =x+\left( \left( y+u\right) +v\right) . +\label{pf.lem.ind.gen-com.xyuv.2}% +\end{align} +The same argument (with $y$ and $u$ replaced by $u$ and $y$) yields% +\begin{equation} +\left( x+u\right) +\left( y+v\right) =x+\left( \left( u+y\right) ++v\right) . \label{pf.lem.ind.gen-com.xyuv.3}% +\end{equation} +But Proposition \ref{prop.ind.gen-com.fg} (applied to $a=y$ and $b=u$) yields +$y+u=u+y$. Thus, (\ref{pf.lem.ind.gen-com.xyuv.2}) becomes% +\[ +\left( x+y\right) +\left( u+v\right) =x+\left( \underbrace{\left( +y+u\right) }_{=u+y}+v\right) =x+\left( \left( u+y\right) +v\right) +=\left( x+u\right) +\left( y+v\right) +\] +(by (\ref{pf.lem.ind.gen-com.xyuv.3})). This proves Lemma +\ref{lem.ind.gen-com.xyuv}. +\end{proof} + +\begin{proof} +[Proof of Theorem \ref{thm.ind.gen-com.sum(a+b)}.]Forget that we fixed $S$, +$a_{s}$ and $b_{s}$. We shall prove Theorem \ref{thm.ind.gen-com.sum(a+b)} by +induction on $\left\vert S\right\vert $: + +\textit{Induction base:} Theorem \ref{thm.ind.gen-com.sum(a+b)} holds under +the condition that $\left\vert S\right\vert =0$% +\ \ \ \ \footnote{\textit{Proof.} Let $S$, $a_{s}$ and $b_{s}$ be as in +Theorem \ref{thm.ind.gen-com.sum(a+b)}. Assume that $\left\vert S\right\vert +=0$. Thus, the first bullet point of Definition \ref{def.ind.gen-com.defsum1} +yields $\sum_{s\in S}a_{s}=0$ and $\sum_{s\in S}b_{s}=0$ and $\sum_{s\in +S}\left( a_{s}+b_{s}\right) =0$. Hence,% +\[ +\sum_{s\in S}\left( a_{s}+b_{s}\right) =0=\underbrace{0}_{=\sum_{s\in +S}a_{s}}+\underbrace{0}_{=\sum_{s\in S}b_{s}}=\sum_{s\in S}a_{s}+\sum_{s\in +S}b_{s}. +\] +\par +Now, forget that we fixed $S$, $a_{s}$ and $b_{s}$. We thus have proved that +if $S$, $a_{s}$ and $b_{s}$ are as in Theorem \ref{thm.ind.gen-com.sum(a+b)}, +and if $\left\vert S\right\vert =0$, then $\sum_{s\in S}\left( a_{s}% ++b_{s}\right) =\sum_{s\in S}a_{s}+\sum_{s\in S}b_{s}$. In other words, +Theorem \ref{thm.ind.gen-com.sum(a+b)} holds under the condition that +$\left\vert S\right\vert =0$. Qed.}. This completes the induction base. + +\textit{Induction step:} Let $m\in\mathbb{N}$. Assume that Theorem +\ref{thm.ind.gen-com.sum(a+b)} holds under the condition that $\left\vert +S\right\vert =m$. We must now prove that Theorem +\ref{thm.ind.gen-com.sum(a+b)} holds under the condition that $\left\vert +S\right\vert =m+1$. + +We have assumed that Theorem \ref{thm.ind.gen-com.sum(a+b)} holds under the +condition that $\left\vert S\right\vert =m$. In other words, the following +claim holds: + +\begin{statement} +\textit{Claim 1:} Let $S$ be a finite set such that $\left\vert S\right\vert +=m$. For every $s\in S$, let $a_{s}$ and $b_{s}$ be elements of $\mathbb{A}$. +Then,% +\[ +\sum_{s\in S}\left( a_{s}+b_{s}\right) =\sum_{s\in S}a_{s}+\sum_{s\in +S}b_{s}. +\] + +\end{statement} + +Next, we shall show the following claim: + +\begin{statement} +\textit{Claim 2:} Let $S$ be a finite set such that $\left\vert S\right\vert +=m+1$. For every $s\in S$, let $a_{s}$ and $b_{s}$ be elements of $\mathbb{A}% +$. Then,% +\[ +\sum_{s\in S}\left( a_{s}+b_{s}\right) =\sum_{s\in S}a_{s}+\sum_{s\in +S}b_{s}. +\] + +\end{statement} + +[\textit{Proof of Claim 2:} We have $\left\vert S\right\vert =m+1>m\geq0$. +Hence, the set $S$ is nonempty. Thus, there exists some $t\in S$. Consider +this $t$. + +From $t\in S$, we obtain $\left\vert S\setminus\left\{ t\right\} \right\vert +=\left\vert S\right\vert -1=m$ (since $\left\vert S\right\vert =m+1$). Hence, +Claim 1 (applied to $S\setminus\left\{ t\right\} $ instead of $S$) yields% +\begin{equation} +\sum_{s\in S\setminus\left\{ t\right\} }\left( a_{s}+b_{s}\right) +=\sum_{s\in S\setminus\left\{ t\right\} }a_{s}+\sum_{s\in S\setminus\left\{ +t\right\} }b_{s}. \label{pf.thm.ind.gen-com.sum(a+b).c2.pf.1}% +\end{equation} + + +Now, Proposition \ref{prop.ind.gen-com.split-off} yields% +\begin{equation} +\sum_{s\in S}a_{s}=a_{t}+\sum_{s\in S\setminus\left\{ t\right\} }a_{s}. +\label{pf.thm.ind.gen-com.sum(a+b).c2.pf.a}% +\end{equation} +Also, Proposition \ref{prop.ind.gen-com.split-off} (applied to $b_{s}$ instead +of $a_{s}$) yields% +\begin{equation} +\sum_{s\in S}b_{s}=b_{t}+\sum_{s\in S\setminus\left\{ t\right\} }b_{s}. +\label{pf.thm.ind.gen-com.sum(a+b).c2.pf.b}% +\end{equation} +Finally, Proposition \ref{prop.ind.gen-com.split-off} (applied to $a_{s}% ++b_{s}$ instead of $a_{s}$) yields% +\begin{align*} +\sum_{s\in S}\left( a_{s}+b_{s}\right) & =\left( a_{t}+b_{t}\right) ++\underbrace{\sum_{s\in S\setminus\left\{ t\right\} }\left( a_{s}% ++b_{s}\right) }_{\substack{=\sum_{s\in S\setminus\left\{ t\right\} }% +a_{s}+\sum_{s\in S\setminus\left\{ t\right\} }b_{s}\\\text{(by +(\ref{pf.thm.ind.gen-com.sum(a+b).c2.pf.1}))}}}\\ +& =\left( a_{t}+b_{t}\right) +\left( \sum_{s\in S\setminus\left\{ +t\right\} }a_{s}+\sum_{s\in S\setminus\left\{ t\right\} }b_{s}\right) \\ +& =\underbrace{\left( a_{t}+\sum_{s\in S\setminus\left\{ t\right\} }% +a_{s}\right) }_{\substack{=\sum_{s\in S}a_{s}\\\text{(by +(\ref{pf.thm.ind.gen-com.sum(a+b).c2.pf.a}))}}}+\underbrace{\left( b_{t}% ++\sum_{s\in S\setminus\left\{ t\right\} }b_{s}\right) }_{\substack{=\sum +_{s\in S}b_{s}\\\text{(by (\ref{pf.thm.ind.gen-com.sum(a+b).c2.pf.b}))}}}\\ +& \ \ \ \ \ \ \ \ \ \ \left( +\begin{array} +[c]{c}% +\text{by Lemma \ref{lem.ind.gen-com.xyuv} (applied}\\ +\text{to }x=a_{t}\text{, }y=b_{t}\text{, }u=\sum_{s\in S\setminus\left\{ +t\right\} }a_{s}\text{ and }v=\sum_{s\in S\setminus\left\{ t\right\} }% +b_{s}\text{)}% +\end{array} +\right) \\ +& =\sum_{s\in S}a_{s}+\sum_{s\in S}b_{s}. +\end{align*} +This proves Claim 2.] + +But Claim 2 says precisely that Theorem \ref{thm.ind.gen-com.sum(a+b)} holds +under the condition that $\left\vert S\right\vert =m+1$. Hence, we conclude +that Theorem \ref{thm.ind.gen-com.sum(a+b)} holds under the condition that +$\left\vert S\right\vert =m+1$ (since Claim 2 is proven). This completes the +induction step. Thus, Theorem \ref{thm.ind.gen-com.sum(a+b)} is proven by induction. +\end{proof} + +We shall next prove (\ref{eq.sum.linear2}): + +\begin{theorem} +\label{thm.ind.gen-com.sum(la)}Let $S$ be a finite set. For every $s\in S$, +let $a_{s}$ be an element of $\mathbb{A}$. Also, let $\lambda$ be an element +of $\mathbb{A}$. Then,% +\[ +\sum_{s\in S}\lambda a_{s}=\lambda\sum_{s\in S}a_{s}. +\] + +\end{theorem} + +To prove this theorem, we need the following fundamental fact of arithmetic: + +\begin{proposition} +\label{prop.ind.gen-com.distr}Let $x$, $y$ and $z$ be three numbers (i.e., +elements of $\mathbb{A}$). Then, $x\left( y+z\right) =xy+xz$. +\end{proposition} + +Proposition \ref{prop.ind.gen-com.distr} is known as the +\textit{distributivity} (or \textit{left distributivity}) in $\mathbb{A}$. It +is a fundamental result, and its proof can be found in standard +textbooks\footnote{For example, Proposition \ref{prop.ind.gen-com.distr} is +proven in \cite[Theorem 3.2.3 (6)]{Swanso18} for the case when $\mathbb{A}% +=\mathbb{N}$; in \cite[Theorem 3.5.4 (6)]{Swanso18} for the case when +$\mathbb{A}=\mathbb{Z}$; in \cite[Theorem 3.6.4 (6)]{Swanso18} for the case +when $\mathbb{A}=\mathbb{Q}$; in \cite[Theorem 3.7.13]{Swanso18} for the case +when $\mathbb{A}=\mathbb{R}$; in \cite[Theorem 3.9.3]{Swanso18} for the case +when $\mathbb{A}=\mathbb{C}$.}. + +\begin{proof} +[Proof of Theorem \ref{thm.ind.gen-com.sum(la)}.]Forget that we fixed $S$, +$a_{s}$ and $\lambda$. We shall prove Theorem \ref{thm.ind.gen-com.sum(la)} by +induction on $\left\vert S\right\vert $: + +\begin{vershort} +\textit{Induction base:} The induction base (i.e., proving that Theorem +\ref{thm.ind.gen-com.sum(la)} holds under the condition that $\left\vert +S\right\vert =0$) is similar to the induction base in the proof of Theorem +\ref{thm.ind.gen-com.sum(a+b)} above; we thus leave it to the reader. +\end{vershort} + +\begin{verlong} +\textit{Induction base:} Theorem \ref{thm.ind.gen-com.sum(la)} holds under the +condition that $\left\vert S\right\vert =0$\ \ \ \ \footnote{\textit{Proof.} +Let $S$, $a_{s}$ and $\lambda$ be as in Theorem \ref{thm.ind.gen-com.sum(la)}. +Assume that $\left\vert S\right\vert =0$. Thus, the first bullet point of +Definition \ref{def.ind.gen-com.defsum1} yields $\sum_{s\in S}a_{s}=0$ and +$\sum_{s\in S}\lambda a_{s}=0$. Hence,% +\[ +\sum_{s\in S}\lambda a_{s}=0=\lambda\underbrace{0}_{=\sum_{s\in S}a_{s}% +}=\lambda\sum_{s\in S}a_{s}. +\] +\par +Now, forget that we fixed $S$, $a_{s}$ and $\lambda$. We thus have proved that +if $S$, $a_{s}$ and $\lambda$ are as in Theorem \ref{thm.ind.gen-com.sum(la)}, +and if $\left\vert S\right\vert =0$, then $\sum_{s\in S}\lambda a_{s}% +=\lambda\sum_{s\in S}a_{s}$. In other words, Theorem +\ref{thm.ind.gen-com.sum(la)} holds under the condition that $\left\vert +S\right\vert =0$. Qed.}. This completes the induction base. +\end{verlong} + +\textit{Induction step:} Let $m\in\mathbb{N}$. Assume that Theorem +\ref{thm.ind.gen-com.sum(la)} holds under the condition that $\left\vert +S\right\vert =m$. We must now prove that Theorem \ref{thm.ind.gen-com.sum(la)} +holds under the condition that $\left\vert S\right\vert =m+1$. + +We have assumed that Theorem \ref{thm.ind.gen-com.sum(la)} holds under the +condition that $\left\vert S\right\vert =m$. In other words, the following +claim holds: + +\begin{statement} +\textit{Claim 1:} Let $S$ be a finite set such that $\left\vert S\right\vert +=m$. For every $s\in S$, let $a_{s}$ be an element of $\mathbb{A}$. Also, let +$\lambda$ be an element of $\mathbb{A}$. Then,% +\[ +\sum_{s\in S}\lambda a_{s}=\lambda\sum_{s\in S}a_{s}. +\] + +\end{statement} + +Next, we shall show the following claim: + +\begin{statement} +\textit{Claim 2:} Let $S$ be a finite set such that $\left\vert S\right\vert +=m+1$. For every $s\in S$, let $a_{s}$ be an element of $\mathbb{A}$. Also, +let $\lambda$ be an element of $\mathbb{A}$. Then,% +\[ +\sum_{s\in S}\lambda a_{s}=\lambda\sum_{s\in S}a_{s}. +\] + +\end{statement} + +[\textit{Proof of Claim 2:} We have $\left\vert S\right\vert =m+1>m\geq0$. +Hence, the set $S$ is nonempty. Thus, there exists some $t\in S$. Consider +this $t$. + +From $t\in S$, we obtain $\left\vert S\setminus\left\{ t\right\} \right\vert +=\left\vert S\right\vert -1=m$ (since $\left\vert S\right\vert =m+1$). Hence, +Claim 1 (applied to $S\setminus\left\{ t\right\} $ instead of $S$) yields% +\begin{equation} +\sum_{s\in S\setminus\left\{ t\right\} }\lambda a_{s}=\lambda\sum_{s\in +S\setminus\left\{ t\right\} }a_{s}. +\label{pf.thm.ind.gen-com.sum(la).c2.pf.1}% +\end{equation} + + +Now, Proposition \ref{prop.ind.gen-com.split-off} yields% +\[ +\sum_{s\in S}a_{s}=a_{t}+\sum_{s\in S\setminus\left\{ t\right\} }a_{s}. +\] +Multiplying both sides of this equality by $\lambda$, we obtain% +\[ +\lambda\sum_{s\in S}a_{s}=\lambda\left( a_{t}+\sum_{s\in S\setminus\left\{ +t\right\} }a_{s}\right) =\lambda a_{t}+\lambda\sum_{s\in S\setminus\left\{ +t\right\} }a_{s}% +\] +(by Proposition \ref{prop.ind.gen-com.distr} (applied to $x=\lambda$, +$y=a_{t}$ and $z=\sum_{s\in S\setminus\left\{ t\right\} }a_{s}$)). Also, +Proposition \ref{prop.ind.gen-com.split-off} (applied to $\lambda a_{s}$ +instead of $a_{s}$) yields% +\[ +\sum_{s\in S}\lambda a_{s}=\lambda a_{t}+\underbrace{\sum_{s\in S\setminus +\left\{ t\right\} }\lambda a_{s}}_{\substack{=\lambda\sum_{s\in +S\setminus\left\{ t\right\} }a_{s}\\\text{(by +(\ref{pf.thm.ind.gen-com.sum(la).c2.pf.1}))}}}=\lambda a_{t}+\lambda\sum_{s\in +S\setminus\left\{ t\right\} }a_{s}. +\] +Comparing the preceding two equalities, we find% +\[ +\sum_{s\in S}\lambda a_{s}=\lambda\sum_{s\in S}a_{s}. +\] +This proves Claim 2.] + +But Claim 2 says precisely that Theorem \ref{thm.ind.gen-com.sum(la)} holds +under the condition that $\left\vert S\right\vert =m+1$. Hence, we conclude +that Theorem \ref{thm.ind.gen-com.sum(la)} holds under the condition that +$\left\vert S\right\vert =m+1$ (since Claim 2 is proven). This completes the +induction step. Thus, Theorem \ref{thm.ind.gen-com.sum(la)} is proven by induction. +\end{proof} + +Finally, let us prove (\ref{eq.sum.sum0}): + +\begin{theorem} +\label{thm.ind.gen-com.sum(0)}Let $S$ be a finite set. Then, +\[ +\sum_{s\in S}0=0. +\] + +\end{theorem} + +\begin{proof} +[Proof of Theorem \ref{thm.ind.gen-com.sum(0)}.]It is completely +straightforward to prove Theorem \ref{thm.ind.gen-com.sum(0)} by induction on +$\left\vert S\right\vert $ (as we proved Theorem \ref{thm.ind.gen-com.sum(la)}% +, for example). But let us give an even shorter argument: Theorem +\ref{thm.ind.gen-com.sum(la)} (applied to $a_{s}=0$ and $\lambda=0$) yields% +\[ +\sum_{s\in S}0\cdot0=0\sum_{s\in S}0=0. +\] +In view of $0\cdot0=0$, this rewrites as $\sum_{s\in S}0=0$. This proves +Theorem \ref{thm.ind.gen-com.sum(0)}. +\end{proof} + +\subsubsection{Splitting a sum by a value of a function} + +We shall now prove a more complicated (but crucial) property of finite sums -- +namely, the equality (\ref{eq.sum.sheph}) in the case when $W$ is +finite\footnote{We prefer to only treat the case when $W$ is finite for now. +The case when $W$ is infinite would require us to properly introduce the +notion of an infinite sum with only finitely many nonzero terms. While this is +not hard to do, we aren't quite ready for it yet (see Theorem +\ref{thm.ind.gen-com.sheph} further below for this).}: + +\begin{theorem} +\label{thm.ind.gen-com.shephf}Let $S$ be a finite set. Let $W$ be a finite +set. Let $f:S\rightarrow W$ be a map. Let $a_{s}$ be an element of +$\mathbb{A}$ for each $s\in S$. Then,% +\[ +\sum_{s\in S}a_{s}=\sum_{w\in W}\sum_{\substack{s\in S;\\f\left( s\right) +=w}}a_{s}. +\] + +\end{theorem} + +Here, we are using the following: convention (made in Section +\ref{sect.sums-repetitorium}) + +\begin{convention} +Let $S$ be a finite set. Let $\mathcal{A}\left( s\right) $ be a logical +statement defined for every $s\in S$. For each $s\in S$ satisfying +$\mathcal{A}\left( s\right) $, let $a_{s}$ be a number (i.e., an element of +$\mathbb{A}$). Then, we set% +\[ +\sum_{\substack{s\in S;\\\mathcal{A}\left( s\right) }}a_{s}=\sum +_{s\in\left\{ t\in S\ \mid\ \mathcal{A}\left( t\right) \right\} }a_{s}. +\] + +\end{convention} + +Thus, the sum $\sum_{\substack{s\in S;\\f\left( s\right) =w}}a_{s}$ in +Theorem \ref{thm.ind.gen-com.shephf} can be rewritten as $\sum_{s\in\left\{ +t\in S\ \mid\ f\left( t\right) =w\right\} }a_{s}$. + +Our proof of Theorem \ref{thm.ind.gen-com.shephf} relies on the following +simple set-theoretic fact: + +\begin{lemma} +\label{lem.ind.gen-com.shephf-l1}Let $S$ and $W$ be two sets. Let +$f:S\rightarrow W$ be a map. Let $q\in S$. Let $g$ be the restriction +$f\mid_{S\setminus\left\{ q\right\} }$ of the map $f$ to $S\setminus\left\{ +q\right\} $. Let $w\in W$. Then,% +\[ +\left\{ t\in S\setminus\left\{ q\right\} \ \mid\ g\left( t\right) +=w\right\} =\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} +\setminus\left\{ q\right\} . +\] + +\end{lemma} + +\begin{vershort} +\begin{proof} +[Proof of Lemma \ref{lem.ind.gen-com.shephf-l1}.]We know that $g$ is the +restriction $f\mid_{S\setminus\left\{ q\right\} }$ of the map $f$ to +$S\setminus\left\{ q\right\} $. Thus, $g$ is a map from $S\setminus\left\{ +q\right\} $ to $W$ and satisfies% +\begin{equation} +g\left( t\right) =f\left( t\right) \ \ \ \ \ \ \ \ \ \ \text{for each +}t\in S\setminus\left\{ q\right\} . +\label{pf.lem.ind.gen-com.shephf-l1.short.1}% +\end{equation} + + +Now,% +\begin{align*} +& \left\{ t\in S\setminus\left\{ q\right\} \ \mid\ \underbrace{g\left( +t\right) }_{\substack{=f\left( t\right) \\\text{(by +(\ref{pf.lem.ind.gen-com.shephf-l1.short.1}))}}}=w\right\} \\ +& =\left\{ t\in S\setminus\left\{ q\right\} \ \mid\ f\left( t\right) +=w\right\} =\left\{ t\in S\ \mid\ f\left( t\right) =w\text{ and }t\in +S\setminus\left\{ q\right\} \right\} \\ +& =\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} \cap +\underbrace{\left\{ t\in S\ \mid\ t\in S\setminus\left\{ q\right\} +\right\} }_{=S\setminus\left\{ q\right\} }\\ +& =\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} \cap\left( +S\setminus\left\{ q\right\} \right) =\left\{ t\in S\ \mid\ f\left( +t\right) =w\right\} \setminus\left\{ q\right\} . +\end{align*} +This proves Lemma \ref{lem.ind.gen-com.shephf-l1}. +\end{proof} +\end{vershort} + +\begin{verlong} +\begin{proof} +[Proof of Lemma \ref{lem.ind.gen-com.shephf-l1}.]We know that $g$ is the +restriction $f\mid_{S\setminus\left\{ q\right\} }$ of the map $f$ to +$S\setminus\left\{ q\right\} $. Thus, $g$ is a map from $S\setminus\left\{ +q\right\} $ to $W$ and satisfies% +\begin{equation} +\left( g\left( s\right) =f\left( s\right) \ \ \ \ \ \ \ \ \ \ \text{for +each }s\in S\setminus\left\{ q\right\} \right) . +\label{pf.lem.ind.gen-com.shephf-l1.long.1}% +\end{equation} + + +Let $p\in\left\{ t\in S\setminus\left\{ q\right\} \ \mid\ g\left( +t\right) =w\right\} $. Thus, $p$ is a $t\in S\setminus\left\{ q\right\} $ +satisfying $g\left( t\right) =w$. In other words, $p$ is an element of +$S\setminus\left\{ q\right\} $ and satisfies $g\left( p\right) =w$. +Applying (\ref{pf.lem.ind.gen-com.shephf-l1.long.1}) to $s=p$, we obtain +$g\left( p\right) =f\left( p\right) $. Hence, $f\left( p\right) +=g\left( p\right) =w$. Thus, $p$ is an element of $S$ (since $p\in +S\setminus\left\{ q\right\} \subseteq S$) and satisfies $f\left( p\right) +=w$. In other words, $p$ is a $t\in S$ satisfying $f\left( t\right) =w$. In +other words, $p\in\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} $. +Moreover, $p\notin\left\{ q\right\} $ (since $p\in S\setminus\left\{ +q\right\} $). Combining $p\in\left\{ t\in S\ \mid\ f\left( t\right) +=w\right\} $ with $p\notin\left\{ q\right\} $, we obtain $p\in\left\{ t\in +S\ \mid\ f\left( t\right) =w\right\} \setminus\left\{ q\right\} $. + +Now, forget that we fixed $p$. We thus have proven that $p\in\left\{ t\in +S\ \mid\ f\left( t\right) =w\right\} \setminus\left\{ q\right\} $ for +each $p\in\left\{ t\in S\setminus\left\{ q\right\} \ \mid\ g\left( +t\right) =w\right\} $. In other words,% +\begin{equation} +\left\{ t\in S\setminus\left\{ q\right\} \ \mid\ g\left( t\right) +=w\right\} \subseteq\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} +\setminus\left\{ q\right\} . \label{pf.lem.ind.gen-com.shephf-l1.long.4}% +\end{equation} + + +On the other hand, let $s\in\left\{ t\in S\ \mid\ f\left( t\right) +=w\right\} \setminus\left\{ q\right\} $. Thus, $s\in\left\{ t\in +S\ \mid\ f\left( t\right) =w\right\} $ and $s\notin\left\{ q\right\} $. +In particular, we have $s\in\left\{ t\in S\ \mid\ f\left( t\right) +=w\right\} $. In other words, $s$ is a $t\in S$ satisfying $f\left( +t\right) =w$. In other words, $s$ is an element of $S$ and satisfies +$f\left( s\right) =w$. Since $s$ is an element of $S$, we have $s\in S$. +Combining this with $s\notin\left\{ q\right\} $, we obtain $s\in +S\setminus\left\{ q\right\} $. Thus, +(\ref{pf.lem.ind.gen-com.shephf-l1.long.1}) yields $g\left( s\right) +=f\left( s\right) =w$. Hence, $s$ is an element of $S\setminus\left\{ +q\right\} $ (since $s\in S\setminus\left\{ q\right\} $) and satisfies +$g\left( s\right) =w$. In other words, $s$ is a $t\in S\setminus\left\{ +q\right\} $ satisfying $g\left( t\right) =w$. In other words, $s\in\left\{ +t\in S\setminus\left\{ q\right\} \ \mid\ g\left( t\right) =w\right\} $. + +Now, forget that we fixed $s$. We thus have shown that $s\in\left\{ t\in +S\setminus\left\{ q\right\} \ \mid\ g\left( t\right) =w\right\} $ for +each $s\in\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} \setminus +\left\{ q\right\} $. In other words,% +\[ +\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} \setminus\left\{ +q\right\} \subseteq\left\{ t\in S\setminus\left\{ q\right\} \ \mid +\ g\left( t\right) =w\right\} . +\] +Combining this with (\ref{pf.lem.ind.gen-com.shephf-l1.long.4}), we obtain% +\begin{equation} +\left\{ t\in S\setminus\left\{ q\right\} \ \mid\ g\left( t\right) +=w\right\} =\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} +\setminus\left\{ q\right\} .\nonumber +\end{equation} +This proves Lemma \ref{lem.ind.gen-com.shephf-l1}. +\end{proof} +\end{verlong} + +\begin{proof} +[Proof of Theorem \ref{thm.ind.gen-com.shephf}.]We shall prove Theorem +\ref{thm.ind.gen-com.shephf} by induction on $\left\vert S\right\vert $: + +\textit{Induction base:} Theorem \ref{thm.ind.gen-com.shephf} holds under the +condition that $\left\vert S\right\vert =0$\ \ \ \ \footnote{\textit{Proof.} +Let $S$, $W$, $f$ and $a_{s}$ be as in Theorem \ref{thm.ind.gen-com.shephf}. +Assume that $\left\vert S\right\vert =0$. Thus, the first bullet point of +Definition \ref{def.ind.gen-com.defsum1} yields $\sum_{s\in S}a_{s}=0$. +Moreover, $S=\varnothing$ (since $\left\vert S\right\vert =0$). Hence, each +$w\in W$ satisfies% +\begin{align*} +\sum_{\substack{s\in S;\\f\left( s\right) =w}}a_{s} & =\sum_{s\in\left\{ +t\in S\ \mid\ f\left( t\right) =w\right\} }a_{s}=\sum_{s\in\varnothing +}a_{s}\\ +& \ \ \ \ \ \ \ \ \ \ \left( +\begin{array} +[c]{c}% +\text{since }\left\{ t\in S\ \mid\ f\left( s\right) =w\right\} +=\varnothing\\ +\text{(because }\left\{ t\in S\ \mid\ f\left( s\right) =w\right\} +\subseteq S=\varnothing\text{)}% +\end{array} +\right) \\ +& =\left( \text{empty sum}\right) =0. +\end{align*} +Summing these equalities over all $w\in W$, we obtain% +\[ +\sum_{w\in W}\sum_{\substack{s\in S;\\f\left( s\right) =w}}a_{s}=\sum_{w\in +W}0=0 +\] +(by an application of Theorem \ref{thm.ind.gen-com.sum(0)}). Comparing this +with $\sum_{s\in S}a_{s}=0$, we obtain $\sum_{s\in S}a_{s}=\sum_{w\in W}% +\sum_{\substack{s\in S;\\f\left( s\right) =w}}a_{s}$. +\par +Now, forget that we fixed $S$, $W$, $f$ and $a_{s}$. We thus have proved that +if $S$, $W$, $f$ and $a_{s}$ are as in Theorem \ref{thm.ind.gen-com.shephf}, +and if $\left\vert S\right\vert =0$, then $\sum_{s\in S}a_{s}=\sum_{w\in +W}\sum_{\substack{s\in S;\\f\left( s\right) =w}}a_{s}$. In other words, +Theorem \ref{thm.ind.gen-com.shephf} holds under the condition that +$\left\vert S\right\vert =0$. Qed.}. This completes the induction base. + +\textit{Induction step:} Let $m\in\mathbb{N}$. Assume that Theorem +\ref{thm.ind.gen-com.shephf} holds under the condition that $\left\vert +S\right\vert =m$. We must now prove that Theorem \ref{thm.ind.gen-com.shephf} +holds under the condition that $\left\vert S\right\vert =m+1$. + +We have assumed that Theorem \ref{thm.ind.gen-com.shephf} holds under the +condition that $\left\vert S\right\vert =m$. In other words, the following +claim holds: + +\begin{statement} +\textit{Claim 1:} Let $S$ be a finite set such that $\left\vert S\right\vert +=m$. Let $W$ be a finite set. Let $f:S\rightarrow W$ be a map. Let $a_{s}$ be +an element of $\mathbb{A}$ for each $s\in S$. Then,% +\[ +\sum_{s\in S}a_{s}=\sum_{w\in W}\sum_{\substack{s\in S;\\f\left( s\right) +=w}}a_{s}. +\] + +\end{statement} + +Next, we shall show the following claim: + +\begin{statement} +\textit{Claim 2:} Let $S$ be a finite set such that $\left\vert S\right\vert +=m+1$. Let $W$ be a finite set. Let $f:S\rightarrow W$ be a map. Let $a_{s}$ +be an element of $\mathbb{A}$ for each $s\in S$. Then,% +\[ +\sum_{s\in S}a_{s}=\sum_{w\in W}\sum_{\substack{s\in S;\\f\left( s\right) +=w}}a_{s}. +\] + +\end{statement} + +[\textit{Proof of Claim 2:} We have $\left\vert S\right\vert =m+1>m\geq0$. +Hence, the set $S$ is nonempty. Thus, there exists some $q\in S$. Consider +this $q$. + +From $q\in S$, we obtain $\left\vert S\setminus\left\{ q\right\} \right\vert +=\left\vert S\right\vert -1=m$ (since $\left\vert S\right\vert =m+1$). + +Let $g$ be the restriction $f\mid_{S\setminus\left\{ q\right\} }$ of the map +$f$ to $S\setminus\left\{ q\right\} $. Thus, $g$ is a map from +$S\setminus\left\{ q\right\} $ to $W$. + +For each $w\in W$, we define a number $b_{w}$ by% +\begin{equation} +b_{w}=\sum_{\substack{s\in S;\\f\left( s\right) =w}}a_{s}. +\label{pf.lem.ind.gen-com.shephf.c2.bw=}% +\end{equation} + + +Furthermore, for each $w\in W$, we define a number $c_{w}$ by% +\begin{equation} +c_{w}=\sum_{\substack{s\in S\setminus\left\{ q\right\} ;\\g\left( s\right) +=w}}a_{s}. \label{pf.lem.ind.gen-com.shephf.c2.cw=}% +\end{equation} + + +Recall that $\left\vert S\setminus\left\{ q\right\} \right\vert =m$. Hence, +Claim 1 (applied to $S\setminus\left\{ q\right\} $ and $g$ instead of $S$ +and $f$) yields% +\begin{equation} +\sum_{s\in S\setminus\left\{ q\right\} }a_{s}=\sum_{w\in W}\underbrace{\sum +_{\substack{s\in S\setminus\left\{ q\right\} ;\\g\left( s\right) =w}% +}a_{s}}_{\substack{=c_{w}\\\text{(by (\ref{pf.lem.ind.gen-com.shephf.c2.cw=}% +))}}}=\sum_{w\in W}c_{w}. \label{pf.lem.ind.gen-com.shephf.c2.usec1}% +\end{equation} + + +Every $w\in W\setminus\left\{ f\left( q\right) \right\} $ satisfies% +\begin{equation} +b_{w}=c_{w}. \label{pf.lem.ind.gen-com.shephf.c2.4b}% +\end{equation} + + +[\textit{Proof of (\ref{pf.lem.ind.gen-com.shephf.c2.4b}):} Let $w\in +W\setminus\left\{ f\left( q\right) \right\} $. Thus, $w\in W$ and +$w\notin\left\{ f\left( q\right) \right\} $. + +\begin{vershort} +If we had $q\in\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} $, then +we would have $f\left( q\right) =w$, which would lead to $w=f\left( +q\right) \in\left\{ f\left( q\right) \right\} $; but this would +contradict $w\notin\left\{ f\left( q\right) \right\} $. Hence, we cannot +have $q\in\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} $. Hence, we +have $q\notin\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} $. +\end{vershort} + +\begin{verlong} +Let us next prove that $q\notin\left\{ t\in S\ \mid\ f\left( t\right) +=w\right\} $. Indeed, assume the contrary (for the sake of contradiction). +Thus, $q\in\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} $. In other +words, $q$ is a $t\in S$ satisfying $f\left( t\right) =w$. In other words, +$q$ is an element of $S$ and satisfies $f\left( q\right) =w$. Hence, +$w=f\left( q\right) \in\left\{ f\left( q\right) \right\} $; but this +contradicts $w\notin\left\{ f\left( q\right) \right\} $. + +This contradiction shows that our assumption was false. Hence, $q\notin% +\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} $ is proven. +\end{verlong} + +But $w\in W$; thus, Lemma \ref{lem.ind.gen-com.shephf-l1} yields% +\begin{align} +\left\{ t\in S\setminus\left\{ q\right\} \ \mid\ g\left( t\right) +=w\right\} & =\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} +\setminus\left\{ q\right\} \nonumber\\ +& =\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} +\label{pf.lem.ind.gen-com.shephf.c2.4b.pf.2}% +\end{align} +(since $q\notin\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} $). + +On the other hand, the definition of $b_{w}$ yields% +\begin{equation} +b_{w}=\sum_{\substack{s\in S;\\f\left( s\right) =w}}a_{s}=\sum_{s\in\left\{ +t\in S\ \mid\ f\left( t\right) =w\right\} }a_{s} +\label{pf.lem.ind.gen-com.shephf.c2.4b.pf.1}% +\end{equation} +(by the definition of the \textquotedblleft$\sum_{\substack{s\in S;\\f\left( +s\right) =w}}$\textquotedblright\ symbol). Also, the definition of $c_{w}$ +yields% +\begin{align*} +c_{w} & =\sum_{\substack{s\in S\setminus\left\{ q\right\} ;\\g\left( +s\right) =w}}a_{s}=\sum_{s\in\left\{ t\in S\setminus\left\{ q\right\} +\ \mid\ g\left( t\right) =w\right\} }a_{s}=\sum_{s\in\left\{ t\in +S\ \mid\ f\left( t\right) =w\right\} }a_{s}\\ +& \ \ \ \ \ \ \ \ \ \ \left( +\begin{array} +[c]{c}% +\text{since }\left\{ t\in S\setminus\left\{ q\right\} \ \mid\ g\left( +t\right) =w\right\} =\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} \\ +\text{(by (\ref{pf.lem.ind.gen-com.shephf.c2.4b.pf.2}))}% +\end{array} +\right) \\ +& =b_{w}\ \ \ \ \ \ \ \ \ \ \left( \text{by +(\ref{pf.lem.ind.gen-com.shephf.c2.4b.pf.1})}\right) . +\end{align*} +Thus, $b_{w}=c_{w}$. This proves (\ref{pf.lem.ind.gen-com.shephf.c2.4b}).] + +Also, +\begin{equation} +b_{f\left( q\right) }=a_{q}+c_{f\left( q\right) }. +\label{pf.lem.ind.gen-com.shephf.c2.5b}% +\end{equation} + + +[\textit{Proof of (\ref{pf.lem.ind.gen-com.shephf.c2.5b}):} Define a subset +$U$ of $S$ by% +\begin{equation} +U=\left\{ t\in S\ \mid\ f\left( t\right) =f\left( q\right) \right\} . +\label{pf.lem.ind.gen-com.shephf.c2.5b.pf.U=}% +\end{equation} + + +\begin{verlong} +This set $U$ is a subset of $S$, and thus is finite (since $S$ is finite). +\end{verlong} + +We can apply Lemma \ref{lem.ind.gen-com.shephf-l1} to $w=f\left( q\right) $. +We thus obtain% +\begin{align} +\left\{ t\in S\setminus\left\{ q\right\} \ \mid\ g\left( t\right) +=f\left( q\right) \right\} & =\underbrace{\left\{ t\in S\ \mid\ f\left( +t\right) =f\left( q\right) \right\} }_{\substack{=U\\\text{(by +(\ref{pf.lem.ind.gen-com.shephf.c2.5b.pf.U=}))}}}\setminus\left\{ q\right\} +\nonumber\\ +& =U\setminus\left\{ q\right\} . +\label{pf.lem.ind.gen-com.shephf.c2.5b.pf.1}% +\end{align} + + +We know that $q$ is a $t\in S$ satisfying $f\left( t\right) =f\left( +q\right) $ (since $q\in S$ and $f\left( q\right) =f\left( q\right) $). In +other words, $q\in\left\{ t\in S\ \mid\ f\left( t\right) =f\left( +q\right) \right\} $. In other words, $q\in U$ (since $U=\left\{ t\in +S\ \mid\ f\left( t\right) =f\left( q\right) \right\} $). Thus, +Proposition \ref{prop.ind.gen-com.split-off} (applied to $U$ and $q$ instead +of $S$ and $t$) yields% +\begin{equation} +\sum_{s\in U}a_{s}=a_{q}+\sum_{s\in U\setminus\left\{ q\right\} }a_{s}. +\label{pf.lem.ind.gen-com.shephf.c2.5b.pf.2}% +\end{equation} + + +But (\ref{pf.lem.ind.gen-com.shephf.c2.5b.pf.1}) shows that $U\setminus +\left\{ q\right\} =\left\{ t\in S\setminus\left\{ q\right\} +\ \mid\ g\left( t\right) =f\left( q\right) \right\} $. Thus,% +\begin{equation} +\sum_{s\in U\setminus\left\{ q\right\} }a_{s}=\sum_{s\in\left\{ t\in +S\setminus\left\{ q\right\} \ \mid\ g\left( t\right) =f\left( q\right) +\right\} }a_{s}=c_{f\left( q\right) } +\label{pf.lem.ind.gen-com.shephf.c2.5b.pf.3}% +\end{equation} +(since the definition of $c_{f\left( q\right) }$ yields $c_{f\left( +q\right) }=\sum_{\substack{s\in S\setminus\left\{ q\right\} ;\\g\left( +s\right) =f\left( q\right) }}a_{s}=\sum_{s\in\left\{ t\in S\setminus +\left\{ q\right\} \ \mid\ g\left( t\right) =f\left( q\right) \right\} +}a_{s}$). + +On the other hand, the definition of $b_{f\left( q\right) }$ yields% +\begin{align*} +b_{f\left( q\right) } & =\sum_{\substack{s\in S;\\f\left( s\right) +=f\left( q\right) }}a_{s}=\sum_{s\in\left\{ t\in S\ \mid\ f\left( +t\right) =f\left( q\right) \right\} }a_{s}\\ +& =\sum_{s\in U}a_{s}\ \ \ \ \ \ \ \ \ \ \left( \text{since }\left\{ t\in +S\ \mid\ f\left( t\right) =f\left( q\right) \right\} =U\right) \\ +& =a_{q}+\underbrace{\sum_{s\in U\setminus\left\{ q\right\} }a_{s}% +}_{\substack{=c_{f\left( q\right) }\\\text{(by +(\ref{pf.lem.ind.gen-com.shephf.c2.5b.pf.3}))}}}\ \ \ \ \ \ \ \ \ \ \left( +\text{by (\ref{pf.lem.ind.gen-com.shephf.c2.5b.pf.2})}\right) \\ +& =a_{q}+c_{f\left( q\right) }. +\end{align*} +This proves (\ref{pf.lem.ind.gen-com.shephf.c2.5b}).] + +Now, recall that $q\in S$. Hence, Proposition \ref{prop.ind.gen-com.split-off} +(applied to $t=q$) yields% +\begin{equation} +\sum_{s\in S}a_{s}=a_{q}+\sum_{s\in S\setminus\left\{ q\right\} }a_{s}. +\label{pf.lem.ind.gen-com.shephf.c2.6}% +\end{equation} + + +Also, $f\left( q\right) \in W$. Hence, Proposition +\ref{prop.ind.gen-com.split-off} (applied to $W$, $\left( c_{w}\right) +_{w\in W}$ and $f\left( q\right) $ instead of $S$, $\left( a_{s}\right) +_{s\in S}$ and $t$) yields% +\[ +\sum_{w\in W}c_{w}=c_{f\left( q\right) }+\sum_{w\in W\setminus\left\{ +f\left( q\right) \right\} }c_{w}. +\] +Hence, (\ref{pf.lem.ind.gen-com.shephf.c2.usec1}) becomes% +\begin{equation} +\sum_{s\in S\setminus\left\{ q\right\} }a_{s}=\sum_{w\in W}c_{w}=c_{f\left( +q\right) }+\sum_{w\in W\setminus\left\{ f\left( q\right) \right\} }c_{w}. +\label{pf.lem.ind.gen-com.shephf.c2.8}% +\end{equation} + + +Also, Proposition \ref{prop.ind.gen-com.split-off} (applied to $W$, $\left( +b_{w}\right) _{w\in W}$ and $f\left( q\right) $ instead of $S$, $\left( +a_{s}\right) _{s\in S}$ and $t$) yields% +\begin{align*} +\sum_{w\in W}b_{w} & =\underbrace{b_{f\left( q\right) }}_{\substack{=a_{q}% ++c_{f\left( q\right) }\\\text{(by (\ref{pf.lem.ind.gen-com.shephf.c2.5b}))}% +}}+\sum_{w\in W\setminus\left\{ f\left( q\right) \right\} }% +\underbrace{b_{w}}_{\substack{=c_{w}\\\text{(by +(\ref{pf.lem.ind.gen-com.shephf.c2.4b}))}}}\\ +& =\left( a_{q}+c_{f\left( q\right) }\right) +\sum_{w\in W\setminus +\left\{ f\left( q\right) \right\} }c_{w}=a_{q}+\left( c_{f\left( +q\right) }+\sum_{w\in W\setminus\left\{ f\left( q\right) \right\} }% +c_{w}\right) +\end{align*} +(by Proposition \ref{prop.ind.gen-com.fgh}, applied to $a_{q}$, $c_{f\left( +q\right) }$ and $\sum_{w\in W\setminus\left\{ f\left( q\right) \right\} +}c_{w}$ instead of $a$, $b$ and $c$). Thus,% +\[ +\sum_{w\in W}b_{w}=a_{q}+\underbrace{\left( c_{f\left( q\right) }% ++\sum_{w\in W\setminus\left\{ f\left( q\right) \right\} }c_{w}\right) +}_{\substack{=\sum_{s\in S\setminus\left\{ q\right\} }a_{s}\\\text{(by +(\ref{pf.lem.ind.gen-com.shephf.c2.8}))}}}=a_{q}+\sum_{s\in S\setminus\left\{ +q\right\} }a_{s}=\sum_{s\in S}a_{s}% +\] +(by (\ref{pf.lem.ind.gen-com.shephf.c2.6})). Hence,% +\[ +\sum_{s\in S}a_{s}=\sum_{w\in W}\underbrace{b_{w}}_{\substack{=\sum +_{\substack{s\in S;\\f\left( s\right) =w}}a_{s}\\\text{(by +(\ref{pf.lem.ind.gen-com.shephf.c2.bw=}))}}}=\sum_{w\in W}\sum_{\substack{s\in +S;\\f\left( s\right) =w}}a_{s}. +\] +This proves Claim 2.] + +But Claim 2 says precisely that Theorem \ref{thm.ind.gen-com.shephf} holds +under the condition that $\left\vert S\right\vert =m+1$. Hence, we conclude +that Theorem \ref{thm.ind.gen-com.shephf} holds under the condition that +$\left\vert S\right\vert =m+1$ (since Claim 2 is proven). This completes the +induction step. Thus, Theorem \ref{thm.ind.gen-com.shephf} is proven by induction. +\end{proof} + +\subsubsection{Splitting a sum into two} + +Next, we shall prove the equality (\ref{eq.sum.split}): + +\begin{theorem} +\label{thm.ind.gen-com.split2}Let $S$ be a finite set. Let $X$ and $Y$ be two +subsets of $S$ such that $X\cap Y=\varnothing$ and $X\cup Y=S$. (Equivalently, +$X$ and $Y$ are two subsets of $S$ such that each element of $S$ lies in +\textbf{exactly} one of $X$ and $Y$.) Let $a_{s}$ be a number (i.e., an +element of $\mathbb{A}$) for each $s\in S$. Then,% +\[ +\sum_{s\in S}a_{s}=\sum_{s\in X}a_{s}+\sum_{s\in Y}a_{s}. +\] + +\end{theorem} + +\begin{proof} +[Proof of Theorem \ref{thm.ind.gen-com.split2}.]From the assumptions $X\cap +Y=\varnothing$ and $X\cup Y=S$, we can easily obtain $S\setminus X=Y$. + +\begin{verlong} +[\textit{Proof:} The set $X\cap Y$ is empty (since $X\cap Y=\varnothing$); +thus, it has no elements. + +Let $y\in Y$. If we had $y\in X$, then we would have $y\in X\cap Y$ (since +$y\in X$ and $y\in Y$), which would show that the set $X\cap Y$ has at least +one element (namely, $y$); but this would contradict the fact that this set +$X\cap Y$ has no elements. Thus, we cannot have $y\in X$. Hence, we have +$y\notin X$. Combining $y\in Y\subseteq S$ with $y\notin X$, we obtain $y\in +S\setminus X$. + +Now, forget that we fixed $y$. We thus have shown that $y\in S\setminus X$ for +each $y\in Y$. In other words, $Y\subseteq S\setminus X$. + +Combining this with% +\[ +\underbrace{S}_{=X\cup Y}\setminus X=\left( X\cup Y\right) \setminus +X=Y\setminus X\subseteq Y, +\] +we obtain $S\setminus X=Y$. Qed.] +\end{verlong} + +We define a map $f:S\rightarrow\left\{ 0,1\right\} $ by setting% +\[ +\left( f\left( s\right) =% +\begin{cases} +0, & \text{if }s\in X;\\ +1, & \text{if }s\notin X +\end{cases} +\ \ \ \ \ \ \ \ \ \ \text{for every }s\in S\right) . +\] + + +For each $w\in\left\{ 0,1\right\} $, we define a number $b_{w}$ by% +\begin{equation} +b_{w}=\sum_{\substack{s\in S;\\f\left( s\right) =w}}a_{s}. +\label{pf.thm.ind.gen-com.split2.bw=}% +\end{equation} + + +\begin{vershort} +Proposition \ref{prop.ind.gen-com.sum12} \textbf{(b)} (applied to $\left\{ +0,1\right\} $, $0$, $1$ and $\left( b_{w}\right) _{w\in\left\{ +0,1\right\} }$ instead of $S$, $p$, $q$ and $\left( a_{s}\right) _{s\in S}% +$) yields $\sum_{w\in\left\{ 0,1\right\} }b_{w}=b_{0}+b_{1}$. +\end{vershort} + +\begin{verlong} +We have $\left\{ 0,1\right\} =\left\{ 0,1\right\} $, where $0$ and $1$ are +two distinct elements. Hence, Proposition \ref{prop.ind.gen-com.sum12} +\textbf{(b)} (applied to $\left\{ 0,1\right\} $, $0$, $1$ and $\left( +b_{w}\right) _{w\in\left\{ 0,1\right\} }$ instead of $S$, $p$, $q$ and +$\left( a_{s}\right) _{s\in S}$) yields $\sum_{w\in\left\{ 0,1\right\} +}b_{w}=b_{0}+b_{1}$. +\end{verlong} + +Now, Theorem \ref{thm.ind.gen-com.shephf} (applied to $W=\left\{ 0,1\right\} +$) yields +\begin{equation} +\sum_{s\in S}a_{s}=\sum_{w\in\left\{ 0,1\right\} }\underbrace{\sum +_{\substack{s\in S;\\f\left( s\right) =w}}a_{s}}_{\substack{=b_{w}% +\\\text{(by (\ref{pf.thm.ind.gen-com.split2.bw=}))}}}=\sum_{w\in\left\{ +0,1\right\} }b_{w}=b_{0}+b_{1}. \label{pf.thm.ind.gen-com.split2.1}% +\end{equation} + + +On the other hand, +\begin{equation} +b_{0}=\sum_{s\in X}a_{s}. \label{pf.thm.ind.gen-com.split2.b0=}% +\end{equation} + + +\begin{vershort} +[\textit{Proof of (\ref{pf.thm.ind.gen-com.split2.b0=}):} The definition of +the map $f$ shows that an element $t\in S$ satisfies $f\left( t\right) =0$ +\textbf{if and only if} it belongs to $X$. Hence, the set of all elements +$t\in S$ that satisfy $f\left( t\right) =0$ is precisely $X$. In other +words,% +\[ +\left\{ t\in S\ \mid\ f\left( t\right) =0\right\} =X. +\] +But the definition of $b_{0}$ yields% +\[ +b_{0}=\sum_{\substack{s\in S;\\f\left( s\right) =0}}a_{s}=\sum_{s\in\left\{ +t\in S\ \mid\ f\left( t\right) =0\right\} }a_{s}=\sum_{s\in X}a_{s}% +\] +(since $\left\{ t\in S\ \mid\ f\left( t\right) =0\right\} =X$). This +proves (\ref{pf.thm.ind.gen-com.split2.b0=}).] +\end{vershort} + +\begin{verlong} +[\textit{Proof of (\ref{pf.thm.ind.gen-com.split2.b0=}):} Let $p\in X$. Thus, +the definition of $f$ yields $f\left( p\right) =% +\begin{cases} +0, & \text{if }p\in X;\\ +1, & \text{if }p\notin X +\end{cases} +=0$ (since $p\in X$). Hence, $p$ is a $t\in S$ satisfying $f\left( t\right) +=0$ (since $p\in S$ and $f\left( p\right) =0$). In other words, +$p\in\left\{ t\in S\ \mid\ f\left( t\right) =0\right\} $. + +Now, forget that we fixed $p$. We thus have proven that $p\in\left\{ t\in +S\ \mid\ f\left( t\right) =0\right\} $ for each $p\in X$. In other words, +\begin{equation} +X\subseteq\left\{ t\in S\ \mid\ f\left( t\right) =0\right\} . +\label{pf.thm.ind.gen-com.split2.b0=.pf.1}% +\end{equation} + + +On the other hand, let $s\in\left\{ t\in S\ \mid\ f\left( t\right) +=0\right\} $. Thus, $s$ is a $t\in S$ satisfying $f\left( t\right) =0$. In +other words, $s$ is an element of $S$ and satisfies $f\left( s\right) =0$. +If we had $s\notin X$, then we would have% +\begin{align*} +f\left( s\right) & =% +\begin{cases} +0, & \text{if }s\in X;\\ +1, & \text{if }s\notin X +\end{cases} +\ \ \ \ \ \ \ \ \ \ \left( \text{by the definition of }f\right) \\ +& =1\ \ \ \ \ \ \ \ \ \ \left( \text{since }s\notin X\right) , +\end{align*} +which would contradict $f\left( s\right) =0\neq1$. Hence, we cannot have +$s\notin X$. Thus, we have $s\in X$. + +Now, forget that we fixed $s$. We thus have proven that $s\in X$ for each +$s\in\left\{ t\in S\ \mid\ f\left( t\right) =0\right\} $. In other words, +$\left\{ t\in S\ \mid\ f\left( t\right) =0\right\} \subseteq X$. Combining +this with (\ref{pf.thm.ind.gen-com.split2.b0=.pf.1}), we obtain% +\[ +X=\left\{ t\in S\ \mid\ f\left( t\right) =0\right\} . +\] +Hence,% +\[ +\sum_{s\in X}a_{s}=\sum_{s\in\left\{ t\in S\ \mid\ f\left( t\right) +=0\right\} }a_{s}. +\] +Comparing this with% +\begin{align*} +b_{0} & =\sum_{\substack{s\in S;\\f\left( s\right) =0}}a_{s}% +\ \ \ \ \ \ \ \ \ \ \left( \text{by the definition of }b_{0}\right) \\ +& =\sum_{s\in\left\{ t\in S\ \mid\ f\left( t\right) =0\right\} }a_{s}, +\end{align*} +we obtain $b_{0}=\sum_{s\in X}a_{s}$. This proves +(\ref{pf.thm.ind.gen-com.split2.b0=}).] +\end{verlong} + +Furthermore, +\begin{equation} +b_{1}=\sum_{s\in Y}a_{s}. \label{pf.thm.ind.gen-com.split2.b1=}% +\end{equation} + + +\begin{vershort} +[\textit{Proof of (\ref{pf.thm.ind.gen-com.split2.b1=}):} The definition of +the map $f$ shows that an element $t\in S$ satisfies $f\left( t\right) =1$ +\textbf{if and only if} $t\notin X$. Thus, for each $t\in S$, we have the +following chain of equivalences:% +\[ +\left( f\left( t\right) =1\right) \ \Longleftrightarrow\ \left( t\notin +X\right) \ \Longleftrightarrow\ \left( t\in S\setminus X\right) +\ \Longleftrightarrow\ \left( t\in Y\right) +\] +(since $S\setminus X=Y$). In other words, an element $t\in S$ satisfies +$f\left( t\right) =1$ \textbf{if and only if} $t$ belongs to $Y$. Hence, the +set of all elements $t\in S$ that satisfy $f\left( t\right) =1$ is precisely +$Y$. In other words,% +\[ +\left\{ t\in S\ \mid\ f\left( t\right) =1\right\} =Y. +\] +But the definition of $b_{1}$ yields% +\[ +b_{1}=\sum_{\substack{s\in S;\\f\left( s\right) =1}}a_{s}=\sum_{s\in\left\{ +t\in S\ \mid\ f\left( t\right) =1\right\} }a_{s}=\sum_{s\in Y}a_{s}% +\] +(since $\left\{ t\in S\ \mid\ f\left( t\right) =1\right\} =Y$). This +proves (\ref{pf.thm.ind.gen-com.split2.b1=}).] +\end{vershort} + +\begin{verlong} +[\textit{Proof of (\ref{pf.thm.ind.gen-com.split2.b1=}):} Let $p\in Y$. Thus, +$p\in Y=S\setminus X$ (since $S\setminus X=Y$). In other words, $p\in S$ and +$p\notin X$. The definition of $f$ yields $f\left( p\right) =% +\begin{cases} +0, & \text{if }p\in X;\\ +1, & \text{if }p\notin X +\end{cases} +=1$ (since $p\notin X$). Hence, $p$ is a $t\in S$ satisfying $f\left( +t\right) =1$ (since $p\in S$ and $f\left( p\right) =1$). In other words, +$p\in\left\{ t\in S\ \mid\ f\left( t\right) =1\right\} $. + +Now, forget that we fixed $p$. We thus have proven that $p\in\left\{ t\in +S\ \mid\ f\left( t\right) =1\right\} $ for each $p\in Y$. In other words, +\begin{equation} +Y\subseteq\left\{ t\in S\ \mid\ f\left( t\right) =1\right\} . +\label{pf.thm.ind.gen-com.split2.b1=.pf.1}% +\end{equation} + + +On the other hand, let $s\in\left\{ t\in S\ \mid\ f\left( t\right) +=1\right\} $. Thus, $s$ is a $t\in S$ satisfying $f\left( t\right) =1$. In +other words, $s$ is an element of $S$ and satisfies $f\left( s\right) =1$. +If we had $s\in X$, then we would have% +\begin{align*} +f\left( s\right) & =% +\begin{cases} +0, & \text{if }s\in X;\\ +1, & \text{if }s\notin X +\end{cases} +\ \ \ \ \ \ \ \ \ \ \left( \text{by the definition of }f\right) \\ +& =0\ \ \ \ \ \ \ \ \ \ \left( \text{since }s\in X\right) , +\end{align*} +which would contradict $f\left( s\right) =1\neq0$. Hence, we cannot have +$s\in X$. Thus, we have $s\notin X$. Combining $s\in\left\{ t\in +S\ \mid\ f\left( t\right) =1\right\} \subseteq S$ with $s\notin X$, we +obtain $s\in S\setminus X=Y$. + +Now, forget that we fixed $s$. We thus have proven that $s\in Y$ for each +$s\in\left\{ t\in S\ \mid\ f\left( t\right) =1\right\} $. In other words, +$\left\{ t\in S\ \mid\ f\left( t\right) =1\right\} \subseteq Y$. Combining +this with (\ref{pf.thm.ind.gen-com.split2.b1=.pf.1}), we obtain% +\[ +Y=\left\{ t\in S\ \mid\ f\left( t\right) =1\right\} . +\] +Hence,% +\[ +\sum_{s\in Y}a_{s}=\sum_{s\in\left\{ t\in S\ \mid\ f\left( t\right) +=1\right\} }a_{s}. +\] +Comparing this with% +\begin{align*} +b_{1} & =\sum_{\substack{s\in S;\\f\left( s\right) =1}}a_{s}% +\ \ \ \ \ \ \ \ \ \ \left( \text{by the definition of }b_{1}\right) \\ +& =\sum_{s\in\left\{ t\in S\ \mid\ f\left( t\right) =1\right\} }a_{s}, +\end{align*} +we obtain $b_{1}=\sum_{s\in Y}a_{s}$. This proves +(\ref{pf.thm.ind.gen-com.split2.b1=}).] +\end{verlong} + +Now, (\ref{pf.thm.ind.gen-com.split2.1}) becomes% +\[ +\sum_{s\in S}a_{s}=\underbrace{b_{0}}_{\substack{=\sum_{s\in X}a_{s}% +\\\text{(by (\ref{pf.thm.ind.gen-com.split2.b0=}))}}}+\underbrace{b_{1}% +}_{\substack{=\sum_{s\in Y}a_{s}\\\text{(by +(\ref{pf.thm.ind.gen-com.split2.b1=}))}}}=\sum_{s\in X}a_{s}+\sum_{s\in +Y}a_{s}. +\] +This proves Theorem \ref{thm.ind.gen-com.split2}. +\end{proof} + +Similarly, we can prove the equality (\ref{eq.sum.split-n}). (This proof was +already outlined in Section \ref{sect.sums-repetitorium}.) + +A consequence of Theorem \ref{thm.ind.gen-com.split2} is the following fact, +which has appeared as the equality (\ref{eq.sum.drop0}) in Section +\ref{sect.sums-repetitorium}: + +\begin{corollary} +\label{cor.ind.gen-com.drop0}Let $S$ be a finite set. Let $a_{s}$ be an +element of $\mathbb{A}$ for each $s\in S$. Let $T$ be a subset of $S$ such +that every $s\in T$ satisfies $a_{s}=0$. Then,% +\[ +\sum_{s\in S}a_{s}=\sum_{s\in S\setminus T}a_{s}. +\] + +\end{corollary} + +\begin{proof} +[Proof of Corollary \ref{cor.ind.gen-com.drop0}.]We have assumed that every +$s\in T$ satisfies $a_{s}=0$. Thus, $\sum_{s\in T}\underbrace{a_{s}}_{=0}% +=\sum_{s\in T}0=0$ (by Theorem \ref{thm.ind.gen-com.sum(0)} (applied to $T$ +instead of $S$)). + +But $T$ and $S\setminus T$ are subsets of $S$. These two subsets satisfy +$T\cap\left( S\setminus T\right) =\varnothing$ and $T\cup\left( S\setminus +T\right) =S$ (since $T\subseteq S$). Hence, Theorem +\ref{thm.ind.gen-com.split2} (applied to $X=T$ and $Y=S\setminus T$) yields% +\[ +\sum_{s\in S}a_{s}=\underbrace{\sum_{s\in T}a_{s}}_{=0}+\sum_{s\in S\setminus +T}a_{s}=\sum_{s\in S\setminus T}a_{s}. +\] +This proves Corollary \ref{cor.ind.gen-com.drop0}. +\end{proof} + +\subsubsection{Substituting the summation index} + +Next, we shall show the equality (\ref{eq.sum.subs1}): + +\begin{theorem} +\label{thm.ind.gen-com.subst1}Let $S$ and $T$ be two finite sets. Let +$f:S\rightarrow T$ be a \textbf{bijective} map. Let $a_{t}$ be an element of +$\mathbb{A}$ for each $t\in T$. Then,% +\[ +\sum_{t\in T}a_{t}=\sum_{s\in S}a_{f\left( s\right) }. +\] + +\end{theorem} + +\begin{proof} +[Proof of Theorem \ref{thm.ind.gen-com.subst1}.]Each $w\in T$ satisfies% +\begin{equation} +\sum_{\substack{s\in S;\\f\left( s\right) =w}}a_{f\left( s\right) }=a_{w}. +\label{pf.thm.ind.gen-com.subst1.1}% +\end{equation} + + +[\textit{Proof of (\ref{pf.thm.ind.gen-com.subst1.1}):} Let $w\in T$. + +\begin{vershort} +The map $f$ is bijective; thus, it is invertible. In other words, its inverse +map $f^{-1}:T\rightarrow S$ exists. Hence, $f^{-1}\left( w\right) $ is a +well-defined element of $S$, and is the only element $t\in S$ satisfying +$f\left( t\right) =w$. Therefore, +\begin{equation} +\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} =\left\{ f^{-1}\left( +w\right) \right\} . \label{pf.thm.ind.gen-com.subst1.1.pf.short.0}% +\end{equation} + +\end{vershort} + +\begin{verlong} +The map $f$ is bijective; thus, it is invertible. In other words, its inverse +map $f^{-1}:T\rightarrow S$ exists. Hence, $f^{-1}\left( w\right) $ is a +well-defined element of $S$. This element $f^{-1}\left( w\right) $ belongs +to $S$ and satisfies $f\left( f^{-1}\left( w\right) \right) =w$. In other +words, $f^{-1}\left( w\right) $ is a $t\in S$ satisfying $f\left( t\right) +=w$. In other words, $f^{-1}\left( w\right) \in\left\{ t\in S\ \mid +\ f\left( t\right) =w\right\} $. Hence,% +\begin{equation} +\left\{ f^{-1}\left( w\right) \right\} \subseteq\left\{ t\in +S\ \mid\ f\left( t\right) =w\right\} . +\label{pf.thm.ind.gen-com.subst1.1.pf.1}% +\end{equation} + + +On the other hand, let $p\in\left\{ t\in S\ \mid\ f\left( t\right) +=w\right\} $. Thus, $p$ is a $t\in S$ satisfying $f\left( t\right) =w$. In +other words, $p$ is an element of $S$ and satisfies $f\left( p\right) =w$. +From $f\left( p\right) =w$, we obtain $p=f^{-1}\left( w\right) $ (since +$f^{-1}$ is the inverse of $f$), and thus $p=f^{-1}\left( w\right) +\in\left\{ f^{-1}\left( w\right) \right\} $. Now, forget that we fixed +$p$. We thus have proven that $p\in\left\{ f^{-1}\left( w\right) \right\} +$ for every $p\in\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} $. In +other words, +\[ +\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} \subseteq\left\{ +f^{-1}\left( w\right) \right\} . +\] +Combining this with (\ref{pf.thm.ind.gen-com.subst1.1.pf.1}), we obtain% +\[ +\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} =\left\{ f^{-1}\left( +w\right) \right\} . +\] + +\end{verlong} + +\begin{vershort} +Now,% +\begin{align*} +& \sum_{\substack{s\in S;\\f\left( s\right) =w}}a_{f\left( s\right) }\\ +& =\sum_{s\in\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} +}a_{f\left( s\right) }=\sum_{s\in\left\{ f^{-1}\left( w\right) \right\} +}a_{f\left( s\right) }\ \ \ \ \ \ \ \ \ \ \left( \text{by +(\ref{pf.thm.ind.gen-com.subst1.1.pf.short.0})}\right) \\ +& =a_{f\left( f^{-1}\left( w\right) \right) }\ \ \ \ \ \ \ \ \ \ \left( +\begin{array} +[c]{c}% +\text{by Proposition \ref{prop.ind.gen-com.sum12} \textbf{(a)} (applied to +}\left\{ f^{-1}\left( w\right) \right\} \text{, }a_{f\left( s\right) }\\ +\text{and }f^{-1}\left( w\right) \text{ instead of }S\text{, }a_{s}\text{ +and }p\text{)}% +\end{array} +\right) \\ +& =a_{w}\ \ \ \ \ \ \ \ \ \ \left( \text{since }f\left( f^{-1}\left( +w\right) \right) =w\right) . +\end{align*} +This proves (\ref{pf.thm.ind.gen-com.subst1.1}).] +\end{vershort} + +\begin{verlong} +Now,% +\[ +\sum_{\substack{s\in S;\\f\left( s\right) =w}}a_{f\left( s\right) }% +=\sum_{s\in\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} }a_{f\left( +s\right) }=\sum_{s\in\left\{ f^{-1}\left( w\right) \right\} }a_{f\left( +s\right) }% +\] +(since $\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} =\left\{ +f^{-1}\left( w\right) \right\} $). + +But $\left\{ f^{-1}\left( w\right) \right\} =\left\{ f^{-1}\left( +w\right) \right\} $. Hence, Proposition \ref{prop.ind.gen-com.sum12} +\textbf{(a)} (applied to $\left\{ f^{-1}\left( w\right) \right\} $, +$a_{f\left( s\right) }$ and $f^{-1}\left( w\right) $ instead of $S$, +$a_{s}$ and $p$) yields +\[ +\sum_{s\in\left\{ f^{-1}\left( w\right) \right\} }a_{f\left( s\right) +}=a_{f\left( f^{-1}\left( w\right) \right) }=a_{w}% +\ \ \ \ \ \ \ \ \ \ \left( \text{since }f\left( f^{-1}\left( w\right) +\right) =w\right) . +\] +Hence,% +\[ +\sum_{\substack{s\in S;\\f\left( s\right) =w}}a_{f\left( s\right) }% +=\sum_{s\in\left\{ f^{-1}\left( w\right) \right\} }a_{f\left( s\right) +}=a_{w}. +\] +This proves (\ref{pf.thm.ind.gen-com.subst1.1}).] +\end{verlong} + +Renaming the summation index $w$ as $t$ in the sum $\sum_{w\in T}a_{w}$ does +not change the sum (since $\left( a_{w}\right) _{w\in T}$ and $\left( +a_{t}\right) _{t\in T}$ are the same $\mathbb{A}$-valued $T$-family). In +other words, $\sum_{w\in T}a_{w}=\sum_{t\in T}a_{t}$. + +Theorem \ref{thm.ind.gen-com.shephf} (applied to $T$ and $a_{f\left( +s\right) }$ instead of $W$ and $a_{s}$) yields% +\[ +\sum_{s\in S}a_{f\left( s\right) }=\sum_{w\in T}\underbrace{\sum +_{\substack{s\in S;\\f\left( s\right) =w}}a_{f\left( s\right) }% +}_{\substack{=a_{w}\\\text{(by (\ref{pf.thm.ind.gen-com.subst1.1}))}}% +}=\sum_{w\in T}a_{w}=\sum_{t\in T}a_{t}. +\] +This proves Theorem \ref{thm.ind.gen-com.subst1}. +\end{proof} + +\subsubsection{Sums of congruences} + +Proposition \ref{prop.mod.+-*} \textbf{(a)} says that we can add two +congruences modulo an integer $n$. We shall now see that we can add +\textbf{any} number of congruences modulo an integer $n$: + +\begin{theorem} +\label{thm.ind.gen-com.sum-mod}Let $n$ be an integer. Let $S$ be a finite set. +For each $s\in S$, let $a_{s}$ and $b_{s}$ be two integers. Assume that% +\[ +a_{s}\equiv b_{s}\operatorname{mod}n\ \ \ \ \ \ \ \ \ \ \text{for each }s\in +S. +\] +Then,% +\[ +\sum_{s\in S}a_{s}\equiv\sum_{s\in S}b_{s}\operatorname{mod}n. +\] + +\end{theorem} + +\begin{proof} +[Proof of Theorem \ref{thm.ind.gen-com.sum-mod}.]We forget that we fixed $n$, +$S$, $a_{s}$ and $b_{s}$. We shall prove Theorem \ref{thm.ind.gen-com.sum-mod} +by induction on $\left\vert S\right\vert $: + +\begin{vershort} +\textit{Induction base:} The induction base (i.e., proving that Theorem +\ref{thm.ind.gen-com.sum-mod} holds under the condition that $\left\vert +S\right\vert =0$) is left to the reader (as it boils down to the trivial fact +that $0\equiv0\operatorname{mod}n$). +\end{vershort} + +\begin{verlong} +\textit{Induction base:} Theorem \ref{thm.ind.gen-com.sum-mod} holds under the +condition that $\left\vert S\right\vert =0$\ \ \ \ \footnote{\textit{Proof.} +Let $n$, $S$, $a_{s}$ and $b_{s}$ be as in Theorem +\ref{thm.ind.gen-com.sum-mod}. Assume that $\left\vert S\right\vert =0$. Thus, +the first bullet point of Definition \ref{def.ind.gen-com.defsum1} yields +$\sum_{s\in S}a_{s}=0$ and $\sum_{s\in S}b_{s}=0$. Hence,% +\[ +\sum_{s\in S}a_{s}=0\equiv0=\sum_{s\in S}b_{s}\operatorname{mod}n. +\] +\par +Now, forget that we fixed $n$, $S$, $a_{s}$ and $b_{s}$. We thus have proved +that if $n$, $S$, $a_{s}$ and $b_{s}$ are as in Theorem +\ref{thm.ind.gen-com.sum-mod}, and if $\left\vert S\right\vert =0$, then +$\sum_{s\in S}a_{s}\equiv\sum_{s\in S}b_{s}\operatorname{mod}n$. In other +words, Theorem \ref{thm.ind.gen-com.sum-mod} holds under the condition that +$\left\vert S\right\vert =0$. Qed.}. This completes the induction base. +\end{verlong} + +\textit{Induction step:} Let $m\in\mathbb{N}$. Assume that Theorem +\ref{thm.ind.gen-com.sum-mod} holds under the condition that $\left\vert +S\right\vert =m$. We must now prove that Theorem \ref{thm.ind.gen-com.sum-mod} +holds under the condition that $\left\vert S\right\vert =m+1$. + +We have assumed that Theorem \ref{thm.ind.gen-com.sum-mod} holds under the +condition that $\left\vert S\right\vert =m$. In other words, the following +claim holds: + +\begin{statement} +\textit{Claim 1:} Let $n$ be an integer. Let $S$ be a finite set such that +$\left\vert S\right\vert =m$. For each $s\in S$, let $a_{s}$ and $b_{s}$ be +two integers. Assume that% +\[ +a_{s}\equiv b_{s}\operatorname{mod}n\ \ \ \ \ \ \ \ \ \ \text{for each }s\in +S. +\] +Then,% +\[ +\sum_{s\in S}a_{s}\equiv\sum_{s\in S}b_{s}\operatorname{mod}n. +\] + +\end{statement} + +Next, we shall show the following claim: + +\begin{statement} +\textit{Claim 2:} Let $n$ be an integer. Let $S$ be a finite set such that +$\left\vert S\right\vert =m+1$. For each $s\in S$, let $a_{s}$ and $b_{s}$ be +two integers. Assume that% +\begin{equation} +a_{s}\equiv b_{s}\operatorname{mod}n\ \ \ \ \ \ \ \ \ \ \text{for each }s\in +S. \label{pf.thm.ind.gen-com.sum-mod.c2.ass}% +\end{equation} +Then,% +\[ +\sum_{s\in S}a_{s}\equiv\sum_{s\in S}b_{s}\operatorname{mod}n. +\] + +\end{statement} + +[\textit{Proof of Claim 2:} We have $\left\vert S\right\vert =m+1>m\geq0$. +Hence, the set $S$ is nonempty. Thus, there exists some $t\in S$. Consider +this $t$. + +From $t\in S$, we obtain $\left\vert S\setminus\left\{ t\right\} \right\vert +=\left\vert S\right\vert -1=m$ (since $\left\vert S\right\vert =m+1$). Also, +every $s\in S\setminus\left\{ t\right\} $ satisfies $s\in S\setminus\left\{ +t\right\} \subseteq S$ and thus $a_{s}\equiv b_{s}\operatorname{mod}n$ (by +(\ref{pf.thm.ind.gen-com.sum-mod.c2.ass})). In other words, we have% +\[ +a_{s}\equiv b_{s}\operatorname{mod}n\ \ \ \ \ \ \ \ \ \ \text{for each }s\in +S\setminus\left\{ t\right\} . +\] +Hence, Claim 1 (applied to $S\setminus\left\{ t\right\} $ instead of $S$) +yields% +\begin{equation} +\sum_{s\in S\setminus\left\{ t\right\} }a_{s}\equiv\sum_{s\in S\setminus +\left\{ t\right\} }b_{s}\operatorname{mod}n. +\label{pf.thm.ind.gen-com.sum-mod.c2.pf.1}% +\end{equation} + + +But $t\in S$. Hence, (\ref{pf.thm.ind.gen-com.sum-mod.c2.ass}) (applied to +$s=t$) yields $a_{t}\equiv b_{t}\operatorname{mod}n$. + +Now, Proposition \ref{prop.ind.gen-com.split-off} (applied to $b_{s}$ instead +of $a_{s}$) yields% +\begin{equation} +\sum_{s\in S}b_{s}=b_{t}+\sum_{s\in S\setminus\left\{ t\right\} }b_{s}. +\label{pf.thm.ind.gen-com.sum-mod.c2.pf.3}% +\end{equation} + + +But Proposition \ref{prop.ind.gen-com.split-off} yields% +\[ +\sum_{s\in S}a_{s}=\underbrace{a_{t}}_{\equiv b_{t}\operatorname{mod}% +n}+\underbrace{\sum_{s\in S\setminus\left\{ t\right\} }a_{s}}% +_{\substack{\equiv\sum_{s\in S\setminus\left\{ t\right\} }b_{s}% +\operatorname{mod}n\\\text{(by (\ref{pf.thm.ind.gen-com.sum-mod.c2.pf.1}))}% +}}\equiv b_{t}+\sum_{s\in S\setminus\left\{ t\right\} }b_{s}=\sum_{s\in +S}b_{s}\operatorname{mod}n +\] +(by (\ref{pf.thm.ind.gen-com.sum-mod.c2.pf.3})). This proves Claim 2.] + +But Claim 2 says precisely that Theorem \ref{thm.ind.gen-com.sum-mod} holds +under the condition that $\left\vert S\right\vert =m+1$. Hence, we conclude +that Theorem \ref{thm.ind.gen-com.sum-mod} holds under the condition that +$\left\vert S\right\vert =m+1$ (since Claim 2 is proven). This completes the +induction step. Thus, Theorem \ref{thm.ind.gen-com.sum-mod} is proven by induction. +\end{proof} + +As we said, Theorem \ref{thm.ind.gen-com.sum-mod} shows that we can sum up +several congruences. Thus, we can extend our principle of substitutivity for +congruences as follows: + +\begin{statement} +\textit{Principle of substitutivity for congruences (stronger version):} Fix +an integer $n$. If two numbers $x$ and $x^{\prime}$ are congruent to each +other modulo $n$ (that is, $x\equiv x^{\prime}\operatorname{mod}n$), and if we +have any expression $A$ that involves only integers, addition, subtraction, +multiplication \textbf{and summation signs}, and involves the object $x$, then +we can replace this $x$ (or, more precisely, any arbitrary appearance of $x$ +in $A$) in $A$ by $x^{\prime}$; the resulting expression $A^{\prime}$ will be +congruent to $A$ modulo $n$. +\end{statement} + +For example, if $p\in\mathbb{N}$, then% +\[ +\sum_{s\in\left\{ 1,2,\ldots,p\right\} }s^{2}\left( 5-3s\right) \equiv +\sum_{s\in\left\{ 1,2,\ldots,p\right\} }s\left( 5-3s\right) +\operatorname{mod}2 +\] +(here, we have replaced the \textquotedblleft$s^{2}$\textquotedblright\ inside +the sum by \textquotedblleft$s$\textquotedblright), because every +$s\in\left\{ 1,2,\ldots,p\right\} $ satisfies $s^{2}\equiv +s\operatorname{mod}2$ (this is easy to check\footnote{\textit{Proof.} Let +$p\in\mathbb{N}$ and $s\in\left\{ 1,2,\ldots,p\right\} $. We must prove that +$s^{2}\equiv s\operatorname{mod}2$. +\par +We have $s\in\left\{ 1,2,\ldots,p\right\} $ and thus $s-1\in\left\{ +0,1,\ldots,p-1\right\} \subseteq\mathbb{N}$. Hence, +(\ref{eq.prop.ind.gen-com.n(n+1)/2.claim}) (applied to $n=s-1$) yields +$\sum_{i\in\left\{ 1,2,\ldots,s-1\right\} }i=\dfrac{\left( s-1\right) +\left( \left( s-1\right) +1\right) }{2}=\dfrac{\left( s-1\right) s}{2}$. +Hence, $\dfrac{\left( s-1\right) s}{2}$ is an integer (since $\sum +_{i\in\left\{ 1,2,\ldots,s-1\right\} }i$ is an integer). In other words, +$2\mid\left( s-1\right) s$. In other words, $2\mid s^{2}-s$ (since $\left( +s-1\right) s=s^{2}-s$). In other words, $s^{2}\equiv s\operatorname{mod}2$ +(by the definition of \textquotedblleft congruent\textquotedblright), qed.}). + +\subsubsection{\label{subsect.ind.gen-com.prods}Finite products} + +Proposition \ref{prop.ind.gen-com.fgh} is a property of the addition of +numbers; it has an analogue for multiplication of numbers: + +\begin{proposition} +\label{prop.ind.gen-com.fgh*}Let $a$, $b$ and $c$ be three numbers (i.e., +elements of $\mathbb{A}$). Then, $\left( ab\right) c=a\left( bc\right) $. +\end{proposition} + +Proposition \ref{prop.ind.gen-com.fgh*} is known as the \textit{associativity +of multiplication} (in $\mathbb{A}$), and is fundamental; its proof can be +found in any textbook on the construction of the number system\footnote{For +example, Proposition \ref{prop.ind.gen-com.fgh*} is proven in \cite[Theorem +3.2.3 (7)]{Swanso18} for the case when $\mathbb{A}=\mathbb{N}$; in +\cite[Theorem 3.5.4 (7)]{Swanso18} for the case when $\mathbb{A}=\mathbb{Z}$; +in \cite[Theorem 3.6.4 (7)]{Swanso18} for the case when $\mathbb{A}% +=\mathbb{Q}$; in \cite[Theorem 3.7.13]{Swanso18} for the case when +$\mathbb{A}=\mathbb{R}$; in \cite[Theorem 3.9.3]{Swanso18} for the case when +$\mathbb{A}=\mathbb{C}$.}. + +Proposition \ref{prop.ind.gen-com.fg} also has an analogue for multiplication: + +\begin{proposition} +\label{prop.ind.gen-com.fg*}Let $a$ and $b$ be two numbers (i.e., elements of +$\mathbb{A}$). Then, $ab=ba$. +\end{proposition} + +Proposition \ref{prop.ind.gen-com.fg*} is known as the \textit{commutativity +of multiplication} (in $\mathbb{A}$), and again is a fundamental result whose +proofs are found in standard textbooks\footnote{For example, Proposition +\ref{prop.ind.gen-com.fg*} is proven in \cite[Theorem 3.2.3 (8)]{Swanso18} for +the case when $\mathbb{A}=\mathbb{N}$; in \cite[Theorem 3.5.4 (8)]{Swanso18} +for the case when $\mathbb{A}=\mathbb{Z}$; in \cite[Theorem 3.6.4 +(8)]{Swanso18} for the case when $\mathbb{A}=\mathbb{Q}$; in \cite[Theorem +3.7.13]{Swanso18} for the case when $\mathbb{A}=\mathbb{R}$; in \cite[Theorem +3.9.3]{Swanso18} for the case when $\mathbb{A}=\mathbb{C}$.}. + +Proposition \ref{prop.ind.gen-com.distr} has an analogue for multiplication as +well (but note that $x$ now needs to be in $\mathbb{N}$, in order to guarantee +that the powers are well-defined): + +\begin{proposition} +\label{prop.ind.gen-com.distr*}Let $x\in\mathbb{N}$. Let $y$ and $z$ be two +numbers (i.e., elements of $\mathbb{A}$). Then, $\left( yz\right) ^{x}% +=y^{x}z^{x}$. +\end{proposition} + +Proposition \ref{prop.ind.gen-com.distr*} is one of the laws of exponents, and +can easily be shown by induction on $x$ (using Proposition +\ref{prop.ind.gen-com.fg*}). + +So far in Section \ref{sect.ind.gen-com}, we have been studying \textbf{sums} +of $\mathbb{A}$-valued $S$-families (when $S$ is a finite set): We have proven +that the definition of $\sum_{s\in S}a_{s}$ given in Section +\ref{sect.sums-repetitorium} is legitimate, and we have proven several +properties of such sums. By the exact same reasoning (but with addition +replaced by multiplication), we can study \textbf{products} of $\mathbb{A}% +$-valued $S$-families. In particular, we can similarly prove that the +definition of $\prod_{s\in S}a_{s}$ given in Section +\ref{sect.sums-repetitorium} is legitimate, and we can prove properties of +such products that are analogous to the properties of sums proven above +(except for Proposition \ref{prop.ind.gen-com.n(n+1)/2}, which does not have +an analogue for products)\footnote{We need to be slightly careful when we +adapt our above proofs to products instead of sums: Apart from replacing +addition by multiplication everywhere, we need to: +\par +\begin{itemize} +\item replace the number $0$ by $1$ whenever it appears in a computation +inside $\mathbb{A}$ (but, of course, not when it appears as the size of a +set); +\par +\item replace every $\sum$ sign by a $\prod$ sign; +\par +\item replace \textquotedblleft let $\lambda$ be an element of $\mathbb{A}% +$\textquotedblright\ by \textquotedblleft let $\lambda$ be an element of +$\mathbb{N}$\textquotedblright\ in Theorem \ref{thm.ind.gen-com.sum(la)}; +\par +\item replace any expression of the form \textquotedblleft$\lambda +b$\textquotedblright\ by \textquotedblleft$b^{\lambda}$\textquotedblright\ in +Theorem \ref{thm.ind.gen-com.sum(la)} (so that the claim of Theorem +\ref{thm.ind.gen-com.sum(la)} becomes $\prod_{s\in S}\left( a_{s}\right) +^{\lambda}=\left( \prod_{s\in S}a_{s}\right) ^{\lambda}$) and in its proof; +\par +\item replace every reference to Proposition \ref{prop.ind.gen-com.fgh} by a +reference to Proposition \ref{prop.ind.gen-com.fgh*}; +\par +\item replace every reference to Proposition \ref{prop.ind.gen-com.fg} by a +reference to Proposition \ref{prop.ind.gen-com.fg*}; +\par +\item replace every reference to Proposition \ref{prop.ind.gen-com.distr} by a +reference to Proposition \ref{prop.ind.gen-com.distr*}. +\end{itemize} +\par +And, to be fully precise: We should not replace addition by multiplication +\textbf{everywhere} (e.g., we should not replace \textquotedblleft$\left\vert +S\right\vert =m+1$\textquotedblright\ by \textquotedblleft$\left\vert +S\right\vert =m\cdot1$\textquotedblright\ in the proof of Theorem +\ref{thm.ind.gen-com.shephf}), but of course only where it stands for the +addition \textbf{inside }$\mathbb{A}$.}. For example, the following theorems +are analogues of Theorem \ref{thm.ind.gen-com.sum(a+b)}, Theorem +\ref{thm.ind.gen-com.sum(la)}, Theorem \ref{thm.ind.gen-com.sum(0)}, Theorem +\ref{thm.ind.gen-com.shephf}, Theorem \ref{thm.ind.gen-com.subst1} and Theorem +\ref{thm.ind.gen-com.sum-mod}, respectively: + +\begin{theorem} +\label{thm.ind.gen-com.prod(a+b)}Let $S$ be a finite set. For every $s\in S$, +let $a_{s}$ and $b_{s}$ be elements of $\mathbb{A}$. Then,% +\[ +\prod_{s\in S}\left( a_{s}b_{s}\right) =\left( \prod_{s\in S}a_{s}\right) +\cdot\left( \prod_{s\in S}b_{s}\right) . +\] + +\end{theorem} + +\begin{theorem} +\label{thm.ind.gen-com.prod(la)}Let $S$ be a finite set. For every $s\in S$, +let $a_{s}$ be an element of $\mathbb{A}$. Also, let $\lambda$ be an element +of $\mathbb{N}$. Then,% +\[ +\prod_{s\in S}\left( a_{s}\right) ^{\lambda}=\left( \prod_{s\in S}% +a_{s}\right) ^{\lambda}. +\] + +\end{theorem} + +\begin{theorem} +\label{thm.ind.gen-com.prod(1)}Let $S$ be a finite set. Then, +\[ +\prod_{s\in S}1=1. +\] + +\end{theorem} + +\begin{theorem} +\label{thm.ind.gen-com.shephf*}Let $S$ be a finite set. Let $W$ be a finite +set. Let $f:S\rightarrow W$ be a map. Let $a_{s}$ be an element of +$\mathbb{A}$ for each $s\in S$. Then,% +\[ +\prod_{s\in S}a_{s}=\prod_{w\in W}\prod_{\substack{s\in S;\\f\left( s\right) +=w}}a_{s}. +\] + +\end{theorem} + +\begin{theorem} +\label{thm.ind.gen-com.subst1*}Let $S$ and $T$ be two finite sets. Let +$f:S\rightarrow T$ be a \textbf{bijective} map. Let $a_{t}$ be an element of +$\mathbb{A}$ for each $t\in T$. Then,% +\[ +\prod_{t\in T}a_{t}=\prod_{s\in S}a_{f\left( s\right) }. +\] + +\end{theorem} + +\begin{theorem} +\label{thm.ind.gen-com.prod-mod}Let $n$ be an integer. Let $S$ be a finite +set. For each $s\in S$, let $a_{s}$ and $b_{s}$ be two integers. Assume that% +\[ +a_{s}\equiv b_{s}\operatorname{mod}n\ \ \ \ \ \ \ \ \ \ \text{for each }s\in +S. +\] +Then,% +\[ +\prod_{s\in S}a_{s}\equiv\prod_{s\in S}b_{s}\operatorname{mod}n. +\] + +\end{theorem} + +\subsubsection{Finitely supported (but possibly infinite) sums} + +In Section \ref{sect.sums-repetitorium}, we mentioned that a sum of the form +$\sum_{s\in S}a_{s}$ can be well-defined even when the set $S$ is not finite. +Indeed, for it to be well-defined, it suffices that \textbf{only finitely many +among the }$a_{s}$ \textbf{are nonzero} (or, more rigorously: only finitely +many $s\in S$ satisfy $a_{s}\neq0$). As we already mentioned, the sum +$\sum_{s\in S}a_{s}$ in this case is defined by discarding the zero addends +and summing the finitely many addends that remain. Let us briefly discuss such +sums (without focussing on the proofs): + +\begin{definition} +\label{def.ind.gen-com.fin-sup.def}Let $S$ be any set. An $\mathbb{A}$-valued +$S$-family $\left( a_{s}\right) _{s\in S}$ is said to be \textit{finitely +supported} if only finitely many $s\in S$ satisfy $a_{s}\neq0$. +\end{definition} + +So the sums we want to discuss are sums $\sum_{s\in S}a_{s}$ for which the set +$S$ may be infinite but the $S$-family $\left( a_{s}\right) _{s\in S}$ is +finitely supported. Let us repeat the definition of such sums in more rigorous language: + +\begin{definition} +\label{def.ind.gen-com.fin-sup.sum}Let $S$ be any set. Let $\left( +a_{s}\right) _{s\in S}$ be a finitely supported $\mathbb{A}$-valued +$S$-family. Thus, there exists a \textbf{finite} subset $T$ of $S$ such that% +\begin{equation} +\text{every }s\in S\setminus T\text{ satisfies }a_{s}=0. +\label{eq.def.ind.gen-com.fin-sup.sum.0}% +\end{equation} +(This is because only finitely many $s\in S$ satisfy $a_{s}\neq0$.) We then +define the sum $\sum_{s\in S}a_{s}$ to be $\sum_{s\in T}a_{s}$. (This +definition is legitimate, because Proposition +\ref{prop.ind.gen-com.fin-sup.leg} \textbf{(a)} below shows that $\sum_{s\in +T}a_{s}$ does not depend on the choice of $T$.) +\end{definition} + +This definition formalizes what we said above about making sense of +$\sum_{s\in S}a_{s}$: Namely, we discard zero addends (namely, the addends +corresponding to $s\in S\setminus T$) and only sum the finitely many addends +that remain (these are the addends corresponding to $s\in T$); thus, we get +$\sum_{s\in T}a_{s}$. Note that we are not requiring that every $s\in T$ +satisfies $a_{s}\neq0$; that is, we are not necessarily discarding +\textbf{all} the zero addends from our sum (but merely discarding enough of +them to ensure that only finitely many remain). This may appear like a strange +choice (why introduce extra freedom into the definition?), but is reasonable +from the viewpoint of constructive mathematics (where it is not always +decidable if a number is $0$ or not). + +\begin{proposition} +\label{prop.ind.gen-com.fin-sup.leg}Let $S$ be any set. Let $\left( +a_{s}\right) _{s\in S}$ be a finitely supported $\mathbb{A}$-valued $S$-family. + +\textbf{(a)} If $T$ is a finite subset of $S$ such that +(\ref{eq.def.ind.gen-com.fin-sup.sum.0}) holds, then the sum $\sum_{s\in +T}a_{s}$ does not depend on the choice of $T$. (That is, if $T_{1}$ and +$T_{2}$ are two finite subsets $T$ of $S$ satisfying +(\ref{eq.def.ind.gen-com.fin-sup.sum.0}), then $\sum_{s\in T_{1}}a_{s}% +=\sum_{s\in T_{2}}a_{s}$.) + +\textbf{(b)} If the set $S$ is finite, then the sum $\sum_{s\in S}a_{s}$ +defined in Definition \ref{def.ind.gen-com.fin-sup.sum} is identical with the +sum $\sum_{s\in S}a_{s}$ defined in Definition \ref{def.ind.gen-com.defsum1}. +(Thus, Definition \ref{def.ind.gen-com.fin-sup.sum} does not conflict with the +previous definition of $\sum_{s\in S}a_{s}$ for finite sets $S$.) +\end{proposition} + +Proposition \ref{prop.ind.gen-com.fin-sup.leg} is fairly easy to prove using +Corollary \ref{cor.ind.gen-com.drop0}; we leave this argument to the reader. + +Most properties of finite sums have analogues for sums of finitely supported +$\mathbb{A}$-valued $S$-families. For example, here is an analogue of Theorem +\ref{thm.ind.gen-com.sum(a+b)}: + +\begin{theorem} +\label{thm.ind.gen-com.sum(a+b).gen}Let $S$ be a set. Let $\left( +a_{s}\right) _{s\in S}$ and $\left( b_{s}\right) _{s\in S}$ be two finitely +supported $\mathbb{A}$-valued $S$-families. Then, the $\mathbb{A}$-valued +$S$-family $\left( a_{s}+b_{s}\right) _{s\in S}$ is finitely supported as +well, and we have% +\[ +\sum_{s\in S}\left( a_{s}+b_{s}\right) =\sum_{s\in S}a_{s}+\sum_{s\in +S}b_{s}. +\] + +\end{theorem} + +The proof of Theorem \ref{thm.ind.gen-com.sum(a+b).gen} is fairly simple (it +relies prominently on the fact that the union of two finite sets is finite), +and is left to the reader. + +It is also easy to state and prove analogues of Theorem +\ref{thm.ind.gen-com.sum(la)} and Theorem \ref{thm.ind.gen-com.sum(0)}. + +We can next prove (\ref{eq.sum.sheph}) in full generality (not only when $W$ +is finite): + +\begin{theorem} +\label{thm.ind.gen-com.sheph}Let $S$ be a finite set. Let $W$ be a set. Let +$f:S\rightarrow W$ be a map. Let $a_{s}$ be an element of $\mathbb{A}$ for +each $s\in S$. Then,% +\[ +\sum_{s\in S}a_{s}=\sum_{w\in W}\sum_{\substack{s\in S;\\f\left( s\right) +=w}}a_{s}. +\] + +\end{theorem} + +Note that the sum on the right hand side of Theorem +\ref{thm.ind.gen-com.sheph} makes sense even when $W$ is infinite, because the +$W$-family $\left( \sum_{\substack{s\in S;\\f\left( s\right) =w}% +}a_{s}\right) _{w\in W}$ is finitely supported (i.e., only finitely many +$w\in W$ satisfy $\sum_{\substack{s\in S;\\f\left( s\right) =w}}a_{s}\neq +0$). The easiest way to prove Theorem \ref{thm.ind.gen-com.sheph} is probably +by reducing it to Theorem \ref{thm.ind.gen-com.shephf} (since $f\left( +S\right) $ is a finite subset of $W$, and every $w\in W\setminus f\left( +S\right) $ satisfies $\sum_{\substack{s\in S;\\f\left( s\right) =w}% +}a_{s}=\left( \text{empty sum}\right) =0$). Again, we leave the details to +the interested reader. + +\begin{noncompile} +We can now prove Theorem \ref{thm.ind.gen-com.sheph} itself: + +\begin{proof} +[Proof of Theorem \ref{thm.ind.gen-com.sheph}.]Let $V$ be the subset $f\left( +S\right) $ of $W$. (Thus, $V=f\left( S\right) =\left\{ f\left( s\right) +\ \mid\ s\in S\right\} $.) The set $f\left( S\right) $ is finite (since the +set $S$ is finite). In other words, the set $V$ is finite (since $V=f\left( +S\right) $). + +Define a map $g:S\rightarrow V$ by% +\[ +\left( g\left( s\right) =f\left( s\right) \text{ for each }s\in S\right) +. +\] +(This map $g$ is well-defined, since each $s\in S$ satisfies $f\left( +s\right) \in f\left( S\right) =V$.) + +Theorem \ref{lem.ind.gen-com.shephf} (applied to $V$ and $g$ instead of $W$ +and $f$) yields% +\[ +\sum_{s\in S}a_{s}=\sum_{w\in V}\sum_{\substack{s\in S;\\g\left( s\right) +=w}}a_{s}. +\] + + +(....) +\end{proof} +\end{noncompile} + +Actually, Theorem \ref{thm.ind.gen-com.sheph} can be generalized even further: + +\begin{theorem} +\label{thm.ind.gen-com.sheph-gen}Let $S$ be a set. Let $W$ be a set. Let +$f:S\rightarrow W$ be a map. Let $\left( a_{s}\right) _{s\in S}$ be a +finitely supported $\mathbb{A}$-valued $S$-family. Then, for each $w\in W$, +the $\mathbb{A}$-valued $\left\{ t\in S\ \mid\ f\left( t\right) =w\right\} +$-family $\left( a_{s}\right) _{s\in\left\{ t\in S\ \mid\ f\left( +t\right) =w\right\} }$ is finitely supported as well (so that the sum +$\sum_{\substack{s\in S;\\f\left( s\right) =w}}a_{s}$ is well-defined). +Furthermore, the $\mathbb{A}$-valued $W$-family $\left( \sum_{\substack{s\in +S;\\f\left( s\right) =w}}a_{s}\right) _{w\in W}$ is also finitely +supported. Finally,% +\[ +\sum_{s\in S}a_{s}=\sum_{w\in W}\sum_{\substack{s\in S;\\f\left( s\right) +=w}}a_{s}. +\] + +\end{theorem} + +It is not hard to derive this theorem from Theorem \ref{thm.ind.gen-com.sheph}% +. This theorem can be used to obtain an analogue of Theorem +\ref{thm.ind.gen-com.split2} for finitely supported $\mathbb{A}$-valued $S$-families. + +Thus, we have defined the values of certain infinite sums (although not nearly +as many infinite sums as analysis can make sense of). We can similarly define +the values of certain infinite products: In order for $\prod_{s\in S}a_{s}$ to +be well-defined, it suffices that \textbf{only finitely many among the }% +$a_{s}$ \textbf{are distinct from }$1$ (or, more rigorously: only finitely +many $s\in S$ satisfy $a_{s}\neq1$). We leave the details and properties of +this definition to the reader. + +\subsection{Two-sided induction} + +\subsubsection{The principle of two-sided induction} + +Let us now return to studying induction principles. We have seen several +induction principles that allow us to prove statements about nonnegative +integers, integers in $\mathbb{Z}_{\geq g}$ or integers in an interval. What +about proving statements about \textbf{arbitrary} integers? The induction +principles we have seen so far do not suffice to prove such statements +directly, since our induction steps always \textquotedblleft go +up\textquotedblright\ (in the sense that they begin by assuming that our +statement $\mathcal{A}\left( k\right) $ holds for some integers $k$, and +involve proving that it also holds for a \textbf{larger} value of $k$), but it +is impossible to traverse all the integers by starting at some integer $g$ and +going up (you will never get to $g-1$ this way). In contrast, the following +induction principle includes both an \textquotedblleft +upwards\textquotedblright\ and a \textquotedblleft downwards\textquotedblright% +\ induction step, which makes it suited for proving statements about all integers: + +\begin{theorem} +\label{thm.ind.IPg+-}Let $g\in\mathbb{Z}$. Let $\mathbb{Z}_{\leq g}$ be the +set $\left\{ g,g-1,g-2,\ldots\right\} $ (that is, the set of all integers +that are $\leq g$). + +For each $n\in\mathbb{Z}$, let $\mathcal{A}\left( n\right) $ be a logical statement. + +Assume the following: + +\begin{statement} +\textit{Assumption 1:} The statement $\mathcal{A}\left( g\right) $ holds. +\end{statement} + +\begin{statement} +\textit{Assumption 2:} If $m\in\mathbb{Z}_{\geq g}$ is such that +$\mathcal{A}\left( m\right) $ holds, then $\mathcal{A}\left( m+1\right) $ +also holds. +\end{statement} + +\begin{statement} +\textit{Assumption 3:} If $m\in\mathbb{Z}_{\leq g}$ is such that +$\mathcal{A}\left( m\right) $ holds, then $\mathcal{A}\left( m-1\right) $ +also holds. +\end{statement} + +Then, $\mathcal{A}\left( n\right) $ holds for each $n\in\mathbb{Z}$. +\end{theorem} + +Theorem \ref{thm.ind.IPg+-} is known as the \textit{principle of two-sided +induction}. Roughly speaking, a proof using Theorem \ref{thm.ind.IPg+-} will +involve two induction steps: one that \textquotedblleft goes +up\textquotedblright\ (proving that Assumption 2 holds), and one that +\textquotedblleft goes down\textquotedblright\ (proving that Assumption 3 +holds). However, in practice, Theorem \ref{thm.ind.IPg+-} is seldom used, +which is why we shall not make any conventions about how to write proofs using +Theorem \ref{thm.ind.IPg+-}. We will only give one example for such a proof. + +Let us first prove Theorem \ref{thm.ind.IPg+-} itself: + +\begin{proof} +[Proof of Theorem \ref{thm.ind.IPg+-}.]Assumptions 1 and 2 of Theorem +\ref{thm.ind.IPg+-} are exactly Assumptions 1 and 2 of Theorem +\ref{thm.ind.IPg}. Hence, Assumptions 1 and 2 of Theorem \ref{thm.ind.IPg} +hold (since Assumptions 1 and 2 of Theorem \ref{thm.ind.IPg+-} hold). Thus, +Theorem \ref{thm.ind.IPg} shows that% +\begin{equation} +\mathcal{A}\left( n\right) \text{ holds for each }n\in\mathbb{Z}_{\geq g}. +\label{pf.thm.ind.IPg+-.pospart}% +\end{equation} + + +On the other hand, for each $n\in\mathbb{Z}$, we define a logical statement +$\mathcal{B}\left( n\right) $ by $\mathcal{B}\left( n\right) +=\mathcal{A}\left( 2g-n\right) $. We shall now consider the Assumptions A +and B of Corollary \ref{cor.ind.IPg.renamed}. + +The definition of $\mathcal{B}\left( g\right) $ yields $\mathcal{B}\left( +g\right) =\mathcal{A}\left( 2g-g\right) =\mathcal{A}\left( g\right) $ +(since $2g-g=g$). Hence, the statement $\mathcal{B}\left( g\right) $ holds +(since the statement $\mathcal{A}\left( g\right) $ holds (by Assumption 1)). +In other words, Assumption A is satisfied. + +Next, let $p\in\mathbb{Z}_{\geq g}$ be such that $\mathcal{B}\left( p\right) +$ holds. We shall show that $\mathcal{B}\left( p+1\right) $ holds. + +Indeed, we have $\mathcal{B}\left( p\right) =\mathcal{A}\left( 2g-p\right) +$ (by the definition of $\mathcal{B}\left( p\right) $). Thus, $\mathcal{A}% +\left( 2g-p\right) $ holds (since $\mathcal{B}\left( p\right) $ holds). +But $p\in\mathbb{Z}_{\geq g}$; hence, $p$ is an integer that is $\geq g$. +Thus, $p\geq g$, so that $2g-\underbrace{p}_{\geq g}\leq2g-g=g$. Hence, $2g-p$ +is an integer that is $\leq g$. In other words, $2g-p\in\mathbb{Z}_{\leq g}$. +Therefore, Assumption 3 (applied to $m=2g-p$) shows that $\mathcal{A}\left( +2g-p-1\right) $ also holds (since $\mathcal{A}\left( 2g-p\right) $ holds). +But the definition of $\mathcal{B}\left( p+1\right) $ yields $\mathcal{B}% +\left( p+1\right) =\mathcal{A}\left( \underbrace{2g-\left( p+1\right) +}_{=2g-p-1}\right) =\mathcal{A}\left( 2g-p-1\right) $. Hence, +$\mathcal{B}\left( p+1\right) $ holds (since $\mathcal{A}\left( +2g-p-1\right) $ holds). + +Now, forget that we fixed $p$. We thus have shown that if $p\in\mathbb{Z}% +_{\geq g}$ is such that $\mathcal{B}\left( p\right) $ holds, then +$\mathcal{B}\left( p+1\right) $ also holds. In other words, Assumption B is satisfied. + +We now have shown that both Assumptions A and B are satisfied. Hence, +Corollary \ref{cor.ind.IPg.renamed} shows that% +\begin{equation} +\mathcal{B}\left( n\right) \text{ holds for each }n\in\mathbb{Z}_{\geq g}. +\label{pf.thm.ind.IPg+-.negpart1}% +\end{equation} + + +Now, let $n\in\mathbb{Z}$. We shall prove that $\mathcal{A}\left( n\right) $ holds. + +Indeed, we have either $n\geq g$ or $nq_{2}$, so that $q_{1}-q_{2}>0$. Hence, $q_{1}-q_{2}\geq1$ (since +$q_{1}-q_{2}$ is an integer). Therefore, $q_{1}-q_{2}-1\geq0$, so that +$N\left( q_{1}-q_{2}-1\right) \geq0$ (because $N>0$ and $q_{1}-q_{2}-1\geq0$). + +But $q_{1}N+r_{1}=q_{2}N+r_{2}$, so that $q_{2}N+r_{2}=q_{1}N+r_{1}$. Hence,% +\[ +r_{2}-r_{1}=q_{1}N-q_{2}N=N\left( q_{1}-q_{2}\right) =\underbrace{N\left( +q_{1}-q_{2}-1\right) }_{\geq0}+N\geq N. +\] +This contradicts $r_{2}-r_{1}0$ and +$d\mid a$. Then, $a\geq d$. +\end{lemma} + +\begin{proof} +[Proof of Lemma \ref{lem.ind.divi-geq}.]We must prove that $a\geq d$. If +$d\leq0$, then this is obvious (because if $d\leq0$, then $a>0\geq d$). Hence, +for the rest of this proof, we WLOG assume that we don't have $d\leq0$. Thus, +we have $d>0$. + +We have $d\mid a$. In other words, there exists an integer $w$ such that +$a=dw$ (by the definition of \textquotedblleft divides\textquotedblright). +Consider this $w$. If we had $w\leq0$, then we would have $dw\leq0$ (because +$d>0$ and $w\leq0$), which would contradict $dw=a>0$. Thus, we cannot have +$w\leq0$. Hence, we have $w>0$. Thus, $w\geq1$ (since $w$ is an integer). In +other words, $w-1\geq0$. From $d>0$ and $w-1\geq0$, we obtain $d\left( +w-1\right) \geq0$. Now, $a=dw=\underbrace{d\left( w-1\right) }_{\geq +0}+d\geq d$. This proves Lemma \ref{lem.ind.divi-geq}. +\end{proof} +\end{noncompile} + +\begin{proof} +[Proof of Theorem \ref{thm.ind.quo-rem}.]Proposition \ref{prop.ind.quo-rem-ex} +shows that there exist $q\in\mathbb{Z}$ and $r\in\left\{ 0,1,\ldots +,N-1\right\} $ such that $n=qN+r$. Consider these $q$ and $r$, and denote +them by $q_{0}$ and $r_{0}$. Thus, $q_{0}\in\mathbb{Z}$ and $r_{0}\in\left\{ +0,1,\ldots,N-1\right\} $ and $n=q_{0}N+r_{0}$. From $q_{0}\in\mathbb{Z}$ and +$r_{0}\in\left\{ 0,1,\ldots,N-1\right\} $, we obtain $\left( q_{0}% +,r_{0}\right) \in\mathbb{Z}\times\left\{ 0,1,\ldots,N-1\right\} $. Hence, +there exists \textbf{at least one} pair $\left( q,r\right) \in +\mathbb{Z}\times\left\{ 0,1,\ldots,N-1\right\} $ such that $n=qN+r$ (namely, +$\left( q,r\right) =\left( q_{0},r_{0}\right) $). + +Now, let $\left( q_{1},r_{1}\right) $ and $\left( q_{2},r_{2}\right) $ be +two pairs $\left( q,r\right) \in\mathbb{Z}\times\left\{ 0,1,\ldots +,N-1\right\} $ such that $n=qN+r$. We shall prove that $\left( q_{1}% +,r_{1}\right) =\left( q_{2},r_{2}\right) $. + +We have assumed that $\left( q_{1},r_{1}\right) $ is a pair $\left( +q,r\right) \in\mathbb{Z}\times\left\{ 0,1,\ldots,N-1\right\} $ such that +$n=qN+r$. In other words, $\left( q_{1},r_{1}\right) $ is a pair in +$\mathbb{Z}\times\left\{ 0,1,\ldots,N-1\right\} $ and satisfies +$n=q_{1}N+r_{1}$. Similarly, $\left( q_{2},r_{2}\right) $ is a pair in +$\mathbb{Z}\times\left\{ 0,1,\ldots,N-1\right\} $ and satisfies +$n=q_{2}N+r_{2}$. + +Hence, $q_{1}N+r_{1}=n=q_{2}N+r_{2}$. Thus, Lemma \ref{lem.ind.quo-rem-uni} +yields $\left( q_{1},r_{1}\right) =\left( q_{2},r_{2}\right) $. + +Let us now forget that we fixed $\left( q_{1},r_{1}\right) $ and $\left( +q_{2},r_{2}\right) $. We thus have shown that if $\left( q_{1},r_{1}\right) +$ and $\left( q_{2},r_{2}\right) $ are two pairs $\left( q,r\right) +\in\mathbb{Z}\times\left\{ 0,1,\ldots,N-1\right\} $ such that $n=qN+r$, then +$\left( q_{1},r_{1}\right) =\left( q_{2},r_{2}\right) $. In other words, +any two pairs $\left( q,r\right) \in\mathbb{Z}\times\left\{ 0,1,\ldots +,N-1\right\} $ such that $n=qN+r$ must be equal. In other words, there exists +\textbf{at most one} pair $\left( q,r\right) \in\mathbb{Z}\times\left\{ +0,1,\ldots,N-1\right\} $ such that $n=qN+r$. Since we also know that there +exists \textbf{at least one} such pair, we can therefore conclude that there +exists \textbf{exactly one} such pair. In other words, there is a unique pair +$\left( q,r\right) \in\mathbb{Z}\times\left\{ 0,1,\ldots,N-1\right\} $ +such that $n=qN+r$. This proves Theorem \ref{thm.ind.quo-rem}. +\end{proof} + +\begin{definition} +Let $N$ be a positive integer. Let $n\in\mathbb{Z}$. Theorem +\ref{thm.ind.quo-rem} says that there is a unique pair $\left( q,r\right) +\in\mathbb{Z}\times\left\{ 0,1,\ldots,N-1\right\} $ such that $n=qN+r$. +Consider this pair $\left( q,r\right) $. Then, $q$ is called the +\textit{quotient of the division of }$n$ \textit{by }$N$ (or the +\textit{quotient obtained when }$n$ \textit{is divided by }$N$), whereas $r$ +is called the \textit{remainder of the division of }$n$ \textit{by }$N$ (or +the \textit{remainder obtained when }$n$ \textit{is divided by }$N$). +\end{definition} + +For example, the quotient of the division of $7$ by $3$ is $2$, whereas the +remainder of the division of $7$ by $3$ is $1$ (because $\left( 2,1\right) $ +is a pair in $\mathbb{Z}\times\left\{ 0,1,2\right\} $ such that +$7=2\cdot3+1$). + +The following basic property + +\begin{corollary} +\label{cor.ind.quo-rem.remmod}Let $N$ be a positive integer. Let +$n\in\mathbb{Z}$. Let $n\%N$ denote the remainder of the division of $n$ by +$N$. + +\textbf{(a)} Then, $n\%N\in\left\{ 0,1,\ldots,N-1\right\} $ and $n\%N\equiv +n\operatorname{mod}N$. + +\textbf{(b)} We have $N\mid n$ if and only if $n\%N=0$. + +\textbf{(c)} Let $c\in\left\{ 0,1,\ldots,N-1\right\} $ be such that $c\equiv +n\operatorname{mod}N$. Then, $c=n\%N$. +\end{corollary} + +\begin{proof} +[Proof of Corollary \ref{cor.ind.quo-rem.remmod}.]Theorem +\ref{thm.ind.quo-rem} says that there is a unique pair $\left( q,r\right) +\in\mathbb{Z}\times\left\{ 0,1,\ldots,N-1\right\} $ such that $n=qN+r$. +Consider this pair $\left( q,r\right) $. Then, the remainder of the division +of $n$ by $N$ is $r$ (because this is how this remainder was defined). In +other words, $n\%N$ is $r$ (since $n\%N$ is the remainder of the division of +$n$ by $N$). Thus, $n\%N=r$. But $N\mid qN$ (since $q$ is an integer), so that +$qN\equiv0\operatorname{mod}N$. Hence, $\underbrace{qN}_{\equiv +0\operatorname{mod}N}+r\equiv0+r=r\operatorname{mod}N$. Hence, $r\equiv +qN+r=n\operatorname{mod}N$, so that $n\%N=r\equiv n\operatorname{mod}N$. +Furthermore, $n\%N=r\in\left\{ 0,1,\ldots,N-1\right\} $ (since $\left( +q,r\right) \in\mathbb{Z}\times\left\{ 0,1,\ldots,N-1\right\} $). This +completes the proof of Corollary \ref{cor.ind.quo-rem.remmod} \textbf{(a)}. + +\textbf{(b)} We have the following implication:% +\begin{equation} +\left( N\mid n\right) \Longrightarrow\left( n\%N=0\right) . +\label{pf.cor.ind.quo-rem.remmod.1}% +\end{equation} + + +[\textit{Proof of (\ref{pf.cor.ind.quo-rem.remmod.1}):} Assume that $N\mid n$. +We must prove that $n\%N=0$. + +We have $N\mid n$. In other words, there exists some integer $w$ such that +$n=Nw$. Consider this $w$. + +We have $N-1\in\mathbb{N}$ (since $N$ is a positive integer), thus +$0\in\left\{ 0,1,\ldots,N-1\right\} $. From $w\in\mathbb{Z}$ and +$0\in\left\{ 0,1,\ldots,N-1\right\} $, we obtain $\left( w,0\right) +\in\mathbb{Z}\times\left\{ 0,1,\ldots,N-1\right\} $. Also, +$wN+0=wN=Nw=n=qN+r$. Hence, Lemma \ref{lem.ind.quo-rem-uni} (applied to +$\left( q_{1},r_{1}\right) =\left( w,0\right) $ and $\left( q_{2}% +,r_{2}\right) =\left( q,r\right) $) yields $\left( w,0\right) =\left( +q,r\right) $. In other words, $w=q$ and $0=r$. Hence, $r=0$, so that +$n\%N=r=0$. This proves the implication (\ref{pf.cor.ind.quo-rem.remmod.1}).] + +Next, we have the following implication:% +\begin{equation} +\left( n\%N=0\right) \Longrightarrow\left( N\mid n\right) . +\label{pf.cor.ind.quo-rem.remmod.2}% +\end{equation} + -The rough idea behind the definition of a polynomial is that a polynomial with -rational coefficients should be a \textquotedblleft formal -expression\textquotedblright\ which is built out of rational numbers, an -\textquotedblleft indeterminate\textquotedblright\ $X$ as well as addition, -subtraction and multiplication signs, such as $X^{4}-27X+\dfrac{3}{2}$ or -$-X^{3}+2X+1$ or $\dfrac{1}{3}\left( X-3\right) \cdot X^{2}$ or -$X^{4}+7X^{3}\left( X-2\right) $ or $-15$. We have not explicitly allowed -powers, but we understand $X^{n}$ to mean the product $\underbrace{XX\cdots -X}_{n\text{ times}}$ (or $1$ when $n=0$). Notice that division is not allowed, -so we cannot get $\dfrac{X}{X+1}$ (but we can get $\dfrac{3}{2}X$, because -$\dfrac{3}{2}$ is a rational number). Notice also that a polynomial can be a -single rational number, since we never said that $X$ must necessarily be used; -for instance, $-15$ and $0$ are polynomials. +[\textit{Proof of (\ref{pf.cor.ind.quo-rem.remmod.2}):} Assume that $n\%N=0$. +We must prove that $N\mid n$. -This is, of course, not a valid definition. One problem with it that it does -not explain what a \textquotedblleft formal expression\textquotedblright\ is. -For starters, we want an expression that is well-defined -- i.e., into that we -can substitute a rational number for $X$ and obtain a valid term. For example, -$X-+\cdot5$ is not well-defined, so it does not fit our bill; neither is the -\textquotedblleft empty expression\textquotedblright. Furthermore, when do we -want two \textquotedblleft formal expressions\textquotedblright\ to be viewed -as one and the same polynomial? Do we want to equate $X\left( X+2\right) $ -with $X^{2}+2X$ ? Do we want to equate $0X^{3}+2X+1$ with $2X+1$ ? The answer -is \textquotedblleft yes\textquotedblright\ both times, but a general rule is -not easy to give if we keep talking of \textquotedblleft formal -expressions\textquotedblright. +We have $n=qN+\underbrace{r}_{=n\%N=0}=qN$. Thus, $N\mid n$. This proves the +implication (\ref{pf.cor.ind.quo-rem.remmod.2}).] -We \textit{could} define two polynomials $p\left( X\right) $ and $q\left( -X\right) $ to be equal if and only if, for every number $\alpha\in\mathbb{Q}% -$, the values $p\left( \alpha\right) $ and $q\left( \alpha\right) $ -(obtained by substituting $\alpha$ for $X$ in $p$ and in $q$, respectively) -are equal. This would be tantamount to treating polynomials as -\textit{functions}: it would mean that we identify a polynomial $p\left( -X\right) $ with the function $\mathbb{Q}\rightarrow\mathbb{Q},\ \alpha\mapsto -p\left( \alpha\right) $. Such a definition would work well as long as we -would do only rather basic things with it\footnote{And some authors, such as -Axler in \cite[Chapter 4]{Axler}, do use this definition.}, but as soon as we -would try to go deeper, we would encounter technical issues which would make -it inadequate and painful\footnote{Here are the three most important among -these issues: -\par -\begin{itemize} -\item One of the strengths of polynomials is that we can evaluate them not -only at numbers, but also at many other things, e.g., at square matrices: -Evaluating the polynomial $X^{2}-3X$ at the square matrix $\left( -\begin{array} -[c]{cc}% -1 & 3\\ --1 & 2 -\end{array} -\right) $ gives $\left( -\begin{array} -[c]{cc}% -1 & 3\\ --1 & 2 -\end{array} -\right) ^{2}-3\left( -\begin{array} -[c]{cc}% -1 & 3\\ --1 & 2 -\end{array} -\right) =\left( -\begin{array} -[c]{cc}% --5 & 0\\ -0 & -5 -\end{array} -\right) $. However, a function must have a well-defined domain, and does not -make sense outside of this domain. So, if the polynomial $X^{2}-3X$ is -regarded as the function $\mathbb{Q}\rightarrow\mathbb{Q},\ \alpha -\mapsto\alpha^{2}-3\alpha$, then it makes no sense to evaluate this polynomial -at the matrix $\left( -\begin{array} -[c]{cc}% -1 & 3\\ --1 & 2 -\end{array} -\right) $, just because this matrix does not lie in the domain $\mathbb{Q}$ -of the function. We could, of course, extend the domain of the function to -(say) the set of square matrices over $\mathbb{Q}$, but then we would still -have the same problem with other things that we want to evaluate polynomials -at. At some point we want to be able to evaluate polynomials at functions and -at other polynomials, and if we would try to achieve this by extending the -domain, we would have to do this over and over, because each time we extend -the domain, we get even more polynomials to evaluate our polynomials at; thus, -the definition would be eternally \textquotedblleft hunting its own -tail\textquotedblright! (We could resolve this difficulty by defining -polynomials as \textit{natural transformations} in the sense of category -theory. I do not want to even go into this definition here, as it would take -several pages to properly introduce. At this point, it is not worth the -hassle.) -\par -\item Let $p\left( X\right) $ be a polynomial with real coefficients. Then, -it should be obvious that $p\left( X\right) $ can also be viewed as a -polynomial with complex coefficients: For instance, if $p\left( X\right) $ -was defined as $3X+\dfrac{7}{2}X\left( X-1\right) $, then we can view the -numbers $3$, $\dfrac{7}{2}$ and $-1$ appearing in its definition as complex -numbers, and thus get a polynomial with complex coefficients. But wait! What -if two polynomials $p\left( X\right) $ and $q\left( X\right) $ are equal -when viewed as polynomials with real coefficients, but when viewed as -polynomials with complex coefficients become distinct (because when we view -them as polynomials with complex coefficients, their domains become extended, -and a new complex $\alpha$ might perhaps no longer satisfy $p\left( -\alpha\right) =q\left( \alpha\right) $ )? This does not actually happen, -but ruling this out is not obvious if you regard polynomials as functions. -\par -\item (This requires some familiarity with finite fields:) Treating -polynomials as functions works reasonably well for polynomials with integer, -rational, real and complex coefficients (as long as one is not too demanding). -But we will eventually want to consider polynomials with coefficients in any -arbitrary commutative ring $\mathbb{K}$. An example for a commutative ring -$\mathbb{K}$ is the finite field $\mathbb{F}_{p}$ with $p$ elements, where $p$ -is a prime. (This finite field $\mathbb{F}_{p}$ is better known as the ring of -integers modulo $p$.) If we define polynomials with coefficients in -$\mathbb{F}_{p}$ as functions $\mathbb{F}_{p}\rightarrow\mathbb{F}_{p}$, then -we really run into problems; for example, the polynomials $X$ and $X^{p}$ over -this field become identical as functions! -\end{itemize} -}. Also, if we equated polynomials with the functions they describe, then we -would waste the word \textquotedblleft polynomial\textquotedblright\ on a -concept (a function described by a polynomial) that already has a word for it -(namely, \textit{polynomial function}). +Combining the two implications (\ref{pf.cor.ind.quo-rem.remmod.1}) and +(\ref{pf.cor.ind.quo-rem.remmod.2}), we obtain the logical equivalence +$\left( N\mid n\right) \Longleftrightarrow\left( n\%N=0\right) $. In other +words, we have $N\mid n$ if and only if $n\%N=0$. This proves Corollary +\ref{cor.ind.quo-rem.remmod} \textbf{(b)}. -The preceding paragraphs should have convinced you that it is worth defining -\textquotedblleft polynomials\textquotedblright\ in a way that, on the one -hand, conveys the concept that they are more \textquotedblleft formal -expressions\textquotedblright\ than \textquotedblleft -functions\textquotedblright, but on the other hand, is less nebulous than -\textquotedblleft formal expression\textquotedblright. Here is one such definition: +\textbf{(c)} We have $c\equiv n\operatorname{mod}N$. In other words, $N\mid +c-n$. In other words, there exists some integer $w$ such that $c-n=Nw$. +Consider this $w$. + +From $-w\in\mathbb{Z}$ and $c\in\left\{ 0,1,\ldots,N-1\right\} $, we obtain +$\left( -w,c\right) \in\mathbb{Z}\times\left\{ 0,1,\ldots,N-1\right\} $. +Also, from $c-n=Nw$, we obtain $n=c-Nw=\left( -w\right) N+c$, so that +$\left( -w\right) N+c=n=qN+r$. Hence, Lemma \ref{lem.ind.quo-rem-uni} +(applied to $\left( q_{1},r_{1}\right) =\left( -w,c\right) $ and $\left( +q_{2},r_{2}\right) =\left( q,r\right) $) yields $\left( -w,c\right) +=\left( q,r\right) $. In other words, $-w=q$ and $c=r$. Hence, $c=r=n\%N$. +This proves Corollary \ref{cor.ind.quo-rem.remmod} \textbf{(c)}. +\end{proof} + +Note that parts \textbf{(a)} and \textbf{(c)} of Corollary +\ref{cor.ind.quo-rem.remmod} (taken together) characterize the remainder +$n\%N$ as the unique element of $\left\{ 0,1,\ldots,N-1\right\} $ that is +congruent to $n$ modulo $N$. Corollary \ref{cor.ind.quo-rem.remmod} +\textbf{(b)} provides a simple algorithm to check whether a given integer $n$ +is divisible by a given positive integer $N$; namely, it suffices to compute +the remainder $n\%N$ and check whether $n\%N=0$. + +Let us further illustrate the usefulness of Theorem \ref{thm.ind.quo-rem} by +proving a fundamental property of odd numbers. Recall the following standard definitions: \begin{definition} -\label{def.polynomial-univar}\textbf{(a)} A \textit{univariate polynomial with -rational coefficients} means a sequence $\left( p_{0},p_{1},p_{2}% -,\ldots\right) \in\mathbb{Q}^{\infty}$ of elements of $\mathbb{Q}$ such that% +Let $n\in\mathbb{Z}$. + +\textbf{(a)} We say that the integer $n$ is \textit{even} if and only if $n$ +is divisible by $2$. + +\textbf{(b)} We say that the integer $n$ is \textit{odd} if and only if $n$ is +not divisible by $2$. +\end{definition} + +This definition shows that any integer $n$ is either even or odd (but not both +at the same time). + +It is clear that an integer $n$ is even if and only if it can be written in +the form $n=2m$ for some $m\in\mathbb{Z}$. Moreover, this $m$ is unique +(because $n=2m$ implies $m=n/2$). Let us prove a similar property for odd numbers: + +\begin{proposition} +\label{prop.ind.quo-rem.odd}Let $n\in\mathbb{Z}$. + +\textbf{(a)} The integer $n$ is odd if and only if $n$ can be written in the +form $n=2m+1$ for some $m\in\mathbb{Z}$. + +\textbf{(b)} This $m$ is unique if it exists. (That is, any two integers +$m\in\mathbb{Z}$ satisfying $n=2m+1$ must be equal.) +\end{proposition} + +We shall use Theorem \ref{thm.ind.quo-rem} several times in the below proof +(far more than necessary), mostly to illustrate how it can be applied. + +\begin{proof} +[Proof of Proposition \ref{prop.ind.quo-rem.odd}.]\textbf{(a)} Let us first +prove the logical implication% \begin{equation} -\text{all but finitely many }k\in\mathbb{N}\text{ satisfy }p_{k}=0. -\label{eq.def.polynomial-univar.finite}% +\left( n\text{ is odd}\right) \ \Longrightarrow\ \left( \text{there exists +an }m\in\mathbb{Z}\text{ such that }n=2m+1\right) . +\label{pf.prop.ind.quo-rem.odd.a.1}% \end{equation} -Here, the phrase \textquotedblleft all but finitely many $k\in\mathbb{N}$ -satisfy $p_{k}=0$\textquotedblright\ means \textquotedblleft there exists some -finite subset $J$ of $\mathbb{N}$ such that every $k\in\mathbb{N}\setminus J$ -satisfies $p_{k}=0$\textquotedblright. (See Definition \ref{def.allbutfin} for -the general definition of \textquotedblleft all but finitely -many\textquotedblright, and Section \ref{sect.infperm} for some practice with -this concept.) More concretely, the condition -(\ref{eq.def.polynomial-univar.finite}) can be rewritten as follows: The -sequence $\left( p_{0},p_{1},p_{2},\ldots\right) $ contains only zeroes from -some point on (i.e., there exists some $N\in\mathbb{N}$ such that -$p_{N}=p_{N+1}=p_{N+2}=\cdots=0$). -For the remainder of this definition, \textquotedblleft univariate polynomial -with rational coefficients\textquotedblright\ will be abbreviated as -\textquotedblleft polynomial\textquotedblright. -For example, the sequences $\left( 0,0,0,\ldots\right) $, $\left( -1,3,5,0,0,0,\ldots\right) $, $\left( 4,0,-\dfrac{2}{3},5,0,0,0,\ldots -\right) $, $\left( 0,-1,\dfrac{1}{2},0,0,0,\ldots\right) $ (where the -\textquotedblleft$\ldots$\textquotedblright\ stand for infinitely many zeroes) -are polynomials, but the sequence $\left( 1,1,1,\ldots\right) $ (where the -\textquotedblleft$\ldots$\textquotedblright\ stands for infinitely many $1$'s) -is not (since it does not satisfy (\ref{eq.def.polynomial-univar.finite})). +[\textit{Proof of (\ref{pf.prop.ind.quo-rem.odd.a.1}):} Assume that $n$ is +odd. We must prove that there exists an $m\in\mathbb{Z}$ such that $n=2m+1$. -So we have defined a polynomial as an infinite sequence of rational numbers -with a certain property. So far, this does not seem to reflect any intuition -of polynomials as \textquotedblleft formal expressions\textquotedblright. -However, we shall soon (namely, in Definition \ref{def.polynomial-univar} -\textbf{(j)}) identify the polynomial $\left( p_{0},p_{1},p_{2}% -,\ldots\right) \in\mathbb{Q}^{\infty}$ with the \textquotedblleft formal -expression\textquotedblright\ $p_{0}+p_{1}X+p_{2}X^{2}+\cdots$ (this is an -infinite sum, but due to (\ref{eq.def.polynomial-univar.finite}) all but its -first few terms are $0$ and thus can be neglected). For instance, the -polynomial $\left( 1,3,5,0,0,0,\ldots\right) $ will be identified with the -\textquotedblleft formal expression\textquotedblright\ $1+3X+5X^{2}% -+0X^{3}+0X^{4}+0X^{5}+\cdots=1+3X+5X^{2}$. Of course, we cannot do this -identification right now, since we do not have a reasonable definition of $X$. +Theorem \ref{thm.ind.quo-rem} (applied to $N=2$) yields that there is a unique +pair $\left( q,r\right) \in\mathbb{Z}\times\left\{ 0,1,\ldots,2-1\right\} +$ such that $n=q\cdot2+r$. Consider this $\left( q,r\right) $. From $\left( +q,r\right) \in\mathbb{Z}\times\left\{ 0,1,\ldots,2-1\right\} $, we obtain +$q\in\mathbb{Z}$ and $r\in\left\{ 0,1,\ldots,2-1\right\} =\left\{ +0,1\right\} $. -\textbf{(b)} We let $\mathbb{Q}\left[ X\right] $ denote the set of all -univariate polynomials with rational coefficients. Given a polynomial -$p=\left( p_{0},p_{1},p_{2},\ldots\right) \in\mathbb{Q}\left[ X\right] $, -we denote the numbers $p_{0},p_{1},p_{2},\ldots$ as the \textit{coefficients} -of $p$. More precisely, for every $i\in\mathbb{N}$, we shall refer to $p_{i}$ -as the $i$\textit{-th coefficient} of $p$. (Do not forget that we are counting -from $0$ here: any polynomial \textquotedblleft begins\textquotedblright\ with -its $0$-th coefficient.) The $0$-th coefficient of $p$ is also known as the -\textit{constant term} of $p$. +We know that $n$ is odd; in other words, $n$ is not divisible by $2$ (by the +definition of \textquotedblleft odd\textquotedblright). If we had $n=2q$, then +$n$ would be divisible by $2$, which would contradict the fact that $n$ is not +divisible by $2$. Hence, we cannot have $n=2q$. If we had $r=0$, then we would +have $n=\underbrace{q\cdot2}_{=2q}+\underbrace{r}_{=0}=2q$, which would +contradict the fact that we cannot have $n=2q$. Hence, we cannot have $r=0$. +Thus, $r\neq0$. -Instead of \textquotedblleft the $i$-th coefficient of $p$\textquotedblright, -we often also say \textquotedblleft the \textit{coefficient before }$X^{i}% -$\textit{ of }$p$\textquotedblright\ or \textquotedblleft the -\textit{coefficient of }$X^{i}$ \textit{in }$p$\textquotedblright. +Combining $r\in\left\{ 0,1\right\} $ with $r\neq0$, we obtain $r\in\left\{ +0,1\right\} \setminus\left\{ 0\right\} =\left\{ 1\right\} $. Thus, $r=1$. +Hence, $n=\underbrace{q\cdot2}_{=2q}+\underbrace{r}_{=1}=2q+1$. Thus, there +exists an $m\in\mathbb{Z}$ such that $n=2m+1$ (namely, $m=q$). This proves the +implication (\ref{pf.prop.ind.quo-rem.odd.a.1}).] -Thus, any polynomial $p\in\mathbb{Q}\left[ X\right] $ is the sequence of its coefficients. +Next, we shall prove the logical implication% +\begin{equation} +\left( \text{there exists an }m\in\mathbb{Z}\text{ such that }n=2m+1\right) +\ \Longrightarrow\ \left( n\text{ is odd}\right) . +\label{pf.prop.ind.quo-rem.odd.a.2}% +\end{equation} + + +[\textit{Proof of (\ref{pf.prop.ind.quo-rem.odd.a.2}):} Assume that there +exists an $m\in\mathbb{Z}$ such that $n=2m+1$. We must prove that $n$ is odd. + +We have assumed that there exists an $m\in\mathbb{Z}$ such that $n=2m+1$. +Consider this $m$. Thus, the pair $\left( m,1\right) $ belongs to +$\mathbb{Z}\times\left\{ 0,1,\ldots,2-1\right\} $ (since $m\in\mathbb{Z}$ +and $1\in\left\{ 0,1,\ldots,2-1\right\} $) and satisfies $n=m\cdot2+1$ +(since $n=\underbrace{2m}_{=m\cdot2}+1=m\cdot2+1$). In other words, the pair +$\left( m,1\right) $ is a pair $\left( q,r\right) \in\mathbb{Z}% +\times\left\{ 0,1,\ldots,2-1\right\} $ such that $n=q\cdot2+r$. + +Now, assume (for the sake of contradiction) that $n$ is divisible by $2$. +Thus, there exists some integer $w$ such that $n=2w$. Consider this $w$. Thus, +the pair $\left( w,0\right) $ belongs to $\mathbb{Z}\times\left\{ +0,1,\ldots,2-1\right\} $ (since $w\in\mathbb{Z}$ and $0\in\left\{ +0,1,\ldots,2-1\right\} $) and satisfies $n=w\cdot2+0$ (since $n=2w=w\cdot +2=w\cdot2+0$). In other words, the pair $\left( w,0\right) $ is a pair +$\left( q,r\right) \in\mathbb{Z}\times\left\{ 0,1,\ldots,2-1\right\} $ +such that $n=q\cdot2+r$. + +Theorem \ref{thm.ind.quo-rem} (applied to $N=2$) yields that there is a unique +pair $\left( q,r\right) \in\mathbb{Z}\times\left\{ 0,1,\ldots,2-1\right\} +$ such that $n=q\cdot2+r$. Thus, there exists \textbf{at most} one such pair. +In other words, any two such pairs must be equal. Hence, the two pairs +$\left( m,1\right) $ and $\left( w,0\right) $ must be equal (since +$\left( m,1\right) $ and $\left( w,0\right) $ are two pairs $\left( +q,r\right) \in\mathbb{Z}\times\left\{ 0,1,\ldots,2-1\right\} $ such that +$n=q\cdot2+r$). In other words, $\left( m,1\right) =\left( w,0\right) $. +In other words, $m=w$ and $1=0$. But $1=0$ is clearly absurd. Thus, we have +obtained a contradiction. This shows that our assumption (that $n$ is +divisible by $2$) was wrong. Hence, $n$ is not divisible by $2$. In other +words, $n$ is odd (by the definition of \textquotedblleft +odd\textquotedblright). This proves the implication +(\ref{pf.prop.ind.quo-rem.odd.a.2}).] + +Combining the two implications (\ref{pf.prop.ind.quo-rem.odd.a.1}) and +(\ref{pf.prop.ind.quo-rem.odd.a.2}), we obtain the logical equivalence% +\begin{align*} +\left( n\text{ is odd}\right) \ & \Longleftrightarrow\ \left( \text{there +exists an }m\in\mathbb{Z}\text{ such that }n=2m+1\right) \\ +& \Longleftrightarrow\ \left( n\text{ can be written in the form +}n=2m+1\text{ for some }m\in\mathbb{Z}\right) . +\end{align*} +In other words, the integer $n$ is odd if and only if $n$ can be written in +the form $n=2m+1$ for some $m\in\mathbb{Z}$. This proves Proposition +\ref{prop.ind.quo-rem.odd} \textbf{(a)}. + +\textbf{(b)} This is easy to prove in any way, but let us prove this using +Theorem \ref{thm.ind.quo-rem} just in order to illustrate the use of the +latter theorem. + +We must prove that any two integers $m\in\mathbb{Z}$ satisfying $n=2m+1$ must +be equal. + +Let $m_{1}$ and $m_{2}$ be two integers $m\in\mathbb{Z}$ satisfying $n=2m+1$. +We shall show that $m_{1}=m_{2}$. + +We know that $m_{1}$ is an integer $m\in\mathbb{Z}$ satisfying $n=2m+1$. In +other words, $m_{1}$ is an integer in $\mathbb{Z}$ and satisfies $n=2m_{1}+1$. +Thus, the pair $\left( m_{1},1\right) $ belongs to $\mathbb{Z}\times\left\{ +0,1,\ldots,2-1\right\} $ (since $m_{1}\in\mathbb{Z}$ and $1\in\left\{ +0,1,\ldots,2-1\right\} $) and satisfies $n=m_{1}\cdot2+1$ (since +$n=\underbrace{2m_{1}}_{=m_{1}\cdot2}+1=m_{1}\cdot2+1$). In other words, the +pair $\left( m_{1},1\right) $ is a pair $\left( q,r\right) \in +\mathbb{Z}\times\left\{ 0,1,\ldots,2-1\right\} $ such that $n=q\cdot2+r$. +The same argument (applied to $m_{2}$ instead of $m_{1}$) shows that $\left( +m_{2},1\right) $ is a pair $\left( q,r\right) \in\mathbb{Z}\times\left\{ +0,1,\ldots,2-1\right\} $ such that $n=q\cdot2+r$. + +Theorem \ref{thm.ind.quo-rem} (applied to $N=2$) yields that there is a unique +pair $\left( q,r\right) \in\mathbb{Z}\times\left\{ 0,1,\ldots,2-1\right\} +$ such that $n=q\cdot2+r$. Thus, there exists \textbf{at most} one such pair. +In other words, any two such pairs must be equal. Hence, the two pairs +$\left( m_{1},1\right) $ and $\left( m_{2},1\right) $ must be equal (since +$\left( m_{1},1\right) $ and $\left( m_{2},1\right) $ are two pairs +$\left( q,r\right) \in\mathbb{Z}\times\left\{ 0,1,\ldots,2-1\right\} $ +such that $n=q\cdot2+r$). In other words, $\left( m_{1},1\right) =\left( +m_{2},1\right) $. In other words, $m_{1}=m_{2}$ and $1=1$. Hence, we have +shown that $m_{1}=m_{2}$. + +Now, forget that we fixed $m_{1}$ and $m_{2}$. We thus have proven that if +$m_{1}$ and $m_{2}$ are two integers $m\in\mathbb{Z}$ satisfying $n=2m+1$, +then $m_{1}=m_{2}$. In other words, any two integers $m\in\mathbb{Z}$ +satisfying $n=2m+1$ must be equal. In other words, the $m$ in Proposition +\ref{prop.ind.quo-rem.odd} \textbf{(a)} is unique. This proves Proposition +\ref{prop.ind.quo-rem.odd} \textbf{(b)}. +\end{proof} + +We can use this to obtain the following fundamental fact: -\textbf{(c)} We denote the polynomial $\left( 0,0,0,\ldots\right) -\in\mathbb{Q}\left[ X\right] $ by $\mathbf{0}$. We will also write $0$ for -it when no confusion with the number $0$ is possible. The polynomial -$\mathbf{0}$ is called the \textit{zero polynomial}. A polynomial -$p\in\mathbb{Q}\left[ X\right] $ is said to be \textit{nonzero} if -$p\neq\mathbf{0}$. +\begin{corollary} +\label{cor.mod.-1powers}Let $n\in\mathbb{Z}$. -\textbf{(d)} We denote the polynomial $\left( 1,0,0,0,\ldots\right) -\in\mathbb{Q}\left[ X\right] $ by $\mathbf{1}$. We will also write $1$ for -it when no confusion with the number $1$ is possible. +\textbf{(a)} If $n$ is even, then $\left( -1\right) ^{n}=1$. -\textbf{(e)} For any $\lambda\in\mathbb{Q}$, we denote the polynomial $\left( -\lambda,0,0,0,\ldots\right) \in\mathbb{Q}\left[ X\right] $ by -$\operatorname*{const}\lambda$. We call it the \textit{constant polynomial -with value }$\lambda$. It is often useful to identify $\lambda\in\mathbb{Q}$ -with $\operatorname*{const}\lambda\in\mathbb{Q}\left[ X\right] $. Notice -that $\mathbf{0}=\operatorname*{const}0$ and $\mathbf{1}=\operatorname*{const}% -1$. +\textbf{(b)} If $n$ is odd, then $\left( -1\right) ^{n}=-1$. +\end{corollary} -\textbf{(f)} Now, let us define the sum, the difference and the product of two -polynomials. Indeed, let $a=\left( a_{0},a_{1},a_{2},\ldots\right) -\in\mathbb{Q}\left[ X\right] $ and $b=\left( b_{0},b_{1},b_{2}% -,\ldots\right) \in\mathbb{Q}\left[ X\right] $ be two polynomials. Then, we -define three polynomials $a+b$, $a-b$ and $a\cdot b$ in $\mathbb{Q}\left[ -X\right] $ by% -\begin{align*} -a+b & =\left( a_{0}+b_{0},a_{1}+b_{1},a_{2}+b_{2},\ldots\right) ;\\ -a-b & =\left( a_{0}-b_{0},a_{1}-b_{1},a_{2}-b_{2},\ldots\right) ;\\ -a\cdot b & =\left( c_{0},c_{1},c_{2},\ldots\right) , -\end{align*} -where% +\begin{proof} +[Proof of Corollary \ref{cor.mod.-1powers}.]\textbf{(a)} Assume that $n$ is +even. In other words, $n$ is divisible by $2$ (by the definition of +\textquotedblleft even\textquotedblright). In other words, $2\mid n$. In other +words, there exists an integer $w$ such that $n=2w$. Consider this $w$. From +$n=2w$, we obtain $\left( -1\right) ^{n}=\left( -1\right) ^{2w}=\left( +\underbrace{\left( -1\right) ^{2}}_{=1}\right) ^{w}=1^{w}=1$. This proves +Corollary \ref{cor.mod.-1powers} \textbf{(a)}. + +\textbf{(b)} Assume that $n$ is odd. Proposition \ref{prop.ind.quo-rem.odd} +\textbf{(a)} shows that the integer $n$ is odd if and only if $n$ can be +written in the form $n=2m+1$ for some $m\in\mathbb{Z}$. Hence, $n$ can be +written in the form $n=2m+1$ for some $m\in\mathbb{Z}$ (since the integer $n$ +is odd). Consider this $m$. From $n=2m+1$, we obtain% \[ -c_{k}=\sum_{i=0}^{k}a_{i}b_{k-i}\ \ \ \ \ \ \ \ \ \ \text{for every }% -k\in\mathbb{N}. +\left( -1\right) ^{n}=\left( -1\right) ^{2m+1}=\left( -1\right) +^{2m}\left( -1\right) =-\underbrace{\left( -1\right) ^{2m}}_{=\left( +\left( -1\right) ^{2}\right) ^{m}}=-\left( \underbrace{\left( -1\right) +^{2}}_{=1}\right) ^{m}=-\underbrace{1^{m}}_{=1}=-1. \] -We call $a+b$ the \textit{sum} of $a$ and $b$; we call $a-b$ the -\textit{difference} of $a$ and $b$; we call $a\cdot b$ the \textit{product} of -$a$ and $b$. We abbreviate $a\cdot b$ by $ab$. +This proves Corollary \ref{cor.mod.-1powers} \textbf{(b)}. +\end{proof} -For example,% +Let us state one more fundamental fact, which follows easily from Corollary +\ref{cor.mod.-1powers} (the details are left to the reader): + +\begin{proposition} +\label{prop.mod.parity}Let $u$ and $v$ be two integers. Then, we have the +following chain of logical equivalences:% \begin{align*} -\left( 1,2,2,0,0,\ldots\right) +\left( 3,0,-1,0,0,0,\ldots\right) & -=\left( 4,2,1,0,0,0,\ldots\right) ;\\ -\left( 1,2,2,0,0,\ldots\right) -\left( 3,0,-1,0,0,0,\ldots\right) & -=\left( -2,2,3,0,0,0,\ldots\right) ;\\ -\left( 1,2,2,0,0,\ldots\right) \cdot\left( 3,0,-1,0,0,0,\ldots\right) & -=\left( 3,6,5,-2,-2,0,0,0,\ldots\right) . +\left( u\equiv v\operatorname{mod}2\right) \ & \Longleftrightarrow +\ \left( u\text{ and }v\text{ are either both even or both odd}\right) \\ +& \Longleftrightarrow\ \left( \left( -1\right) ^{u}=\left( -1\right) +^{v}\right) . \end{align*} +\end{proposition} -The definition of $a+b$ essentially says that \textquotedblleft polynomials -are added coefficientwise\textquotedblright\ (i.e., in order to obtain the sum -of two polynomials $a$ and $b$, it suffices to add each coefficient of $a$ to -the corresponding coefficient of $b$). Similarly, the definition of $a-b$ says -the same thing about subtraction. The definition of $a\cdot b$ is more -surprising. However, it loses its mystique when we identify the polynomials -$a$ and $b$ with the \textquotedblleft formal expressions\textquotedblright% -\ $a_{0}+a_{1}X+a_{2}X^{2}+\cdots$ and $b_{0}+b_{1}X+b_{2}X^{2}+\cdots$ -(although, at this point, we do not know what these expressions really mean); -indeed, it simply says that -\[ -\left( a_{0}+a_{1}X+a_{2}X^{2}+\cdots\right) \left( b_{0}+b_{1}X+b_{2}% -X^{2}+\cdots\right) =c_{0}+c_{1}X+c_{2}X^{2}+\cdots, -\] -where $c_{k}=\sum_{i=0}^{k}a_{i}b_{k-i}$ for every $k\in\mathbb{N}$. This is -precisely what one would expect, because if you expand $\left( a_{0}% -+a_{1}X+a_{2}X^{2}+\cdots\right) \left( b_{0}+b_{1}X+b_{2}X^{2}% -+\cdots\right) $ using the distributive law and collect equal powers of $X$, -then you get precisely $c_{0}+c_{1}X+c_{2}X^{2}+\cdots$. Thus, the definition -of $a\cdot b$ has been tailored to make the distributive law hold. +\subsection{Induction from $k-1$ to $k$} -(By the way, why is $a\cdot b$ a polynomial? That is, why does it satisfy -(\ref{eq.def.polynomial-univar.finite}) ? The proof is easy, but we omit it.) +\subsubsection{The principle} -Addition, subtraction and multiplication of polynomials satisfy some of the -same rules as addition, subtraction and multiplication of numbers. For -example, the commutative laws $a+b=b+a$ and $ab=ba$ are valid for polynomials -just as they are for numbers; same holds for the associative laws $\left( -a+b\right) +c=a+\left( b+c\right) $ and $\left( ab\right) c=a\left( -bc\right) $ and the distributive laws $\left( a+b\right) c=ac+bc$ and -$a\left( b+c\right) =ab+ac$. +Let us next show yet another \textquotedblleft alternative induction +principle\textquotedblright, which differs from Theorem \ref{thm.ind.IPg} in a +mere notational detail: -The set $\mathbb{Q}\left[ X\right] $, endowed with the operations $+$ and -$\cdot$ just defined, and with the elements $\mathbf{0}$ and $\mathbf{1}$, is -a commutative ring (where we are using the notations of Definition -\ref{def.commring}). It is called the \textit{(univariate) polynomial ring -over }$\mathbb{Q}$. +\begin{theorem} +\label{thm.ind.IPg-1}Let $g\in\mathbb{Z}$. For each $n\in\mathbb{Z}_{\geq g}$, +let $\mathcal{A}\left( n\right) $ be a logical statement. -\textbf{(g)} Let $a=\left( a_{0},a_{1},a_{2},\ldots\right) \in -\mathbb{Q}\left[ X\right] $ and $\lambda\in\mathbb{Q}$. Then, $\lambda a$ -denotes the polynomial $\left( \lambda a_{0},\lambda a_{1},\lambda -a_{2},\ldots\right) \in\mathbb{Q}\left[ X\right] $. (This equals the -polynomial $\left( \operatorname*{const}\lambda\right) \cdot a$; thus, -identifying $\lambda$ with $\operatorname*{const}\lambda$ does not cause any -inconsistencies here.) +Assume the following: -\textbf{(h)} If $p=\left( p_{0},p_{1},p_{2},\ldots\right) \in\mathbb{Q}% -\left[ X\right] $ is a nonzero polynomial, then the \textit{degree} of $p$ -is defined to be the maximum $i\in\mathbb{N}$ satisfying $p_{i}\neq0$. If -$p\in\mathbb{Q}\left[ X\right] $ is the zero polynomial, then the degree of -$p$ is defined to be $-\infty$. (Here, $-\infty$ is just a fancy symbol, not a -number.) For example, $\deg\left( 1,4,0,-1,0,0,0,\ldots\right) =3$. +\begin{statement} +\textit{Assumption 1:} The statement $\mathcal{A}\left( g\right) $ holds. +\end{statement} -\textbf{(i)} If $a=\left( a_{0},a_{1},a_{2},\ldots\right) \in\mathbb{Q}% -\left[ X\right] $ and $n\in\mathbb{N}$, then a polynomial $a^{n}% -\in\mathbb{Q}\left[ X\right] $ is defined to be the product -$\underbrace{aa\cdots a}_{n\text{ times}}$. (This is understood to be -$\mathbf{1}$ when $n=0$. In general, an empty product of polynomials is always -understood to be $\mathbf{1}$.) +\begin{statement} +\textit{Assumption 2:} If $k\in\mathbb{Z}_{\geq g+1}$ is such that +$\mathcal{A}\left( k-1\right) $ holds, then $\mathcal{A}\left( k\right) $ +also holds. +\end{statement} -\textbf{(j)} We let $X$ denote the polynomial $\left( 0,1,0,0,0,\ldots -\right) \in\mathbb{Q}\left[ X\right] $. (This is the polynomial whose -$1$-st coefficient is $1$ and whose other coefficients are $0$.) This -polynomial is called the \textit{indeterminate} of $\mathbb{Q}\left[ -X\right] $. It is easy to see that, for any $n\in\mathbb{N}$, we have% -\[ -X^{n}=\left( \underbrace{0,0,\ldots,0}_{n\text{ zeroes}},1,0,0,0,\ldots -\right) . -\] +Then, $\mathcal{A}\left( n\right) $ holds for each $n\in\mathbb{Z}_{\geq g}$. +\end{theorem} +Roughly speaking, this Theorem \ref{thm.ind.IPg-1} is just Theorem +\ref{thm.ind.IPg}, except that the variable $m$ in Assumption 2 has been +renamed as $k-1$. Consequently, it stands to reason that Theorem +\ref{thm.ind.IPg-1} can easily be derived from Theorem \ref{thm.ind.IPg}. Here +is the derivation in full detail: + +\begin{proof} +[Proof of Theorem \ref{thm.ind.IPg-1}.]For each $n\in\mathbb{Z}_{\geq g}$, we +define the logical statement $\mathcal{B}\left( n\right) $ to be the +statement $\mathcal{A}\left( n\right) $. Thus, $\mathcal{B}\left( n\right) +=\mathcal{A}\left( n\right) $ for each $n\in\mathbb{Z}_{\geq g}$. Applying +this to $n=g$, we obtain $\mathcal{B}\left( g\right) =\mathcal{A}\left( +g\right) $ (since $g\in\mathbb{Z}_{\geq g}$). + +We shall now show that the two Assumptions A and B of Corollary +\ref{cor.ind.IPg.renamed} are satisfied. + +Indeed, recall that Assumption 1 is satisfied. In other words, the statement +$\mathcal{A}\left( g\right) $ holds. In other words, the statement +$\mathcal{B}\left( g\right) $ holds (since $\mathcal{B}\left( g\right) +=\mathcal{A}\left( g\right) $). In other words, Assumption A is satisfied. + +We shall next show that Assumption B is satisfied. Indeed, let $p\in +\mathbb{Z}_{\geq g}$ be such that $\mathcal{B}\left( p\right) $ holds. +Recall that the statement $\mathcal{B}\left( p\right) $ was defined to be +the statement $\mathcal{A}\left( p\right) $. Thus, $\mathcal{B}\left( +p\right) =\mathcal{A}\left( p\right) $. Hence, $\mathcal{A}\left( +p\right) $ holds (since $\mathcal{B}\left( p\right) $ holds). Now, let +$k=p+1$. We know that $p\in\mathbb{Z}_{\geq g}$; in other words, $p$ is an +integer and satisfies $p\geq g$. Hence, $k=p+1$ is an integer as well and +satisfies $k=\underbrace{p}_{\geq g}+1\geq g+1$. In other words, +$k\in\mathbb{Z}_{\geq g+1}$. Moreover, from $k=p+1$, we obtain $k-1=p$. Hence, +$\mathcal{A}\left( k-1\right) =\mathcal{A}\left( p\right) $. Thus, +$\mathcal{A}\left( k-1\right) $ holds (since $\mathcal{A}\left( p\right) $ +holds). Thus, Assumption 2 shows that $\mathcal{A}\left( k\right) $ also +holds. But the statement $\mathcal{B}\left( k\right) $ was defined to be the +statement $\mathcal{A}\left( k\right) $. Hence, $\mathcal{B}\left( +k\right) =\mathcal{A}\left( k\right) $, so that $\mathcal{A}\left( +k\right) =\mathcal{B}\left( k\right) =\mathcal{B}\left( p+1\right) $ +(since $k=p+1$). Thus, the statement $\mathcal{B}\left( p+1\right) $ holds +(since $\mathcal{A}\left( k\right) $ holds). Now, forget that we fixed $p$. +We thus have shown that if $p\in\mathbb{Z}_{\geq g}$ is such that +$\mathcal{B}\left( p\right) $ holds, then $\mathcal{B}\left( p+1\right) $ +also holds. In other words, Assumption B is satisfied. + +We have now proven that both Assumptions A and B of Corollary +\ref{cor.ind.IPg.renamed} are satisfied. Hence, Corollary +\ref{cor.ind.IPg.renamed} shows that $\mathcal{B}\left( n\right) $ holds for +each $n\in\mathbb{Z}_{\geq g}$. In other words, $\mathcal{A}\left( n\right) +$ holds for each $n\in\mathbb{Z}_{\geq g}$ (because each $n\in\mathbb{Z}_{\geq +g}$ satisfies $\mathcal{B}\left( n\right) =\mathcal{A}\left( n\right) $ +(by the definition of $\mathcal{B}\left( n\right) $)). This proves Theorem +\ref{thm.ind.IPg-1}. +\end{proof} + +Proofs that use Theorem \ref{thm.ind.IPg-1} are usually called \textit{proofs +by induction} or \textit{induction proofs}. As an example of such a proof, let +us show the following identity: -This polynomial $X$ finally provides an answer to the questions -\textquotedblleft what is an indeterminate\textquotedblright\ and -\textquotedblleft what is a formal expression\textquotedblright. Namely, let -$\left( p_{0},p_{1},p_{2},\ldots\right) \in\mathbb{Q}\left[ X\right] $ be -any polynomial. Then, the sum $p_{0}+p_{1}X+p_{2}X^{2}+\cdots$ is well-defined -(it is an infinite sum, but due to (\ref{eq.def.polynomial-univar.finite}) it -has only finitely many nonzero addends), and it is easy to see that this sum -equals $\left( p_{0},p_{1},p_{2},\ldots\right) $. Thus, +\begin{proposition} +\label{prop.ind.alt-harm}For every $n\in\mathbb{N}$, we have% +\begin{equation} +\sum_{i=1}^{2n}\dfrac{\left( -1\right) ^{i-1}}{i}=\sum_{i=n+1}^{2n}\dfrac +{1}{i}. \label{eq.prop.ind.alt-harm.claim}% +\end{equation} + +\end{proposition} + +The equality (\ref{eq.prop.ind.alt-harm.claim}) can be rewritten as% \[ -\left( p_{0},p_{1},p_{2},\ldots\right) =p_{0}+p_{1}X+p_{2}X^{2}% -+\cdots\ \ \ \ \ \ \ \ \ \ \text{for every }\left( p_{0},p_{1},p_{2}% -,\ldots\right) \in\mathbb{Q}\left[ X\right] . +\dfrac{1}{1}-\dfrac{1}{2}+\dfrac{1}{3}-\dfrac{1}{4}\pm\cdots+\dfrac{1}% +{2n-1}-\dfrac{1}{2n}=\dfrac{1}{n+1}+\dfrac{1}{n+2}+\cdots+\dfrac{1}{2n}% \] -This finally allows us to write a polynomial $\left( p_{0},p_{1},p_{2}% -,\ldots\right) $ as a sum $p_{0}+p_{1}X+p_{2}X^{2}+\cdots$ while remaining -honest; the sum $p_{0}+p_{1}X+p_{2}X^{2}+\cdots$ is no longer a -\textquotedblleft formal expression\textquotedblright\ of unclear meaning, nor -a function, but it is just an alternative way to write the sequence $\left( -p_{0},p_{1},p_{2},\ldots\right) $. So, at last, our notion of a polynomial -resembles the intuitive notion of a polynomial! +(where all the signs on the right hand side are $+$ signs, whereas the signs +on the left hand side alternate between $+$ signs and $-$ signs). -Of course, we can write polynomials as finite sums as well. Indeed, if -$\left( p_{0},p_{1},p_{2},\ldots\right) \in\mathbb{Q}\left[ X\right] $ is -a polynomial and $N$ is a nonnegative integer such that every $n>N$ satisfies -$p_{n}=0$, then% +\begin{proof} +[Proof of Proposition \ref{prop.ind.alt-harm}.]For each $n\in\mathbb{Z}% +_{\geq0}$, we let $\mathcal{A}\left( n\right) $ be the statement% \[ -\left( p_{0},p_{1},p_{2},\ldots\right) =p_{0}+p_{1}X+p_{2}X^{2}+\cdots -=p_{0}+p_{1}X+\cdots+p_{N}X^{N}% +\left( \sum_{i=1}^{2n}\dfrac{\left( -1\right) ^{i-1}}{i}=\sum_{i=n+1}% +^{2n}\dfrac{1}{i}\right) . \] -(because addends can be discarded when they are $0$). For example, $\left( -4,1,0,0,0,\ldots\right) =4+1X=4+X$ and $\left( \dfrac{1}{2},0,\dfrac{1}% -{3},0,0,0,\ldots\right) =\dfrac{1}{2}+0X+\dfrac{1}{3}X^{2}=\dfrac{1}% -{2}+\dfrac{1}{3}X^{2}$. +Our next goal is to prove the statement $\mathcal{A}\left( n\right) $ for +each $n\in\mathbb{Z}_{\geq0}$. -\textbf{(k)} For our definition of polynomials to be fully compatible with our -intuition, we are missing only one more thing: a way to evaluate a polynomial -at a number, or some other object (e.g., another polynomial or a function). -This is easy: Let $p=\left( p_{0},p_{1},p_{2},\ldots\right) \in -\mathbb{Q}\left[ X\right] $ be a polynomial, and let $\alpha\in\mathbb{Q}$. -Then, $p\left( \alpha\right) $ means the number $p_{0}+p_{1}\alpha -+p_{2}\alpha^{2}+\cdots\in\mathbb{Q}$. (Again, the infinite sum $p_{0}% -+p_{1}\alpha+p_{2}\alpha^{2}+\cdots$ makes sense because of -(\ref{eq.def.polynomial-univar.finite}).) Similarly, we can define $p\left( -\alpha\right) $ when $\alpha\in\mathbb{R}$ (but in this case, $p\left( -\alpha\right) $ will be an element of $\mathbb{R}$) or when $\alpha -\in\mathbb{C}$ (in this case, $p\left( \alpha\right) \in\mathbb{C}$) or when -$\alpha$ is a square matrix with rational entries (in this case, $p\left( -\alpha\right) $ will also be such a matrix) or when $\alpha$ is another -polynomial (in this case, $p\left( \alpha\right) $ is such a polynomial as well). +We first notice that the statement $\mathcal{A}\left( 0\right) $ +holds\footnote{\textit{Proof.} We have $\sum_{i=1}^{2\cdot0}\dfrac{\left( +-1\right) ^{i-1}}{i}=\left( \text{empty sum}\right) =0$. Comparing this +with $\sum_{i=0+1}^{2\cdot0}\dfrac{1}{i}=\left( \text{empty sum}\right) =0$, +we obtain $\sum_{i=1}^{2\cdot0}\dfrac{\left( -1\right) ^{i-1}}{i}% +=\sum_{i=0+1}^{2\cdot0}\dfrac{1}{i}$. But this is precisely the statement +$\mathcal{A}\left( 0\right) $ (since $\mathcal{A}\left( 0\right) $ is +defined to be the statement $\left( \sum_{i=1}^{2\cdot0}\dfrac{\left( +-1\right) ^{i-1}}{i}=\sum_{i=0+1}^{2\cdot0}\dfrac{1}{i}\right) $). Hence, +the statement $\mathcal{A}\left( 0\right) $ holds.}. -For example, if $p=\left( 1,-2,0,3,0,0,0,\ldots\right) =1-2X+3X^{3}$, then -$p\left( \alpha\right) =1-2\alpha+3\alpha^{3}$ for every $\alpha$. +Now, we claim that +\begin{equation} +\text{if }k\in\mathbb{Z}_{\geq0+1}\text{ is such that }\mathcal{A}\left( +k-1\right) \text{ holds, then }\mathcal{A}\left( k\right) \text{ also +holds.} \label{pf.prop.ind.alt-harm.step}% +\end{equation} -The map $\mathbb{Q}\rightarrow\mathbb{Q},\ \alpha\mapsto p\left( -\alpha\right) $ is called the \textit{polynomial function described by }$p$. -As we said above, this function is not $p$, and it is not a good idea to -equate it with $p$. -If $\alpha$ is a number (or a square matrix, or another polynomial), then -$p\left( \alpha\right) $ is called the result of \textit{evaluating }$p$ -\textit{at }$X=\alpha$ (or, simply, evaluating $p$ at $\alpha$), or the result -of \textit{substituting }$\alpha$\textit{ for }$X$\textit{ in }$p$. This -notation, of course, reminds of functions; nevertheless, (as we already said a -few times) $p$ is \textbf{not a function}. +[\textit{Proof of (\ref{pf.prop.ind.alt-harm.step}):} Let $k\in\mathbb{Z}% +_{\geq0+1}$ be such that $\mathcal{A}\left( k-1\right) $ holds. We must show +that $\mathcal{A}\left( k\right) $ also holds. -Probably the simplest three cases of evaluation are the following ones: +We have $k\in\mathbb{Z}_{\geq0+1}$. Thus, $k$ is an integer and satisfies +$k\geq0+1=1$. -\begin{itemize} -\item We have $p\left( 0\right) =p_{0}+p_{1}0^{1}+p_{2}0^{2}+\cdots=p_{0}$. -In other words, evaluating $p$ at $X=0$ yields the constant term of $p$. +We have assumed that $\mathcal{A}\left( k-1\right) $ holds. In other words, +\begin{equation} +\sum_{i=1}^{2\left( k-1\right) }\dfrac{\left( -1\right) ^{i-1}}{i}% +=\sum_{i=\left( k-1\right) +1}^{2\left( k-1\right) }\dfrac{1}{i} +\label{pf.prop.ind.alt-harm.IH}% +\end{equation} +holds\footnote{because $\mathcal{A}\left( k-1\right) $ is defined to be the +statement $\left( \sum_{i=1}^{2\left( k-1\right) }\dfrac{\left( -1\right) +^{i-1}}{i}=\sum_{i=\left( k-1\right) +1}^{2\left( k-1\right) }\dfrac{1}% +{i}\right) $}. -\item We have $p\left( 1\right) =p_{0}+p_{1}1^{1}+p_{2}1^{2}+\cdots -=p_{0}+p_{1}+p_{2}+\cdots$. In other words, evaluating $p$ at $X=1$ yields the -sum of all coefficients of $p$. +We have $\left( -1\right) ^{2\left( k-1\right) }=\left( +\underbrace{\left( -1\right) ^{2}}_{=1}\right) ^{k-1}=1^{k-1}=1$. But +$2k-1=2\left( k-1\right) +1$. Thus, $\left( -1\right) ^{2k-1}=\left( +-1\right) ^{2\left( k-1\right) +1}=\underbrace{\left( -1\right) +^{2\left( k-1\right) }}_{=1}\underbrace{\left( -1\right) ^{1}}_{=-1}=-1$. -\item We have $p\left( X\right) =p_{0}+p_{1}X^{1}+p_{2}X^{2}+\cdots -=p_{0}+p_{1}X+p_{2}X^{2}+\cdots=p$. In other words, evaluating $p$ at $X=X$ -yields $p$ itself. This allows us to write $p\left( X\right) $ for $p$. Many -authors do so, just in order to stress that $p$ is a polynomial and that the -indeterminate is called $X$. It should be kept in mind that $X$ is \textbf{not -a variable} (just as $p$ is \textbf{not a function}); it is the (fixed!) -sequence $\left( 0,1,0,0,0,\ldots\right) \in\mathbb{Q}\left[ X\right] $ -which serves as the indeterminate for polynomials in $\mathbb{Q}\left[ -X\right] $. -\end{itemize} +Now, $k\geq1$, so that $2k\geq2$ and therefore $2k-1\geq1$. Hence, we can +split off the addend for $i=2k-1$ from the sum $\sum_{i=1}^{2k-1}% +\dfrac{\left( -1\right) ^{i-1}}{i}$. We thus obtain% +\begin{align} +\sum_{i=1}^{2k-1}\dfrac{\left( -1\right) ^{i-1}}{i} & =\sum_{i=1}^{\left( +2k-1\right) -1}\dfrac{\left( -1\right) ^{i-1}}{i}+\dfrac{\left( -1\right) +^{\left( 2k-1\right) -1}}{2k-1}\nonumber\\ +& =\underbrace{\sum_{i=1}^{2\left( k-1\right) }\dfrac{\left( -1\right) +^{i-1}}{i}}_{\substack{=\sum_{i=\left( k-1\right) +1}^{2\left( k-1\right) +}\dfrac{1}{i}\\\text{(by (\ref{pf.prop.ind.alt-harm.IH}))}}% +}+\underbrace{\dfrac{\left( -1\right) ^{2\left( k-1\right) }}{2k-1}% +}_{\substack{=\dfrac{1}{2k-1}\\\text{(since }\left( -1\right) ^{2\left( +k-1\right) }=1\text{)}}}\nonumber\\ +& \ \ \ \ \ \ \ \ \ \ \left( \text{since }\left( 2k-1\right) -1=2\left( +k-1\right) \right) \nonumber\\ +& =\sum_{i=\left( k-1\right) +1}^{2\left( k-1\right) }\dfrac{1}{i}% ++\dfrac{1}{2k-1}=\sum_{i=k}^{2k-2}\dfrac{1}{i}+\dfrac{1}{2k-1} +\label{pf.prop.ind.alt-harm.0}% +\end{align} +(since $\left( k-1\right) +1=k$ and $2\left( k-1\right) =2k-2$). -\textbf{(l)} Often, one wants (or is required) to give an indeterminate a name -other than $X$. (For instance, instead of polynomials with rational -coefficients, we could be considering polynomials whose coefficients -themselves are polynomials in $\mathbb{Q}\left[ X\right] $; and then, we -would not be allowed to use the letter $X$ for the \textquotedblleft -new\textquotedblright\ indeterminate anymore, as it already means the -indeterminate of $\mathbb{Q}\left[ X\right] $ !) This can be done, and the -rules are the following: Any letter (that does not already have a meaning) can -be used to denote the indeterminate; but then, the set of all polynomials has -to be renamed as $\mathbb{Q}\left[ \eta\right] $, where $\eta$ is this -letter. For instance, if we want to denote the indeterminate as $x$, then we -have to denote the set by $\mathbb{Q}\left[ x\right] $. +On the other hand, $2k\geq2\geq1$. Hence, we can split off the addend for +$i=2k$ from the sum $\sum_{i=1}^{2k}\dfrac{\left( -1\right) ^{i-1}}{i}$. We +thus obtain% +\begin{align} +\sum_{i=1}^{2k}\dfrac{\left( -1\right) ^{i-1}}{i} & =\underbrace{\sum +_{i=1}^{2k-1}\dfrac{\left( -1\right) ^{i-1}}{i}}_{\substack{=\sum +_{i=k}^{2k-2}\dfrac{1}{i}+\dfrac{1}{2k-1}\\\text{(by +(\ref{pf.prop.ind.alt-harm.0}))}}}+\underbrace{\dfrac{\left( -1\right) +^{2k-1}}{2k}}_{\substack{=\dfrac{-1}{2k}\\\text{(since }\left( -1\right) +^{2k-1}=-1\text{)}}}\nonumber\\ +& =\sum_{i=k}^{2k-2}\dfrac{1}{i}+\dfrac{1}{2k-1}+\dfrac{-1}{2k}. +\label{pf.prop.ind.alt-harm.1}% +\end{align} -It is furthermore convenient to regard the sets $\mathbb{Q}\left[ -\eta\right] $ for different letters $\eta$ as distinct. Thus, for example, -the polynomial $3X^{2}+1$ is not the same as the polynomial $3Y^{2}+1$. (The -reason for doing so is that one sometimes wishes to view both of these -polynomials as polynomials in the two variables $X$ and $Y$.) Formally -speaking, this means that we should define a polynomial in $\mathbb{Q}\left[ -\eta\right] $ to be not just a sequence $\left( p_{0},p_{1},p_{2}% -,\ldots\right) $ of rational numbers, but actually a pair $\left( \left( -p_{0},p_{1},p_{2},\ldots\right) ,\text{\textquotedblleft}\eta -\text{\textquotedblright}\right) $ of a sequence of rational numbers and the -letter $\eta$. (Here, \textquotedblleft$\eta$\textquotedblright\ really means -the letter $\eta$, not the sequence $\left( 0,1,0,0,0,\ldots\right) $.) This -is, of course, a very technical point which is of little relevance to most of -mathematics; it becomes important when one tries to implement polynomials in a -programming language. -\textbf{(m)} As already explained, we can replace $\mathbb{Q}$ by $\mathbb{Z}% -$, $\mathbb{R}$, $\mathbb{C}$ or any other commutative ring $\mathbb{K}$ in -the above definition. (See Definition \ref{def.commring} for the definition of -a commutative ring.) When $\mathbb{Q}$ is replaced by a commutative ring -$\mathbb{K}$, the notion of \textquotedblleft univariate polynomials with -rational coefficients\textquotedblright\ becomes \textquotedblleft univariate -polynomials with coefficients in $\mathbb{K}$\textquotedblright\ (also known -as \textquotedblleft univariate polynomials over $\mathbb{K}$% -\textquotedblright), and the set of such polynomials is denoted by -$\mathbb{K}\left[ X\right] $ rather than $\mathbb{Q}\left[ X\right] $. -\end{definition} +But we have $\left( 2k-1\right) -k=k-1\geq0$ (since $k\geq1$). Thus, +$2k-1\geq k$. Thus, we can split off the addend for $i=2k-1$ from the sum +$\sum_{i=k}^{2k-1}\dfrac{1}{i}$. We thus obtain% +\begin{equation} +\sum_{i=k}^{2k-1}\dfrac{1}{i}=\sum_{i=k}^{\left( 2k-1\right) -1}\dfrac{1}% +{i}+\dfrac{1}{2k-1}=\sum_{i=k}^{2k-2}\dfrac{1}{i}+\dfrac{1}{2k-1} +\label{pf.prop.ind.alt-harm.2}% +\end{equation} +(since $\left( 2k-1\right) -1=2k-2$). Hence, (\ref{pf.prop.ind.alt-harm.1}) +becomes% +\begin{equation} +\sum_{i=1}^{2k}\dfrac{\left( -1\right) ^{i-1}}{i}=\underbrace{\sum +_{i=k}^{2k-2}\dfrac{1}{i}+\dfrac{1}{2k-1}}_{\substack{=\sum_{i=k}^{2k-1}% +\dfrac{1}{i}\\\text{(by (\ref{pf.prop.ind.alt-harm.2}))}}}+\dfrac{-1}{2k}% +=\sum_{i=k}^{2k-1}\dfrac{1}{i}+\dfrac{-1}{2k}. \label{pf.prop.ind.alt-harm.3}% +\end{equation} -So much for univariate polynomials. -Polynomials in multiple variables are (in my opinion) treated the best in -\cite[Chapter II, \S 3]{Lang02}, where they are introduced as elements of a -monoid ring. However, this treatment is rather abstract and uses a good deal -of algebraic language\footnote{Also, the book \cite{Lang02} is notorious for -its unpolished writing; it is best read with Bergman's companion -\cite{Bergman-Lang} at hand.}. The treatments in \cite[\S 4.5]{Walker87}, in -\cite[Chapter A-3]{Rotman15} and in \cite[Chapter IV, \S 4]{BirkMac} use the -above-mentioned recursive shortcut that makes them inferior (in my opinion). A -neat (and rather elementary) treatment of polynomials in $n$ variables (for -finite $n$) can be found in \cite[Chapter III, \S 5]{Hungerford-03} and in -\cite[\S 8]{AmaEsc05}; it generalizes the viewpoint we used in Definition -\ref{def.polynomial-univar} for univariate polynomials above\footnote{You are -reading right: The analysis textbook \cite{AmaEsc05} is one of the few sources -I am aware of to define the (algebraic!) notion of polynomials precisely and -well.}. +But we have $k+1\leq2k$ (since $2k-\left( k+1\right) =k-1\geq0$). Thus, we +can split off the addend for $i=2k$ from the sum $\sum_{i=k+1}^{2k}\dfrac +{1}{i}$. We thus obtain% +\[ +\sum_{i=k+1}^{2k}\dfrac{1}{i}=\sum_{i=k+1}^{2k-1}\dfrac{1}{i}+\dfrac{1}{2k}. +\] +Hence,% +\begin{equation} +\sum_{i=k+1}^{2k-1}\dfrac{1}{i}=\sum_{i=k+1}^{2k}\dfrac{1}{i}-\dfrac{1}{2k}. +\label{pf.prop.ind.alt-harm.4}% +\end{equation} + + +Also, $k\leq2k-1$ (since $\left( 2k-1\right) -k=k-1\geq0$). Thus, we can +split off the addend for $i=k$ from the sum $\sum_{i=k}^{2k-1}\dfrac{1}{i}$. +We thus obtain% +\[ +\sum_{i=k}^{2k-1}\dfrac{1}{i}=\dfrac{1}{k}+\underbrace{\sum_{i=k+1}% +^{2k-1}\dfrac{1}{i}}_{\substack{=\sum_{i=k+1}^{2k}\dfrac{1}{i}-\dfrac{1}% +{2k}\\\text{(by (\ref{pf.prop.ind.alt-harm.4}))}}}=\dfrac{1}{k}+\sum +_{i=k+1}^{2k}\dfrac{1}{i}-\dfrac{1}{2k}=\sum_{i=k+1}^{2k}\dfrac{1}% +{i}+\underbrace{\dfrac{1}{k}-\dfrac{1}{2k}}_{=\dfrac{1}{2k}}=\sum_{i=k+1}% +^{2k}\dfrac{1}{i}+\dfrac{1}{2k}. +\] +Subtracting $\dfrac{1}{2k}$ from this equality, we obtain% +\[ +\sum_{i=k}^{2k-1}\dfrac{1}{i}-\dfrac{1}{2k}=\sum_{i=k+1}^{2k}\dfrac{1}{i}. +\] +Hence,% +\[ +\sum_{i=k+1}^{2k}\dfrac{1}{i}=\sum_{i=k}^{2k-1}\dfrac{1}{i}-\dfrac{1}{2k}% +=\sum_{i=k}^{2k-1}\dfrac{1}{i}+\dfrac{-1}{2k}. +\] +Comparing this with (\ref{pf.prop.ind.alt-harm.3}), we obtain% +\begin{equation} +\sum_{i=1}^{2k}\dfrac{\left( -1\right) ^{i-1}}{i}=\sum_{i=k+1}^{2k}\dfrac +{1}{i}. \label{pf.prop.ind.alt-harm.8}% +\end{equation} +But this is precisely the statement $\mathcal{A}\left( k\right) +$\ \ \ \ \footnote{because $\mathcal{A}\left( k\right) $ is defined to be +the statement $\left( \sum_{i=1}^{2k}\dfrac{\left( -1\right) ^{i-1}}% +{i}=\sum_{i=k+1}^{2k}\dfrac{1}{i}\right) $}. Thus, the statement +$\mathcal{A}\left( k\right) $ holds. + +Now, forget that we fixed $k$. We thus have shown that if $k\in\mathbb{Z}% +_{\geq0+1}$ is such that $\mathcal{A}\left( k-1\right) $ holds, then +$\mathcal{A}\left( k\right) $ also holds. This proves +(\ref{pf.prop.ind.alt-harm.step}).] + +Now, both assumptions of Theorem \ref{thm.ind.IPg-1} (applied to $g=0$) are +satisfied (indeed, Assumption 1 holds because the statement $\mathcal{A}% +\left( 0\right) $ holds, whereas Assumption 2 holds because of +(\ref{pf.prop.ind.alt-harm.step})). Thus, Theorem \ref{thm.ind.IPg-1} (applied +to $g=0$) shows that $\mathcal{A}\left( n\right) $ holds for each +$n\in\mathbb{Z}_{\geq0}$. In other words, $\sum_{i=1}^{2n}\dfrac{\left( +-1\right) ^{i-1}}{i}=\sum_{i=n+1}^{2n}\dfrac{1}{i}$ holds for each +$n\in\mathbb{Z}_{\geq0}$ (since $\mathcal{A}\left( n\right) $ is the +statement $\left( \sum_{i=1}^{2n}\dfrac{\left( -1\right) ^{i-1}}{i}% +=\sum_{i=n+1}^{2n}\dfrac{1}{i}\right) $). In other words, $\sum_{i=1}% +^{2n}\dfrac{\left( -1\right) ^{i-1}}{i}=\sum_{i=n+1}^{2n}\dfrac{1}{i}$ holds +for each $n\in\mathbb{N}$ (because $\mathbb{Z}_{\geq0}=\mathbb{N}$). This +proves Proposition \ref{prop.ind.alt-harm}. +\end{proof} + +\subsubsection{Conventions for writing proofs using \textquotedblleft$k-1$ to +$k$\textquotedblright\ induction} + +Just like most of the other induction principles that we have so far +introduced, Theorem \ref{thm.ind.IPg-1} is not usually invoked explicitly when +it is used; instead, its use is signalled by certain words: + +\begin{convention} +\label{conv.ind.IPg-1lang}Let $g\in\mathbb{Z}$. For each $n\in\mathbb{Z}_{\geq +g}$, let $\mathcal{A}\left( n\right) $ be a logical statement. Assume that +you want to prove that $\mathcal{A}\left( n\right) $ holds for each +$n\in\mathbb{Z}_{\geq g}$. + +Theorem \ref{thm.ind.IPg-1} offers the following strategy for proving this: +First show that Assumption 1 of Theorem \ref{thm.ind.IPg-1} is satisfied; +then, show that Assumption 2 of Theorem \ref{thm.ind.IPg-1} is satisfied; +then, Theorem \ref{thm.ind.IPg-1} automatically completes your proof. + +A proof that follows this strategy is called a \textit{proof by induction on +}$n$ (or \textit{proof by induction over }$n$) \textit{starting at }$g$ or +(less precisely) an \textit{inductive proof}. Most of the time, the words +\textquotedblleft starting at $g$\textquotedblright\ are omitted, since the +value of $g$ is usually clear from the statement that is being proven. +Usually, the statements $\mathcal{A}\left( n\right) $ are not explicitly +stated in the proof either, since they can also be inferred from the context. + +The proof that Assumption 1 is satisfied is called the \textit{induction base} +(or \textit{base case}) of the proof. The proof that Assumption 2 is satisfied +is called the \textit{induction step} of the proof. + +In order to prove that Assumption 2 is satisfied, you will usually want to fix +a $k\in\mathbb{Z}_{\geq g+1}$ such that $\mathcal{A}\left( k-1\right) $ +holds, and then prove that $\mathcal{A}\left( k\right) $ holds. In other +words, you will usually want to fix $k\in\mathbb{Z}_{\geq g+1}$, assume that +$\mathcal{A}\left( k-1\right) $ holds, and then prove that $\mathcal{A}% +\left( k\right) $ holds. When doing so, it is common to refer to the +assumption that $\mathcal{A}\left( k-1\right) $ holds as the +\textit{induction hypothesis} (or \textit{induction assumption}). +\end{convention} + +This language is exactly the same that was introduced in Convention +\ref{conv.ind.IPglang} for proofs by \textquotedblleft +standard\textquotedblright\ induction starting at $g$. The only difference +between proofs that use Theorem \ref{thm.ind.IPg} and proofs that use Theorem +\ref{thm.ind.IPg-1} is that the induction step in the former proofs assumes +$\mathcal{A}\left( m\right) $ and proves $\mathcal{A}\left( m+1\right) $, +whereas the induction step in the latter proofs assumes $\mathcal{A}\left( +k-1\right) $ and proves $\mathcal{A}\left( k\right) $. (Of course, the +letters \textquotedblleft$m$\textquotedblright\ and \textquotedblleft% +$k$\textquotedblright\ are not set in stone; any otherwise unused letters can +be used in their stead. Thus, what distinguishes proofs that use Theorem +\ref{thm.ind.IPg} from proofs that use Theorem \ref{thm.ind.IPg-1} is not the +letter they use, but the \textquotedblleft$+1$\textquotedblright\ versus the +\textquotedblleft$-1$\textquotedblright.) + +Let us repeat the above proof of Proposition \ref{prop.ind.alt-harm} (or, more +precisely, its non-computational part) using this language: + +\begin{proof} +[Proof of Proposition \ref{prop.ind.alt-harm} (second version).]We must prove +(\ref{eq.prop.ind.alt-harm.claim}) for every $n\in\mathbb{N}$. In other words, +we must prove (\ref{eq.prop.ind.alt-harm.claim}) for every $n\in +\mathbb{Z}_{\geq0}$ (since $\mathbb{N}=\mathbb{Z}_{\geq0}$). We shall prove +this by induction on $n$ starting at $0$: + +\textit{Induction base:} We have $\sum_{i=1}^{2\cdot0}\dfrac{\left( +-1\right) ^{i-1}}{i}=\left( \text{empty sum}\right) =0$. Comparing this +with $\sum_{i=0+1}^{2\cdot0}\dfrac{1}{i}=\left( \text{empty sum}\right) =0$, +we obtain $\sum_{i=1}^{2\cdot0}\dfrac{\left( -1\right) ^{i-1}}{i}% +=\sum_{i=0+1}^{2\cdot0}\dfrac{1}{i}$. In other words, +(\ref{eq.prop.ind.alt-harm.claim}) holds for $n=0$. This completes the +induction base. + +\textit{Induction step:} Let $k\in\mathbb{Z}_{\geq1}$. Assume that +(\ref{eq.prop.ind.alt-harm.claim}) holds for $n=k-1$. We must show that +(\ref{eq.prop.ind.alt-harm.claim}) holds for $n=k$. + +We have $k\in\mathbb{Z}_{\geq1}$. In other words, $k$ is an integer and +satisfies $k\geq1$. + +We have assumed that (\ref{eq.prop.ind.alt-harm.claim}) holds for $n=k-1$. In +other words, +\begin{equation} +\sum_{i=1}^{2\left( k-1\right) }\dfrac{\left( -1\right) ^{i-1}}{i}% +=\sum_{i=\left( k-1\right) +1}^{2\left( k-1\right) }\dfrac{1}{i}. +\label{pf.prop.ind.alt-harm.ver2.1}% +\end{equation} +From here, we can obtain% +\begin{equation} +\sum_{i=1}^{2k}\dfrac{\left( -1\right) ^{i-1}}{i}=\sum_{i=k+1}^{2k}\dfrac +{1}{i}. \label{pf.prop.ind.alt-harm.ver2.4}% +\end{equation} +(Indeed, we can derive (\ref{pf.prop.ind.alt-harm.ver2.4}) from +(\ref{pf.prop.ind.alt-harm.ver2.1}) in exactly the same way as we derived +(\ref{pf.prop.ind.alt-harm.8}) from (\ref{pf.prop.ind.alt-harm.IH}) in the +above first version of the proof of Proposition \ref{prop.ind.alt-harm}; +nothing about this argument needs to be changed, so we have no reason to +repeat it.) + +But the equality (\ref{pf.prop.ind.alt-harm.ver2.4}) shows that +(\ref{eq.prop.ind.alt-harm.claim}) holds for $n=k$. This completes the +induction step. Hence, (\ref{eq.prop.ind.alt-harm.claim}) is proven by +induction. This proves Proposition \ref{prop.ind.alt-harm}. +\end{proof} \section{\label{chp.binom}On binomial coefficients} @@ -3283,21 +15826,17 @@ \subsubsection{The combinatorial interpretation of binomial coefficients} does have such an interpretation.) \end{remark} -\footnotetext{A mathematical statement of the form \textquotedblleft if -$\mathcal{A}$, then $\mathcal{B}$\textquotedblright\ is said to be -\textit{vacuously true} if $\mathcal{A}$ never holds. For example, the -statement \textquotedblleft if $0=1$, then every integer is +\footnotetext{Recall that a mathematical statement of the form +\textquotedblleft if $\mathcal{A}$, then $\mathcal{B}$\textquotedblright\ is +said to be \textit{vacuously true} if $\mathcal{A}$ never holds. For example, +the statement \textquotedblleft if $0=1$, then every integer is odd\textquotedblright\ is vacuously true, because $0=1$ is false. Proposition \ref{prop.binom.subsets} is vacuously true when $m$ is negative, because the condition \textquotedblleft$S$ is an $m$-element set\textquotedblright\ never holds when $m$ is negative. \par -By the laws of logic, a vacuously true statement is always true! This may -sound counterintuitive, but actually makes a lot of sense: A statement -\textquotedblleft if $\mathcal{A}$, then $\mathcal{B}$\textquotedblright\ only -says anything about situations where $\mathcal{A}$ holds. If $\mathcal{A}$ -never holds, then it therefore says nothing. And saying nothing is a safe way -to remain truthful.} +By the laws of logic, a vacuously true statement is always true! See +Convention \ref{conv.logic.vacuous} for a discussion of this principle.} \begin{remark} Some authors (for example, those of \cite{LeLeMe16} and of \cite{Galvin}) use @@ -3444,6 +15983,10 @@ \subsubsection{Binomial coefficients of integers are integers} $. This proves Lemma \ref{lem.binom.intN}. \end{proof} +It is also easy to prove Lemma \ref{lem.binom.intN} by induction on $m$, using +\eqref{eq.binom.00} and \eqref{eq.binom.0} in the induction base and using +(\ref{eq.binom.rec.m}) in the induction step. + \begin{proposition} \label{prop.binom.int}Let $m\in\mathbb{Z}$ and $n\in\mathbb{N}$. Then,% \begin{equation} @@ -3481,13 +16024,12 @@ \subsubsection{Binomial coefficients of integers are integers} \end{proof} The above proof of Proposition \ref{prop.binom.int} may well be the simplest -one. There is another which proceeds by induction on $m$ (using -(\ref{eq.binom.rec.m})), but this induction needs two induction steps -($m\rightarrow m+1$ and $m\rightarrow m-1$) in order to reach all integers -(positive and negative). There is yet another proof using basic number theory -(specifically, checking how often a prime $p$ appears in the numerator and the -denominator of $\dbinom{m}{n}=\dfrac{m\left( m-1\right) \cdots\left( -m-n+1\right) }{n!}$), but this is not quite easy. +one. There is another proof, which uses Theorem \ref{thm.ind.IPg+-}, but it is +more complicated\footnote{It requires an induction on $n$ nested inside the +induction step of the induction on $m$.}. There is yet another proof using +basic number theory (specifically, checking how often a prime $p$ appears in +the numerator and the denominator of $\dbinom{m}{n}=\dfrac{m\left( +m-1\right) \cdots\left( m-n+1\right) }{n!}$), but this is not quite easy. \subsubsection{The binomial formula} @@ -4076,9 +16618,9 @@ \subsubsection{An algebraic proof} \end{align*} Compared with% \begin{align*} -& \sum_{k=0}^{N}\dfrac{k}{N}\dbinom{x}{k}\dbinom{y}{N-k}\\ -& =\underbrace{\dfrac{0}{N}\dbinom{x}{0}\dbinom{y}{N-0}}_{=0}+\sum_{k=1}% -^{N}\dfrac{k}{N}\dbinom{x}{k}\dbinom{y}{N-k}\\ +\sum_{k=0}^{N}\dfrac{k}{N}\dbinom{x}{k}\dbinom{y}{N-k} & =\underbrace{\dfrac +{0}{N}\dbinom{x}{0}\dbinom{y}{N-0}}_{=0}+\sum_{k=1}^{N}\dfrac{k}{N}\dbinom +{x}{k}\dbinom{y}{N-k}\\ & \ \ \ \ \ \ \ \ \ \ \left( \text{here, we have split off the addend for }k=0\right) \\ & =\sum_{k=1}^{N}\dfrac{k}{N}\dbinom{x}{k}\dbinom{y}{N-k}, @@ -4098,8 +16640,8 @@ \subsubsection{An algebraic proof} \dbinom{x}{k}\dbinom{y-1}{N-k-1}\\ & =\sum_{k=0}^{N-1}\dbinom{x}{k}\underbrace{\dfrac{y}{N}\dbinom{y-1}{N-k-1}% }_{\substack{=\dfrac{N-k}{N}\dbinom{y}{N-k}\\\text{(by -(\ref{pf.thm.vandermonde.pf.2.8}),}\\\text{applied to }a=N-k\\\text{(since -}N-k\in\left\{ 1,2,3,\ldots\right\} \text{ (because }k\in\left\{ +(\ref{pf.thm.vandermonde.pf.2.8}), applied to }a=N-k\\\text{(since }% +N-k\in\left\{ 1,2,3,\ldots\right\} \text{ (because }k\in\left\{ 0,1,\ldots,N-1\right\} \text{)))}}}\\ & =\sum_{k=0}^{N-1}\dbinom{x}{k}\dfrac{N-k}{N}\dbinom{y}{N-k}=\sum _{k=0}^{N-1}\dfrac{N-k}{N}\dbinom{x}{k}\dbinom{y}{N-k}. @@ -6637,7 +19179,8 @@ \subsection{Basics} \textit{\href{https://en.wikipedia.org/wiki/Fibonacci_number}{Fibonacci sequence}} is the sequence $\left( f_{0},f_{1},f_{2},\ldots\right) $ of integers which is defined recursively by $f_{0}=0$, $f_{1}=1$, and -$f_{n}=f_{n-1}+f_{n-2}$ for all $n\geq2$. Its first terms are% +$f_{n}=f_{n-1}+f_{n-2}$ for all $n\geq2$. We have already introduced this +sequence in Example \ref{exa.rec-seq.fib}. Its first terms are% \begin{align*} f_{0} & =0,\ \ \ \ \ \ \ \ \ \ f_{1}=1,\ \ \ \ \ \ \ \ \ \ f_{2}% =1,\ \ \ \ \ \ \ \ \ \ f_{3}=2,\ \ \ \ \ \ \ \ \ \ f_{4}% @@ -6678,8 +19221,8 @@ \subsection{Basics} fact, the \textit{Binet formula} says that the $n$-th Fibonacci number $f_{n}$ can be computed by% \begin{equation} -f_{n}=\dfrac{1}{\sqrt{5}}\varphi^{n}-\dfrac{1}{\sqrt{5}}\psi^{n}% -,\label{eq.binet.f}% +f_{n}=\dfrac{1}{\sqrt{5}}\varphi^{n}-\dfrac{1}{\sqrt{5}}\psi^{n}, +\label{eq.binet.f}% \end{equation} where $\varphi=\dfrac{1+\sqrt{5}}{2}$ and $\psi=\dfrac{1-\sqrt{5}}{2}$ are the two solutions of the quadratic equation $X^{2}-X-1=0$. (The number $\varphi$ @@ -6687,7 +19230,7 @@ \subsection{Basics} it by $\psi=1-\varphi=-1/\varphi$.) A similar formula, using the very same numbers $\varphi$ and $\psi$, exists for the Lucas numbers:% \begin{equation} -\ell_{n}=\varphi^{n}+\psi^{n}.\label{eq.binet.l}% +\ell_{n}=\varphi^{n}+\psi^{n}. \label{eq.binet.l}% \end{equation} @@ -6758,6 +19301,9 @@ \subsection{Basics} values. Here are some further examples of $\left( a,b\right) $-recurrent sequences: \begin{itemize} +\item The sequence $\left( x_{0},x_{1},x_{2},\ldots\right) $ in Theorem +\ref{thm.rec-seq.fibx} is $\left( a,b\right) $-recurrent (by its very definition). + \item A sequence $\left( x_{0},x_{1},x_{2},\ldots\right) $ is $\left( 2,-1\right) $-recurrent if and only if every $n\geq2$ satisfies $x_{n}=2x_{n-1}-x_{n-2}$. In other words, a sequence $\left( x_{0}% @@ -7357,19 +19903,6 @@ \subsection{Additional exercises} a,b\right) $-recurrent sequences (e.g., Lucas numbers). \end{remark} -\begin{exercise} -\label{exe.rec.addition}\textbf{(a)} Let $\left( f_{0},f_{1},f_{2}% -,\ldots\right) $ be the Fibonacci sequence. Prove that $f_{m+n}=f_{m}% -f_{n+1}+f_{m-1}f_{n}$ for any positive integer $m$ and any $n\in\mathbb{N}$. - -\textbf{(b)} Generalize to $\left( a,b\right) $-recurrent sequences with -arbitrary $a$ and $b$. - -\textbf{(c)} Let $\left( f_{0},f_{1},f_{2},\ldots\right) $ be the Fibonacci -sequence. Prove that $f_{m}\mid f_{mk}$ for any $m\in\mathbb{N}$ and -$k\in\mathbb{N}$. -\end{exercise} - \begin{exercise} \label{exe.rec.fibonomial}\textbf{(a)} Let $\left( f_{0},f_{1},f_{2}% ,\ldots\right) $ be the Fibonacci sequence. For every $n\in\mathbb{N}$ and @@ -9778,6 +22311,11 @@ \subsection{More on signs of permutations} give a new solution to Exercise \ref{exe.ps2.2.5} \textbf{(b)}. \end{exercise} +The next exercise relies on the notion of ``the list of all elements of $S$ in +increasing order (with no repetitions)'', where $S$ is a finite set of +integers. This notion means exactly what it says; it was rigorously defined in +Definition \ref{def.ind.inclist}. + \begin{exercise} \label{exe.Ialbe}Let $n\in\mathbb{N}$. Let $I$ be a subset of $\left\{ 1,2,\ldots,n\right\} $. Let $k=\left\vert I\right\vert $. Let $\left( @@ -11006,8 +23544,8 @@ \subsection{\label{sect.commring}Commutative rings} for the additive inverse of $a$. This is the same as $0_{\mathbb{K}} - a$. The intuition for commutative rings is essentially that all computations that -can be made with the operations $+$, $-$ and $\cdot$ on integers can be -similarly made in a commutative ring. For instance, if $a_{1},a_{2}% +can be performed with the operations $+$, $-$ and $\cdot$ on integers can be +similarly made in any commutative ring. For instance, if $a_{1},a_{2}% ,\ldots,a_{n}$ are $n$ elements of a commutative ring, then the sum $a_{1}+a_{2}+\cdots+a_{n}$ is well-defined, and can be computed by adding the elements $a_{1},a_{2},\ldots,a_{n}$ to each other in any order\footnote{For @@ -11015,9 +23553,21 @@ \subsection{\label{sect.commring}Commutative rings} ways: For example, we can first add $a$ and $b$, then add $c$ and $d$, and finally add the two results; alternatively, we can first add $a$ and $b$, then add $d$ to the result, then add $c$ to the result. In a commutative ring, all -such ways lead to the same result. To prove this is a slightly tedious -induction argument that uses commutativity and associativity.}. The same holds -for products. If $n$ is an integer and $a$ is an element of a commutative ring +such ways lead to the same result.}. More generally: If $S$ is a finite set, +if $\mathbb{K}$ is a commutative ring, and if $\left( a_{s}\right) _{s\in +S}$ is a $\mathbb{K}$-valued $S$-family\footnote{See Definition +\ref{def.ind.families.fams} for the definition of this notion.}, then the sum +$\sum_{s\in S}a_{s}$ is defined in the same way as finite sums of numbers were +defined in Section \ref{sect.sums-repetitorium} (but with $\mathbb{A}$ +replaced by $\mathbb{K}$, of course\footnote{and, consequently, $0$ replaced +by $0_{\mathbb{K}}$}); this definition is still legitimate\footnote{i.e., the +result does not depend on the choice of $t$ in (\ref{eq.sum.def.1})}, and +these finite sums of elements of $\mathbb{K}$ satisfy the same properties as +finite sums of numbers (see Section \ref{sect.sums-repetitorium} for these +properties). All this can be proven in the same way as it was proven for +numbers (in Section \ref{sect.ind.gen-com} and Section +\ref{sect.sums-repetitorium}). The same holds for finite products. +Furthermore, if $n$ is an integer and $a$ is an element of a commutative ring $\mathbb{K}$, then we define an element $na$ of $\mathbb{K}$ by% \[ na=% @@ -11210,8 +23760,18 @@ \subsection{Matrices} 1,2,\ldots,m\right\} $ to $\mathbb{K}$. We represent such a map as a rectangular table by writing the image of $\left( i,j\right) \in\left\{ 1,2,\ldots,n\right\} \times\left\{ 1,2,\ldots,m\right\} $ into the cell in -the $i$-th row and the $j$-th column.} For instance, when $\mathbb{K}% -=\mathbb{Q}$, the table $\left( +the $i$-th row and the $j$-th column. +\par +Thus, the notion of an $n\times m$-matrix is closely akin to what we called an +\textquotedblleft$n\times m$-table of elements of $\mathbb{K}$% +\textquotedblright\ in Definition \ref{def.ind.families.rectab}. The main +difference between these two notions is that an $n\times m$-matrix +\textquotedblleft knows\textquotedblright\ $\mathbb{K}$, whereas an $n\times +m$-table does not (i.e., two $n\times m$-matrices that have the same entries +in the same positions but are defined using different commutative rings +$\mathbb{K}$ are considered different, but two such $n\times m$-tables are +considered identical).} For instance, when $\mathbb{K}=\mathbb{Q}$, the table +$\left( \begin{array} [c]{ccc}% 1 & -2/5 & 4\\ @@ -26858,7 +39418,8 @@ \subsection{\label{sect.laplace}Laplace expansion in multiple rows/columns} \item If $I$ is a finite set of integers, then $w\left( I\right) $ shall denote the list of all elements of $I$ in increasing order (with no -repetitions). (For example, $w\left( \left\{ 3,4,8\right\} \right) +repetitions). (See Definition \ref{def.ind.inclist} for the formal definition +of this list.) (For example, $w\left( \left\{ 3,4,8\right\} \right) =\left( 3,4,8\right) $.) \end{itemize} @@ -28030,7 +40591,8 @@ \subsection{Additional exercises} I=\sum_{i\in I}i$.) \item Let $w\left( I\right) $ denote the list of all elements of $I$ in -increasing order (with no repetitions). (For example, $w\left( \left\{ +increasing order (with no repetitions). (See Definition \ref{def.ind.inclist} +for the formal definition of this list.) (For example, $w\left( \left\{ 3,4,8\right\} \right) =\left( 3,4,8\right) $.) \item Let $\left( A\mid B_{\bullet,I}\right) $ denote the $n\times\left( @@ -37970,7 +50532,8 @@ \subsubsection{First solution} \begin{equation} m\%c\equiv m\operatorname{mod}c. \label{pf.prop.sol.choose.a/b.lem1.rem.2}% \end{equation} - +Indeed, these two relations follow from Corollary \ref{cor.ind.quo-rem.remmod} +\textbf{(a)} (applied to $N=c$ and $n=m$). We shall now show that two of the $c+1$ integers $b^{0}\%c,b^{1}% \%c,\ldots,b^{c}\%c$ are equal. @@ -39438,14 +52001,16 @@ \subsection{Solution to Exercise \ref{exe.ps2.2.1}} $a=0$).}). \textit{Proof of (\ref{sol.ps2.2.1.claim}):} We shall prove -(\ref{sol.ps2.2.1.claim}) by strong induction on $n$. So we fix some -$N\in\mathbb{N}$, and we assume that (\ref{sol.ps2.2.1.claim}) is already +(\ref{sol.ps2.2.1.claim}) by strong induction\footnote{See Section +\ref{sect.ind.SIP} for an introduction to strong induction.} on $n$. So we fix +some $N\in\mathbb{N}$, and we assume that (\ref{sol.ps2.2.1.claim}) is already proven for every $n