Jekyll2019-04-03T20:33:10+00:00https://nivent.github.io/feed.xmlThoughts of a ProgrammerNiven Achenjang's Personal WebsiteDedekind Domains Done Right2019-03-26T00:00:00+00:002019-03-26T00:00:00+00:00https://nivent.github.io/blog/dedekind-domain<p>I’ve wanted to write this post for a long time now. Dedekind domain’s are, objectively, the best rings in existence, and their greatness stems from one fact: ideals in a Dedekind domain factor uniquely into (finite) products of prime ideals. However, I’ve never seen a proof of this fact that I liked (i.e. one that’s straightforward enough for me to actually remember) <sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup>, and so this post is my attempt to remedy this situation. Like (almost) always, I’ll start by introducing some background I’ll need in the proof, and then I’ll actually get into the good stuff in a separate section.</p>
<h1 id="background">Background</h1>
<p>I’ll start with a fact about localizations that I think I’ve used before on this blog, but never actually stated/proven.</p>
<p>First, observe that given a ring $R$, a multiplicative set $S\subset R$, and an $R$-module $M$, we can form an $\sinv R$-module $\sinv M$ which we call the localization of $M$ at $S$ (away from $S$? I can never remember what preposition to use). The construction is exactly what you expect: elements of $\sinv M$ are formal fractions $\frac ms$ with $m\in M$ and $s\in S$, and we say that</p>
<script type="math/tex; mode=display">\frac ms=\frac nt\iff\exists u\in S:u\cdot(t\cdot m-s\cdot n)=0,</script>
<p>where we’ve explicitly used $\cdot$ to emphasize that $m,n$ are module elements while $u,s,t$ are ring elements. If you know about tensor products, then you can show that we also have</p>
<script type="math/tex; mode=display">\sinv M\simeq M\otimes_R\sinv R,</script>
<p>so localization is really just extension of scalars. Now, onto the fact.</p>
<div class="proposition">
Let $M$ be an $R$-module. Then, $M=0\iff M_\mfm=0$ for all maximal ideals $\mfm\subset M$.
</div>
<div class="proof4">
On of these directions is easy, so we'll prove the other one. Assume that $M_\mfm=0$ for all maximal ideals $\mfm\subset R$, and suppose that $M\neq0$. Fix some nonzero $x\in M$, and let $\mfm$ a maximal ideal containing
$$\ann(x)=\bracks{r\in R:r\cdot x=0}\subsetneq R.$$
Then, $\frac x1=\frac01\in M_\mfm$ so there exists some $u\in R\sm\mfm\subset R\sm\ann(x)$ such that $u\cdot x=0$. This means that $u\in(R\sm\ann(x))\cap\ann(x)=\emptyset$, which is just wonderful.
</div>
<p>The real utility of this proposition comes from the fact that given an $R$-linear map $f:M\to N$, and a multiplicative set $S\subset M$, we can always form the $\sinv R$-linear map $\sinv f:\sinv M\to\sinv N$ given by $\sinv f(m/s)=f(m)/s$.</p>
<div class="exercise">
Show that we always have $\im(\sinv f)=\sinv\im(f)$ and $\ker(\sinv f)=\sinv\ker(f)$, so localization commutes with taking images and kernels. Also show localization commutes with forming quotients if you feel like it.
</div>
<div class="corollary">
Let $f:M\to N$ be a map between $R$-modules. Then, $f$ is an injection (or surjection or bijection) iff $f_\mfm$ is an injection (or surjection or bijection) for all maximal $\mfm\subset R$.
</div>
<div class="proof4">
Use exercise to apply the proposition to $\ker(f)$ and $\coker(f)=N/\im(f)$.
</div>
<p>That’s one thing down. The second thing we’ll need is generalized Cayley-Hamilton.</p>
<div class="theorem" name="Cayley-Hamilton">
Let $M$ be an $R$-module generated by $n$ elements, and let $T:M\to M$ be some $R$-linear map. Then, there exists a monic polynomial $f\in R[x]$ of degree $n$ such that $f(T)=0\in\End_R(M)$.
</div>
<div class="proof4">
First, fix a surjection $\pi:R^n\to M$, and note that we can form a commutative square
<center>
<img src="https://nivent.github.io/images/blog/dedekind-domain/ch.png" width="150" height="100" />
</center>
and so lift $T$ to a map $\alpha:R^n\to R^n$. The top map comes from the fact that $R^n$ is free so we can construct it by choosing any lifts of $T(\pi(e_i))$ for $e_1,\dots,e_n$ a basis for $R^n$. Furthermore, this lift is nice in that $p(\alpha)\in\End_R(R^n)$ lifts $p(T)\in\End_R(M)$ for any $p\in R[x]$, so to prove the theorem, it suffices to find some $f\in R[x]$ s.t. $f(\alpha)=0$. Since $R^n$ is free, we have a matrix $(a_{ij})_{i,j=1}^n$ defined by writing $\alpha(e_j)=\sum_{i=1}^na_{ij} e_i$. With this in mind, let $A=\Z\sqbracks{\bracks{x_{ij}}_{i,j=1}^n}$, and consider the universal matrix
$$S=\Mat{x_{11}}\cdots{x_{1n}}\vdots\ddots\vdots{x_{n1}}\cdots{x_{nn}}\in\End_A(A^n).$$
Now, let $\pi_T:A\to R$ be the homomorphism determined by $\pi_T(x_{ij})=a_{ij}$ for all $i,j$. With this map, we are reduced to finding some $g\in A[x]$ s.t. $g(S)=0$ since then applying $\pi_T$ gives some $f\in R[x]$ s.t. $f(\alpha)=0$. Luckily for us, $A$ is a domain, so we can take $g(x)=\det(xI-S)$ where $I$ is the identity matrix. To see why this works note that we can embed $\Frac(S)\into\C$ (since $\trdeg_{\Q}\Frac(S)<\trdeg_{\Q}\C$ and $\C$ algebraically closed) and so we are reduced to Cayley-Hamilton over $\C$ where this choice of $g$ definitely works.
</div>
<h1 id="the-good-stuff">The Good Stuff</h1>
<p>Now we’re here. Before getting in Dedekind domains, let’s briefly discuss dvrs (which we’ll see are just local Dedekind domains).</p>
<div class="definition">
Let $K$ be a field and $\Gamma$ be a totally ordered abelian group. A <b>valuation</b> $v:\units K\to\Gamma$ is a map such that
<ol>
<li> $v(ab)=v(a)+v(b)$ for all $a,b\in\units K$. </li>
<li> $v(a+b)\ge\min(v(a),v(b))$ with equality if $v(a)\neq v(b)$ for all $a,b\in\units K$.</li>
</ol>
We sometimes also say $v(0)=\infty$. A valuation is <b>discrete</b> if $\Gamma=\Z$.
</div>
<div class="definition">
Given a discrete valuation $v:\units K\to\Z$, its associated <b>discrete valuation ring (dvr)</b> is
$$R=\bracks{\alpha\in\units K:v(\alpha)\ge0}\cup\{0\}.$$
</div>
<p>Dvrs are very nice as detailed in the following theorem.</p>
<div class="theorem">
Let $R$ be a dvr with valuation $v:\units{\Frac(R)}\to\Z$. Then,
<ol>
<li> $R$ is a domain. </li>
<li> $\alpha\in\units R\iff v(\alpha)=0$. </li>
<li> $\mfm=\bracks{\alpha\in R:v(\alpha)>0}$ is maximal (this and 2. imply that $R$ is local) </li>
<li> $\mfm=(t)$ for any $t\in R$ with $v(t)=1$. </li>
<li> $R$ is a PID </li>
<li> Any $x\in R$ is of the form $x=ut^n$ with $u\in\units R$ and $v(t)=1$. (this and 5. imply that every ideal of $R$ is of the form $(t^n)$ for some $n$) </li>
</ol>
</div>
<div class="proof4">
<ol>
<li>
If $a,b\in R$ are nonzero, then $v(ab)=v(a)+v(b)$ is finite, so $ab\neq0$.
</li>
<li>
$ab=1$ implies that $v(a)+v(b)=v(1)=0$. Since $v(a),v(b)\ge0$, we must have $v(a)=v(b)=0$. Conversely, assume $v(a)=0$, and pick $b\in K$ s.t. $ab=1$. Then, $v(b)=v(a)+v(b)=v(1)=0$, so $b\in R$ and hence $a\in\units R$.
</li>
<li>
If $a,b\in\mfm$ and $r\in R$, then $v(rb)=v(r)+v(b)>0$ and $v(a+b)\ge\min(v(a),v(b))>0$ so $\mfm$ is an ideal. Since it's literally all the nonunits by 2., it's maximal and the only maximal ideal.
</li>
<li>
Fix $t\in R$ with $v(t)=1$, and fix any nonzero $r\in\mfm$. There exists $\alpha\in\Frac(R)$ such that $\alpha t=r$, so $v(\alpha)+1=v(\alpha)+v(t)=v(r)>0$. Thus, $v(\alpha)\ge0$, so $\alpha\in R$ and $r\in(t)$.
</li>
<li>
Let $I\subset R$ be an ideal and fix $t\in I$ with minimal valuation. Let $r\in I$ be any other nonzero element, and again fix $\alpha\in\Frac(R)$ s.t. $\alpha t=r$. Then, $v(\alpha)+v(t)=v(r)\ge v(t)$, so $v(\alpha)\ge 0$ which menas $\alpha\in R$ and $r\in(t)$, so $I=(t)$.
</li>
<li>
Fix some $x\in R$, and let $n=v(x)$. There's some $u\in\Frac(R)$ such that $ut^n=x$, so $v(u)+n=v(u)+v(t^n)=v(x)=n$. Hence, $v(u)=0$, so $u\in\units R$.
</li>
</ol>
</div>
<p>And finally, what is a Dedekind domain?</p>
<div class="definition">
An integral domain $R$ is a <b>Dedekind domain</b> if it is 1-dimensional, integrally closed, and noetherian.
</div>
<div class="example">
Any PID is a Dedekind domain.
</div>
<div class="example">
Let $K/\Q$ be a finite extension, and let $\ints K\subset K$ be the integral closure of $\Z$ in $K$. Then, $\ints K$ is Dedekind (More generally, the integral closure of a Dedekind domain in a field is Dedekind).
</div>
<div class="example">
Let $k$ be algebraically closed, and let $f\in k[x_1,\dots,x_n]$ be irreducible and "smooth," i.e. for all $c\in k^n$, (at least) one of $f(c),\pderiv f{x_1}(c),\dots,\pderiv f{x_n}(c)$ is nonzero. Then, $k[x_1,\dots,x_n]/(f)$ is a Dedekind domain.
</div>
<p>I’ll leave verifying these examples up to you. Our first lemma is that local Dedekind domains are dvrs.</p>
<div class="lemma">
Let $A$ be a local Dedekind domain. Then, $A$ is a dvr.
</div>
<div class="proof4">
Let $\mfm\subset A$ be its maximal ideal, and fix any $t\in\mfm\sm\mfm^2$. We claim that $\mfm=(t)$. First note that $\bar\mfm\subset A/(t)$ is the unique maximal ideal of the local 0-dimensional noetherian (hence local artinian) ring $A/(t)$ and so is nilpotent. Hence, $\mfm^n\subset(t)$ for some $n\ge1$. Suppose that $n>1$ and that $\mfm^{n-1}\not\subset(t)$; choose any $r\in\mfm^{n-1}\sm(t)$. Note that, given any $m\in\mfm$, we have $rm\in(t)\cap\mfm^n\subset(t)\cap\mfm^2$, so the multiplication map
$$\frac rt:\mfm\to\mfm$$
is $A$-linear. Hence, Cayley-Hamilton implies that $\frac rt\in\Frac(A)$ is integral over $A$. Since $A$ is integrally closed, we have $r/t\in A$ which contradicts $r\in\mfm^{n-1}\sm(t)$. Hence, $\mfm^{n-1}\subset(t)$, and induction then shows that in fact $\mfm\subset(t)\subsetneq A$, so $\mfm=(t)$. Now, given any $x\in A\sm\{0\}$, let $v(x)\in\Z_{\ge0}$ be the highest power of $t$ dividing $x$. This is well-defined because if $t^n\mid x\neq0$ for all $n$, then we could form the infinite chain
$$\parens{x}\subsetneq\parens{\frac xt}\subsetneq\parens{\frac x{t^2}}\subsetneq\parens{\frac x{t^3}}\subsetneq\cdots,$$
which contradicts $A$ being noetherian. Furthermore, it is clear that $v(ab)=v(a)+v(b)$ always and that $v(a+b)\ge\min(v(a),v(b))$ with equality if $v(a)\neq v(b)$ because that's how division works. Now, note that $v$ extends to a map $\units{\Frac(A)}\to\Z$ via $v(x/y)=v(x)-v(y)$. To finish, we need to show that
$$A=\bracks{\alpha\in\units{\Frac(A)}:v(\alpha)\ge0}\cup\{0\}.$$
This is because if $v(\alpha)\ge0$, then we can write $\alpha=x/y$ for some $x,y\in A$ with $v(y)=0$. However, $v(y)=0\implies y\in A\sm(t)=A\sm\mfm=\units A$ so $\alpha=x/y\in A$.
</div>
<div class="corollary">
A ring $A$ is a dvr iff $A$ is a local Dedekind domain iff $A$ is a local PID.
</div>
<p>The above was most of the work in proving this theorem about factoring ideals into primes. The rest of the proof is essentially the observation that localizing a Dedekind domain at a prime gives you a local Dedekind domain.</p>
<div class="theorem" name="Structure Theorem">
Let $A$ be a Dedekind domain, and let $(0)\neq I\subset A$ be an ideal. Then, $I$ factors into a unique finite product of prime ideals.
</div>
<div class="proof4">
Fix any nonzero prime (i.e. maximal) $\mfp\subset A$. Since localizations preserve noetherianness and integral closednesss and since the only prime inside of $\mfp$ is $(0)$ (i.e. $\mfp$ has height 1), we see that $A_\mfp$ is a local Dedekind domain, and so a dvr. Hence, $IA_\mfp=(\mfp A_\mfp)^{v_{\mfp}(I)}$ for some $v_{\mfp}(I)\in\Z_{\ge0}$. Let
$$J=\prod_{(0)\neq\mfp\in\spec A}\mfp^{v_\mfp(I)}=\bigcap_{(0)\neq\mfp\in\spec A}\mfp^{v_\mfp(I)},$$
and note that $I\subseteq J$ since $I\subseteq IA_\mfp\cap A=\mfp^{v_\mfp(I)}$ for all $\mfp$. Since the inclusion $I\into J$ is (by construction) locally an isomorphism (i.e. $I_\mfp=J_\mfp$), we conclude that in fact $I=J$, so $I$ is a product of primes. This product is unique since the exponents are recoverable via localization, so we only need to show that this product is finite (i.e. that $v_\mfp(I)=0$ almost always). This is equivalent to the claim that $I$ is contained in only finitely many maximal ideals, but that's true since $A/I$ is Artinian and Artinian rings only have finitely many maximal ideals. Thus, we win.
</div>
<p>Well, that wasn’t so bad, was it? I feel like this is a nice proof because it doesn’t require many technical lemmas and you can even get away with not mentioning fraction ideals. To end this post, you can try proving a converse to the structure theorem.</p>
<div class="exercise">
Let $A$ be an integral domain such that every nonzero ideal $I\subset A$ factors uniquely into a finite product of prime ideals. Show that $A$ is a Dedekind domain.
</div>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>Although Serre comes close in his “Local Fields” book (maybe I should only admit that I’ve only seen two approaches aside from the one I’ll show here) <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>I’ve wanted to write this post for a long time now. Dedekind domain’s are, objectively, the best rings in existence, and their greatness stems from one fact: ideals in a Dedekind domain factor uniquely into (finite) products of prime ideals. However, I’ve never seen a proof of this fact that I liked (i.e. one that’s straightforward enough for me to actually remember) 1, and so this post is my attempt to remedy this situation. Like (almost) always, I’ll start by introducing some background I’ll need in the proof, and then I’ll actually get into the good stuff in a separate section. Although Serre comes close in his “Local Fields” book (maybe I should only admit that I’ve only seen two approaches aside from the one I’ll show here) ↩$\ell$-adic Representations of Elliptic Curves2019-03-24T00:00:00+00:002019-03-24T00:00:00+00:00https://nivent.github.io/blog/elliptic-irrep<p><b>If you’re somehow seeing this right now, look away. It’s not finished, and I’m not sure when/if it will be</b></p>
<p>The title of this post is technically incorrect. We won’t be talking about representations of elliptic curves, but about representations attached to (associated with?) elliptic curves. In particular, the goal of this post is to prove that given an elliptic curve $E/\Q$ defined over the rationals, and a prime $\l$, the $\l$-adic representation $G_{\Q}\to\GL(T_\l(E))\iso\GL_2(\Z_\l)\subset\GL_2(\Q_\l)$ is irreducible where $G_{\Q}=\Gal(\Qbar/\Q)$ is the absolute Galois group of $\Q$. If you don’t know what some of these words mean, don’t worry; I’ll explain.</p>
<h1 id="prelim-on-elliptic-curves">Prelim on Elliptic Curves</h1>If you’re somehow seeing this right now, look away. It’s not finished, and I’m not sure when/if it will beSome Classic Affine Algebraic Geometry2019-03-24T00:00:00+00:002019-03-24T00:00:00+00:00https://nivent.github.io/blog/comm-alg<p>This will serve as background for a post I wanna write on Dedekind domains. The idea here is to blaze through <sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup> some geometry so I can use words like “Artinian ring” and “Krull dimension” when discussing Dedekind domains. The plan is to talk about integeral extensions of rings (+ other stuff), spectra of rings, dimensions of rings, and then Artinian rings. In a sense, this might be more commutative algebra than (classic) algebraic geometry, but what’s the difference? Note that all rings in this post are commutative with unity (and all ring maps preserve the unit).</p>
<h1 id="a-mess-of-words-i-dont-think-ive-defined-on-this-blog-before">A Mess of Words I Don’t Think I’ve Defined on this Blog Before</h1>
<p>For completeness, before getting into integral extensions I want to really quickly fill a couple gaps in this blog by saying what a module and an algebra are.</p>
<div class="definition">
Let $R$ be a ring. A <b>(left) $R$-module</b> $M$ is an abelian group with an $R$-action $R\by M\to M$ satisfying the following
<ol>
<li> $r(m_1+m_2)=rm_1+rm_2$ always.</li>
<li> $(r_2r_1)m=r_2(r_1m)$ always.</li>
<li> $(r_1+r_2)m=r_1m+r_2m$ always.</li>
<li> $1m=m$ where $1\in R$ is the multiplicative unit</li>
</ol>
</div>
<div class="remark">
An $R$-module structure of a group $M$ is the same thing as a ring homomorphism $R\to\End(M)$ into $M$'s endomorphism group.
</div>
<div class="remark">
If $R$ is a field, then an $R$-module is an $R$-vector space. If $R=\Z$, then an $R$-module is an abelian group.
</div>
<div class="definition">
Let $R$ be a ring. An <b>$R$-algebra $A$</b> is a ring with an $R$-module structure such that
$$r(a_1a_2)=(ra_1)a_2=a_1(ra_2).$$
</div>
<div class="remark">
An $R$-algebra is the same thing as a ring $A$ with a ring map $R\to Z(A)$ into the center of $A$.
</div>
<p>Modules, and in particular algebras, will feature heavily in this post. Modules in general can be rather poorly behaved, but we can maintain our sanity if we restrict ourselves to modules over a commutative ring that satisfies a nice finiteness property.</p>
<div class="definition">
An $R$-module $M$ is called <b>Noetherian</b> if it satisfies the <b>ascending chain condition</b> on submodules; that is, any chain
$$M_1\subseteq M_2\subseteq M_3\subseteq\cdots\subseteq M$$
on submodules eventually stabilizes (i.e. $M_n=M_{n+1}=\cdots$ for some $n\ge1$). If $R$ is Noetherian as an $R$-module, then we call it a <b>Noetherian ring</b>.
</div>
<div class="remark">
I never defined what a submodule is, but I think it's not that hard to figure out. However, while I'm on the subject of submodules, two things: (1) quotient modules always exist (i.e. given any submodule $N\subseteq M$, the natural quotient $M/N$ can be given an $R$-module structure) and (2) an $R$-submodule of $R$ is the same thing as an ideal.
</div>
<p>The Noetherian property is a kind of natural generalization being a PID. This is maybe not immediately obvious from the given definition, but this next theorem will help.</p>
<div class="proposition">
Let $M$ be an $R$-module. Then $M$ is Noetherian iff every submodule of $M$ is finitely generated.
</div>
<div class="proof4">
$(\to)$ Assume $M$ is Noetherian, and let $N\subset M$ be a submodule. Fix some $x_1\in N$, and let $N_1=Rx_1$ be the module it generates. Inductively choose $x_n\in M$ ($n>1$) such that $x_n\in N\sm N_{n-1}$ if $N_{n-1}\neq N$, and $x_n=x_{n-1}$ otherwise; set $N_n=\sum_{i=1}^nRx_i$. This gives an ascending chain
$$N_1\subseteq N_2\subseteq N_3\subseteq\cdots\subseteq M$$
of submodules of $M$ (submodules of $N$ even), and so must stabalize at the $m$th step for some $m$. Then, $N_m=N$ (otherwise, $N_{m+1}$ would be bigger by construction), so $N$ is generated by $m<\infty$ elements.
<br />
$(\from)$ Exercise.
</div>
<div class="corollary">
A ring is Noetherian iff every ideal is finitely generated.
</div>
<div class="remark">
This proposition makes it clear that subrings/submodules of a Neotherian ring/module are also noetherian (e.g. since submodules of $N\subset M$ are still submodules of $M$ and submodules of $M/N$ are submodules of $M$ containing $N$).
</div>
<p>Another nice thing to know about the Noetherian property is that almost every ring/module you will ever care about is Noetherian.</p>
<div class="theorem">
Let $R$ be a noetherian ring. Then every finitely generated (f.g.) $R$-module is noetherian.
</div>
<div class="proof4">
The main part of the proof is to show that noetherianess is preserved under extensions (i.e. if $N\subset M$ and $M/N$ are noetherian, then so is $M$), so we only prove this (to finish, show f.g. free modules $R^{\oplus m}$ are noetherian, and every f.g. module is a quotient of a f.g. free module). Consider a short exact sequence of $R$-modules
$$0\too N\too M\too M/N\too 0$$
where $N$ and $M/N$ are Noetherian. Let $M'\subset M$ be any $R$-submodule, so we get another short exact sequence
$$0\too N\cap M'\xtoo fM'\xtoo gM'/(N\cap M')\too0$$
where $N\cap M'\subseteq N$ and $M'/(N\cap M')\simeq (N+M')/N\subseteq M/N$. Hence, these modules are finitely generated, say by $\{e_1,\dots,e_n\}\subseteq N\cap M'$ and $\{\bar f_1,\dots,\bar f_m\}\subseteq M/(N\cap M')$, respectively. Now, pick any $m\in M'$, and write (non-uniquely) $g(m)=r_1\bar f_1+\dots+r_m\bar f_m$ with $r_i\in R$. Because $g$ is surjective, we can pick (non-unique) lifts $f_1,\dots,f_m\in M'$ of $\bar f_1,\dots,\bar f_m$, so $m-(r_1f_1+\dots+r_mf_m)\in\ker g=\im f$. Hence,
$$m=(r_1f_1+\dots+r_mf_m)+(s_1e_1+\dots+s_ne_n)$$
for some $s_j\in R$ (where we've identified $e_i\in N\cap M'$ with $f(e_i)\in M'$ since $f$ is injective). Thus, $M'$ is generated by $\{e_1,\dots,e_n\}\cup\{f_1,\dots,f_m\}$, so every submodule of $M$ is finitely generated.
</div>
<div class="theorem" name="Hilbert Basis">
Let $R$ be a Noetherian ring. Then, $R[x]$ is Noetherian as well.
</div>
<div class="proof4">
Exercise (hint: given an ideal $I\subseteq R[x]$, first consider the ideal of leading coefficients of elements of $I$)
</div>
<div class="corollary">
Let $R$ be a Noetherian ring. Then every finitely generated $R$-algebra is Noetherian.
</div>
<p>I think that’s all the highlights about Noetherianess. Let’s get into integral extensions.</p>
<div class="definition">
Let $\phi:R\to S$ be a ring map (e.g. an injection of rings). Then, $s\in S$ is <b>integral over $R$</b> if it is the root of a monic polynomial in $R[x]$. If all $s\in S$ are integral over $R$, then $S$ is <b>integral over $R$</b> or <b>an integral extension of $\phi(R)$</b>.
</div>
<div class="definition">
Let $\phi:R\to S$ be a ring map. The <b>integral closure of $R$ in $S$</b> is the set
$$\bar R=\{s\in S:s\text{ is integral over }R\}.$$
We say that $R$ is <b>integrally closed in $S$</b> if $\bar R=R$, and we say that a domain $R$ is <b>integrally closed</b> if it is integrally closed in $\Frac(R)$.
</div>
<div class="proposition">
Integral closures are rings.
</div>
<div class="proof4">
Let $\phi:R\to S$ be a ring map, and let $a,b\in S$ be integral over $R$. We need to show that $ab$ and $a+b$ are integral over $R$ as well (I guess we also need that $-a$ is integral over $R$). This is a consequence of the following lemma (consider the ring $R[a,b]$).
</div>
<div class="lemma">
Let $\phi:R\to S$ be a ring map. Then, $s\in S$ is integral over $R$ iff there exists a a subring $S'\subseteq S$ containing $s$ which is finitely generated as an $R$-module.
</div>
<div class="proof4">
$(\to)$ Assume that $s\in S$ is integral over $R$, and let $d$ be the degree of a monic polynomial in $R[x]$ vanishing at $s$. Then, $R[s]\subseteq S$ is a ring generated as an $R$-module by $\{s^k\}_{k=0}^{d-1}$.
<br />
$(\from)$ Let $S'\subseteq S$ be a subring containing $s$ which is finitely generated as an $R$-module. Write $S'=\sum_{i=1}^nRe_i$ with $e_i\in S'$. Note that the multiplication by $s$ map $m_s:S'\to S', x\mapsto sx$ is $R$-linear, and so can be represented (non-uniquely) by some matrix. That is, $m_s(e_j)=\sum_ia_{ij}e_i$ for some $a_{ij}\in R$, so
$$\Mat s\dots0\vdots\ddots\vdots0\dots s\vVec{e_1}\vdots{e_n}=\Mat{a_{11}}\dots{a_{1n}}\vdots\ddots\vdots{a_{n1}}\dots{a_{nn}}\vVec{e_1}\vdots{e_n}.$$
Subtracting now gives
$$\Mat {s-a_{11}}\dots{-a_{1n}}\vdots\ddots\vdots{-a_{n1}}\dots{s-a_{nn}}\vVec{e_1}\vdots{e_n}=\vVec0\vdots0.$$
Call the matrix on the left $M$. By Cramer's rule, we can form its adjugate $M^{\mrm{adj}}$ so that $M^{\mrm{adj}}M=\det M\cdot I$ where $I$ is the identity matrix. Multiplying the above equation by the adjugate gives $(\det M)e_i=0$ for all $i$. Since the $e_i$'s generate $S'$, we conclude $(\det M)x=0$ for all $x\in S'$. Taking $x=1$, gives $\det M=0$. Finally, $\det M$ literally is a monic polynomial in with $R$-coefficients vanishing at $s$, so we win.
</div>
<div class="remark">
A finitely generated $R$-algebra $S$ is integral over $R$ iff $S$ is finitely generated as an $R$-module.
</div>
<p>Integral extensions are the ring-theoretic analogue of algebraic extensions from field theory. Indeed, if $R$ is a field, then $x$ is integral over $R$ iff it is algebraic over $R$. Integral extensions have many nice properties, some of which are collected below.</p>
<div class="proposition">
Integral closures are integrally closed.
</div>
<div class="proof4">
Exercise.
</div>
<div class="proposition">
UFDs are integrally closed.
</div>
<div class="proof4">
Exercise (Hint: this is basically the rational root theorem).
</div>
<div class="proposition">
Let $\phi:A\to B$ be a ring map, and let $S\subset A$ be a multiplicative set. If $B$ is integral over $A$, then $\phi(S)^{-1}B$ is integral over $\sinv A$.
</div>
<div class="proof4">
Exercise.
</div>
<div class="corollary">
Let $S\subset R$ be a multiplicative set, and suppose that $R$ is integrally closed. Then, $\sinv R$ is integrally closed as well.
</div>
<div class="proposition">
Let $A\subset B$ be an integral extension of rings. Then, $A$ is a field iff $B$ is a field.
</div>
<div class="proof4">
$(\to)$ Suppose that $A$ is a field, and fix any $b\in B$. Then, $b$ is the root of some
$$f(x)=x^n+a_{n-1}x^{n-1}+\cdots+a_1x+a_0\in A[x].$$
This means that
$$-a_0=b(b^{n-1}+a_{n-1}b^{n-2}+\cdots+a_1),$$
and hence $\inv b=\frac{-1}{a_0}(b^{n-1}+a_{n-1}b^{n-2}+\dots+a_1)\in B$.
<br />
($\from$) This direction is proved similarly using that $\inv a\in B$ for any nonzero $a\in A$.
</div>
<div class="proposition">
Let $A\subset B$ be an integral extension of rings. Let $\mfa\subsetneq B$ be an ideal, and let $\mfa'=\mfa\cap A$. Then, $B/\mfa$ is integral over $A/\mfa'$.
</div>
<div class="proof4">
Just take a monic polynomial and reduce it mod $\mfa'$.
</div>
<div class="proposition">
Let $A\subset B$ be an integral ring extension, and let $\mfp\subset B$ be a prime ideal. If $\mfp\cap A$ is maximal, then so is $\mfp$.
</div>
<div class="proof4">
$B/\mfp$ is integral over the field $A/(\mfp\cap A)$.
</div>
<div class="theorem" name="Going up">
Let $A\subset B$ be an integral ring extension, and let $\mfp\subset A$ be prime. Then, there's some prime ideal $\mfp'\subset B$ with $\mfp'\cap A=\mfp$.
</div>
<div class="proof4">
Let $S=A\sm\mfp$, so $\sinv B$ is integral over the local ring $\sinv A=A_{\mfp}$ with unique maximal ideal $\sinv\mfp$. Let $\mfm\subset\sinv B$ be any maximal ideal. Then, $\mfm'=\mfm\cap\sinv A$ is prime, so $\sinv B/\mfm$ is a field integral over $\sinv A/\mfm'$; hence, $\mfm'$ is maximal and so must be $\sinv\mfp$. Thus, $\mfp'=\mfm\cap B$ is a prime of $B$ laying over $\mfp$ since
$$(\mfm\cap B)\cap A=\mfm\cap A=(\mfm\cap\sinv A)\cap A=\sinv\mfp\cap A=\mfp.$$
</div>
<p>Alright. I think that’s more than enough about integral extensions for now. Bottom line: they’re preserved by reasonal operations and they play nicely with primes.</p>
<h1 id="finally-some-geometry">Finally, Some Geometry</h1>
<p>Let’s actually justify the use of the word geometry in the title of this post by talking of $\spec$. Hilbert’s Nullstellensatz (which we’ll prove later) shows that, for an algebrically closed field $k$, points in $k^n$ are in bijection with maximal ideals of $k[x_1,\dots,x_n]$. Since we’d like to do geometry in purely algebraic settings (i.e. over an arbitrary ring $R$), we may think that a good replacement for $k^n$ is the set of maximal ideals of $R$. However, there’s a better choice; the set of prime ideals of $R$.</p>
<div class="definition">
Let $R$ be a ring. Its <b>spectrum</b> is
$$\spec R=\bracks{\mfp\subset R:\mfp\text{ is prime}}.$$
</div>
<p>Now, these spectres or whatever are supposed to be our replacements for things like $\C^n$, so they better be geometric in some sense. At the very least, they better have a topology.</p>
<div class="definition">
Given an ideal $I\subseteq R$, its "vanishing set" is $V(I)=\bracks{\mfp\in\spec R:\mfp\supseteq I}$. The <b>Zariski topology</b> on $\spec R$ is given by having closed sets be $V(I)$ for $I$ an ideal.
</div>
<div class="exercise">
Prove that this is an actual topology. Also prove that $V(I)$ is homeomorphic to $\spec R/I$. If you haven't had enough of this exercise, show that the open set $\spec R\sm V(r)$ (where $r\in R$) is homeomorphic to $\spec R_r=\spec R[1/r]$.
</div>
<div class="remark">
Given a ring map $f:R\to S$, we get an induced map on spectra $\spec S\to\spec R$ given by sending $\mfp\subset S$ to $\inv f(\mfp)\subset R$. This map is easily seen to be continuous, and so $\spec$ is a contravariant functor from Rng to Top.
</div>
<p>The idea behind the Zariski topology is that “zero sets of polynomials” should be closed <sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup>, so how do we get from that to this? Well, think of $\spec R$ as some variety (e.g. if $R=\C[x,y]/(y^2-x^3-x)$, then $\spec R$ is the curve given by $y^2=x^3-x$) and we want elements of $R$ to be functions on $\spec R$ (i.e. intuitively, elements of $R$ should give well-defined polynomials on $\spec R$ (see previous parenthetical)). The natural way of realizing this is to say that $r(\mfp)= r\pmod\mfp$ where $r\in R$ and $\mfp\in\spec R$ <sup id="fnref:3"><a href="#fn:3" class="footnote">3</a></sup>, so $r(\mfp)=0$ precisely when $r\in\mfp$. With this in mind, the points of $\spec R$ vanishing on each function in an ideal $I$ <sup id="fnref:4"><a href="#fn:4" class="footnote">4</a></sup> are exactly those that contain $I$.</p>
<p>We now have a topological space associated to an arbitrary ring $R$. How should we think about it geometrically? Well, motivated by Hilbert’s Nullstellensatz, we should think of the maximal ideals of $R$ as the (closed) points of $\spec R$ (point here used geometrically as a 0-dimensional thing. Any element of $\spec R$ could be reasonably called a point since it’s a space). Motivated by the exercise, the non-maximal prime ideals should correspond to higher-dimensional subvarieties: curves and hyperplanes and whatnot. This does beg the question though: what do I mean by dimension here? To answer that, I first need to define irreducible sets. <sup id="fnref:5"><a href="#fn:5" class="footnote">5</a></sup></p>
<div class="definition">
A subvariety $V(I)\subset\spec R$ (note: $I$ arbitrary) is called <b>irreducible</b> if writing $V(I)=V(J_1)\cup V(J_2)$ requires that $I=J_1$ or $I=J_2$.
</div>
<div class="exercise">
Show that $V(I)$ is irreducible iff $V(I)=V(\mfp)$ with $\mfp\subset R$ prime.
</div>
<p>Irreducible varieties are the ones that we care about; these are things like 1 point, 1 curve, 1 plane, etc. Intuitively, an irreducible variety cannot contain two things of the same dimension (otherwise you could write it as the union of those two things), so if $V(I)\subset V(J)$ with both irreducible, you would expect that $\dim V(I)<\dim V(J)$.</p>
<div class="definition">
Let $X=\spec R$. Then, $\dim X$ is defined to be the length of the longest chain $X_1\subsetneq X_2\subsetneq\dots\subsetneq X_n=X$ of irreducible subvarieties of $X$. Similarly, $\dim R$ is defined to be $\dim(\spec R)$.
</div>
<div class="remark">
By the previous exercise, the dimension of a ring is the length of its longest chain of prime ideals.
</div>
<div class="remark">
It is fairly easy to see that $\dim A\ge\dim A_\mfp+\dim A/\mfp$ for all $\mfp\in\spec A$. It turns out that this is always an equality. I won't prove this in this post, but after finishing this section, you should be able to do this by inducting on $\dim A$ and making use of Noether normalization.
</div>
<div class="example">
Let $\A_k^n=k[x_1,\dots,x_n]$ where $k$ is a field. Then, $\dim\A_k^n=n$. In general, $\dim R[x]=1+\dim R$.
</div>
<div class="exmaple">
$\dim\Z=1$. In general, $\dim R\le1$ if $R$ is a PID.
</div>
<div class="exmaple">
$\dim k=0$ if $k$ is a field (this is not an iff!).
</div>
<div class="remark">
$\dim R=0$ iff all prime ideals are maximal. For $R$ a domain, $\dim R=1$ iff all nonzero prime ideals are maximal.
</div>
<p>For the rest of this section, I’ll discuss some alternative definitions of dimension when your ring $R$ is niceish. Fix a field $k$. Unless otherwise stated, assume that $R$ is a finitely generated $k$-algebra for the rest of the section. This means that we can write $R=k[x_1,\dots,x_n]/I$.</p>
<div class="definition">
An <b>affine irreducible variety</b> is an irreducible set $\spec k[x_1,\dots,x_n]/I=V(I)\subset\A_k^n$.
</div>
<div class="definition">
Let $\phi:\spec B\to\spec A$ be a morphism induced by a $k$-algebra homomorphism $f:A\to B$. We say that $\phi$ is <b>finite</b> if $B$ is integral over $f(A)$.
</div>
<div class="theorem">
If $\phi:\spec B\to\spec A$ is finite, then
<ol>
<li> $\phi$ is a closed map </li>
<li> for any $\mfp\in\spec A$, $\inv\phi(\mfp)$ is finite. </li>
<li> $\phi$ is injective iff it was induced by a surjective map of rings </li>
</ol>
</div>
<div class="proof4">
I'll only prove 2. Assume $\phi$ is finite and induced by $f:A\to B$. Fix any $\mfp\in\spec A$, and let $B_\mfp=f(A\sm\mfp)^{-1}B$. Then, $B_\mfp$ is an integral extension of $A_\mfp=(A\sm\mfp)^{-1}A$, and so a finitely generated $A_\mfp$-module. Note that primes of $B_\mfp$ are in bijection with primes of $B$ away (i.e. disjoint) from $f(A\sm\mfp)$, so in particular, any $\mfq\in\inv\phi(\mfp)$ corresponds to a unique prime of $B_\mfp$. Now, let $\mfm=\mfp A_\mfp$ be the unique maximal ideal of $A_\mfp$, and consider the $A_\mfp/\mfm$-algebra $B_\mfp/f(\mfm)$. Note that $B_\mfp/f(\mfm)$ is a finitely generated $A_\mfm/\mfm$-module (i.e. a finite-dimensional vector space), and so has only finitely many prime ideals (we'll see this when we talk about Artinian rings). However, prime ideals of $B_\mfp/f(\mfm)$ are prime ideals of $B_\mfp$ containing $f(\mfm)$, and so are a superset of $\inv\phi(\mfp)$; thus, we win.
</div>
<div class="lemma" name="Noether normalization">
Let $X\subset\A_k^n$ be an irreducible subvariety. Then, there's a map $\pi:\A_k^n\onto\A_k^d$ such that the composition $X\to\A_k^n\onto\A_k^d$ is finite and surjective.
</div>
<div class="proof4">
We'll prove this assuming $k$ infinite (the general case is similar but with a non-linear transformation). Note that, by induction, it suffices to prove this in the case that $X=\spec k[x_1,\dots,x_n]/(f)$ is an irreducible hypersurface, so that's what we'll do. For an aribtrary nonzero $c=(c_1,c_2,\dots,c_{n-1})\in k^{n-1}$, consider the projection $\A_k^n\to\A_k^{n-1}$ induced by $y_i=c_ix_n+x_i$ for $i=1,\dots,n-1$. Write $f=f_d+f_{d-1}+\dots+f_0$ where $d=\deg f$ and $f_i$ is homogeneous of degree $i$ for all $i$. Then, $x_n$ satisfies the polynomial
$$g(T)=f(y_1-c_1T,y_2-c_2T,\dots,y_{n-1}-c_{n-1}T,T)\in k[y_1,\dots,y_{n-1}][T].$$
Furthermore, the leading term of $g(T)$ is $f_d(-c_1T,-c_2T,\dots,-c_{n-1}T,T)=(-T)^df_d(c_1,c_2,\dots,c_{n-1},1)$. Since $f_d\neq0\in k[x_1,\dots,x_n]$ and $k$ is infinite, we can fix a choice of $r=(r_1,\dots,r_n)\in k^n$ s.t. $f_d(r)\neq0$. Possibly after reordering, we may assume $r_n\neq0$, so $f_d(r/r_n)=f_d(r)/r_n^d\neq0$. Thus, choosing $c=(r_1/r_n,\dots,r_{n-1}/r_n)\in k^{n-1}$ causes the composite $\pi_c:X\to\A_k^n\to\A_k^{n-1}$ induced by the map
$$\mapdesc\phi{k[y_1,\dots,y_{n-1}]}{k[x_1,\dots,x_n]/(f)}{y_i}{c_ix_n+x_i+(f)}$$
to be finite; this choice of $c$ makes $x_n$ integral over the image, and the other $x_i$ are also integral over the image since $x_i=\phi(y_i)-c_ix_n$ and being integral is closed under ring operations. Hence, we only need to show that $\pi_c$ is surjective (i.e. that $\phi$ is injective). Fix some $p(y_1,\dots,y_{n-1})\in\ker\phi$ so
$$p(c_1x_n+x_1,\dots,c_{n-1}x_n+x_{n-1})=f(x_1,\dots,x_n)q(x_1,\dots,x_n).$$
Now, divide $p(y_1,\dots,y_{n-1})$ by $f(y_1-c_1y_n,\dots,y_{n-1}-c_{n-1}y_n,y_n)$ in $k(y_1,\dots,y_{n-1})[y_n]$ to get
$$p(y_1,\dots,y_{n-1})=\st q(y_1,\dots,y_n)f(y_1-c_1y_n,\dots,y_{n-1}-c_{n-1}y_n,y_n)+r(y_1,\dots,y_n).$$
Suppose that $p\neq0$; we can apply $\phi$ to $y_i$ ($i < n$) and send $y_n\mapsto x_n$, resulting in
$$p(c_1x_n+x+1,\dots,c_{n-1}x_n+x_{n-1})=r(c_1x_n+x_1,\dots,c_{n-1}x_n+x_{n-1},x_n),$$
so the total degree of $p$ is the total degree of $r$ which is less than the total degree of $f$. This contradicts $p(c_1x_n+x_1,\dots,c_{n-1}x_n+x_{n-1})=f(x_1,\dots,x_n)q(x_1,\dots,x_n)$, so we must have $p=0$. Hence, $\phi$ is injective, and $\pi_c$ is surjective.
</div>
<p>This is a nice lemma. For example, we can use it to give another definition of dimension.</p>
<div class="proposition">
Let $R=k[x_1,\dots,x_n]/\mfp$ where $\mfp$ is prime, and let $\spec R\onto\A_k^d$ be the map from Noether normalization. Then, $d=\dim R$.
</div>
<div class="proof4">
We have that $R$ is integral over $k[x_1,\dots,x_d]$. Hence, any chain of primes in $R$ restricts to a chain of primes in $k[x_1,\dots,x_d]$, and any chain of primes in $k[x_1,\dots,x_d]$ lifts to a chain of primes in $R$ (Going up), so $\dim R=\dim k[x_1,\dots,x_d]=d$.
</div>
<div class="corollary">
Let $R=k[x_1,\dots,x_n]/\mfp$ where $\mfp$ is prime. Then, $\trdeg_k\Frac R=\dim R$.
</div>
<div class="proof4">
Noether normalization gives the existence of $d=\dim R$ algebraically independent elements $y_1,\dots,y_d\in R$ such that $R$ is integral of $S=k[y_1,\dots,y_d]$. Let $F=\Frac(R)$ and $K=\Frac(S)$. Since $R$ is integeral over $S$, $F/K$ is algebraic so $\trdeg_kF=\trdeg_kS=d$.
</div>
<p>We also (finally) get a nice proof of the Nullstellensatz.</p>
<div class="lemma" name="Zariski's">
Let $k$ be a field, $A$ a finitely generated $k$-algebra, and $\mfm$ a maximal ideal of $A$. Then, $A/\mfm$ is a finite degree field extension of $k$.
</div>
<div class="proof4">
Note that $\dim(A/\mfm)=0$ since it's a field, so Noether normalization gives the existence of a surjective, finite map $\spec(A/\mfm)\to\spec k$. This means that $A/\mfm$ is integral over $k$ and hence an algebraic extension. Since $A$ was finitely generated as a $k$-algebra, we conclude that $A/\mfm$ is a finitely generated algebraic extension of $k$ (i.e. a finite extension of $k$).
</div>
<div class="definition">
Let $k$ be a field. For $S\subset k^n$, let
$$I(S)=\bracks{f\in k[x_1,\dots,x_n]:\forall x\in S,f(x)=0}.$$
Conversely, for $J\subset k[x_1,\dots,x_n]$, let
$$V(J)=\bracks{x\in k^n:\forall f\in J,f(x)=0}.$$
</div>
<div class="theorem" name="Hilbert's Nullstellensatz">
Let $k$ be an algebraically closed field. Then, maximal ideals of $k[x_1,\dots,x_n]$ are in bijection with points of $k^n$: $(a_1,\dots,a_n)\in k^n\mapsto\mfm_a=(x_1-a_1,\dots,x_n-a_n)$.
</div>
<div class="proof4">
Let $\mfm$ be a maximal ideal of $k[x_1,\dots,x_n]$. Then, $k[x_1,\dots,x_n]/\mfm$ is a finite degree extension of the algebrically closed field $k$, and so must itself be $k$. Pick a $k$-isomorphism $\phi:k[x_1,\dots,x_n]/\mfm\to k$, and let $a_i=\phi(x_i)$. Then, $\mfm\supseteq\mfm_a$ where $a=(a_1,\dots,a_n)$, so $\mfm=\mfm_a$ since $\mfm_a$ is maximal and $\mfm$ is proper.
</div>
<h1 id="artinian-rings">Artinian Rings</h1>
<p>I think at this point, we’ve had a nice little introduction to algebraic geometry. I mentioned at the beginning that this post was motivated primarily by my desire to give a nice treatment of Dedekind domains. For that, I’ll need to make use of some facts about Artinian rings, so this is what we’ll end on. Artinian rings are dual to Noetherian ones, but as we’ll see, they are much more constrained.</p>
<div class="definition">
A ring $A$ is called <b>Artinian</b> if it satisfies the <b>descending chain condition</b> on ideals; that is, any chain
$$A\supseteq I_0\supseteq I_1\supseteq\dots$$
of ideals stabilizes.
</div>
<div class="example">
If $k$ is a field, then $k[x]/(x^n)$ is Artinian.
</div>
<p>We want to understand the structure of Artininan rings. We first observe that they only have finitely many maximal ideals.</p>
<div class="proposition">
Let $A$ be an Artinian ring. Then, $A$ has finitely many maximal ideals.
</div>
<div class="proof4">
Suppose otherwise and let $\mfm_1,\mfm_2,\dots$ be an infinite list of distinct maximal ideals. Then,
$$\mfm_1\supseteq\mfm_1\cap\mfm_2\supseteq\mfm_1\cap\mfm_2\cap\mfm_3\supseteq\cdots$$
is a descending chain of ideals, so $\mfm_1\cap\cdots\cap\mfm_n=\mfm_1\cap\cdots\cap\mfm_n\cap\mfm_{n+1}$ for some $n$. Then, Chinese remainder theorem gives
$$\prod_{i=1}^n\frac A{\mfm_i}\simeq\prod_{i=1}^{n+1}\frac A{\mfm_i},$$
which is absurd.
</div>
<p>This proposition is actually stronger than it may seem at first, since in fact all prime ideals of $A$ are maximal.</p>
<div class="proposition">
Let $A$ be an Artinian ring. Then, $\dim A=0$.
</div>
<div class="proof4">
Let $\mfp\subset A$ be prime. Then, $A/\mfp$ is an Artinian domain. Pick any nonzero $x\in A/\mfp$ and consider the chain $(x)\supseteq(x^2)\supseteq(x^3)\supseteq\dots$. This stabilizers, so $x^n=ux^{n+1}$ for some $n$ and some unit $u$. Thus,
$$0=x^n(ux-1).$$
Since $A/\mfp$ is a domain and $x\neq0$, we must have $ux=1$, so $x$ is a unit.
</div>
<div class="corollary">
$A$ Artinian $\implies\spec A$ finite and discrete.
</div>
<p>We’ll next show one of the stranger properties of Artinian rings: they’re isomorphic to the product of their localizations. To do this, we’ll need to make use of Nakayama’s lemma which I’ll state without proof <sup id="fnref:6"><a href="#fn:6" class="footnote">6</a></sup>. We only need the first version, but the second is also nice.</p>
<div class="lemma" name="Nakayama's">
Let $R$ be a commutative ring, and let $\mf M=\cap\mfm$ where $\mfm$ ranges over maximal ideals of $R$. Let $M$ be a finitely generated $R$-module. Then, $\mf M\cdot M=M\implies M=0$.
</div>
<div class="corollary">
Let $R$ be a local ring with maximal ideal $\mfm$, and let $M$ be a finitely generated $R$-module. Them, $M/\mfm M$ is a finite dimensional $R/\mfm$-vector space. A subset $X\subset M$ generates $M$ iff its image $\bar X\subset M/\mfm M$ generates $M/\mfm M$.
</div>
<div class="proposition">
Let $A$ be a local Artinian ring. Then, its maximal ideal is nilpotent.
</div>
<div clas="proof4">
Let $\mfm\subset A$ be maximal. The chain $\mfm\supseteq\mfm^2\supseteq\cdots$ tells us that $\mfm^n=\mfm^{n+1}$ for some $n$. Since $\mfm$ is the only maximal ideal, Nakayama says that $\mfm^n=0$.
</div>
<div class="proposition">
Let $A$ be Artinian. Then,
$$A\simeq\prod_{\mfm\in\spec A}A_\mfm.$$
</div>
<div class="proof4">
Since $\spec A$ is finite, we can pick some $N$ large enough that $(\mfm A_\mfm)^N=\mfm^NA_\mfm=0$ for all $\mfm\in\spec A$. Hence, $(\mfm^NA_\mfm)\cap A=\mfm^N=0$ for all $\mfm\in\spec A$. Fix a prime $\mfm\in\spec A$, and consider the natural map $f:A\to A_\mfm$. It's clear that $\ker f\supseteq\mfm^N$, but we claim that this is an equality. Pick any $a\in\ker f$. Then, $a/1=0/1$ so there's some $s\not\in\mfm$ s.t. $sa=0\in\mfm^N$. As $s\not\in\mfm$, an easy induction argument shows that $a\in\mfm^N$, so $\ker f=\mfm^N$. We also claim that $f$ is surjective. Pick some $s\not\in\mfm$. The image of $s$ in $A/\mfm^N$ is invertible (one stupid way to see this is that $(s)+\mfm^N=A$ by checking locally), so there's some $a\in A$ s.t. $sa-1\in\mfm^N$. Hence,
$$sa-1\in\mfm^NA_\mfm\implies sa-1=0\in A_\mfm\implies\frac a1=\frac1s,$$
so $f$ is surjective (for elements with non-unit numerator, use that $f$ is multiplicative). Thus, $A/\mfm^N\simeq A_\mfm$. To finish, let $J=\prod_{\mfm\in\spec A}\mfm$ be the nilradical of $A$. Then, $J^N=0$ since locally, $J^N_\mfm=\mfm^NA_\mfm=0$, and so Chinese remainder theorem gives
$$A\simeq A/J^N\simeq\prod_{\mfm\in\spec A}\frac A{\mfm^N}\simeq\prod_{\mfm\in\spec A}A_\mfm$$
as desired.
</div>
<div class="corollary">
Let $A$ be Artinian. Then, $A$ is Noetherian.
</div>
<div class="proof4">
By the proposition (and the easy fact that finite products of noetherian rings are noetherian), it suffices to prove this when $A$ is local. Let $\mfm\subset A$ be its unique maximal ideal, and write $\mfm^N=0$. Then,
$$0=\mfm^N\subseteq\mfm^{N-1}\subseteq\cdots\subseteq\mfm\subseteq A$$
is a finite filtration of $A$ giving rise to the short exact sequences ($0\le n < N$)
$$0\too\mfm^{n+1}\too\mfm^n\too\mfm^n/\mfm^{n+1}\too0.$$
Since $0$ is Noetherian and extensions of noetherian rings are noetherian (I don't think I proved this either but it's also easy), it suffices to show that $\mfm^n/\mfm^{n+1}$ is noetherian for all $n$. However, $\mfm^n/\mfm^{n+1}$ is a vector space over $A/\mfm$, and is finite-dimensional because otherwise you could get an infinite descending chain (which you can then pull back to $A$). Since it's a finite dimensional vector space, it's also noetherian (e.g. because it's a free $A/\mfm$-module and so noetherien as an $A/\mfm$-module, but every ideal is an $A/\mfm$-submodule so an ascending chain of ideals is an ascending chain of $A/\mfm$-submodules which then must stabilize). Thus, we win.
</div>
<p>In the end, we’ve shown that every Artinian ring is a $0$-dimensional, Noetherian ring. One can in fact show the converese, so a ring is Artinian precisely when it’s 0-dimensional and Noetherian. I’ll leave the reverse direction as an exercise <sup id="fnref:7"><a href="#fn:7" class="footnote">7</a></sup>.</p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>This means there will be omitted definitions/examples and probably a lot of things left as exercises <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>The connection to polynomials comes from the fact that clasically people tend to work with finitely generated $k$-algebras, so your ring $R$ looks like $R=k[x_1,\dots,x_n]/I$. <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>Note that this is weird because $r$ doesn’t have a (nice) well-defined domain. The image of every $\mfp\in\spec R$ lies in a different ring. <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
<li id="fn:4">
<p>Equivalently, on each of the generators of $I$ <a href="#fnref:4" class="reversefootnote">↩</a></p>
</li>
<li id="fn:5">
<p>I should mention that I think the term variety is usally reserved for irreducible sets (and general $V(I)$ may instead be called algebraic sets), but oh well. <a href="#fnref:5" class="reversefootnote">↩</a></p>
</li>
<li id="fn:6">
<p>Exercise: prove it <a href="#fnref:6" class="reversefootnote">↩</a></p>
</li>
<li id="fn:7">
<p>It should be possible to show that the nilradical is nilpotent, and then $A$ is the product of its localizations (each with a nilpotent maximal ideal). From this, it’s not hard to conclude that $A$ is artinian. <a href="#fnref:7" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>This will serve as background for a post I wanna write on Dedekind domains. The idea here is to blaze through 1 some geometry so I can use words like “Artinian ring” and “Krull dimension” when discussing Dedekind domains. The plan is to talk about integeral extensions of rings (+ other stuff), spectra of rings, dimensions of rings, and then Artinian rings. In a sense, this might be more commutative algebra than (classic) algebraic geometry, but what’s the difference? Note that all rings in this post are commutative with unity (and all ring maps preserve the unit). This means there will be omitted definitions/examples and probably a lot of things left as exercises ↩The Duality Between Algebra and Geometry2019-03-24T00:00:00+00:002019-03-24T00:00:00+00:00https://nivent.github.io/blog/dual-alg-geo<p><b>If you’re somehow seeing this right now, look away. It’s not finished, and I’m not sure when/if it will be</b></p>
<p>I talked a little bit about the topic of this post’s title in a <a href="../comm-alg">recent post</a>, but I want to stress that this <a href="https://www.youtube.com/watch?v=F8mYLi3PGOc">duality between algebra and geometry</a> goes beyond <sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup> this $\spec$ business. In particular, I’m going to discuss a more “topological” setting where we see nice interplay between algebra and geometry <sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup>: relating a (compact, Hausdorff) topological space to its ring of (real-valued) continuous functions.</p>
<h1 id="prelim-on-separation-axioms">Prelim on Separation Axioms</h1>
<p>This might just be because there’s no point-set topology class at my school <sup id="fnref:3"><a href="#fn:3" class="footnote">3</a></sup>, but I get the sense that too many people don’t know about the theory of separation axioms for topological spaces. Sadly, I do not think I have enough space in this post to develop this theory, but I can state some needed highlights.</p>
<h1 id="x-c0x">$X``=” C^0(X)$</h1>
<h1 id="swans-theorem">Swan’s Theorem</h1>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>I don’t actually know that much about this area, so maybe what I’m gonna talk about in this post is somehow the same as this $\spec$ stuff. That would surprise me though. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>It really bothers me that topology and geometry are very similar in general character, but the only word I know for capturing both of them at once is “geometry”. Like, sometimes I say “geometry” and mean “topology/geometry” but other times I say it and mean just “geometry”. How’s anyone supposed to understand what I’m saying? <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>Which is really a shame <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>If you’re somehow seeing this right now, look away. It’s not finished, and I’m not sure when/if it will beCovering Spaces, $\pi_1$-actions, and Locally Constant Sheaves2019-03-22T00:00:00+00:002019-03-22T00:00:00+00:00https://nivent.github.io/blog/cover-fundgrp-sheaf<p>It makes me happy to know that this post will be the one knocking the “Covering Spaces” post off of the front page. This one will cover a related topic but (hopefully) with the noticable difference that while that post is trash, this one will be somewhat well-written. That being said, I’m going to be (mostly) stepping away from the (certain flavor of) number theory that I have been writing about, and make my next few posts more geometric/topological. <sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup> Kicking things off, this post will be about showing an equivalence between 3 seemingly different<sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup> kinds of objects. I’ll start off by briefly introducing categories and sheaves <sup id="fnref:7"><a href="#fn:7" class="footnote">3</a></sup>; then I’ll say some things about covers, and finally get into the good stuff.</p>
<h1 id="categories-and-sheaves-sheafs">Categories and Sheaves (Sheafs?)</h1>
<p>I’ve wanted to introduce categories on this blog for a long time now, but this is not the context in which I imagined I would do it. <sup id="fnref:3"><a href="#fn:3" class="footnote">4</a></sup> Anyways, what’s a category?</p>
<div class="definition">
A <b>(small) category</b> $\mc C$ is a collection (not necessarily a set) $\ob\mc C$ of <b>objects</b> and, for each pair $A,B\in\ob\mc C$ of objects in $\mc C$, a set $\mc C(A,B)=\Hom_{\mc C}(A,B)$ of <b>morphisms</b> such that
<ol>
<li> If $f\in\Hom_{\mc C}(A,B)$ and $g\in\Hom_{\mc C}(B,C)$ are morphisms, then there is a unique composite morphism $g\circ f\in\Hom_{\mc C}(A,C)$. Furthermore, composition of morphisms is associative so $h\circ(g\circ f)=(h\circ g)\circ f$ whenever either (and hence both) side is defined.</li>
<li> For all $A\in\ob\mc C$, there exists an identity morphism $1_A\in\Hom_{\mc C}(A,A)$ such that, for all $B\in\ob\mc C$ all $f\in\Hom_{\mc C}(A,B)$ and all $g\in\Hom_{\mc C}(B,A)$, we have $f=f\circ1_A$ and $g=1_A\circ g$. </li>
</ol>
If $f\in\mc C(A,B)$ we will often denote this by writing $f:A\to B$ like we do for normal functions. Finally, let $\Mor\mc C=\bigcup\Hom_{\mc C}(A,B)$ be the collection of all morphisms in $\mc C$.
</div>
<div class="remark">
Our categories are small because we require our hom-sets to be, well, sets instead of allowing (potentially proper) classes like we do for our collection of objects.
</div>
<p>The notion of a category is very general; think of your favorite type of mathematical object and you can probably form a category out of these things. The goto first example of a category people see is $\mrm{Set}$, the category whose objects are sets and whose morphisms are set maps. However, while easy to understand, this is a terrible example becasue (1) nobody ever does math in the category $\mrm{Set}$ (sets are too unstructured) and (2) seeing this example makes it harder to internalize the fact that objects/morphisms are atomic as far as category theory is concerned (you’re objects don’t have to be sets, and your morphisms don’t have to be “structure-preserving” set maps). <sup id="fnref:4"><a href="#fn:4" class="footnote">5</a></sup></p>
<p>As far as fixing (1) above, better examples to have in mind are $\mrm{Top}$, the category of topological spaces with continuous maps are morphisms; $\mrm{Ab}$, the category of abelian groups with group homomorphisms as its morphisms; and $R-\mrm{Mod}$ (here, $R$ is some ring), the category of left $R$-modules <sup id="fnref:5"><a href="#fn:5" class="footnote">6</a></sup> with $R$-linear maps as its morphisms. For better examples as far as (2) is concerned, think about the following.</p>
<div class="exercise">
Let $G$ be a graph. Convince yourself that this gives rise to a category whose objects are vertices and whose morphisms are edges.
</div>
<div class="definition">
Let $\mc C$ be a category, and let $f:A\rightleftarrows B:g$ be morphisms such that $f\circ g=1_B$ and $g\circ f=1_A$. Then, $f,g$ are called <b>isomorphisms</b>.
</div>
<div class="exercise">
Show that a group is the same thing as a category with one object in which every morphism is an isomorphism.
</div>
<div class="exercise">
Think of another category whose morphisms are not set-theoretic.
</div>
<p>Category theory is all about studying objects through their properties and their interactions (i.e. maps) with other objects of the same type instead of through their particular construction. So if we want to study categories, we should study maps between categories.</p>
<div class="definition">
Let $\mc C,\mc D$ be two categories. A <b>(covariant) functor</b> $F:\mc C\to\mc D$ is a choice of object $F(A)\in\mc D$ for each object $A\in\mc C$, and a collection of maps $\mc C(A,B)\xto F\mc D(F(A),F(B))$ such that $F(1_A)=1_A$ (i.e. $F$ preserves identites) for all $A\in\mc C$ and $F(f\circ g)=F(f)\circ F(g)$ (i.e. $F$ preserves compositions) for all morphisms $f,g\in\Mor C$ whenever both sides are defined.
</div>
<div class="remark">
If an asignment $F$ reverses arrows (i.e. gives maps $\mc C(A,B)\to\mc D(F(B),F(A))$), then we call is a <b>contravariant functor</b>. Letting $\mc C\op$ denote the category whoses objects are the same as in $\mc C$ and whose morphisms are the same as in $\mc C$ except going the other way around, a contravariant functor $\mc C\to\mc D$ is the same thing as a covariant functor $\mc C\op\to\mc D$.
</div>
<div class="example">
Many (pairs of) categories have "forgetful functors" that just forget some structure. For examples, there's a forgetful functor $\mrm{Ab}\to\mrm{Set}$ and also one from $R\mrm{-Mod}$ to $\mrm{Ab}$.
</div>
<div class="example">
Less boringly, the fundamental group $\pi_1$ is a functor $\mrm{Top}\to\mrm{Grp}$. Similarly, $n$th singular cohomology gives a contravariant functor $\mrm{Top}\to\mrm{Ab}$.
</div>
<div class="example">
Given any $A\in\mc C$, we can form two <b>$\Hom$-functors</b> $\Hom(A,-)$ and $\Hom(-,A)$. It's often beneficial to try and understand an object by understanding its $\Hom$-functors. Note that $\Hom(A,-)$ is covariant while $\Hom(-,A)$ is contravariant. This is often a source of confusion.
</div>
<p>Now that we know what a functor is, we can form a (large) category $\mrm{Cat}$ whose objects are (small) categories and whose morphisms are functors, but why stop there? The real raison d’être of category theory is to look at categories whose objects are functors and whose morphisms are $\dots$<sup id="fnref:6"><a href="#fn:6" class="footnote">7</a></sup></p>
<div class="definition">
Given two (covariant) functors $F,G:\mc C\to\mc D$, a <b>natural transformation</b> $\eta$ is a collection of maps $\eta_A:F(A)\to G(A)$ such that for all $A,B\in\mc C$ and $f\in\Hom_{\mc C}(A,B)$, we have $G(f)\circ\eta_A=\eta_B\circ F(f)$. i.e. the following square commutes
<center>
<img src="https://nivent.github.io/images/blog/cover-fundgrp-sheaf/nat.png" width="200" height="100" />
</center>
We denote this by $\eta:F\to G$ or $\eta:F\implies G$.
</div>
<p>Now, given two categories $\mc C,\mc D$, we let $\mc D^{\mc C}$ denote the category of (covariant) functors $\mc C\to\mc D$ with natural transformations as morphisms, and let $\mc D^{\mc C\op}$ denote the cateogry of contravariant functors $\mc C\to\mc D$ with natural transformations as morphisms.</p>
<p>I know I joked above about this stuff being abstract nonsense <sup id="fnref:8"><a href="#fn:8" class="footnote">8</a></sup>, but these functor categories are actually fairly natural and show up often; although, they’re not always presented as functor categories.</p>
<div class="example">
Let $G$ be a group and $k$ be a field. Recall that we can view $G$ itself as a category. With this in mind, the category of linear $G$-reps in the functor category $k$-Mod$^G$.
</div>
<div class="example">
A directed graph is two sets - edges and vertices - with two maps from the edge set to the vertex set - source and target - so the category of directed graphs is the functor category Set$^{\mc C}$ where $\mc C$ is the category with 2 objects $A,B$ and 2 morphisms $A\rightrightarrows B$.
</div>
<div class="definition">
Let $X$ be a topological space, and let $\Open(X)$ be the category whose objects are open subsets of $X$ and whose morphisms are inclusion maps $U\into V$ (in particular, each Hom-set as cardinality at most 1). Fix a category $\mc C$. The category of <b>($\mc C$-valued) presheaves</b> on $X$ is the functor category $\mc C^{\Open(X)\op}$.
</div>
<div class="remark">
Let $X$ be a topological space. Unpacking the above definition, a presheaf $\msP$ on $X$ is, for every open $U\subseteq X$, a choice of $\msP(U)\in\ob\mc C$ such that if $U\subseteq V$ are both open, there is a "restriction" morphism $\rho_{UV}:\msP(V)\to\msP(U)$. We require that $\rho_{UU}=1_{\msP(U)}$ and $\rho_{UV}\circ\rho_{VW}=\rho_{UW}$ for all $U\subseteq V\subseteq W$ open in $X$. We usually write $f\mid_U$ for $\rho_{UV}(f)$ (This is because many important examples of (pre)sheaves are of the form "nice" functions on $U$ where "nice" might mean continuous, smooth, holomorphic, etc.).
<br /><br />
A morphism of presheaves $\msP,\msS$ on $X$ is a collection of maps $\msP(U)\to\msS(U)$ commuting with the restriction maps on $\msP,\msS$ respectively.
</div>
<p>We’ll say more about (pre)sheaves in a bit, but before that, there’s a little more category theory to introduce. Anytime you define a mathematical object, you have to ask yourself what equivalence relation you want to consider them up to. For categories, there are (at least) 2 choices. The first is fairly obvious.</p>
<div class="definition">
Two categories $\mc C,\mc D$ are called <b>isomorphic</b> if they are isomorphic in $\mrm{Cat}$. That is, if there exists functors $F:\mc C\rightleftarrows\mc D:G$ such that $F\circ G=1_{\mc D}$ and $G\circ F=1_{\mc C}$.
</div>
<p>This turns out to be too strong a condition most of the time, so people usually only care about a weaker condition which you can think of as the homotopy equivalence of categories.</p>
<div class="definition">
Two categories $\mc C,\mc D$ are called <b>equivalent</b> if there exists functors $F:\mc C\rightleftarrows\mc D:G$ with isomorphisms of functors (i.e. natural transformations with inverse natural transformations) $\eta:G\circ F\implies1_{\mc C}$ and $\eta':F\circ G\implies1_{\mc D}$.
</div>
<p>The goal of this post is to show that three certain categories are equivalent. One of these is the category of locally constant sheaves, so let us return to (pre)sheaves. <sup id="fnref:9"><a href="#fn:9" class="footnote">9</a></sup></p>
<div class="definition">
Let $\msP$ be a $\mc C$-valued presheaf on a topological space $X$ where $\mc C$ is a category with a forgetful functor to $\mrm{Set}$ (e.g. $\Ab$). Recall that this means that $\msP$ is a contravariant functor $\Open(X)\to\mc C$. We say that $\msP$ is a <b>sheaf</b> if it is "locally defined" in the sense that for any collection $\{U_i\}$ of open sets and elements $f_i\in\msP(U_i)$ satisfying $f_i\mid_{U_i\cap U_j}=f_j\mid_{U_i\cap U_j}$ for all $i,j$, there is a unique $f\in\msP(U)$ with $f\mid_{U_i}=f_i$ for all $i$.
</div>
<p>That definition is a bit of a mouthful because I tried to make it general, but then ran into the issue that not all categories are built from sets. To stop further confusion, for the rest of this section assume all presheaves (at the very least) spit out abelian groups because this is (almost) always true in practice anyway <sup id="fnref:16"><a href="#fn:16" class="footnote">10</a></sup>. A sheaf is a presheaf where given a consistent choice of elements $f_i\in\msP(U_i)$, you can always (uniquely) glue these to get a global elements $f\in\msP(\bigcup U_i)$. Now, sheaves are really good for studying local properties and studying their ability (or failure) to satisfy local-to-global principals. One notion that helps in studying local properties via sheafs is that of a stalk.</p>
<div class="definition">
Fix a presheaf $\msP$ on a space $X$, and choose some $x\in X$. Let $U,V\subseteq X$ be (open) neighborhoods in $X$. We say $f\in\msP(U)$ and $g\in\msP(V)$ have the same <b>germ</b> if there's some $W\subseteq U\cap V$ such that $f\mid_W=g\mid_W$. This is an equivalence relation. The <b>stalk</b> $\msP_x$ of $\msP$ at $x$ is the set of germs of sections defined near $x$ (i.e. of $f\in\msP(U)$ for any $U\ni x$).
</div>
<div class="remark">
More algebraically, the stalk of a presheaf is the direct limit
$$\msP_x=\dirlim_{x\in U}\msP(U)$$
taken over neighborhoods of $x$.
</div>
<p>This post is looking like it might get quite long <sup id="fnref:10"><a href="#fn:10" class="footnote">11</a></sup>, so I’m just gonna move on without discussing stalks further except to say that, for our needs, they will appear as fibers of certain topological covers.</p>
<p>Sheaves are nice to have and work with, but presheaves are easier to write down. Thus, it’s really nice that there is a (functorial) process called <b>sheafification</b> that, given any presheaf $\msP$ spits out a sheaf $\msP^+$ with a morphism $\msP\to\msP^+$ inducing an isomorphism on stalks <sup id="fnref:11"><a href="#fn:11" class="footnote">12</a></sup> such that any morphism $\msP\to\msS$ from $\msP$ to a sheaf $\msS$ factors uniquely as $\msP\to\msP^+\to\msS$. Let’s end this section with a simple example. Fix an abelian group $M$ and a topological space $X$. Let $M_X$ denote the <b>constant presheaf</b>, i.e. $M_X(U)=M$ for all $U\in\Open(X)$. <sup id="fnref:18"><a href="#fn:18" class="footnote">13</a></sup> Its sheafification $M_X^+$ is the locally constant sheaf (which, because sheaves > presheaves, we still denote $M_X$ and we refer to as a <b>constant sheaf</b>) whose value on $U\in\Open(X)$ is $M\oplus M\oplus\cdots\oplus M$ where the number of factors of $M$ equals the number of connected components of $U$ <sup id="fnref:12"><a href="#fn:12" class="footnote">14</a></sup>. It would not be a bad idea to pause and go through the trouble of varifying that the constant presheaf really is a presheaf whose sheafification really is the constant sheaf as I have defined them.</p>
<h1 id="covers">Covers</h1>
<p>Well, that last section felt like a lot of material to introduce all at once, so I really hope you’ve seen categories and/or presheaves before. <sup id="fnref:13"><a href="#fn:13" class="footnote">15</a></sup> I think this one will be more digestible <sup id="fnref:14"><a href="#fn:14" class="footnote">16</a></sup>.</p>
<div class="definition">
Let $X$ be a topological space. A <b>space over $X$</b> is a topological space $Y$ with a continuous map $p:Y\to X$. A morphisms bewteen two spaces $p_i:Y_i\to X$ ($i=1,2$) over $X$ is a continuous map $f:Y_1\to Y_2$ such that the following triangle commutes
<center>
<img src="https://nivent.github.io/images/blog/cover-fundgrp-sheaf/overmor.png" width="200" height="100" />
</center>
The set $\inv p(x)$ is called the <b>fiber</b> over $x$.
</div>
<div class="definition">
Let $X$ be a topological space. A <b>cover (or covering space) of $X$</b> is a space $p:Y\to X$ over $X$ such that every point $x\in X$ has a neighborhood $U\ni x$ (called an <b>elementary (or fundamental) neighborhood</b>) such that $\inv p(U)$ is a disjoint union of open sets $\wt U_i$ of $Y$ and $p\mid_{\wt U_i}:\wt U_i\to U$ is a homeomorphism for all $i$.
</div>
<div class="remark">
Given any discrete space $I$, there's a trivial cover $X\by I\to X$ given by projection onto the first factor. More generally, the existence of fundamental neighborhoods means that any cover $p:Y\to X$ is locally trivial in the sense that each point of $X$ has a neighborhood $U$ s.t. $\inv p(U)\simeq X\by I$ (here, this isomorphism is in the category of spaces over $X$, or equivalently, in the category of covers of $X$) for some discrete $I$.
</div>
<div class="remark">
Because covers $p:Y\to X$ are locally trivial fibers are "locally homeomorphic" in the sense that each point $x\in X$ has a neighborhood in which every point's fiber is homeomorphic to $\inv p(x)$. This allows us to conclude that all fibers in any connected component of $X$ are homeomorphic. This remark does not use the discreteness of fibers and in fact holds for more general "fiber bundles" that I will not define here.
</div>
<div class="definition">
Let $p:Y\to X$ be a cover with $X$ connected. The size $\abs{\inv p(x)}$ of any fiber is called the <b>degree</b> of the cover.
</div>
<p>Let’s collect some facts about covers.</p>
<div class="definition">
Let $G$ be a group with a continuous (left) action on a topological space $Y$. We say its action is <b>even (or properly discontinuous)</b> if each point $y\in Y$ has some open neighborhood $U$ such that the sets $gU$ are pairwise disjoint for all $g\in G$.
</div>
<div class="remark">
If $G\actson Y$ evenly, then the natural projection $Y\to G\sm Y$ is a cover.
</div>
<div class="proposition">
Let $p:Y\to X$ be a cover and $Z$ a connected topological space. If two maps $f,g:Z\rightrightarrows Y$ satisfy $p\circ f=p\circ g$ (i.e. $f,g$ lift the same map $Z\to X$), then $f,g$ either agree everywhere or nowhere. That is, if there's some $z\in Z$ with $f(z)=g(z)$, then $f=g$.
</div>
<div class="proof4">
This was shown in a <a href="../covering-spaces">previous post</a>.
</div>
<div class="corollary">
If $p:Y\to X$ is a connected cover, the action of $\Aut(Y\mid X)$, the group of covering morphisms $Y\to Y$ with inverse covering morphisms, on $Y$ is even and free.
</div>
<div class="proof4">
Fix any $y\in Y$, and let $x=p(y)$. Let $V\ni x$ be fundamental, so $\inv p(V)$ is a disjoint union of open sets $U_i$, one of which, say $U_j$, contains $y$. Now, fix any $\phi\in\Aut(Y\mid X)$. Then, $p\circ\phi=p$ maps $U_j$ homemorphically onto $V$, and so $\phi(U_j)$ must be one of the $U_i$'s. Hence, the translates $\phi(U_j)$ of $U_j$ are disjoint, so $\Aut(Y\mid X)$ acts evenly. Furthermore, since $p\circ\phi=p$, $\phi(U_j)=U_j$ implies that $\phi(y)=y$ which (by the proposition) implies that $\phi=1_Y$ is the identity map, so $\Aut(Y\mid X)$ acts freely as well.
</div>
<div class="exercise">
Let $f:Y\to Z$ be a map between covers of $X$ (i.e. $f$ fits into a commutative triangle with $Y,Z$ both projecting to $X$). Then, $f$ is a cover of $Z$.
</div>
<p>At this point, I should probably mention this section is about laying the ground work to show an equivalence of categories between covers of $X$ and $\pi_1(X,x)$-sets for “nice enough” $X$. To define the relavent functor, we’ll need a way to recover a $\pi_1(X,x)$-action from a cover of $X$.</p>
<div class="definition">
Let $p:\wt X\to X$ be a cover. Given a path $f:I\to X$ and a choice $\st x\in\inv p(f(0))$ of lift of the starting point, there exists a unique path $\st f:I\to\wt X$ starting at $\st x$ and lifting $f$ in the sense that $p\circ\st f=f$. Furthermore, if $g:I\to X$ is homotopic to $f$, then $\st g:I\to\wt X$ lifting $g$ and starting at $\st x$ is homotopic to $\st f$.
</div>
<div class="proof4">
This was shown in a <a href="../covering-spaces">previous post</a>.
</div>
<div class="corollary">
Let $p:Y\to X$ be a cover and fix some $x\in X$. Then, there is a well-defined action of $\pi_1(X,x)$ on the fiber $\inv p(x)$ given by lifting a loop to a path in $Y$ and then taking its right endpoint.
</div>
<p>$\DeclareMathOperator{\Cov}{Cov}\DeclareMathOperator{\Fib}{Fib}$
For concreteness, let $\Cov(X)$ denote the category of coverings of $X$, and let $\pi_1(X,x)\mrm{-Set}=\mrm{Set}^{\pi_1(X,x)}$ denote the category of left $\pi_1(X,x)$-sets. We want to say that the above corollary gives the existence of a functor $F=\Fib_x:\Cov(X)\to\pi_1(X,x)\mrm{-Set}$ such that $F(p)=\inv p(x)$. However, we may worry that it is nonobvious that this construction comes with $\pi_1(X,x)$-equivariant induced maps (i.e. natural transformations) $F(f):\inv p_1(x)\to \inv p_2(x)$ where $f:Y_1\to Y_2$ is a cover morphism from $p_1:Y_1\to X$ to $p_2:Y_2\to X$. However, there is nothing to worry about. The obvious choice of $F(f):\inv p_1(x)\to\inv p_2(x)$ is $\pi_1(X,x)$-equivariant because of uniqueness of path lifting.</p>
<div class="exercise">
Convince yourself that $\Fib_x$ defined above actually is a functor.
</div>
<p>We’ll show that $\Fib_x$ is an equivalence of categories in the next section. To facilitate this, we’ll show that $\Fib_x$ is <b>representable</b> in the sense that $\Fib_x\cong\Hom_{\Cov(X)}(C,-)$ (this isomorphism taking place in the functor category $(\pi_1(X,x)\mrm{-Set})^{\Cov(X)}$) for some $C\in\Cov(X)$. To make $\Hom_{\Cov(X)}(C,D)$ a $\pi_1(X,x)$-set, give it the post-compose by (the action of) $\gamma\in\pi_1(X,x)$ action.</p>
<div class="theorem">
Let $X$ be path-connected and semilocally simply connected, and fix a base point $x\in X$. Then, the functor $\Fib_x$ is representable by a cover $\wt X_x\to X$.
</div>
<p>I think the construction of $\wt X_x$ is one of those things that becomes obviously the right choice after you see it and digest it, but that can be hard to succinctly motivate beforehand, so I won’t try to. As a general rule of thumb, let $I=[0,1]$ denote the unit interval.</p>
<div class="proof4">
Let $\wt X_x=[(I,0)\to(X,x)]$ be the set of homotopy classes (rel basepoints) of paths starting from $x$, and let $p:\wt X_x\to X$ be projection onto the right endpoint. To turn $\wt X_x$ into a space, we endow it with the following topology: given a point $[f]\in\wt X_x$, its neighborhood base consists of sets of the form
$$\st U_f:=\bracks{[g\circ f]:g:I\to X, g(0)=y, g([0,1])\subset U}$$
where $U\subseteq X$ is open such that the injection $i:U\into X$ induces the zero map $\push i:\pi_1(U,p([f]))\to\pi_1(X,p([f]))$ (i.e. every path contained in $U$ is contractible via some homotopy in $X$). This is to say that $\st U$ is (homotopy classes) of paths that continue $f$ within $U$. Since we are allowed to take homotopies in $X$, an element of $\st U$ is determined by its endpoints. Now, one can easily verify that these sets form a basis for a topology and that $p$ is continuous with respect to this topology. Note that given a $[f]\in\wt X_x$, and a subspace $i:U\into X$ with $\push i\pi_1(X,f(1))=0$, $\inv p(U)$ consists sets of the form $\st U_f$ for all $[f]\in\pi_1(X,x)$ and so $p$ really is a cover.
<br />
Now, fix the point $\st x\in\wt X_x$ to be the homotopy class of the constant loop at $x$. We will show that the cover $p:\wt X_x\to X$ represents the functor $\Fib_x$. This means we need a functorial isomorphism $\inv q(x)\iso\Hom_{\Cov(X)}(\wt X_x,Y)$ for any cover $q:Y\to X$. For any such cover and any choice of $y\in\inv q(x)$, let $\pi_y:\wt X_x\to Y$ be the morphism taking a point $[f]\in\wt X_x$ to $\st f(1)$ where $\st f:[0,1]\to Y$ is the unique path lifting $f$ with $\st f(0)=y$. Note that this is $\pi_1(X,x)$-equivariant by uniqueness of path lifting. To see that this is an isomorphism, observe that the map $\phi\mapsto\phi(\st x)$ is an explicit inverse. Finally this map is functorial since given a morphism $Y\to Y'$ of covers of $X$ taking $y\in Y$ to $y'\in Y'$, the induced map $\Hom_{\Cov(X)}(\wt X_x,Y)\to\Hom_{\Cov(X)}(\wt X_x,Y')$ takes $\pi_y$ to $\pi_{y'}$ as these are the maps sending $\wt x$ to $y$ and $y'$, repsectively.
</div>
<p>To make matters simpler, fix some path-connected and semilocally simply connected space $X$ with a choice of basepoint $x\in X$. The representing cover $\wt X_x$ coming from the above theorem is called the <b>universal cover</b> of $X$.</p>
<div class="theorem">
The universal cover $\wt X_x$ is path-connected with automorphism group $\Aut(\wt X_x\mid X)\simeq\pi_1(X,x)$ which acts transtiviely on the fiber above $x$.
</div>
<div class="proof4">
Exercise.
</div>
<p>At this point, I think we have (almost) everything we need to show that covers of $X$ are the same thing as sets with a $\pi_1(X,x)$-action.</p>
<h1 id="the-first-equivalence">The First Equivalence</h1>
<p>You may have noticed the “(almost)” in the previous sentence. That’s there because directly proving a functor gives an equivalence of categories is annoying since you need to give an inverse functor and two natural transformations. To help ease our pain, we’ll prove a lemma which says that the existence of one “nice” functor suffices to prove an equivalence of categories.</p>
<div class="definition">
A functor $F:\mc C_1\to\mc C_2$ is <b>faithful</b> if the induced map $\mc C_1(A,B)\to\mc C_2(F(A),F(B))$ is injective for all $A,B\in\mc C_1$. If these maps are always bijective, then $F$ is <b>fully faithful</b>.
</div>
<div class="definition">
A functor $F:\mc C_1\to\mc C_2$ is <b>essentially surjective</b> if every $A_2\in\mc C_2$ is isomorphic to $F(A_1)$ for some $A_1\in\mc C_1$.
</div>
<div class="lemma">
Two categories $\mc C_1,\mc C_2$ are equivalent if and only if there exists a functor $F:\mc C_1\to\mc C_2$ which is fully faithful and essentially surjective.
</div>
<p>We will give a proof of the “if” direction, but leave the “only if” direction as an exercise.</p>
<div class="proof4">
$(\to)$ Assume $F:\mc C_1\to\mc C_2$ is a fully faithful, essentially surjective functor. For each $X\in\mc C_2$, fix an isomorphism $i_X:F(A)\iso X$ for some $A\in\mc C_1$. Let $G:\mc C_2\to\mc C_1$ be the functor sending each $X\in\mc C_2$ to the $A$ chosen above, and each morphism $\phi:X\to Y$ to
$$G(\phi)=\inv F(\inv i_Y\circ\phi\circ i_X)$$
which is well-defined since $F$ is fully faithful so its induced map $\Hom(A,B)\to\Hom(F(A),F(B))$ is bijective. The maps $i_X:F(G(V))\iso V$ give an isomorphism $\eta:F\circ G\iso1_{\mc C_2}$ by construction. To finish, we need to give an isomorphism $\eta':G\circ F\iso1_{\mc C_1}$. Hence, we need functorial isomorphisms $\eta'_A:G(F(A))\to A$ for each $A\in\mc C_1$. Since $F$ is fully faithful, it suffices to construct maps
$$F(\eta'_A):F(G(F(A)))\to F(A).$$
We take $F(\eta'_A)=\eta_{F(A)}$ which is indeed a functorial isomorphism because $\eta$ is an isomorphism in $\mrm{Cat}$. Thus, we win.
</div>
<p>I bet you can guess what comes next: to show that $\Fib_x$ gives an equivalence of categories, we’ll show that it is fully faithful and essentially surjective. We’ll prove each of these as a lemma, and then conclude what we want. Fix some path-connected and semilocally simply connected topological space $X$ with a chosen basepoint $x\in X$.</p>
<div class="lemma">
$\Fib_x$ is fully faithful.
</div>
<div class="proof4">
Let $p:Y\to X$ and $q:Z\to X$ be two covers of $X$. We need to show that each map $\phi:\Fib_x(Y)\to\Fib_x(Z)$ of $\pi_1(X,x)$-sets comes from a unique map $Y\to Z$ of covers of $X$. We can assume that $Y,Z$ are connected (else run this argument on pairs of connected components). Let $\wt X_x$ be the universal cover of $X$, and recall the maps $\pi_y:\wt X_x\to Y$ defined for each $y\in Y$. Let $U_y=\Aut(\wt X_x\mid Y)$ be the stabilizer of $y$, and note that $\pi_y$ descends to a map $U_y\sm\wt X_x\to Y$ since the action of $U_y$ fixes fibers above $y$. This map is a homeomorphism (and hence a cover isomorphism once $U_y\sm\wt X_x$ is given the appropriate projection map to $X$) since you can define an inverse map $\phi_y:Y\to U_y\sm\wt X_x$ by taking $y'\in Y$ to any lift in $\wt X_x$ and then projecting that lift into $U_y\sm\wt X_x$ (This map is obviously-well defined and a set-theoretic inverse to $\phi$. Showing that it's continuous is left as an exercise). Now, $U_y$ injects into the stabilizer of $\phi(y)$ via $\phi$, so the natural map $\pi_{\phi(y)}:\wt X_x\to Z$ induces a map
$$U_y\sm\wt X_x\to U_{\phi(y)}\sm\wt X_x\iso Z.$$
Composing this with the map $\phi_y:Y\iso U_y\sm\wt X_x$ gives the desired map $Y\to Z$.
</div>
<div class="lemma">
$\Fib_x$ is essentially surjective.
</div>
<div class="proof4">
We need to show that each left $\pi_1(X,x)$-set $S$ is isomorphic to the fiber of some cover of $X$. We can assume that $\pi_1(X,x)$ acts transitively on $S$ (else take the disjoint union of the covers corresponding to each orbit), so let's assume that. Fix any $s\in S$, and let $H=\pi_1(X,x)_s$ be its stabilizer. Since $H\le\pi_1(X,x)\simeq\Aut(\wt X_x\mid X)$, we can consider the quotient $H\sm\wt X_x$. This is our desired cover of $X$.
</div>
<div class="theorem">
The categories $\Cov(X)$ of covers of $X$ and $\mrm{Set}^{\pi_1(X,x)}$ of sets with a $\pi_1(X,x)$-action are equivalent.
</div>
<div class="exercise">
Let $\wh{\pi_1(X,x)}$ denote the profinite completion of $\pi_1(X,x)$ (i.e. $\invlim\pi_1(X,x)/N$ where $N$ ranges over normal subgroups of finite index partially ordered by inclusion). Prove that $\Fib_x$ induces an equivalence of categories between the category of finite covers of $X$ (i.e. covers with finite fibers) and sets with a continuous left $\wh{\pi_1(X,x)}$-action.
</div>
<p>Whelp. I hope the journey was worth it. Probably if you’ve seen covering spaces before, this isn’t too surprising, but I think it’s still nice to see things from a more categorical perspective. I suspect that the next result is will be more surprising even to someone who has seen sheaves before (unless, of course, they’ve also seen this result before) <sup id="fnref:15"><a href="#fn:15" class="footnote">17</a></sup>: the category of (left) $\pi_1(X,x)$-sets is equivalent to the category of locally constant sheaves on $X$.</p>
<h1 id="the-second-equivalence">The Second Equivalence</h1>
<p>Let’s get started. First off, I don’t think I mentioned this before (but I did use this terminology earlier), but if $\ms P$ is a presheaf and $U\subseteq X$ is open, then any $s\in\msP(U)$ is called a <b>section</b> of $\msP$ (defined) over $U$. This is because “sheaves of sections” (this use of “section” refering to (local) right inverses) show up fairly regularly in math <sup id="fnref:17"><a href="#fn:17" class="footnote">18</a></sup>. As an example</p>
<div class="definition">
Let $p:Y\to X$ be a space over $X$ (note: not necessarily a cover), and $U\subset X$ an open set. A <b>section</b> of $p$ over $U$ is a continous map $s:U\to Y$ such that $p\circ s=1_U$.
</div>
<div class="example">
Let $p:Y\to X$ be a space over $X$, and let $\ms F_Y$ be the $\mrm{Set}$-valued presheaf on $X$ of sections of $p$. i.e.
$$\ms F_Y(U)=\bracks{s:U\to Y:p\circ s=1_U},$$
and the restriction maps $\ms F_Y(U)\to\ms F_Y(V)$ for an inclusion $V\into U$ are literally given by restricted a section to a subspace. This presheaf is actually a sheaf essentially because continuous functions are locally definable (use the pasting lemma).
</div>
<p>Now, recall that a constant sheaf $\msS=S_X$ is the sheafification of the presheaf of constant functions to some fixed set $S$ (i.e. $\msS(U)$ is the set of locally constant functions $U\to S$). You may object that these should really be called locally constant sheaves; I hear that objection, ignore it, and make the following definition just to make the situation more confusing.</p>
<div class="definition">
A sheaf $\msS$ on a topological space $X$ is <b>locally constant</b> if each point of $X$ has an open neighborhood $U$ such that $\msS\mid_U$ (the restriction of $\msS$ to $U$ which is a sheaf on $U$) is isomorphic (in the category of sheaves on $U$) to a constant sheaf.
</div>
<div class="proposition">
Let $p:Y\to X$ be a cover with $X$ connected. Then, the sheaf $\ms F_Y$ of sections is locally constant, and is constant if $p$ is trivial (i.e. a homeomorphism).
</div>
<div class="proof4">
Given some $x\in X$, let $V$ be one of its fundamental neighborhoods. The image of a section $V\to Y$ must be one of the connected components of $\inv p(V)$, so sections over $V$ correspond bijectively to points of the fiber $\inv p(x)$. Hence, $\ms F_Y\mid_V$ is (isomorphic to) the constant sheaf $F_X$ defined by $F=\inv p(x)$. $\ms F_Y$ itself is constant if and only if we may take $V=X$ if and only if $p$ is trivial.
</div>
<div class="corollary">
Let $p:Y\to X$ be a cover and fix any $x\in X$. Then the stalk $\ms F_{Y,x}$ of $\ms F_Y$ at $x$ is (isomorphic to) the fiber $\inv p(x)$.
</div>
<div class="example">
I don't think I ever wrote down an example of a cover, so let's do that now. View $S^1\subseteq\C$ as the unit circle. Then, the $n$th-power map $z\mapsto z^n$ gives a degree $n$ cover $S^1\to S^1$. For all $n>1$, this cover is nontrivial and hence, by the above, gives rise to a locally constant but non-constant sheaf.
</div>
<p>$\DeclareMathOperator{\LCS}{LCS}$
Note that a morphism $\phi:Y\to Z$ of covers of $X$ induces a morphism $\ms F_Y\to\ms F_Z$ of locally constant sheaves by taking the local section $s:U\to Y$ to $\phi\circ s$ ($\phi\circ s$ is a local section precisely because $\phi$ is a cover morphism) <sup id="fnref:19"><a href="#fn:19" class="footnote">19</a></sup>. Hence, we get a functor $S:Y\mapsto\ms F_Y$ from the category $\Cov(X)$ of covers of $X$ to the category $\LCS(X)$ of $\mrm{Set}$-valued locally constant sheaves on $X$. To show that this functor gives an equivalence of categories we’ll just construct an inverse this time. This is where we get to make use of stalks.</p>
<p>Let $X$ be a topological space, and let $\ms F$ be a presheaf (of sets) on $X$. Let
<script type="math/tex">X_{\ms F}=\bigsqcup_{x\in X}\ms F_x</script>
be the disjoint union of the stalks of $\ms F$. We want to give this set a topology. Note that for any open $U\subset X$, a section $s\in\ms F(U)$ gives rise to a map $i_s:U\to X_{\ms F}$ sending each $x\in U$ to the germ $s_x\in\ms F_x$ of $s$ over $x$. Give $X_{\ms F}$ the coarsest (i.e. smallest) <sup id="fnref:20"><a href="#fn:20" class="footnote">20</a></sup> topology in which the sets $i_s(U)$ are open for all $U$ and $s$. Let $p_{\ms F}:X_{\ms F}\to X$ be the map sending a germ to the point over which it is defined (i.e. $p_{\ms F}\mid_{\ms F_x}=x$ for all $x\in X$). This map is continuous because
<script type="math/tex">\inv p_{\ms F}(U)=\bigcup_{s\in\ms F(U)}i_s(U)</script>
is open. Furthermore, the maps $i_s:U\to X_{\ms F}$ are continuous as well, and I’ll leave this for you to check. Note that a morphism $\phi:\ms F\to\ms G$ of presheaves induces maps $\ms F_x\to\ms G_x$ for each $x\in X$, and hence a set map $\phi:X_{\ms F}\to X_{\ms G}$ compatible with the projections onto $X$. Thus, the assignment $\ms F\mapsto X_{\ms F}$ gives a functor from the category of sheaves on $X$ to the category of spaces over $X$ (Technically, we need to show that $\phi$ is continuous, but this is easy).</p>
<div class="proposition">
The assignment $C:\ms F\mapsto X_{\ms F}$ restricts to a functor $\LCS(X)\to\Cov(X)$.
</div>
<div class="proof4">
By the discussion above the claim, it suffices to show that $p_{\ms F}:X_{\ms F}\to X$ is a cover when $\ms F$ is locally constant. Fix any $x\in X$, and let $U\ni x$ be a neighborhood such taht $\ms F\mid_U$ is constant. Say $\ms F\mid_U\cong F_U$ for some set $F$. Then, $\ms F_x=(\ms F\mid_U)_x=(F_U)_x=F$ for all $x\in U$, so $\inv p_{\ms F}(U)\cong U\by F$ where $F$ is given the discrete topology.
</div>
<p>Now that we have our inverse functor, we prove.</p>
<div class="theorem">
Let $X$ be a topological space. Then the functors $S,C$ induce an equivalence of categories between $\Cov(X)$ and $\LCS(X)$.
</div>
<div class="proof4">
Let $\ms F$ be a locally constant sheaf, and let $p:Y\to X$ be a cover. We need to show that we have $\ms F_{X_{\ms F}}\cong\ms F$ and $X_{\ms F_Y}\cong Y$ both functorially in $\ms F,Y$ respectively. The map $\ms F\to\ms F_{X_{\ms F}}$ is given by sending a section $s\in\ms F(U)$ to the local section $i_s:U\to X_{\ms F}$ while the map $Y\to X_{\ms F_Y}$ is given by sending a point $y\in Y$ to the corresponding point of the stalk/fiber $\ms F_{Y,p(y)}=\inv p(p(y))$ (Secretly, we fixed some choice of isomorphisms at the beginning of this argument) over $p(y)$. To show that these maps are isomorphisms, it suffices to show that their restrictions over a suitable open covering of $X$ are, so choose a cover $\{U_i\}_{i\in I}$ such that $\ms F\mid_{U_i}$ is constant for each $i$. Replacing $X$ by $U_i$, we may henceforth (not sure if I've ever used this word in a proof before) assume $\ms F\cong F_X$ is a constant sheaf. In this case, $X_{\ms F}\cong X_{F_X}\cong X\by F$, and the sheaf of local sections of the trivial cover $X\by F\to X$ is the constant sheaf $F_X$, so the maps $\ms F\to\ms F_{X_{\ms F}}$ (i.e. $F_X\to\ms F_{X\by F}$) and $Y\to X_{\ms F_Y}$ (i.e. $X\by F\to X_{F_X}$) are isomorphisms, and we win.
</div>
<div class="corollary">
Let $X$ be a path connected and locally simply connected topological space, and let $x$ be a point in $X$. Then, $\pi_1(X,x)\mrm{-Set}$ is equivalent to $\LCS(X)$, and this equivalence is induced by the functor $\ms F\to\ms F_x$.
</div>
<p>There you have it. A locally constant sheaf on a space $X$ is nothing more than a cover of that space or, if it is nice enough, a (left) $\pi_1(X,x)$-set. To end this post, try proving the following</p>
<div class="exercise">
Let $X$ be path connected and locally simply connected, and fix some basepoint $x\in X$. Let $R$ be a commutative ring. Then, the category of locally constant sheaves of $R$-modules on $X$ is equivalent to the category of (left) modules over the group ring $R[\pi_1(X,x)]$.
</div>
<p>For motivation for why you might care about this, fix a field $k$, and recall that a linear representation $\rho:\pi_1(X,x)\to\GL_n(k)$ of the fundamental group of $X$ is the same thing as a choice of a $k[\pi_1(X,x)]$-module $M$ (i.e. there’s an equivalence of categories between linear representations and modules over the group ring). With this in mind, the above exercise asks you to show that a locally constant sheaf of $k$-modules on a (sufficiently nice) space $X$ is the same thing as a representation of its fundamental group! <sup id="fnref:21"><a href="#fn:21" class="footnote">21</a></sup></p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>As a rule of thumb, <a href="../fourier">if</a> <a href="../group-intro">I</a> <a href="../Modular-Arithmetic">ever</a> <a href="../interesting-equation-ii">say</a> that I will write a post about something, you probably shouldn’t believe that I will actually follow through with that promise. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>Admittedly, the first two are unsurprisingly related <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:7">
<p>If you know what these are, just skip the first section or two <a href="#fnref:7" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>Always thought it would be in some blog post about $R$-modules and I thought I would not be doing you the disservice of talking about categories without mentioning universal properties. Oh well… it just goes to show <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
<li id="fn:4">
<p>The irony (or not irony? What does this word even technically mean?) of $\mrm{Set}$ technically being the first example I give is not lost on me. <a href="#fnref:4" class="reversefootnote">↩</a></p>
</li>
<li id="fn:5">
<p>The category of right $R$-modules is written $\mrm{Mod}-R$. <a href="#fnref:5" class="reversefootnote">↩</a></p>
</li>
<li id="fn:6">
<p>There’s a reason people call this stuff abstract nonsense <a href="#fnref:6" class="reversefootnote">↩</a></p>
</li>
<li id="fn:8">
<p>If you don’t remember me doing this, then you really gotta start reading these footnotes <a href="#fnref:8" class="reversefootnote">↩</a></p>
</li>
<li id="fn:9">
<p>You might be wondering if one can define sheaves for categories without forgetful functors to Set. I imagine that the answer to this is yes and that one way to do this is by viewing the elements of an object $A$ in you category as the morphisms $B\to A$ for various $B$ a la <a href="https://www.maths.ed.ac.uk/~tl/elements.pdf">this</a>. Alternatively, maybe you can do something like say $\msP$ is a sheave if given any collection $\bracks{U_i}$ for $i\in I$ of open sets, the category whose objects are $\msP\parens{\bigcap_{j\in J}U_j}$ for finite $J\subseteq I$ has all colimits (in the categorical sense). I haven’t thought about this enough to know if either of these work (or if they recover the usual definition when the objects in your category secretly are sets) <a href="#fnref:9" class="reversefootnote">↩</a></p>
</li>
<li id="fn:16">
<p>When we get into the meat of things, we’ll actually be looking at $\mrm{Set}$-valued sheaves <a href="#fnref:16" class="reversefootnote">↩</a></p>
</li>
<li id="fn:10">
<p>We’re still in the prelims (!) <a href="#fnref:10" class="reversefootnote">↩</a></p>
</li>
<li id="fn:11">
<p>Exercise convince yourself that any morphism $\ms A\to\ms B$ between presheaves induces a morphism $\ms A_x\to\ms B_x$ on stalks. <a href="#fnref:11" class="reversefootnote">↩</a></p>
</li>
<li id="fn:18">
<p>Think of this as the presheaf of constant functions $X\to M$. <a href="#fnref:18" class="reversefootnote">↩</a></p>
</li>
<li id="fn:12">
<p>i.e. $M_X(U)=M^{\oplus\dim\hom_0(U,\R)}$ <a href="#fnref:12" class="reversefootnote">↩</a></p>
</li>
<li id="fn:13">
<p>I’m starting to understand why people write textbooks instead of just trying to jam everything into individual blog posts. Maybe I should take a page from <a href="https://jeremykun.com/">Jeremy Kun’s</a> playbook and start separating all the background material into their own separate posts. <a href="#fnref:13" class="reversefootnote">↩</a></p>
</li>
<li id="fn:14">
<p>I’m assuming you’ve seen fundamental groups before, so probably you’ve also seen covering spaces and this section is mostly review <a href="#fnref:14" class="reversefootnote">↩</a></p>
</li>
<li id="fn:15">
<p>My introduction to sheaf theory was somewhat nonstandard, so it’s possible that this is a commonly taught/known result and I just happened to be out of the loop until recently <a href="#fnref:15" class="reversefootnote">↩</a></p>
</li>
<li id="fn:17">
<p>e.g. when proving that the category of vector bundles on a space is equivalent to the category of locally free sheaves on that space <a href="#fnref:17" class="reversefootnote">↩</a></p>
</li>
<li id="fn:19">
<p>Show rigorously that this is a well-defined map of sheaves (hint: because these things are sheaves to show that $\phi\circ s$ is a section, it suffices to show that it restricts to a section on each set in an open cover of $U$) <a href="#fnref:19" class="reversefootnote">↩</a></p>
</li>
<li id="fn:20">
<p>I don’t think I’ll ever be able to remember which of “coarser” and “finer” means smaller without pulling up Wikipedia. <a href="#fnref:20" class="reversefootnote">↩</a></p>
</li>
<li id="fn:21">
<p>(I’m pretty sure that) It is possible to recover a group from its category of $k$-linear representations, so perhaps if you wanted to define a version of the fundamental group in algebraic contexts where you have a not-so-nice topology, you should try giving a definition in terms of locally constant sheaves of that space (or in terms of covers of that space). <a href="#fnref:21" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>It makes me happy to know that this post will be the one knocking the “Covering Spaces” post off of the front page. This one will cover a related topic but (hopefully) with the noticable difference that while that post is trash, this one will be somewhat well-written. That being said, I’m going to be (mostly) stepping away from the (certain flavor of) number theory that I have been writing about, and make my next few posts more geometric/topological. 1 Kicking things off, this post will be about showing an equivalence between 3 seemingly different2 kinds of objects. I’ll start off by briefly introducing categories and sheaves 3; then I’ll say some things about covers, and finally get into the good stuff. As a rule of thumb, if I ever say that I will write a post about something, you probably shouldn’t believe that I will actually follow through with that promise. ↩ Admittedly, the first two are unsurprisingly related ↩ If you know what these are, just skip the first section or two ↩Adeles2019-03-19T00:04:00+00:002019-03-19T00:04:00+00:00https://nivent.github.io/blog/adeles<p>Let’s step away from $\zeta$-function stuff for a bit, and talk about something different. <sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup> In an <a href="../abs-val-p-adic">earlier post</a>, I mentioned these local fields like $\Q_p$ that are useful for studing things “one prime at a time” <sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup> (whatever that means). Corresponding to this local fields, one also has global objects (e.g. $\Q$) from which they arise, but in some sense, these global objects (i.e. global fields) don’t have all the information of the local objects readily available <sup id="fnref:3"><a href="#fn:3" class="footnote">3</a></sup>. Because of this, it may be nice to consider different global objects that combine all the local ones in a more straightforward manner.</p>
<h1 id="definitions">Definitions</h1>
<p>Let $K$ be a global field. Technically, this means that $K$ is a number field or a function field of a curve over $\F_q$ (i.e. $K/\F_q(t)$ is finite), but the example I’ll have in mind if of $K$ as a number field. At some point I may explicitly say to let $K$ be a number field to simplify things, but know that most (all?) of what I do can be done for a general global field.</p>
<p>We want to construct the adele ring of $K$ which morally is just the topological ring</p>
<script type="math/tex; mode=display">\prod_vK_v</script>
<p>where $v$ ranges over all places of $K$, and $K_v$ is the completion of $K$ at $v$. However, this product is stupid-big (so maybe not the easiest thing to work), and doesn’t reflect some of the nice finiteness properties of global fields (e.g. the valuation of $x\in K$ is zero for almost all places). Because of this, we’ll replace it with a so-called restricted product.</p>
<div class="definition">
Fix some set $I$ of indices, and some locally compact Hausdorff groups (or rings or fields) $G_i$ with compact, open (hence closed (!)) subgroups $H_i\le G_i$. The <b>restricted (direct) product</b> of the $G_i$ with respect to the $H_i$ is
$$\prodp_{i\in I}(G_i,H_i):=\bracks{(g_i)\in\prod_{i\in I}G_i:g_i\in H_i\text{ for all but finitely many }i}.$$
We make this a topological group, but not by giving it the subspace topology it inherits from the direct product of the $G_i$. Instead, a neighborhood base of the identity consists of sets of the form $\prod N_i$ where $N_i$ is a neighborhood of $1\in G_i$ for all $i$, and $N_i=H_i$ for all but finitely many $i$.
</div>
<p>Given this, we define the <b>finite adele ring of $K$</b> (or <b>ring of finite adeles</b>) to be the (topological) ring</p>
<script type="math/tex; mode=display">\A_{K,\mrm{fin}}:=\prodp_{v\nmid\infty}(K_v,\ints v),</script>
<p>where the notation $v\nmid\infty$ means we’re ranging only over finite (i.e. non-archimedean) places, and $\ints v\subset K_v$ is the ring of integers (i.e. elements of norm at most 1). The <b>adele ring of $K$</b> is obtained from the finite adeles by throwing in the infinite places. In other words, it is</p>
<script type="math/tex; mode=display">\A_K:=\A_{K,\mrm{fin}}\by\prod_{v\mid\infty}K_v.</script>
<p>Since there are only finitely many infinite places <sup id="fnref:4"><a href="#fn:4" class="footnote">4</a></sup>, if we let $\ints v=K_v$ for $v\mid\infty$, we could have just defined <sup id="fnref:6"><a href="#fn:6" class="footnote">5</a></sup></p>
<script type="math/tex; mode=display">\A_K=\prodp_v(K_v,\ints v).</script>
<p>Now that we know what the adele ring is, a few remarks about why this restricted direct product is nicer than the ordinary direct product. First, topologically speaking, $\prod_vK_v$ is not locally compact <sup id="fnref:5"><a href="#fn:5" class="footnote">6</a></sup> essentially because the product topology requires open sets to be entire spaces in all but finitely many factors. However,$\dots$</p>
<div class="proposition">
Any restricted product
$$\prodp_{i\in I}(G_i,H_i)$$
is locally compact.
</div>
<div class="proof4">
Let $S\subset I$ be finite, and consider the subgroup
$$G_S=\prod_{v\in S}G_v\by\prod_{v\not\in s}H_v.$$
Finite products of locally compact spaces are locally compact (and arbitrary products of compact spaces are compact), so $G_S$ is locally compact in the product topology. However, since $S$ is finite, the product topology on $G_S$ coincides with the subspace topology it inherits from the restricted direct product, so $G_S$ is locally compact there as well. Since every element of the restricted product belongs to a set of this form, it is locally compact.
</div>
<p>This means that $\A_K$ is the product of finitely many locally compact spaces, so it is itself locally compact. In this way, it is not as stupid big as the ordinary direct product even though it looks massive at a glance. It’s also worth noting that $K\into\A_K$ as a discrete subgroup via the diagonal (algebraic) embedding</p>
<script type="math/tex; mode=display">x\mapsto(x,x,x,x,\cdots)</script>
<p>since $x$ has zero valuation at all but finitely many places <sup id="fnref:7"><a href="#fn:7" class="footnote">7</a></sup>. Elements of the image of this embedding are sometimes referred to as “principal adeles.” The embedding $K\into\A_K$ should be thought of as an analouge of $\Z\into\R$ (e.g it’s discrete <sup id="fnref:8"><a href="#fn:8" class="footnote">8</a></sup> and we’ll later see that $\A_K/K$ is compact). The unit group $\units\A_K$ of the adele group is also important.</p>
<div class="exercise">
Prove that
$$\units\A_K=\prod_{v\mid\infty}\units K_v\by\prodp_{v\nmid\infty}(\units K_v,\units{\ints v})$$
as groups. The right hand side is denoted $\I_K$, and is called the <b>idele group of $K$</b>. It's topology as a restricted direct product is different (stronger) than its topology as a subgroup of $\A_K$.
</div>
<p>We similarly have a diagonal embedding $\units K\into\I_K$. Finally, we can extend the absolute value $\nabs_v$ on $K_v$ to $\A_K$ via $\abs{x}_v=\abs{x_v}_v$ for $x\in\A_K$, and then we can combine these to define a global absolute value <sup id="fnref:15"><a href="#fn:15" class="footnote">9</a></sup></p>
<script type="math/tex; mode=display">\abs x=\prod_v\abs{x_v}_v</script>
<p>on $\A_K$ which converges since $\abs{x_v}_v\le1$ for all but finitely many $v$.</p>
<h1 id="basic-properties">Basic Properties</h1>
<p>Let’s “prove” <sup id="fnref:11"><a href="#fn:11" class="footnote">10</a></sup> some things about adeles.</p>
<div class="theorem" name="Approximation Theorem">
Fix a global field $K$. Let $\A_\omega=\prod_{v\mid\infty}K_v\by\prod_{v\nmid\infty}\ints v$. Then,
$$\A_K=K+\A_\omega\text{, and }K\cap\A_\omega=\ints K$$
where $\ints K$ is the ring of integers of $K$ (elements with absolute value $\le1$ at all places).
</div>
<div class="proof4">
Here, $K$ is embedded diagonally into $\A_K$. We want to show that given any $x\in\A_K$, there's some $\mu\in K$ such that each component of the difference $x-\mu$ is a local integer. Let $\mfp\subset\ints K$ be prime, and write $\mfp\cap F=(p)$ where $F\subseteq K$ is $\Z$ or $\F_q[t]$. Then, multiplying by $p$ will reduce the $\mfp$-adic absolute value of $x$. Since $\abs x_v\le1$ for all but finitely many places, this means that there's some $m\in F$ such that $mx$ is integral at all finite primes. Let $\{\mfp_1,\dots,\mfp_r\}$ be the set of primes of $K$ (i.e. prime ideals of $\ints K$) that divide (i.e. contain) $m$, and let $n_1,\dots,n_r$ be naturals such that $\mfp_j^{n_j}\nmid(m)$ (i.e. $n_j>v_{\mfp_j}(m)$), then the Chinese remainder theorem let's us find some $\lambda\in\ints K$ such that
$$\lambda\equiv mx_j\pmod{\mfp_j^{n_j}},$$
where $x_j$ is the component of the adele $x$ corresponding to $\mfp_j$. Let $\mu=\lambda/m$, and note that $x-\mu=\inv m(mx-\lambda)$ is integral at each of the primes $\mfp_j$ (since we chose $n_j$ large). At other primes, its absolute value is the same as $mx-\lambda$'s so it's integral everywhere. Hence, we win.
</div>
<p>One can also proove</p>
<div class="theorem" name="Strong Approximation">
Fix a finite place $v_0$ on a global field $K$, and let $\A_{K,v_0}=\prodp_{v\neq v_0}(K_v,\ints v)$ where $v$ ranges over all finite places. Then, $K$ is dense in $\A_{K,v_0}$.
</div>
<p>but doing so requires saying the words “Haar measure,” and I’d rather not get into that, so I’ll skip the proof of this fact <sup id="fnref:9"><a href="#fn:9" class="footnote">11</a></sup>. If you recall from before, $K$ is discrete in $\A_K$, and so certainly not dense. This results says that if we remove a single (finite) place from $\A_K$, then $K$ goes from being discrete to being dense!</p>
<div class="remark">
For an arbitrary Dedekind domain $A$, the <b>weak approximation theorem</b> says that for any finite set of primes $\mfp_i$ along with integers $e_i\ge0$, there exists some $a\in A$ such that $v_{\mfp_i}(a)=e_i$ for all $i$. Here, $v_{\mfp_i}$ is the $\mfp_i$-adic valuation. Sincr rings of integers of global fields are Dedekind domains, you may want to take a minute to think about how strong approximation compares/relates to weak approximation.
</div>
<p>We’ll next take a look at how $\A_E$ is related to $\A_K$ when $E/K$ is a(n) (finite) <sup id="fnref:10"><a href="#fn:10" class="footnote">12</a></sup> extension of global fields.</p>
<p>Fix some (finite, separable) extensions $E/K$, and fix a place $v$ on $K$. Let</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
E_v:=\prod_{w\mid v}L_w, &&\wt{\ints v}:=\prod_{w\mid v}\ints w,
\end{align*} %]]></script>
<p>where $w\mid v$ means that $w$ is a place on $E$ which restricts to $v$ on $K$. We can use to notation to build $\A_E$ from $K$ as the following lemma illustrates.</p>
<div class="lemma">
Let $E/K$ be a finite, spearable extension. Then,
$$\A_E\cong\prod_{v\mid\infty}E_v\by\prodp_{v\nmid\infty}(E_v,\wt{\ints v}),$$
where $v$ ranges over places of $K$.
</div>
<div class="proof4">
It is clear that
$$\prod_{v\mid\infty}E_v\cong\prod_{w\mid\infty}E_w,$$
where the left product ranges over (archimedean) places of $K$ and the right ranges over (archimedean) places of $E$ just because the LHS expands out into exactly the RHS. Hence, to prove the claim, it suffices to show that the finite adeles on both sides agree. First note that the natural map
$$\mapdesc f{\prodp_{v\mid\infty}(E_v,\wt{\ints v})}{\A_{E,\mrm{fin}}}{((x_w)_{w\mid v})_v}{(x_w)_w}$$
is well-defined because both the RHS and the LHS allow only finitely many non-integral elements, and it is visible bijective. Furthermore, because open sets in $E_v$ are products of open sets in $E_w$ for $w\mid v$, and because $\wt{\ints v}$ literally is the product of $\ints w$ for $w\mid v$, this map is clearly continuous and open. Hence, $f$ is a homeomorphism. It is even easy to see that $f$ is a ring map, so $f$ really is an isomorphism of topological rings.
</div>
<p>The above isn’t the only way to get from $K$ (or $\A_K$) to $\A_E$, however.<sup id="fnref:12"><a href="#fn:12" class="footnote">13</a></sup></p>
<div class="proposition">
Let $E/K$ be a finite, spearable extension. Then, $\A_E\cong\A_K\otimes E$ where this time the isomorphism is only algebraic.
</div>
<div class="proof4">
Like before, it suffices to only give an isomorphism between the finite adeles on both sides. That is, we only need to show that $\A_{E,\mrm{fin}}\cong\A_{K,\mrm{fin}}\otimes E$. Let $e_1,\dots,e_n$ (here, $n=[E:K]$) be an integral (i.e. $e_i\in\ints E$) basis for $E/K$. Hence, it is also a basis for $E_v/K_v$ for all places $v$ on $K$. Tensor products behave as you would expect with restricted direct products, so
$$\A_{K,\mrm{fin}}\otimes E\cong\prodp_{v\nmid\infty}(K_v,\ints v)\cong\prodp_{v\nmid\infty}(K_v\otimes E,\ints v\otimes E)\cong\prodp_{v\nmid\infty}(E_v,\ints ve_1\oplus\dots\oplus\ints ve_n).$$
Finally, $\ints ve_1\oplus\cdots\oplus\ints ve_n\cong\wt{\ints v}$ for allmost all $v$, so the above is also isomorphic to $\prodp_{v\nmid\infty}(E_v,\wt{\ints v})\cong\A_E$.
</div>
<div class="remark">
We can easily upgrade the algebraic isomorphism above to a topological one by giving $\A_K\otimes E$ the topology induced from this isomorphism.
</div>
<div class="corollary">
Let $K/E$ be a degree $n$, separable extension of global fields. As (additive) topological groups, we have
$$\A_E\cong\prod_{i=1}^n\A_K.$$
</div>
<p>Moving on, we claimed that the diagonal embedding $K\into\A_K$ turns $K$ into a discrete subgroup of $\A_K$. Let’s actually prove this in the case that $K$ is a number field. We’ll leave the function field case as an exercise. <sup id="fnref:13"><a href="#fn:13" class="footnote">14</a></sup></p>
<div class="proposition">
Let $K$ be a number field. The image of the diagonal embedding $K\into\A_K$ is discrete.
</div>
<div class="proof4">
Let $\sigma_1,\dots,\sigma_n:K\to\C$ be the $n=[K:\Q]$ (algebraic) embeddings of $K$ into $\C$. Take $x\mapsto\abs{\sigma_i(x)}$ to be the representatives for the $n$ archimedean places of $K$. Note that this means that
$$\prod_{v\mid\infty}\abs x_v=\abs{\knorm(x)}$$
for all $x\in K$. Let
$$U=\prod_{v\mid\infty}B_v(0,2^{-1})\by\prod_{v\nmid\infty}\ints v,$$
where $B_v(0,2^{-n})=\bracks{x\in K_v:\abs x_v<2^{-1}}$. Note that $U\subset\A_K$ is open, and consider any $x\in U\cap K$. Then, $x\in\ints v$ for all finite places $v$, so $x\in\ints K$ and hence $\knorm(x)\in\ints{\Q}=\Z$. Furthermore, $x\in U$ implies
$$\abs{\knorm(x)}=\prod_{v\mid\infty}\abs x_v\le\prod_{v\mid\infty}2^{-1}=2^{-n}<1.$$
Since $\knorm(x)\in\Z$, this implies that $\knorm(x)=0$ so $x=0$. Thus, $U\cap K=\{0\}$ and the claim follows.
</div>
<p>With that proven, let’s complete the second part of the $Z\into\R$ analogy by showing that $K$ is cocompact in $\A_K$. Note that, if $K$ is a number field, then $\A_K\cong(\A_{\Q})^n$ as topological groups where $n=[K:\Q]$. Hence, $(\A_K/K)\cong(\A_{\Q}/\Q)^n$ is compact iff $\A_{\Q}/\Q$ is. Thus, we can restrict our attention for the next proof (we could have use the same trick with discreteness if we wanted).</p>
<div class="theorem">
Let $K$ be a number field. Then, $\A_K/K$ is compact.
</div>
<div class="proof4">
By the discussion above the theorem statement, it suffices to prove this for $K=\Q$, so that's what we'll do. Consider the compact set
$$U=\prod_{p<\infty}\Z_p\by\sqbracks{-\frac12,\frac12}\subseteq\A_{\Q}.$$
We will show that the map quotient map $\A_{\Q}\to\A_{\Q}/\Q$ restricted to $U$ is surjective (i.e. for all $\alpha\in\A_{\Q}$, there's some $\beta\in\Q$ s.t. $\alpha-\beta\in U$), which will prove the result. Fix any $\alpha=(\alpha_p)_p\in\A_{\Q}$, and let $p$ be a prime for which $\abs{\alpha_p}>1$, so $\alpha_p=z_pp^{-k}$ for some $z_p\in\Z_p$ and $k>0$. Fix an integer $z_p'$ such that $z_p'\equiv z_p\pmod{p^k}$ and let $r_p=z_p'p^{-k}\in\Z$. Then,
$$\abs{\alpha_p-r_p}_p=\abs{\frac{z_p-z_p'}{p^k}}_p=\abs{\frac{dp^k}{p^k}}_p=\abs d_p\le1,$$
for some $d\in\Z_p$. Furthermore, for any prime $q\neq p$, we have
$$\abs{\alpha_q-r_p}_q\le\max(\abs{\alpha_q}_q,\abs{r_p}_q)\le\max(\abs{\alpha_q}_q,1),$$
so $\abs{\alpha_q-r_p}_q\le1$ if $\abs{\alpha_q}_q\le1$. This means that we can replace $\alpha$ by $\alpha-r_p$ to reduce the number of places it's nonintegral at by 1. After finitely many such replacements, we can assume that $\alpha$ is integral at all finite places. To finish, we observe that there exists some $s\in\Z$ such that $\alpha_\infty-s\in[-1/2,1/2]$, and hence $(\alpha-s)\mapsto\alpha+\Q\in\A_{\Q}/\Q$. Since $\alpha$ was arbitrary (really, sufficiently arbitrary since we added the assumption that it's integral at all finite places), we win.
</div>
<p>To wrap up this section, we give a proof of the <b>Artin product formula</b> which says that</p>
<script type="math/tex; mode=display">\prod_v\abs a_v=1</script>
<p>for all $a\in\units K$. We’ll continue our trend of only proving things for number fields, so assume $K$ is one of those. Note that</p>
<script type="math/tex; mode=display">\prod_v\abs a_v=\prod_{p\le\infty}\prod_{v\mid p}\abs a_v=\prod_{p\le\infty}\prod_{v\mid p}\abs{\norm_{K_v/\Q_p}(a)}_p=\prod_{p\le\infty}\abs{\knorm(a)},</script>
<p>so it suffices to prove this for $K=\Q$. Since $\nabs:\A_K\to\R_{\ge0}$ is multiplicative, we can further simplify to the case that $a=p$ is prime. Here, there are only two nonunit absolute values, $\abs p_p=\frac1p$ and $\abs p_\infty=p$. Thus, we win.</p>
<h1 id="class-groups">Class Groups</h1>
<p>At this point we know a thing or two about adeles, but maybe we don’t know what they’re good for. One of the classic reasons for studying adeles is to give a more memorable proof of things like the finiteness of the class group of a number field. A love the geometry of numbers as much as the next guy <sup id="fnref:14"><a href="#fn:14" class="footnote">15</a></sup>, but the topologist in me refuses to believe in any finiteness prove that doesn’t end with “This space is both compact and discrete and hence finite.”</p>
<p>It turns out that there are many “class groups” one can define using adeles. We won’t bother looking at all (most?) of them. For the remainder of this section, fix a number field $K$.</p>
<div class="definition">
Recall that $\units K\into\I_K$ embeds as a discrete subgroup. The <b>idele class group</b> of $K$ is the quotient $C_K:=\I_K/\units K$.
</div>
<p>It turns out that $C_K$ is not necessarily compact, but the so-called norm-one idele class group of $K$ is.</p>
<div class="definition">
Let $\I_K^1:=\ker(\nabs_{\A_K})$. The <b>norm-one idele class group</b> of $K$ is the quotient $C_K^1=\I_K^1/\units K$.
</div>
<div class="theorem">
For a number field $K$, $C_K^1$ is compact.
</div>
<div class="proof4">
Omitted. The proof technique is similar to the one used to show that $\A_K/K$ is compact in that you have an explicit compact subset of $\I_K^1$ which surjects onto $C_K^1$.
</div>
<p><script type="math/tex">\newcommand{\tints}[1]{\wh{\mathscr O}_{#1}}</script>
We next relate these to the traditional class group $\Cl_K$ of $K$. Let $J_K$ denote its group of fraction ideals (finitely generated $\ints K$-submodules of $K$), so $\Cl_K=J_K/\units K$. Recall that there is a one-to-one correspondence $v\mapsto\mfp_v$ between the finite places of $K$ and the nonzero prime ideals of $\ints K$. Let $C_{K,\mrm{fin}}=\I_{K,\mrm{fin}}/\units K$. Finally, let $\tints K=\prod_{v\nmid\infty}\ints v$ and $\units{\tints K}=\prod_{v\nmid\infty}\units{\ints v}$.</p>
<div class="theorem">
$$\begin{align*}
J_K\cong\I_{K,\mrm{fin}}/\units{\tints K}&&\Cl_K\cong C_{K,\mrm{fin}}/\units{\tints K}\units K
\end{align*}$$
</div>
<div class="proof4">
First, consider the map
$$\mapdesc\phi{\I_{K,\mrm{fin}}}{J_K}{(\alpha_v)_v}{\prod_{v\nmid\infty}\mfp_v^{v(\alpha_v)}}$$
which visibly has kernel $\units{\tints K}$ and is visibly surjective. Hence, we get the first isomorphism. Now, note that, for $\alpha\in\units K$, we have
$$\phi(\alpha)=\phi(\alpha,\alpha,\dots)=\prod_{v\nmid\infty}\mfp_v^{v(\alpha)}=\alpha\ints K$$
so $\phi$ descends to a map $\psi:C_{K,\mrm{fin}}\to\Cl_K,\alpha\units K\mapsto\phi(\alpha)\units K$. We claim that $\ker(\psi)=\units{\tints K}\units K$. It is clear that $\units{\tints K}\units K\subseteq\ker\psi$, so we focus on the revese direction. Pick some $\xi\units K\in C_{K,\mrm{fin}}$ with $\psi(\xi\units K)=\ints K\units K$, so $\prod_v\mfp_v^{v(\xi_v')}=\ints K$ for some representative $\xi'\in\xi\units K$. Hence, $\xi'\in\tints K$, and so $\xi\units K=\xi'\units K\in\tints K\units K$. This gives the second isomorphism.
</div>
<div class="remark">
It is a general fact about topological groups that, for $H\le G$, the quotient map $G\to G/H$ is open and that the quotient space $G/H$ is discrete iff $H$ is an open subgroup. In the above theorem, $\units{\tints K}$ is visibly an open subgroup of $\I_{K,\mrm{fin}}$, and so it descends to an open subgroup $\units{\tints K}\units K$ of $C_{K,\mrm{fin}}$. This means that $\Cl_K$ is discrete with the topology given by the above isomorphism.
</div>
<div class="corollary">
The class group $\Cl_K$ of a number field is finite.
</div>
<div class="proof4">
Consider the map
$$\mapdesc f{\I_K^1}{J_K}{(\alpha_v)_v}{\prod_{v<\infty}\mfp_v^{v(\alpha_v)}}.$$
This map is visibly surjective, and similarly to last time, descends to a map $C_K^1\to\Cl_K$. Thus, $\Cl_K$ is the continuous image of a compact set, and hence compact. However, we saw in the previous remark that it was discrete, so it must be finite.
</div>
<p>Another classic use of adeles is proving Dirichlet’s theorem about the rank of the unit group of the ring of integers of a number field. It’s also worth mentioning that both this and the finiteness of the class group have generalizations to $S$-integers which can be proven with adelic methods. However, instead of covering these, I think I will stop here.</p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>Secretly, we’re not stepping that far away from it. One does Fourier analysis on R to prove nice things (i.e. existence of a functional equation) about the Riemann zeta function. Analagously, one does Fourier analysis on these adelic rings to prove nice things about more general L-functions. We won’t really touch on this here, but it’s lurking in the background. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>Better put, “one place at a time” <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>Of course, given e.g. Q, you can complete it at various places to obtain all the Q_p’s you could want. However, if you want to study e.g. Q_2 and Q_5 at the same time, then you can’t complete Q because this will kill valuable information, but Q itself is somehow not the best place to work to understand Q_2 and Q_5 simultaneously. <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
<li id="fn:4">
<p>Exercise: prove this (hint: infinite places on number fields come from embeddings into C) <a href="#fnref:4" class="reversefootnote">↩</a></p>
</li>
<li id="fn:6">
<p>If we did this, we would need to amend our definition of restricted products to not require all H_i to be compact. Instead, we’d only require all but finitely many H_i are compact. I’ll leave it up to the reader to figure out how to modify arguments for this slightly more general definition. <a href="#fnref:6" class="reversefootnote">↩</a></p>
</li>
<li id="fn:5">
<p>You want this to be able to do analysis-y type stuff. Locally compact topolgical groups have Haar measures which let you do Fourier Analysis (I guess in this setting it’s typically called harmonic analysis) on them (maybe just when they’re abelian). <a href="#fnref:5" class="reversefootnote">↩</a></p>
</li>
<li id="fn:7">
<p>Exercise: prove this (hint: factor (x)) <a href="#fnref:7" class="reversefootnote">↩</a></p>
</li>
<li id="fn:8">
<p>Exercise: prove this (hint: suffices to find a neighborhood around 1 containing no other principal adele) (hint2: It’s possible you want to hold off on proving this until you see the product formula) <a href="#fnref:8" class="reversefootnote">↩</a></p>
</li>
<li id="fn:15">
<table>
<tbody>
<tr>
<td>technically, for this to make sense we need to choose a representative of each place on K. Just choose the normal ones (e.g. for v finite, choose</td>
<td>p</td>
<td>=1/q where p is a uniformizer and q is the size of the residue field)</td>
</tr>
</tbody>
</table>
<p><a href="#fnref:15" class="reversefootnote">↩</a></p>
</li>
<li id="fn:11">
<p>Quotes because I won’t give all the details for most (any?) of the things here <a href="#fnref:11" class="reversefootnote">↩</a></p>
</li>
<li id="fn:9">
<p>I don’t think I’ll need it for anything. If I do and it bothers you that I haven’t proved it, you can find a proof in <a href="http://math.mit.edu/classes/18.785/2017fa/LectureNotes25.pdf">these notes</a> <a href="#fnref:9" class="reversefootnote">↩</a></p>
</li>
<li id="fn:10">
<p>What’s the correct way to notate that the two choices are “a finite extension” and “an extension”? <a href="#fnref:10" class="reversefootnote">↩</a></p>
</li>
<li id="fn:12">
<p>TODO: double check that {e_i} gives a basis for E_v/K_v for all v and not just all but finitely many v (proof should be salvagable in either case) <a href="#fnref:12" class="reversefootnote">↩</a></p>
</li>
<li id="fn:13">
<p>hint: for F_q(t), the t-adic absolute value (I think this is usually considered the infinite place) should play the role of the archimedean places in the number field proof. I could be wrong about this; I really haven’t spent much time with function fields. <a href="#fnref:13" class="reversefootnote">↩</a></p>
</li>
<li id="fn:14">
<p>sarcasm: I’m not a fan of it <a href="#fnref:14" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Let’s step away from $\zeta$-function stuff for a bit, and talk about something different. 1 In an earlier post, I mentioned these local fields like $\Q_p$ that are useful for studing things “one prime at a time” 2 (whatever that means). Corresponding to this local fields, one also has global objects (e.g. $\Q$) from which they arise, but in some sense, these global objects (i.e. global fields) don’t have all the information of the local objects readily available 3. Because of this, it may be nice to consider different global objects that combine all the local ones in a more straightforward manner. Secretly, we’re not stepping that far away from it. One does Fourier analysis on R to prove nice things (i.e. existence of a functional equation) about the Riemann zeta function. Analagously, one does Fourier analysis on these adelic rings to prove nice things about more general L-functions. We won’t really touch on this here, but it’s lurking in the background. ↩ Better put, “one place at a time” ↩ Of course, given e.g. Q, you can complete it at various places to obtain all the Q_p’s you could want. However, if you want to study e.g. Q_2 and Q_5 at the same time, then you can’t complete Q because this will kill valuable information, but Q itself is somehow not the best place to work to understand Q_2 and Q_5 simultaneously. ↩A Quick Note on Values of the Riemann Zeta Function2019-02-23T00:01:00+00:002019-02-23T00:01:00+00:00https://nivent.github.io/blog/quick-zeta-calc<p>I don’t know how much you’ll enjoy this post, but calculating values of the zeta function is one of those things that struck high-school me as being incredibly difficult and requiring mastery over the arcane arts of mathematics, so it’s pretty cool to know that I can do this now <sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup>. Without further ado, let’s find a formula for $\zeta(2k)\dots$ <sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup></p>
<p>Let’s start with Parseval. Recall that we have a nice theory of Fourier analysis on the circle $S^1$ for the space $C^2(S^1)$ of twice differentiable (with continuous second derivative) functions. Note furthermore that viewing circle functions as functions $f:[0,1]\to\C$ with $f(0)=f(1)$ gives an embedding $C^2(S^1)\into C^2([0,1])$. Note that the set $\bracks{e_n}_{n\in\Z}$ where $e_n(x)=e^{-2\pi inx}$ is orthonormal with respect to the following inner product on $C^2([0,1])$:</p>
<script type="math/tex; mode=display">\angles{f,g}=\int_0^1f(x)\conj{g(x)}\dx.</script>
<p>We claim (without proof) that the (linear) span of $\bracks{e_n}$ is dense in $C^2([0,1])$, letting us make a limit argument to show that for any $f\in C^2([0,1])$, one has</p>
<script type="math/tex; mode=display">\sum_{n=-\infty}^\infty\abs{\angles{f,e_n}}^2=\|f\|^2:=\int_0^1\abs{f(x)}^2\dx.</script>
<p>The comment about embedding $C^2(S^1)$ into $C^2([0,1])$ is meant to help you recognize that $\angles{f,e_n}$ is exactly the $n$th Fourier coefficient of $f$. Our strategy for calculating $\zeta(2k)$ will be to find some function $f\in C^2([0,1])$ whose Fourier series has coefficients (roughly) of the form $c_n=\frac1{n^k}$ and then apply Parseval’s above identity.</p>
<p>The functions we’ll use are the <b>(Jacob) Bernoulli polynomials</b> $B_k(x)$ defined by the following identity <sup id="fnref:4"><a href="#fn:4" class="footnote">3</a></sup></p>
<script type="math/tex; mode=display">\int_x^{x+1}B_k(u)\d u=x^k.</script>
<p>In particular, $\int_0^1B_k(x)\dx=0^k$ (Here, $0^0=1$). Note that one can show that $B_k’(x)=kB_{k-1}(x)$ <sup id="fnref:3"><a href="#fn:3" class="footnote">4</a></sup>, and for good measure, one can calculate</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
B_0(x) &= 1\\
B_1(x) &= x - \frac12\\
B_2(x) &= x^2 - x + \frac16,
\end{align*} %]]></script>
<p>and so on. Note that this derivative relation gives</p>
<script type="math/tex; mode=display">B_k(1)-B_k(0)=\int_0^1B_k'(x)\dx=k\int_0^1B_{k-1}(x)\dx=0</script>
<p>for $k>1$ whereas $B_1(1)-B_1(0)=1$. Let the $k$th <b>Bernoulli number</b> $B_k$ be $B_k(0)$ (some authors use $B_k(1)$ instead). With this, we’ve done enough set up, so let’s get to the zeta stuff. <sup id="fnref:5"><a href="#fn:5" class="footnote">5</a></sup></p>
<p>Define $c_k(n):=\int_0^1B_k(x)e^{-2\pi inx}\dx$, the $n$th Fourier coefficient of $B_k(x)$, and calculate</p>
<script type="math/tex; mode=display">c_k(n)=\int_0^1B_k(x)e^{-2\pi inx}\dx=\frac{-1}{2\pi in}\sqbracks{\left.B_k(x)e^{-2\pi inx}\right|_0^1-\int_0^1 B_k'(x)e^{-2\pi inx}\dx}=\frac{k c_{k-1}(n)}{2\pi in}</script>
<p>for $k>1$ (since the product vanishes) while $c_1(n)=-1/(2\pi in)$ (since the integral vanishes). This gives</p>
<script type="math/tex; mode=display">c_k(n) = \frac{k c_{k-1}(n)}{2\pi i n}=\frac{k(k-1)c_{k-2}(n)}{(2\pi in)^2}=\cdots=\frac{k!c_1(n)}{(2\pi in)^{k-1}}=-\frac{k!}{(2\pi in)^k}</script>
<p>where the $\cdots$ indicates that an induction argument is taking place behind the scenes. Now, the careful reader should be up in arms with the above equalities because they only hold for $n\neq0$ (this is needed for the integration by parts to work as claimed). By definition of the Bernoulli polynomials, we have $c_k(0)=0$. At this point, Parseval says</p>
<script type="math/tex; mode=display">\sum_{\substack{n=-\infty\\n\neq0}}^\infty \abs{c_k(n)}^2=\int_0^1B_k(x)^2\dx.</script>
<p>The left hand side simplifies as</p>
<script type="math/tex; mode=display">\sum_{\substack{n=-\infty\\n\neq0}}^\infty \abs{c_k(n)}^2=2\sum_{n=1}^\infty\frac{(k!)^2}{(2\pi n)^{2k}}=\frac{2(k!)^2}{(2\pi)^{2k}}\zeta(2k),</script>
<p>so our fabled $\zeta$ finally appears. To calculate the right hand side, integrate by parts (and induct) some more:</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
\int_0^1B_k(x)^2\dx
&= \frac1{k+1}\sqbracks{\left.B_k(x)B_{k+1}(x)\right|_0^1-k\int_0^1B_{k-1}(x)B_{k+1}(x)\dx}\\
&= -\frac k{k+1}\int_0^1B_{k-1}(x)B_{k+1}(x)\dx\\
&= -\frac k{(k+1)(k+2)}\sqbracks{\left.B_{k-1}(x)B_{k+2}(x)\right|_0^1-(k-1)\int_0^1B_{k-2}(x)B_{k+2}(x)\dx}\\
&= \frac{k(k-1)}{(k+1)(k+2)}\int_0^1B_{k-2}(x)B_{k+2}(x)\dx\\
&=\cdots\\
&= (-1)^{k-1}\frac{(k!)^2}{(2k-1)!}\int_0^1B_1(x)B_{2k-1}(x)\dx\\
&= (-1)^{k-1}\frac{(k!)^2}{(2k)!}\sqbracks{\left.B_1(x)B_{2k-1}(x)\right|_0^1-\int_0^1B_{2k}(x)\dx}\\
&= (-1)^{k-1}B_{2k}\frac{(k!)^2}{(2k)!}.
\end{align*} %]]></script>
<p>Putting these two sides together gives</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
\sum_{n=-\infty}^\infty\abs{\angles{B_k(x),e^{-2\pi inx}}}^2 &= \int_0^1B_k(x)^2\dx\\
\frac{2(k!)^2}{(2\pi)^{2k}}\zeta(2k) &= (-1)^{k-1}\frac{B_{2k}(k!)^2}{(2k)!}\\
\zeta(2k) &= (-1)^{k-1}B_{2k}\frac{(2\pi)^{2k}}{2(2k)!}
\end{align*} %]]></script>
<p>In particular, plugging in $k=1$ recovers the classic</p>
<script type="math/tex; mode=display">\zeta(2)=(-1)^{1-1}B_2\frac{(2\pi)^2}{2(2!)}=\frac{\pi^2}6</script>
<p>which is a good sign <sup id="fnref:6"><a href="#fn:6" class="footnote">6</a></sup>. Furthermore, I haven’t actually checked this myself yet, but I’m pretty sure that using the functional equation for the $\zeta$ function along with this calculation let’s you show that $\zeta(1-2k)\in\Q$ for all $k$ (and so $\zeta(1-k)\in\Q$ for all $k$ since it’s 0 when $k$ is odd) which is quite surprising. To end, I’ll really indulge a high-school fantasy by “proving” everyone’s favorite “theorem.”</p>
<div class="theorem" name="The Best One">
The sum of the naturals is $-1/12$. That is,
$$\sum_{n=1}^\infty n=-\frac1{12}.$$
</div>
<div class="proof4">
Recall the functional equation for $\zeta(s)$ which is
$$\zeta(s)=\frac{\pi^s\Gamma\parens{\frac{1-s}2}\zeta(1-s)}{\Gamma\parens{\frac s2}\sqrt\pi}$$
Thus (using $\Gamma(-1/2)=-2\sqrt\pi$ and $\Gamma(1)=1$),
$$\zeta(-1)=\frac{\Gamma(1)\zeta(2)}{\Gamma(-1/2)\pi\sqrt\pi}=\frac{\pi^2}{-12\pi^2}=-\frac1{12}.$$
Now recall that the Riemman zeta function is originally defined as
$$\zeta(s)=\sum_{n=1}^\infty\frac1{n^s},$$
and that $\frac1{n^{-1}}=n$, so
$$-\frac1{12}=\zeta(-1)=\sum_{n=1}^\infty n.$$
Boom! There you have it: the sum of the naturals is negative one twelfth. Checkmate, mathematicians; the internet wins this one.
</div>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>And also nice to know that I didn’t have to master any arcane arts to do so <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>using an argument that’s rigorous modulo me playing fast-and-loose with the definition of Bernoulli numbers/polynomials and with proving Parseval’s identity <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:4">
<p>Exercise: prove that this defines a unique (degree k) polynomial <a href="#fnref:4" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>A perhaps suggestive observation is that this relation is held by the polynomials B_k(x)=x^k <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
<li id="fn:5">
<p>Read: Let’s do some GRE prep by integrating by parts <a href="#fnref:5" class="reversefootnote">↩</a></p>
</li>
<li id="fn:6">
<p>A better sign is that the general formula we calculated here is the same one appearing in Wikipedia <a href="#fnref:6" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>I don’t know how much you’ll enjoy this post, but calculating values of the zeta function is one of those things that struck high-school me as being incredibly difficult and requiring mastery over the arcane arts of mathematics, so it’s pretty cool to know that I can do this now 1. Without further ado, let’s find a formula for $\zeta(2k)\dots$ 2 And also nice to know that I didn’t have to master any arcane arts to do so ↩ using an argument that’s rigorous modulo me playing fast-and-loose with the definition of Bernoulli numbers/polynomials and with proving Parseval’s identity ↩Riemann, Dirichlet, and Their Favorite Letters2019-01-31T00:00:00+00:002019-01-31T00:00:00+00:00https://nivent.github.io/blog/riemann-dirichlet<p>In this post, I want to focus on the Riemman $\zeta$ function, and (one type of) its generalizations: the Dirichlet $L$-functions. We’ll prove some nice properties of these things (e.g. their meromorphic continuations, product formulas, and functional equations), and use them to prove some infinitude results involving primes (spoiler: we’ll show e.g. that there are infinitely many primes!).</p>
<h1 id="riemann">Riemann</h1>
<div class="definition">
The <b>Riemann Zeta Function</b> is defined as
$$\zeta(s)=\sum_{n\ge1}\frac1{n^s},$$
where $s\in\C$. Note that this converges absolutely when $\Re(s)>1$ (e.g. by the integral test).
</div>
<p>We want to show that, among other things, this function extends to a meromorphic function on the entire complex plane. To begin, we’ll show that it’s at least holomorphic in the half-plane $\Re(s)>1$ by appealing to a standard theorem from complex analysis <sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup>.</p>
<div class="theorem">
Let $\{f_n\}_{n=1}^\infty$ be a sequence of holomorphic functions that converges uniformly to a function $f$ in every compact subset of an open set $\Omega\subseteq\C$. Then, $f$ is holomorphic in $\Omega$. Furthermore, the sequence of derivatives $\{f_n'\}_{n=1}^\infty$ converges uniformly to $f'$ on every compact subset of $\Omega$.
</div>
<div class="corollary">
$\zeta(s)$ defines a holomorphic function in the half-plane $\Re(s)>1$.
</div>
<p>Next, we’ll derive the product formula for the $\zeta$ function. Morally, we want to perform the following manipulation (where $p$ always denotes a prime because we’re not savages)$\dots$</p>
<script type="math/tex; mode=display">\sum_{n\ge1}\frac1{n^s}=\sum_{n\ge1}\prod_{p\mid n}\frac1{p^{sv_p(n)}}=\prod_p\sum_{n\ge0}\frac1{p^{sn}}=\prod_p\parens{\frac1{1-p^{-s}}}.</script>
<p>Above, $v_p(n)$ is the number of times the prime $p$ divides the number $n$. The middle equality is (formally) justified by the fundamental theorem of arithmetic; every number has a unique factorization into primes and this corresponds to picking some exponent for each prime <sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup>. The above is called an <b>Euler product</b> because Euler was the first person to show this equality, but I think his argument was about as legitimate as what I wrote above, so let’s do one better and actually prove this.</p>
<div class="theorem">
The Euler product for the Riemann Zeta function is legit (at least for $\Re(s)>1$).
</div>
<div class="proof4">
We'll do the classic analysis thing and prove that the Euler product is at most $\zeta(s)$, and that it is at least $\zeta(s)$.
<br />
$(\le)$ Fix some positive integer $N\ge1$. Then, by the fundamental theorem of arithmetic, we have
$$\prod_{p\le N}\sum_{n=0}^N\frac1{p^{sn}}=\sum_{n\in S}\frac1{n^s}\le\sum_{n=1}^\infty\frac1{n^s}=\zeta(s),$$
where $S=\bracks{n:p\le N\implies v_p(n)\le N\text{ and }p>N\implies v_p(n)=0}$. Taking the limit as $N\to\infty$ gives
$$\prod_p\sum_{n\ge1}\frac1{p^{sn}}=\prod_p\parens{\frac1{1-p^{-s}}}\le\zeta(s),$$
which has the added benefit of showing the Euler product converges.
<br />
$(\ge)$ Fix some positive integer $N\ge1$. Then,
$$\sum_{n=1}^N\frac1{n^s}\le\prod_{p\le N}\sum_{n=0}^N\frac1{p^{ns}}\le\prod_p\sum_{n=0}^\infty\frac1{p^{ns}}.$$
Taking the limit as $N\to\infty$ gives
$$\zeta(s)\le\prod_p\parens{\frac1{1-p^{-s}}}$$
as desired.
</div>
<div class="corollary">
There are infinitely many primes.
</div>
<div class="proof4">
$$\lim_{s\to1^+}\prod_p\parens{\frac1{1-p^{-s}}}=\lim_{s\to1^+}\zeta(s)=\lim_{s\to1^+}\sum_{n\ge1}\frac1{n^s}.$$
Now, if there are only finitely many primes, then the LHS obviously coverges because it's just a finite product. On the other hand, the RHS obviously diverges since it approaches the harmonic series. Thus, there must be infinitely many primes.
</div>
<div class="aside">
This is largely unrelated to the rest of this post, but whatever; this is my blog so I can go on mini-rants whenever I want. The standard proof that there are infinitely many primes (the one due to Euclid) is not a proof by contradiction even though it's often presented this way. There's no reason to assume that there are finitely many primes in the beginning, becuase you're secretly just constructing an infinite sequence of prime numbers.
<br />
i.e let $p_1=2$, and for $n>1$, let $p_n$ be the smallest prime factor of $p_1p_2\cdots p_{n-1}+1$. Then, $p_n$ is prime for all $n$, and $n\neq m\implies p_n\neq p_m$, so this gives an infinite sequence of primes. No contradiction necessary.
</div>
<p>While we’re on the subject of primes, we can actually do more than just count them.</p>
<div class="proposition">
$$\sum_p\frac1p=\infty.$$
</div>
<div class="proof4">
This is gonna be a little handwavy, but don't worry about that too much; Euler wouldn't (if it helps, mentally restrict $s$ to being real). First recall that $\log(1+x)=\sum_{n\ge1}(-1)^{n+1}x^n/n$ when $\abs x< 1$. Now, note that
$$\log\zeta(s)=\log\prod_p\parens{\frac1{1-p^{-s}}}=-\sum_p\log\parens{1-p^{-s}}=\sum_p\sum_{n\ge1}\frac1{np^{ns}}=\parens{\sum_p\frac1{p^s}}+\sum_p\sum_{n\ge2}\frac1{np^{ns}}.$$
Because
$$\sum_p\sum_{n\ge2}\frac1{p^{ns}}=\sum_p\sum_{n\ge2}\parens{p^{-s}}^n=\sum_p\frac{p^{-2s}}{1-p^{-s}}\le\frac1{1-2^{-s}}\sum_pp^{-2s}\le2\zeta(2)$$
is bounded as $s\to1^+$, but $\log\zeta(s)$ isn't bounded as $s\to1^+$, we must have
$$\sum_p\frac1p=\infty$$
as claimed.
</div>
<p>We next move on to analytically continuing the $\zeta$ function. I wish I could give some good motiviation for the argument, but sadly, I cannot. The main idea is to relate the $\zeta$ function to the theta function from last time</p>
<script type="math/tex; mode=display">\vartheta(s)=\sum_{n=-\infty}^\infty e^{-\pi n^2s}</script>
<p>and then translate the equality $\vartheta(s)=s^{-1/2}\vartheta(1/s)$ into a similarl functional equation for the $\zeta$ function. How do you think up this approach? I don’t know.</p>
<p>Before the proof, recall the Gamma function</p>
<script type="math/tex; mode=display">\Gamma(s)=\int_0^\infty e^{-t}t^s\d t/t</script>
<p>which is initally defined for $\Re(s)>0$ but extends to a meromorphic function on $\C$ with simple poles at the non-positive integers. The idea here is that integration by parts gives $\Gamma(s+1)=s\Gamma(s)$, so</p>
<script type="math/tex; mode=display">F_m(s)=\frac{\Gamma(s+m)}{(s+m-1)(s+m-2)\cdots s}</script>
<p>gives a meromorphic extension of $\Gamma$ to the half plane $\Re(s)>-m$. We can relate $\zeta(s)$ to $\vartheta(s)$ through $\Gamma(s)$ via the following equality:</p>
<script type="math/tex; mode=display">\int_0^\infty e^{-\pi n^2u}u^{(s/2)-1}\d u=\pi^{-s/2}\Gamma(s/2)n^{-s},\text{ }\text{ }\text{ }n\ge1.</script>
<p>This is seen by making the change of variables $u=\frac t{\pi n^2}$ on the LHS. Motivated by this, call $\xi(s)=\pi^{-s/2}\Gamma(s/2)\zeta(s)$ the <b>xi function</b> <sup id="fnref:3"><a href="#fn:3" class="footnote">3</a></sup>.</p>
<div class="theorem">
$\xi$ has a meromorphic continuation to the entire complex plane, and satisfies the function equation
$$\xi(s)=\xi(1-s).$$
</div>
<div class="proof4">
Below, the interchange of a sum and integral is (I think) justified because the summands are rapidly decreasing functions. First note that
$$\begin{align*}
\xi(s)
=\int_0^\infty\pi^{-s/2}e^{-t}t^{s/2}\zeta(s)\frac{\d t}t
&=\int_0^\infty\sum_{n\ge1}\frac{\pi^{-s/2}e^{-t}t^{s/2}}{n^s}\frac{\d t}t\\
&=\sum_{n\ge1}\int_0^\infty\frac{\pi^{-s/2}e^{-t}t^{s/2}}{n^s}\frac{\d t}t\\
&=\sum_{n\ge1}\pi^{-s/2}\Gamma(s/2)n^{-s}\\
&=\sum_{n\ge1}\int_0^\infty e^{-\pi n^2u}u^{(s/2)}\frac{\d u}u\\
&=\int_0^\infty u^{s/2}\sqbracks{\sum_{n\ge1}e^{-\pi n^2u}}\frac{\d u}u\\
&=\frac12\int_0^\infty u^{s/2}\sqbracks{\vartheta(u)-1}\frac{\d u}u
.
\end{align*}$$
Not let $\psi(u)=\frac12(\vartheta(u)-1)$, so
$$\psi(u)=\frac12\parens{u^{-1/2}\vartheta(1/u)-1}=\frac12\parens{u^{-1/2}\parens{2\psi(1/u)+1}-1}=u^{-1/2}\psi(1/u)+\frac12u^{-1/2}-\frac12.$$
This let's us calculate
$$\begin{align*}
\xi(s)
&=\int_0^\infty u^{s/2}\psi(u)\frac{\d u}u\\
&=\int_0^1u^{s/2}\psi(u)\frac{\d u}u+\int_1^\infty u^{s/2}\psi(u)\frac{\d u}u\\
&=\int_0^1u^{s/2}\sqbracks{u^{-1/2}\psi(1/u)+\frac12u^{-1/2}-\frac12}\frac{\d u}u+\int_1^\infty u^{s/2}\psi(u)\frac{\d u}u
.
\end{align*}$$
Now, we can make the substitution $u\mapsto1/u$ in the (leftmost summand of the) left integral to transform it into an integral from $1$ to $\infty$. Doing this and calculating the rest of that integral gives
$$\begin{align*}
\int_0^1u^{(s-1)/2}\psi(1/u)\frac{\d u}u
&&&=\int_1^\infty\frac{\psi(u)}{u^{(s-1)/2}}\frac{\d u}u\\
\frac12\int_0^1u^{(s-1)/2}\frac{\d u}u
&=\sqbracks{\frac{u^{(s-1)/2}}{s-1}}_0^1 &&= \frac1{s-1}\\
-\frac12\int_0^1u^{s/2}\frac{\d u}u
&=-\sqbracks{\frac{u^{s/2}}s}_0^1 &&= \frac{-1}s
.
\end{align*}$$
Combining this with our expression for $\xi(s)$ above gives
$$\xi(s) = \frac1{s-1}-\frac1s+\int_1^\infty\sqbracks{u^{(1-s)/2}+u^{s/2}}\psi(u)\frac{\d u}u.$$
Because $\psi$ decays rapidly (exponentially), the integral defines an entire function, so $\xi$ has an analytic continuation to all of $\C$ with simple poles at $s=0$ and $s=1$. Furthermore, the above expression is unchanged if we replace $s$ by $(1-s)$, so $\xi(s)=\xi(1-s)$ as desired.
</div>
<div class="corollary">
The zeta function has a meromorphic continuation to the entire plane with a single simple pole at $s=1$.
</div>
<div class="proof4">
We can meromorphically continue $\zeta$ via the equation
$$\zeta(s)=\pi^{s/2}\frac{\xi(s)}{\Gamma(s/2)}.$$
Since $1/\Gamma(s/2)$ is entire with simple zeros at the non-positive even integers, we see that simple pole of $\xi(s)$ at $s=0$ cancels out, leaving $\zeta$ with only one simple pole at $s=1$. Furthermore, we see that $\zeta(-2n)=0$ for $n\in\Z_{>0}$.
</div>
<div class="remark">
Expanded out, the functional equation for $\zeta(s)$ is
$$\pi^{-s/2}\Gamma\parens{\frac s2}\zeta(s)=\pi^{\frac{s-1}2}\Gamma\parens{\frac{1-s}2}\zeta(1-s).$$
</div>
<p>It may be worth noting that the product formula shows that there are no zeros with $\Re(s)>1$, and combining this with the above functional equation shows that the only zeros with $\Re(s)<0$ are at the negative even integers $s=-2k$ with $k\in\Z_{\ge1}$. Thus, any “non-trivial” zero of the Zeta function must lie n the strip $0\le\Re(s)\le1$. This leads me to make the following totally 100% original conjecture:</p>
<div class="conj">
Ignoring the negative even integers, the only zeros of the Riemman zeta function lie on the line $\Re(s)=\frac12$.
</div>
<h1 id="dirichlet">Dirichlet</h1>
<p>Dirichlet studied his $L$-series with one specific application in mind: proving that for any coprime $a,n$, there are infinitely many primes $p\equiv a\pmod n$ <sup id="fnref:4"><a href="#fn:4" class="footnote">4</a></sup>. At its core, the idea behind the proof is to follow in Euler’s footsteps by showing that</p>
<script type="math/tex; mode=display">\lim_{s\to1^+}\sum_{p\equiv a\pmod n}\frac1{p^s}=\infty.</script>
<p>Unsurprisingly, we will prove this by exploiting some analytic properties of a functions that are reminiscent of $\zeta(s)$.</p>
<p>This summation is similar to $\zeta(1)$, so, letting $\mbf1_ a:\Z\to\bits$ be the characterstic function of congruence to $a\pmod n$, we may be tempted to consider the function $\sum_{n=1}^\infty\mbf1_a(n)n^{-s}$. However, $\mbf1_a$ is not multiplicative, so we’d have a hard time recovering an Euler product for this; since Euler products were useful in proving the infinitue of primes, we’d like to still have one of those. On the bright side, $\mbf1_a$ descends to a homomorphism $\units{(\zmod n)}\to\C$.</p>
<div class="definition">
A <b>character</b> $\chi$ of the group $G=\units{(\zmod N)}$ is a homomorphism $\chi:G\to\units\C$.
</div>
<div class="remark">
Because of the condition that $\chi(x^{\phi(N)})=\chi(1)=1$, every character actually lands in $S^1\subset\units\C$. More conceptually, because the torsion elements of $\units\C$ are precisely the roots of unity, the characters (of any finite group) can equivalently be defined as homomorphisms $G\to S^1$.
</div>
<p>Our first goal is to relate these characters to $\mbf1_a$, our real function of interest. First, note that any character $\chi:\units{(\zmod N)}\to\C$ can be extended to a function $\Z\to\C$ via</p>
<script type="math/tex; mode=display">\chi(n)=\twocases{\chi(n\bmod N)}{\gcd(n,N)=1}0</script>
<p>Furthermore, these extensions are completely multiplicative in the sence that $\chi(nm)=\chi(n)\chi(m)$ for all $n,m\in\Z$. To relate this extension to $\mbf1_a:\Z\to\bits\subset\C$, we’ll take a shallow dive in general character theory <sup id="fnref:5"><a href="#fn:5" class="footnote">5</a></sup>.</p>
<h2 id="shallow-dive">Shallow Dive</h2>
<p>Fix a finite abelian group $G$. By a character on $G$, we mean a homomorphism $G\to S^1$.</p>
<div class="definition">
The <b>dual group</b> of $G$ is the group $\hat G:=\Hom(G,S^1)$ of characters on $G$.
</div>
<div class="remark">
If $G$ is cyclic of order $n$, then $G\simeq\hat G$ where the isomorphism is non-canonical. This is because a character $\chi:G\to S^1$ is the same thing as a choice of $n$th root of unity in this case, so $\hat G$ is isomorphic to the group of $n$th roots of unity, which is cyclic of order $n$.
</div>
<div class="proposition">
Let $H\le G$. Then, every character of $H$ extends to a character of $G$.
</div>
<div class="proof4">
Induct on the index $[G:H]$. If $[G:H]=1$, then we're in business. Otherwise, pick some $x\in G\sm H$, and fix $n$ minimal such that $x^n\in H$. Let $t=\chi(x^n)$ and note that we can choose some $w\in S^1$ such that $w^n=t$. Now, let $H'$ be the subgroup of $G$ generated by $H$ and $x$, so any $h'\in H$ can be written $h'=hx^a$ with $a\in\Z$ and $h\in H$. Set $\chi'(h')=\chi(h)w^a$ which gives a (well-defined) character $\chi':H'\to S^1$ extending $\chi$. Since $[G:H']<[G:H]$, we win by induction.
</div>
<div class="remark">
Restricting a character to a subgroup defines a homomorphism $\rho:\hat G\to\hat H$. The above proposition says that $\rho$ is surjective. Since the kernel of $\rho$, characters of $G$ which are trivial on $H$, is isomorphic to $\wh{G/H}$, we obtain a short exact sequence
$$1\to\wh{G/H}\to\hat G\to\hat H\to1.$$
</div>
<p>Combining the two remarks and the proposition above in a simple induction argument gives the following.</p>
<div class="corollary">
$G\simeq\hat G$ (non-canonically)
</div>
<div class="proof4">
Induct on $n$, the order of $G$. If $n=1$, we win, so suppose $n>1$. Let $H\le G$ be a nontrivial cyclic subgroup, so $H\simeq\hat H$ be a previous remark, and $G/H\simeq\wh{G/H}$ by induction. Fixing isomorphisms $\phi,\psi$, we obtain a commutative diagram with exact rows
<center>
<img src="https://nivent.github.io/images/blog/riemann-dirichlet/hses.png" width="350" height="100" />
</center>
where the middle vertical map is well-defined because of the existence of the other two maps. Furthermore, this middle map is an isomorphism by the snake lemma (or short 5 lemma or regular 5 lemma) or by an explicit diagram chase.
</div>
<p>We can actually say something stronger in the case of double duals.</p>
<div class="proposition">
$G\simeq\hat{\hat G}$ where the isomorphism is now canonical.
</div>
<div class="proof4">
Consider the map $\eps:G\to\hat{\hat G}$ given by $\eps(x)(\chi)=\chi(x)$. We claim this is an isomorphism. It is clearly a homomorphism. Since we know $G\simeq\hat{\hat G}$ non-canonically, they have the same (finite) order so it suffices to show that $\eps$ is injective. That is to say, if $x\in G$ is $\neq1$, then there exists a character $\chi$ of $G$ such that $\chi(x)\neq1$. To see this, consider $H$, the cyclic subgroup generated by $x$, and note that $H$ has a character $\chi$ such that $\chi(x)\neq1$. Since this character extends to one on $G$, we win.
</div>
<p>Finally, we arrive at our last result of this mini-section.</p>
<div class="theorem">
Let $n=\abs G$, and let $\chi\in\hat G$. Then,
$$\sum_{x\in G}\chi(x)=\twocases n{\chi=1}0.$$
</div>
<div class="proof4">
This obviously holds if $\chi=1$, so suppose $\chi\neq1$. Fix some $y\in G$ such that $\chi(y)\neq1$. Then,
$$\chi(y)\sum_{x\in G}\chi(x)=\sum_{x\in G}\chi(xy)=\sum_{x\in G}\chi(x),$$
so
$$(\chi(y)-1)\sum_{x\in G}\chi(x)=0.$$
Since $\chi(y)\neq1$, we obtain our desired result.
</div>
<div class="corollary">
Fix $x\in G$. Then,
$$\sum_{\chi\in\hat G}\chi(x)=\twocases n{x=1}0.$$
</div>
<p>The corollary is just applying the theorem to $\hat G$.</p>
<h2 id="back-to-dirichlet">Back to Dirichlet</h2>
<p>Ok, back to Dirichlet. Remember that we want to relate $\mbf1_a:\Z\to\C$, the characteristic function of being $\equiv a\pmod N$. to characters of the group $\units{(\zmod N)}$. This will be an application of the last theorem from our dive into character theory. First, to make things notationally easier, define $U(N)=\units{(\zmod N)}$ and $X(N):=\wh{U(N)}$.</p>
<div class="lemma">
For all $n\in\Z$, we have
$$\mbf1_a(n)=\sum_{\chi\in X(N)}\frac{\chi(a)^{-1}}{\phi(N)}\chi(n).$$
</div>
<div class="proof4">
Remember that these characters extend to completely muliplicative functions on the integers, so the right hand side above is really
$$\frac1{\phi(N)}\parens{\sum_{\chi\in X(N)}\chi(\inv an)}.$$
The term in the parantheses is $0$ if $\inv an\not\equiv1\pmod N$ and is $\phi(N)$ if $\inv an\equiv1\pmod N$. Hence we win.
</div>
<p>Hence, to understand</p>
<script type="math/tex; mode=display">\sum_{p\equiv a\pmod N}\frac1p=\sum_p\frac{\mbf 1_a(p)}p,</script>
<p>it should suffice to study the Dirichlet $L$-series defined below.</p>
<div class="definition">
Fix a character $\chi\in X(N)$. The <b>Dirichlet $L$-series</b> attached to $\chi$ is
$$L(\chi, s)=\sum_{n\ge1}\frac{\chi(n)}{n^s},$$
where $s\in\C$. Since $\abs{\chi(n)}=1$, this converges absoluately when $\Re(s)>1$.
</div>
<div class="definition">
A character $\chi\in X(N)$ is called <b>primitive</b> if there does not exist an $N'< N$ and character $\chi'\in X(N')$ such that $\chi(n)=\chi'(\bar n)$ where $\bar n\in\zmod{N'}$ denotes $n$'s reduction modulo $N'$.
</div>
<p>Note that the trivial character $\chi=1$ has $\zeta(s)$ as its $L$-series (when $N=1$).</p>
<p>Fix some (primitive) character $\chi\in X(N)$. It will turn out that these $L$-series each extend to meromorphic <sup id="fnref:6"><a href="#fn:6" class="footnote">6</a></sup> functions on the plane with their own product formulae and functional equations. The proofs of these facts are similar to those in the case of the Zeta function, so we may not always be as careful and trust that arguments could be made more carefully. For example, we observe the following product formula</p>
<script type="math/tex; mode=display">L(\chi,s)=\sum_{n\ge1}\frac{\chi(n)}{n^s}=\sum_{n\ge1}\prod_{p\mid n}\frac{\chi(p)^{v_p(n)}}{p^{v_p(n)s}}=\prod_p\sum_{n\ge0}\frac{\chi(p)^n}{p^{sn}}=\prod_p\parens{\frac1{1-\chi(p)p^{-s}}}</script>
<p>Before moving on, fix $\eps\in\bits$ such that $\chi(-1)=(-1)^{\eps}$. We call this $\eps$ the <b>exponent</b> of $\chi$. We use it to define the $\chi$-Gamma function</p>
<script type="math/tex; mode=display">\Gamma(\chi, s)=\Gamma\parens{\frac{s+\eps}2}=\int_0^\infty e^{-t}t^{(s+\eps)/2}\frac{\d t}t.</script>
<p>Perhaps unsurprisingly, the next thing we do is perform a substitution ($t\mapsto\pi n^2u/N$) to get</p>
<script type="math/tex; mode=display">\parens{\frac N\pi}^{\frac{s+\eps}2}\Gamma(\chi, s)n^{-s}=\int_0^\infty n^{\eps}e^{-\pi n^2u/N}u^{(s+\eps)/2}\frac{\d u}u.</script>
<p>Multiplying by $\chi(n)$ and summing over all $n\ge1$ gives</p>
<script type="math/tex; mode=display">\parens{\frac N\pi}^{\frac{s+\eps}2}\Gamma(\chi, s)L(\chi, s)=\int_0^\infty u^{(s+\eps)/2}\sqbracks{\sum_{n=1}^\infty\chi(n)n^{\eps}e^{-\pi n^2u/N}}\frac{\d u}u.</script>
<p>Motivated by this, we also define a $\chi$-analogue of the theta series</p>
<script type="math/tex; mode=display">\theta(\chi, s):=\sum_{n\in\Z}\chi(n)n^{\eps}e^{-\pi n^2s/N}</script>
<p>At this point, we would like to be able to apply Poisson summation <sup id="fnref:7"><a href="#fn:7" class="footnote">7</a></sup> to get a functional equation for $\theta(\chi, z)$. To do this, first define</p>
<script type="math/tex; mode=display">\theta_a(s):=\sum_{n\equiv a\pmod N}n^{\eps}e^{-\pi n^2s/N}=\sum_{k\in\Z}(Nk+a)^{\eps}e^{-\pi(Nk+a)^2s/N}</script>
<p>Next, we want to calculate the fourier transform of $f_a(x)=(Nx+a)^{\eps}e^{-\pi(Nx+a)^2s/N}$ ($\Re(s)>0$). For conveinence, let $g_0(x)=e^{-\pi x^2s/N}$ and $g_1(x)=xe^{-\pi x^2s/N}$. Then,</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
\mc F(g_0)(\xi) &= \parens{\frac Ns}^{1/2}e^{-\pi\xi^2N/s}\\
\mc F(g_1)(\xi) &= \frac1{-2\pi i}\parens{\frac{\d}{\d\xi}\mc F(g_0)(\xi)} =-i\xi\parens{\frac Ns}^{3/2}e^{-\pi\xi^2N/s}\\
\mc F(f_a)(\xi) &= \mc F(g_{\eps}(Nx+a))(\xi) = e^{2\pi ia\xi/N}\mc F(g_{\eps}(Nx))(\xi) = \frac{e^{2\pi ia\xi/N}}N\mc F(g_{\eps})\parens{\frac\xi N}\\
&= \parens{\frac{-i\xi}s}^{\eps}(Ns)^{-1/2}e^{2\pi ia\xi/N}e^{-\pi\xi^2/(Ns)}
\end{align*} %]]></script>
<p>With that over, Poisson summation gives <sup id="fnref:8"><a href="#fn:8" class="footnote">8</a></sup></p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
\theta_a(s)=\sum_{k\in\Z}f_a(k)=\sum_{k\in\Z}\mc F(f_a)(k)
&=\sum_{k\in\Z}\parens{\frac{-ik}s}^{\eps}(Ns)^{-1/2}e^{2\pi iak/N}e^{-\pi k^2/(Ns)}
\end{align*} %]]></script>
<p>so</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
\theta(\chi, s)=\sum_{n\in\Z}\chi(n)n^{\eps}e^{-\pi n^2s/N}=\sum_{a=0}^{N-1}\chi(a)\theta_a(s)
&=\sum_{a=0}^{N-1}\sum_{k\in\Z}\chi(a)\parens{\frac{-ik}s}^{\eps}(Ns)^{-1/2}e^{2\pi iak/N}e^{-\pi k^2/(Ns)}\\
&=\sum_{k\in\Z}\sqbracks{\sum_{a=0}^{N-1}\chi(a)e^{2\pi iak/N}}\parens{\frac{-ik}s}^{\eps}\frac1{(Ns)^{1/2}}e^{-\pi k^2/(Ns)}
\end{align*}. %]]></script>
<p>The sum over $a$ above is a special kind of sum called a <b>Gauss sum</b>, which are sums of the form</p>
<script type="math/tex; mode=display">\tau(\chi, k)=\sum_{a=0}^{N-1}\chi(a)e^{2\pi iak/N}.</script>
<div class="proposition">
Gauss sums enjoy the following list of properties (below, $\chi$ assumed primitive).
<ol>
<li> $\tau(\chi, k)=\conj\chi(k)\tau(\chi, 1)$ (the bar denotes complex conjugation). </li>
<li> $\abs{\tau(\chi)}=\sqrt N$. </li>
<li> $\tau(\conj\chi, 1)=\chi(-1)\conj{\tau(\chi, 1)}=(-1)^{\eps}\conj{\tau(\chi, 1)}$. </li>
</ol>
</div>
<div class="proof4">
Exercise.
</div>
<p>Now, let’s continue to simplify our expression for $\theta(\chi, s)$. Using the above (and that $-i=i^{-1}$) we see</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
\theta(\chi, s)
&=\sum_{n\in\Z}\chi(n)n^{\eps}e^{-\pi n^2s/N}\\
&=\sum_{k\in\Z}\tau(\chi, k)\parens{\frac{-ik}s}^{\eps}\frac1{(Ns)^{1/2}}e^{-\pi k^2/(Ns)}\\
&=\frac{\tau(\chi, 1)}{(is)^{\eps}{(Ns)^{1/2}}}\sum_{k\in\Z}{\conj\chi(k)k^{\eps}}e^{-\pi k^2/(Ns)}\\
&=\frac{\tau(\chi, 1)}{(is)^{\eps}(Ns)^{1/2}}\theta(\conj\chi, 1/s)
\end{align*}, %]]></script>
<p>where I just did a bunch of algebra, so it’s very possible I made a mistake along the way, and the true expression should look slightly different <sup id="fnref:9"><a href="#fn:9" class="footnote">9</a></sup>. However, assuming this is not the case <sup id="fnref:11"><a href="#fn:11" class="footnote">10</a></sup>, we have a nice functional equation for $\theta(\chi, s)$, so we can use this to give an analytic extension of $L(\chi, s)$ in much the same way as before. If you recall, we had shown previously that</p>
<script type="math/tex; mode=display">\parens{\frac N\pi}^{\frac{s+\eps}2}\Gamma(\chi, s)L(\chi, s)=\int_0^\infty u^{(s+\eps)/2}\sqbracks{\sum_{n=1}^\infty\chi(n)n^{\eps}e^{-\pi n^2u/N}}\frac{\d u}u.</script>
<p>Hence, we will find a functional equation for the <b>$\chi$-xi function </b> <sup id="fnref:10"><a href="#fn:10" class="footnote">11</a></sup></p>
<script type="math/tex; mode=display">\xi(\chi, s)=\parens{\frac N\pi}^{\frac{s+\eps}2}\Gamma(\chi, s)L(\chi, s).</script>
<div class="theorem">
$$\xi(\chi, s) = W(\chi)\xi(\conj\chi, 1-s),$$
where $W(\chi)=\frac{\tau(\chi, 1)}{i^{\eps}\sqrt n}$.
</div>
<div class="proof4">
First note that $\xi(\chi, s)=\frac12\int_0^\infty u^{(s+\eps)/2}\sqbracks{\theta(\chi, u)-1}\frac{\d u}u$, and let $\psi(\chi, u)=\frac12(\theta(\chi, u)-1)$. Then,
$$\begin{align*}
\psi(\chi, u)
=\frac12\parens{\frac{\tau(\chi, 1)}{(iu)^{\eps}(Nu)^{1/2}}\theta(\conj\chi, 1/u)-1}
&=\frac12\parens{\frac{\tau(\chi, 1)}{(iu)^{\eps}(Nu)^{1/2}}\sqbracks{2\psi(\conj\chi, 1/u)+1}-1}\\
&=\frac{\tau(\chi, 1)}{(iu)^{\eps}(Nu)^{1/2}}\psi(\conj\chi,1/u)+\frac12\frac{\tau(\chi, 1)}{(iu)^{\eps}(Nu)^{1/2}}-\frac12.
\end{align*}$$
This let's us calculate
$$\begin{align*}
\xi(\chi, s)
&=\int_0^\infty u^{(s+\eps)/2}\psi(\chi, u)\frac{\d u}u\\
&=\int_0^1u^{(s+\eps)/2}\psi(\chi, u)\frac{\d u}u+\int_1^\infty u^{(s+\eps)/2}\psi(u)\frac{\d u}u\\
&=\int_0^1u^{(s+\eps)/2}\sqbracks{\frac{\tau(\chi, 1)}{(iu)^{\eps}(Nu)^{1/2}}\psi(\conj\chi,1/u)+\frac12\frac{\tau(\chi, 1)}{(iu)^{\eps}(Nu)^{1/2}}-\frac12}\frac{\d u}u+\int_1^\infty u^{(s+\eps)/2}\psi(\chi, u)\frac{\d u}u
\end{align*}$$
Like last time, we now want to calculate the various parts of the left integral.
$$\begin{align*}
\frac{\tau(\chi,1)}{i^{\eps}\sqrt N}\int_0^1u^{(s-\eps-1)/2}\psi(\conj\chi, 1/u)\frac{\d u}u
&&&=\frac{\tau(\chi,1)}{i^{\eps}\sqrt N}\int_1^\infty\frac{\psi(\conj\chi, u)}{u^{(s-\eps-1)/2}}\frac{\d u}u\\
\frac12\frac{\tau(\chi,1)}{i^{\eps}\sqrt N}\int_0^1u^{(s-\eps-1)/2}\frac{\d u}u
&=\frac{\tau(\chi,1)}{i^{\eps}\sqrt N}\sqbracks{\frac{u^{(s-\eps-1)/2}}{s-\eps-1}}_0^1 &&= \frac{\tau(\chi,1)}{i^{\eps}\sqrt N}\frac1{s-\eps-1}\\
-\frac12\int_0^1u^{(s+\eps)/2}\frac{\d u}u
&=-\sqbracks{\frac{u^{(s+\eps)/2}}{s+\eps}}_0^1 &&= \frac{-1}{s+\eps}
.
\end{align*}$$
Thus, letting $W(\chi)={\tau(\chi,1)}/\parens{i^{\eps}\sqrt N}$, we have
$$\xi(\chi, s)=\frac{W(\chi)}{s-\eps-1}-\frac1{s+\eps}+\int_1^\infty\sqbracks{W(\chi)u^{(1+\eps-s)/2}\psi(\conj\chi, u)+u^{(s+\eps)/2}\psi(\chi, u)}\frac{\d u}u.$$
Now, the integral above defines an entire function since $\psi(\chi, u)$ ($\chi$ fixed) decays rapidly, so $\xi(\chi, s)$ is meromorphic with two simple poles at $s+\eps=0$ and $s-\eps=1$. Note that
$$W(\conj\chi)=\tau(\conj\chi,1)/(i^{\eps}\sqrt N)=(-1)^{\eps}\conj{\tau(\chi,1)}/(i^{\eps}\sqrt N)=\conj{\tau(\chi, 1)}/((-i)^{\eps}\sqrt N)=\conj{W(\chi)},$$
so
$$\xi(\conj\chi, 1-s)=-\frac{\conj{W(\chi)}}{s+\eps}+\frac1{s-\eps-1}+\int_1^{\infty}\sqbracks{\conj{W(\chi)}u^{(s+\eps)/2}\psi(\chi,u)+u^{(1+\eps-s)/w}\psi(\conj\chi, u)}\frac{\d u}u.$$
This formula looks pretty familiar, and indeed (after remarking that $\abs{W(\chi)}=1$) we see that
$$\xi(\chi, s)=W(\chi)\xi(\conj\chi, 1-s).$$
</div>
<div class="corollary">
$L(\chi, s)$ has a meromorphic continuation to the entire plane with at most one pole.
</div>
<div class="proof4">
Just use
$$L(\chi, s)=\parens{\frac\pi N}^{(s+\eps)/2}\frac{\xi(\chi, s)}{\Gamma(\chi, s)}.$$
Note that $1/\Gamma(\chi, s)$ has a simple zero when $s+\eps$ is a non-positive even integer (and has no other poles/zeros), so the pole at $s+\eps=0$ of $\xi(\eps, s)$ gets cancelled out, meaning $L(\chi, s)$ has at most one pole (which, if it exists, is simple and occurs at $s-\eps=1$).
</div>
<div class="corollary">
The same as the last corollary except without the implicit assumption that $\chi$ is primitive. To prove this, just notice that any character factors through a primitive one and then relate their $L$-functions.
</div>
<p>Now that we’ve gotten this far, let’s return to the question of primes in arithmetic progressions. In order to prove Dirichlet’s theorem, we’ll need to make use of one non-trivial result that I will not prove in this post <sup id="fnref:12"><a href="#fn:12" class="footnote">12</a></sup></p>
<div class="theorem">
For every non-trivial character $\chi\in X(N)$, one has $L(\chi,s)$ is holomorphic at $s=1$ (i.e. there's not a pole there), and $L(\chi, 1)\neq0$.
</div>
<p>Now, let</p>
<script type="math/tex; mode=display">P_a:=\sum_{p\equiv a\pmod n}\frac1p,</script>
<p>and recall that</p>
<script type="math/tex; mode=display">\mbf1_a(n) = \sum_{\chi\in X(N)}\frac{\chi(a)^{-1}}{\phi(N)}\chi(n).</script>
<p>Combined together, this gives</p>
<script type="math/tex; mode=display">P_a=\sum_p\frac{\mbf1_a(p)}p=\sum_{\chi\in X(N)}\frac{\chi(a)^{-1}}{\phi(N)}\sum_p\frac{\chi(p)}p.</script>
<p>Now, note that (say, $s>1$)</p>
<script type="math/tex; mode=display">\log L(\chi, s)=\prod_p\parens{\frac1{1-\chi(p)p^{-s}}}=\sum_p\frac{\chi(p)}{p^s}+\sum_p\sum_{n\ge2}\frac{\chi(p)^n}{np^{ns}}=\sum_p\frac{\chi(p)}{p^s}+O(1)</script>
<p>where I skipped a few steps because the argument is the same as when we showed $\sum_p\inv p=\infty$. Now, here’s the kicker: taking the limit as $s\to1^+$, we get</p>
<script type="math/tex; mode=display">P_a=\sum_{p\equiv a\pmod n}\frac1p=\sum_{\chi\in X(N)}\frac{\chi(a)^{-1}}{\phi(N)}\log L(\chi, 1) + O(1) = \infty,</script>
<p>where the last equality comes from the fact that $\log L(\chi,1)=\infty$ iff $\chi$ is the trivial character! Thus, we’ve proven the following.</p>
<div class="theorem" name="Dirichlet's Theorem on Primes in Arithmetic Progressions">
Fix some $a,N$ with $\gcd(a,N)=1$. Then, there are infinitely many primes $p$ such that $p\equiv a\pmod N$. Equivalently, the arithmetic progression
$$\{\dots, a-2N, a-N, a, a+N, a+2N, \dots\}$$
contains infinitely many primes.
</div>
<h1 id="dedekind">Dedekind</h1>
<p>Dedekind’s name doesn’t appear in the title because I wasn’t originally going to talk about him, but he has a role in this story too. Unlike the previous two sections, this one will require some knowledge of basic algebraic number theory, and will not prove an infinitude result about primes <sup id="fnref:13"><a href="#fn:13" class="footnote">13</a></sup>.</p>
<p>We start by recalling some definitions/facts. A <b>number field</b> is a finite extension $K/\Q$. The integral closure of $\Z$ in $K$ is denoted $\ints K$, and is called $K$’s <b>ring of integers</b>. $\ints K$ is always a Dedekind domain (but not always a UFD), so any nonzero ideal in $\ints K$ factors into a unique product of prime ideals. Given an ideal $I\subseteq\ints K$, its <b>norm</b> is $N(I)=\abs{\ints K/I}$, the size of its residue ring.</p>
<p>Now, as it turns out, Dedekind’s favorite letter is the same as Riemann’s.</p>
<div class="definition">
Given a number field $K/\Q$, the <b>Dedekind $\zeta$-function</b> is
$$\zeta_K(s)=\sum_{I\subseteq\ints K}\frac1{N(I)^s},$$
where the sum is taken over all nonzero ideals of $K$. Note that $\zeta_{\Q}(s)=\zeta(s)$, the ordinary Riemann $\zeta$-function.
</div>
<div class="remark">
$\zeta_K(s)$ defines a holomorphic function in the half-plane $\Re(s)>1$. Perhaps unsurprisingly at this point, it is possible to show that $\zeta_K$ extends to a meromorphic function on the entire complex plane with a simple pole at $s=1$. This is harder to show than the analagous result for $\zeta(s),L(\chi, s)$ (although the idea is the same), so showing it is beyond the scope of this post.
</div>
<p>One can use $\zeta_K$ to prove my earlier claim about the holomorphicity and non-vanishing of $L$-functions attached to nontrivial characters at $s=1$. This follows as a corollary of (a stronger verion of) the following.<sup id="fnref:17"><a href="#fn:17" class="footnote">14</a></sup></p>
<div class="theorem">
Let $K=\Q(\zeta_N)$ where $\zeta_N$ is a primitive $N$ root of unity. Then,
$$\zeta_K(s)\left/\prod_{\chi\in X(N)}L(\chi, s)\right.$$
is holomorphic.
</div>
<p>An analagous result holds for arbitrary abelian extensions of $\Q$, and combining this with the knowledge that $L(1, s),\zeta_K(s)$ both have a simple pole at $s=1$ let’s you conclude what you want. We’ll see a simple case of this.</p>
<p>First note that unique factorization of ideals gives a product formula</p>
<script type="math/tex; mode=display">\zeta_K(s)=\prod_\mfp\frac1{1-N(\mfp)^{-s}}.</script>
<p>where $\mfp$ ranges over all (nonzero) prime ideals of $\ints K$. Now, fix a quadratic number field $K=\Q(\sqrt d)$ ($d\not\in\bits$ squarefree). Recall the splitting behavior of primes in $K$. Let ($k\in\Z$, and $p$ an odd prime)</p>
<script type="math/tex; mode=display">% <![CDATA[
\legendre kp=\begin{cases}
1&k\equiv x^2\pmod p\text{ for some }x\\
{-1}&k\not\equiv x^2\pmod p\text{ for all }x\\
0&k\equiv0\pmod p
\end{cases} %]]></script>
<p>be the <b>legendre symbol</b>, and let $D=\disc(K/\Q)$ (i.e. $d$ if $d\equiv1\pmod4$ and $4d$ otherwise). Then, for an odd prime $p$, we have</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{cases}
\text{$p$ is inert in $\ints K$}&\text{if }\legendre Dp=-1\\
\text{$p$ splits in $\ints K$}&\text{if }\legendre Dp=1\\
\text{$p$ is ramified in $\ints K$}&\text{if }\legendre Dp=0
\end{cases} %]]></script>
<p>Because of this, we choose to extend the legendre symbol via</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{cases}
\legendre D2=-1&\text{if $2$ is inert in $\ints K$}\\
\legendre D2=1&\text{if $2$ splits in $\ints K$}\\
\legendre D2=0&\text{if $2$ is ramified in $\ints K$}
\end{cases} %]]></script>
<p>We want to relate this splitting behavior to the Euler product for $\zeta_K(s)$. Consider an odd prime $p\in\Z$. If $p$ is inert, this it will contribute (since $p$ prime with norm $p^2$)</p>
<script type="math/tex; mode=display">\frac1{1-p^{-2s}}=\frac1{1-p^{-s}}\frac1{1+p^{-s}}</script>
<p>to $\zeta_K(s)$. If $p$ splits, then it will contribute (since there are two primes above $p$, each with norm $p$)</p>
<script type="math/tex; mode=display">\frac1{(1-p^{-s})^2}=\frac1{1-p^{-s}}\frac1{1-p^{-s}}</script>
<p>to $\zeta_K(s)$. Finally, if $p$ is ramified, then it will contribute (since there’s one prime over $p$ with norm $p$)</p>
<script type="math/tex; mode=display">\frac1{1-p^{-s}}=\frac1{1-p^{-s}}\frac1{1-0p^{-s}}</script>
<p>to $\zeta_K(s)$. Thus, we see that</p>
<script type="math/tex; mode=display">\zeta_K(s)=\prod_p\frac1{1-p^{-s}}\frac1{1-\legendre Dpp^{-s}}.</script>
<p>where the product is taken over rational primes $p\in\Z$. Now, granting that one could show that $\chi(n)=\legendre Dn$ is multiplicative and factors through a map $U(D)\to\bracks{\pm1}$ <sup id="fnref:14"><a href="#fn:14" class="footnote">15</a></sup>, we would have $\zeta_K(s)=\zeta(s)L(\chi,s)$. <sup id="fnref:15"><a href="#fn:15" class="footnote">16</a></sup> Furthermore, taking $D=(-1)^{(q-1)/2}q$ ($q$ a prime) would mean there’s only one quadratic Dirichlet character $\bmod D$ (i.e. one homomorphism $U(D)\to\{\pm1\}$) which is $\chi(n)=\legendre nq$. This shows that</p>
<script type="math/tex; mode=display">\legendre pq=\legendre{(-1)^{(q-1)/2}q}p=\legendre{-1}p^{(q-1)/2}\legendre qp=(-1)^{\frac{q-1}2\frac{p-1}2}\legendre qp,</script>
<p>which is the law of quadratic reciprocity. <sup id="fnref:16"><a href="#fn:16" class="footnote">17</a></sup></p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>The first time I saw this theorem, I thought it was the kind of dry, technical result that almost never shows up in the wild; I was wrong. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>You may object that expanding out the RHS let’s you pick a term with infinitely many prime factors, but this is a non-issue because those’ll all multiply out at 0, so we good. <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>I think I’ve also seen this called the completed zeta function, but don’t quote me on that <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
<li id="fn:4">
<p>It’s a little known that this is something Dirichlet investigated to get his spirits up after failing to achieve his one, true goal: wiping out half of all life in the universe <a href="#fnref:4" class="reversefootnote">↩</a></p>
</li>
<li id="fn:5">
<p>Really less a “dive” than a “dip our toes in the water” <a href="#fnref:5" class="reversefootnote">↩</a></p>
</li>
<li id="fn:6">
<p>actually, holomorphic when the character is nontrivial <a href="#fnref:6" class="reversefootnote">↩</a></p>
</li>
<li id="fn:7">
<p>See the end of my “Fourier Analysis” post if you don’t know what this is <a href="#fnref:7" class="reversefootnote">↩</a></p>
</li>
<li id="fn:8">
<p>In hindsight, it would have been better to define f(x) = (Nx)^{\eps}e^{-\pi(Nx)^2s/N}, and then apply poisson summation to get \sum_{k\in\Z}f(k+a) = \sum_{k\in\Z}F(f)(k)e^{2\pi ina} <br /> PS: I really wish latex worked in these footnotes <a href="#fnref:8" class="reversefootnote">↩</a></p>
</li>
<li id="fn:9">
<p>If you see a mistake somewhere, let me know <a href="#fnref:9" class="reversefootnote">↩</a></p>
</li>
<li id="fn:11">
<p>At the very least, letting $N=1$ and $\chi=1$ (so $\eps=0$) recovers the functional equation for the classic theta function, as it should <a href="#fnref:11" class="reversefootnote">↩</a></p>
</li>
<li id="fn:10">
<p>completed L function? <a href="#fnref:10" class="reversefootnote">↩</a></p>
</li>
<li id="fn:12">
<p>But I will say more about it in the next section <a href="#fnref:12" class="reversefootnote">↩</a></p>
</li>
<li id="fn:13">
<p>But don’t worry, we will prove something involving primes. <a href="#fnref:13" class="reversefootnote">↩</a></p>
</li>
<li id="fn:17">
<p>Alternatively, it may actually be easier to calculate the residue of L(\chi, s) at s=1, and show that this residue is 0. I haven’t tried this, but I imagine (could be wrong) that it boils down to some orthagonality relation a la the last theorem of the shallow dive showing that this residue is 0 iff chi is nontrivial. You would still need to below theorem for non-vanishing though. <a href="#fnref:17" class="reversefootnote">↩</a></p>
</li>
<li id="fn:14">
<p>Which, honestly, might be very hard to do. I haven’t tried. <a href="#fnref:14" class="reversefootnote">↩</a></p>
</li>
<li id="fn:15">
<p>We’ve incidentally shown that \zeta_K has a meromorphic continuation in the quadratic number field case <a href="#fnref:15" class="reversefootnote">↩</a></p>
</li>
<li id="fn:16">
<p>I must admit, this whole section turned out more dubious than I intended. I wanted to present a (clean) proof of qudratic reciprocity, but it’s unclear to me how much machinary/grunt work one would need to turn this outline into a rigorous, non-circular proof <a href="#fnref:16" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>In this post, I want to focus on the Riemman $\zeta$ function, and (one type of) its generalizations: the Dirichlet $L$-functions. We’ll prove some nice properties of these things (e.g. their meromorphic continuations, product formulas, and functional equations), and use them to prove some infinitude results involving primes (spoiler: we’ll show e.g. that there are infinitely many primes!).“Fourier Analysis”2019-01-13T00:01:00+00:002019-01-13T00:01:00+00:00https://nivent.github.io/blog/fourier<p>I may start writing posts much more frequently than usual. There is a lot of mathematics that I want to learn this quarter, and somehow it seems that writing these posts is one of the best ways I know to help me absorb everything <sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup>. In this post, I want to do a quick introduction to fourier analysis on the circle $S^1$ and on the real line $\R$. Because of my (lack of a) background in analysis, I may be a little handwavy every now and then <sup id="fnref:3"><a href="#fn:3" class="footnote">2</a></sup>, but the main ideas will be there. In addition, as I’m <del>stealing this material from Stein</del> typing this all up, I might try to include some footnotes with questions/comments I have about how the ideas here generalize or relate to other parts of mathematics. Finally, this post will lead into <a href="../riemann-dirichlet">another one on the Riemann zeta function</a> <sup id="fnref:2"><a href="#fn:2" class="footnote">3</a></sup>, so look forward to that.</p>
<h1 id="fourier-aanalysis-on-s1">Fourier Aanalysis on $S^1$</h1>
<p>Our goal is to be able to say that we can represent a function $f:S^1\to\C$ (which we view as a $1$-periodic function $f:\R\to\C$ <sup id="fnref:4"><a href="#fn:4" class="footnote">4</a></sup>) as a fourier series</p>
<script type="math/tex; mode=display">f(x)=\sum_{n=-\infty}^\infty a_ne^{2\pi in}</script>
<p>with coefficients given by</p>
<script type="math/tex; mode=display">a_n=\int_0^1f(x)e^{-2\pi inx}\dx.</script>
<p>Now, there are some really ugly function $f:S^1\to\C$, so we obviously can’t expect this to hold for all of them. Hence, we will impose the restriction that all our functions are Riemann integrable when considered as functions $[0,1]\to\C$ (i.e. their real and imaginary parts of both Riemann integrable in the following sense).</p>
<div class="definition">
We say a function $f:[0,1]\to\R$ is <b>Riemann integrable</b> if it is bounded and for every $\eps>0$, there exists a subdivision $P=\bracks{0=x_0< x_1<\cdots< x_n=L}$ of $[0,1]$ so that $\mc U(f, P)-\mc L(f, P)<\eps$ where
$$\mc U(f, P)=\sum_{i=1}^n\sqbracks{\sup_{x\in[x_{i-1},x_i]}f(x)}\parens{x_i-x_{i-1}}\text{ and }\mc L(f, P)=\sum_{i=1}^n\sqbracks{\inf_{x\in[x_{i-1},x_i]}f(x)}\parens{x_i-x_{i-1}}$$
are the upper and lower, respectively, sums of $f$ for this subdivision. Whenever $f$ is Riemann integrable, the <b>Riemann integral</b> of $f$ is
$$\int_0^1f(x)\dx:=\inf_P\mc U(f,P)=\sup_P\mc L(f,P)$$
</div>
<div class="remark">
All continuous functions $[0,1]\to\C$ are Riemann integrable, but a Riemann integrable function does not have to be continuous; however, if $f$ is Riemann integrable, then its set of discontinuities has measure $0$.
</div>
<p>Denote the $\C$-algebra <sup id="fnref:5"><a href="#fn:5" class="footnote">5</a></sup> of Riemann integrable (complex-valued) functions on $S^1$ by $\mc R(S^1)$. Given some $f\in\mc R(S^1)$, its <b>$n$th Fourier coefficient</b> is <sup id="fnref:8"><a href="#fn:8" class="footnote">6</a></sup></p>
<script type="math/tex; mode=display">\hat f(n)=a(f)_ n:=\int_0^1f(x)e^{-2\pi inx}\dx=\int_0^1f(x)\cos(2\pi nx)\dx+i\int_0^1f(x)\sin(2\pi nx).</script>
<p>If we feel so inclined, we can view this construction as a function <sup id="fnref:6"><a href="#fn:6" class="footnote">7</a></sup></p>
<script type="math/tex; mode=display">\mapdesc{\wh{}}{\mc R(S^1)}{\C^{\Z}}{f}{\hat f},</script>
<p>where $\C^{\Z}$ is the space of all functions $\Z\to\C$. It’s worth noting that while we can define Fourier coefficients for all Riemann integrable functions, it is not the case that the associated fourier series converges to $f$ always. This is immediate once you remember that you can change a function at a single point <sup id="fnref:7"><a href="#fn:7" class="footnote">8</a></sup> without changing its integral, so to obtain a nice theory, we’ll need to be more restrictive. To help us decide how restrictive, here’s a nice theorem.</p>
<div class="theorem">
Suppose that $f:S^1\to\C$ (i.e. $f:[0,1]\to\C$ where $f(0)=f(1)$) is Riemann integrable with $a(f)_n=0$ for all $n\in\Z$. Then, $f(x_0)=0$ if $f$ is continuous at $x_0$.
</div>
<div class="proof4">
We prove this in the case that $f$ is real-valued. By shifting $f$ and negating if necessary, we may assume that $x_0=\frac12$ and $f(x_0)>0$. Since $f$ is continuous at $x_0$, we can choose $0<\delta\le\frac12$ so that $\abs{f(x)-f(x_0)}< f(x_0)/2\implies f(x)>f(x_0)/2$ whenever $\abs{x-\frac12}<\delta$. Let<br />
$$p(x)=\eps+\cos\parens{2\pi x-\pi}$$
where $\eps>0$ is small enough that $\abs{p(x)}< 1-\eps/2$ whenever $\delta\le\abs{x-\frac12}\le\frac12$ (e.g. $\eps<\frac23\parens{1-\cos(2\pi\delta)}$). Next, fix a positive $\eta<\delta$ s.t. $p(x)\ge1+\eps/2$ for $\abs{x-\frac12}<\eta$ (exists by continuity since $p(1/2)=1+\eps>1+\eps/2$, and let $p_k(x)=p(x)^k$. Finally, fix $B$ so that $\abs{f(x)}\le B$ for all $x$. Each $p_k$ is a trigonometric polynomial, so $\hat f(n)=0$ for all $n$ implies that<br />
$$\int_0^1f(x)p_k(x)\dx=0\text{ for all $k$}.$$
At the same time, our various chosen parameters give us the following integral estimates
$$\begin{align*}
\abs{\int_{\delta\le\abs{x-\frac12}}f(x)p_k(x)\dx} &\le B(1-\eps/2)^k\\
\int_{\eta\le\abs{x-\frac12}<\delta}f(x)p_k(x)\dx &\ge0\\
\int_{\abs{x-\frac12}<\eta}f(x)p_k(x)\dx &\ge \eta f(x_0)\parens{1+\eps/2}^k
\end{align*}.$$
As $k\to\infty$, the top integral approaches $0$, the middle one remains non-negative, and the bottom one appraoches $\infty$. Summing them, we get
$$\int_0^1f(x)p_k(x)\dx\to\infty\text{ as }k\to\infty$$
which is a contradiction. When $f$ is not real-valued, let $u(x)=\Re f(x)$ and $v(x)=\Im f(x)$. Then,
$$u(x)=\frac{f(x)+\conj f(x)}2\text{ and }v(x)=\frac{f(x)-\conj f(x)}{2i}.$$
Furthermore, $a(\conj f)_n=\conj{a(f)_{-n}}$. Taken together, this means $a(u)_n=\frac12\parens{a(f)_n+\conj{a(f)_{-n}}}=0$ (and similarly, $a(v)_n=0$), so $f$ vanishes.
</div>
<div class="corollary">
If $f$ is continuous on the circle and $\hat f(n)=0$ for all $n\in\Z$, then $f=0$.
</div>
<div class="corollary">
Suppose that $f$ is a continuous function on the circle whose <b>Fourier series</b>
$$\sum_{n=-\infty}^\infty\hat f(n)e^{2\pi inx}$$
is absolutely convergent with the further condition that $\sum_{n\in\Z}\abs{\hat f(n)}<\infty$. Then, the Fourier series converges uniformly to $f$, i.e.
$$\lim_{N\to\infty}\sum_{n=-N}^N\hat f(n)e^{2\pi inx}=f(x)\text{ uniformly in }x$$
</div>
<p>The proof idea here is that the condition on the coefficients guarantees that the Fourier series converges (absolutely and uniformly) to some continuous function $g(x)$ with the same coefficients as $f$; hence, $f(x)=g(x)$ by the first corollary.</p>
<div class="definition">
Given two functions $f,g$, we say $f(x)=O(g(x))$, read "$f(x)$ is big-O of $g(x)$," as $x\to a$ if there exists some constant $C$ such that $\abs{f(x)}\le C\abs{g(x)}$ in some neighborhood of $a$. In particular, if $f(x)=O(g(x))$ as $x\to\infty$, then there are constants $C,n$ such that $\abs{f(x)}\le C\abs{g(x)}$ for all $x\ge n$.
</div>
<div class="proposition">
If $f\in C^k(S^1)$ (i.e. $f$ is $k$-times-differentiable with continuous $k$th derivative), then $\hat f(n)=O(1/\abs n^k)$ as $\abs n\to\infty$.
</div>
<div class="proof4">
Just use integration by parts. For $n\neq0$, we have
$$\begin{align*}
\hat f(n)=\int_0^1f(x)e^{-2\pi inx}\dx
&=\sqbracks{\frac{-f(x)}{2\pi i n}}_0^1+\frac1{2\pi i n}\int_0^1f'(x)e^{-2\pi inx}\dx\\
&=\frac1{2\pi i n}\int_0^1f'(x)e^{-2\pi inx}\dx\\
&=\cdots\\
&=\frac1{(2\pi in)^k}\int_0^1f^{(k)}(x)e^{-2\pi inx}\dx
\end{align*}$$
where the bracket quantites vanish since $f^{(n)}(0)=f^{(n)}(1)$ for all $n$. Fixing $B\in\R_{>0}$ such that $\abs{f^{(k)}(x)}\le B$ for all $x$, this means that
$$\abs{\hat f(n)}\le\frac B{(2\pi\abs n)^k}=O(\abs n^{-k})$$
</div>
<div class="corollary">
If $f\in C^k(S^1)$ for $k\ge2$, then the Fourier series of $f$ converges absolutely and uniformly to $f$.
</div>
<p>There’s more that can be said here, but my main goal is to get to Poisson summation, and I think we’ve developed all the theory on $S^1$ that we need for that, so let’s move on$\dots$ after a few remarks.</p>
<p>The first thing we’ll do is update our description of $\wh{}$ as a map on function spaces. Letting $\mc S(\Z)$ denote the space of functions $f:\Z\to\C$ such that $\sum_{n\in\Z}f(n)<\infty$, we can view our work here as showing that the function</p>
<script type="math/tex; mode=display">\mapdesc{\wh{}}{C^2(S^1)}{\mc S(\Z)}{f}{\wh f}</script>
<p>is injective.</p>
<p>The second thing we’ll do is give a little intuition for the formula for Fourier coefficients, i.e. why take</p>
<script type="math/tex; mode=display">a_n=\int_0^1f(x)e^{-2\pi inx}\dx?</script>
<p>The idea is that the functions $g_n(x)=e^{2\pi inx}$ as $n$ varies over $\Z$ are pairwise orthogonal (and have norm $1$) with respect to the following inner product on $\mc R(S^1)$:</p>
<script type="math/tex; mode=display">\angled fg=\int_0^1f(x)\conj{g(x)}\dx.</script>
<p>This means that if we can represent some function $f\in\mc R(S^1)$ as $f(x)=\sum_{d\in\Z}c_de^{2\pi idx}$, then we must have</p>
<script type="math/tex; mode=display">\int_0^1f(x)e^{-2\pi inx}\dx=\angled f{g_n}=\angled{\sum_{d\in\Z}c_dg_d}{g_n}=\sum_{d\in\Z}c_d\angled{g_d}{g_n}=c_n\angled{g_n}{g_n}=c_n.</script>
<h1 id="fourier-analysis-on-r">Fourier Analysis on $\R$</h1>
<p>This time around, we’ll start off with a nice space of functions.</p>
<div class="definition">
The <b>Schwartz space</b> on $\R$ is the set of all smooth (i.e. infinitely differentiable) functions on $f$ that are rapidly decreasing in the sense that
$$\sup_{x\in\R}\abs x^k\abs{f^{(\l)}(x)}<\infty\text{ for every }k,\l\ge0.$$
We denote this space by $\mc S(\R)$. Note that it is a $\C$-vector space.
</div>
<div class="remark">
It's clear from the definition that $f(x)\in\mc S(\R)\implies f'(x)\in\mc S(\R)$ and $xf(x)\in\mc S(\R)$. Hence, $\mc S(\R)$ is closed under differentiation and polynomial multiplication. Put another way, $\C[x]$-module that is closed under differentiation. Put another way, $\mc S(\R)$ is a $\C[x,D]$-module where $D$ acts via the differentiation operator.
</div>
<div class="example">
$f(x)=e^{-x^2}\in\mc S(\R)$. This is because $P(x)e^{-x^2}\to0$ as $\abs x\to\infty$ for any polynomial $P$ (in particular, for $P(x)=x^k$), and an easy induction argument shows that every derivative of $f$ is of the form $P(x)e^{-x^2}$. In fact, $f_a(x)=e^{-ax^2}\in\mc S(\R)$ for every $a>0$.
</div>
<p>One (though certainly not the only) nice property of Schwartz functions is that they decay fast enough to have a finite integral over all of $\R$. That is, if $f\in\mc S(\R)$, then</p>
<script type="math/tex; mode=display">\lim_{N\to\infty}\int_{-N}^Nf(x)\dx</script>
<p>exists and is finite. To see this, let $I_N=\int_{-N}^Nf(x)\dx$, so we only need to show that $I_N$ is Cauchy.</p>
<div class="lemma">
If $f\in\mc S(\R)$, then there exists some $N>0$ s.t. $x\ge N\implies\abs{f(x)}\le1/x^2$.
</div>
<div class="proof4">
Suppose not, so there exists arbitrarirly large $x\in\R$ with $\abs{f(x)}>1/x^2$. This means we can find some sequences $\{a_n\}$ of real numbers such that $\lim\abs{a_n}=\infty$ and $\abs{a_n}^3\abs{f(a_n)}>\abs{a_n}$ for all $n$. However, this contradicts
$$\sup_{x\in\R}\abs x^3\abs{f(x)}<\infty,$$
so we win.
</div>
<p>Given that lemma, fix $N$ large enough that $x\ge N\implies\abs{f(x)}\le1/x^2$, and note that for $m>n\ge N$ we have</p>
<script type="math/tex; mode=display">\abs{I_m-I_n}=\abs{\int_{n\le\abs x\le m}f(x)\dx}\le{\int_{n\le\abs x\le m}x^{-2}\dx}\le\sqbracks{-\frac1x}_{-m}^{-n}+\sqbracks{-\frac1x}_n^m=2\parens{\frac1n-\frac1m}\le\frac2N\to0</script>
<p>so ${I_n}$ is indeed Caucy, and we can safely define</p>
<script type="math/tex; mode=display">\int_{\R}f(x)\dx=\int_{-\infty}^\infty f(x)\dx=\lim_{N\to\infty}\int_{-N}^Nf(x)\dx.</script>
<div class="definition">
The <b>Fourier transform</b> of a function $f\in\mc S(\R)$ is defined by
$$\hat f(\xi)=\int_{-\infty}^\infty f(x)e^{-2\pi ix\xi}\dx.$$
We will sometimes denote this by $\mc F(f)(\xi)=\hat f(\xi)$.
</div>
<div class="proposition">
The Fourier transform enjoys the following list of properties.
<ol>
<li> $\mc F(f(x+h))(\xi)=\hat f(\xi)e^{2\pi ih\xi}$ when $h\in\R$. </li>
<li> $\mc F(f(x)e^{-2\pi ixh})(\xi)=\hat f(\xi+h)$ when $h\in\R$. </li>
<li> $\mc F(f(\delta x))(\xi)=\inv\delta\hat f(\inv\delta\xi)$ when $\delta>0$. </li>
<li> $\mc F(f')(\xi) = 2\pi i\xi\hat f(\xi)$. </li>
<li> $\mc F(-2\pi ixf(x))(\xi) = \frac{\d}{\d\xi}\hat f(\xi)$. </li>
</ol>
So the Fourier transform (roughly) turns differentiation into mulitplication by $x$, and shifting into multiplication by $e^{hx}$.
</div>
<div class="proof4">
Exercise.
</div>
<div class="corollary">
If $f\in\mc S(\R)$, then $\hat f\in\mc S(\R)$.
</div>
<div class="proof4">
Note that $\abs{\hat f(\xi)}\le\int_{\R}\abs{f(x)}\dx<\infty$, so the $f\in\mc S(\R)\implies\hat f$ is bounded. Now, for any $\l,k\in\Z_{\ge0}$, we have that
$$\xi^k\parens{\frac{\d}{\d\xi}}^\l\hat f(\xi)$$
is bounded since it is the Fourier transform of
$$\frac1{(2\pi i)^k}\parens{\frac{\d}{\dx}}^k\sqbracks{(-2\pi ix)^\l f(x)}.$$
</div>
<p>This post is more about ideas than details, so let’s just state the good stuff.</p>
<div class="theorem" name="Fourier Inversion Formula">
If $f\in\mc S(\R)$, then
$$f(x)=\int_{-\infty}^\infty\hat f(\xi)e^{2\pi ix\xi}\d\xi.$$
</div>
<div class="proof4">
Omitted. See the book by Stein and Shakarchi.
</div>
<p>Note that, like last time, we can view our work here as showing some function is “nice.” In this instance, we have that</p>
<script type="math/tex; mode=display">\mapdesc{\mc F}{\mc S(\R)}{\mc S(\R)}{f}{\mc F(f)}</script>
<p>is a $\C$-vector space isomorphism.</p>
<h1 id="poisson-summation">Poisson Summation</h1>
<p>To end things, we’ll relate fourier tranforms and fourier series in a neat way. Fix some function $f\in\mc S(\R)$ on the real line, and imagine you want to convert this into some function on the circle. One thing you could try is defining</p>
<script type="math/tex; mode=display">F_1(x)=\sum_{n=-\infty}^\infty f(x+n),</script>
<p>which is obviously $1$-periodic (the series converges since $f$ decays rapidly). Alternatively, inspired by Fourier Inversion</p>
<script type="math/tex; mode=display">f(x)=\int_{\R}\hat f(\xi)e^{2\pi ix\xi}\d\xi,</script>
<p>you could try creating a periodic version of $f$ by considering some discrete analouge of Fourier inversion:</p>
<script type="math/tex; mode=display">F_2(x)=\sum_{n=-\infty}^{\infty}\wh f(n)e^{2\pi inx}.</script>
<p>As it turns out, these two approaches are equivalent.</p>
<div class="theorem" name="Poisson summation formula">
If $f\in\mc S(\R)$, then
$$\sum_{n=-\infty}^\infty f(x+n)=\sum_{n=-\infty}^\infty \hat f(n)e^{2\pi inx}.$$
In particular, setting $x=0$ gives
$$\sum_{n=-\infty}^\infty f(n)=\sum_{n=-\infty}^\infty\hat f(n).$$
</div>
<div class="proof4">
Since both sides are continuous, it suffices to show they have the same Fourier coefficients. Unsurprisingly, the $m$th Fourier coefiicient of the RHS is $\hat f(m)$. On the LHS, we have
$$\begin{align*}
\int_0^1\parens{\sum_{n=-\infty}^\infty f(x+n)}e^{-2\pi imx}\dx
&=\sum_{n=-\infty}^\infty\int_0^1f(x+n)e^{-2\pi imx}\dx\\
&=\sum_{n=-\infty}^\infty\int_n^{n+1}f(x)e^{-2\pi imx}\dx\\
&=\int_{-\infty}^\infty f(x)e^{-2\pi imx}\dx\\
&=\hat f(m)
\end{align*}$$
where we were allowed to change the sum and the integral because $f$ is rapidly decreasing.
</div>
<p>As an application of this, consider the <b>theta function</b></p>
<script type="math/tex; mode=display">\vartheta(s)=\sum_{n=-\infty}^\infty e^{-\pi n^2s}</script>
<p>defined for $s>0$ (or for $s\in\C$ with $\Re(s)>0$ if you’re feeling adventurous). Because the function $f(x)=e^{-\pi sx^2}$ is in $\mc S(\R)$ (e.g. $s>0$), and because $\hat f(\xi)=s^{-1/2}e^{-\pi\xi^2/s}$, we can apply Poisson summation to get</p>
<script type="math/tex; mode=display">\sum_{n=-\infty}^\infty e^{-\pi sn^2} = s^{-1/2}\sum_{n=-\infty}^\infty e^{-\pi n^2/s}.</script>
<p>Written in terms of $\vartheta$, this says that $\vartheta(s)=s^{-1/2}\vartheta(1/s)$. This will be useful when looking at the Riemann zeta function.</p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>Of course, when things really get going and I’m regularly doing psets and whatnot, blogging may seem less necessary (and logistically possible) than right now <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>If you’re reading this post, it’s probably better to think of it as motivation for learning about fourier analysis instead of as an introduction to fourier analysis <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>And hopefully writing this post and the next will provide me with decent motivation/context for studying Tate’s thesis where he uses fourier analysis on some number-theoretic groups to prove results about their attached zeta functions and whatnot <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:4">
<p>This is justified because the circle S^1 is just R/Z as a (topological) group, or because S^1 is obtained from [0,1] by joining the endpoints. Use whichever justification you prefer; they’re not that different. <a href="#fnref:4" class="reversefootnote">↩</a></p>
</li>
<li id="fn:5">
<p>I think this is furthermore a Banach algebra with inner product (f,g) = \int_0^1(f\bar g)dx but I haven’t checked this <a href="#fnref:5" class="reversefootnote">↩</a></p>
</li>
<li id="fn:8">
<p>f is complex-valued, so the following is not a decomposition of \hat f(n) into a real and imaginary part <a href="#fnref:8" class="reversefootnote">↩</a></p>
</li>
<li id="fn:6">
<p>Secretly the circle S^1 and the integers Z are somehow dual in a way that can be made precise if study Fourier analysis sufficiently generally <a href="#fnref:6" class="reversefootnote">↩</a></p>
</li>
<li id="fn:7">
<p>or any set of measure 0 <a href="#fnref:7" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>I may start writing posts much more frequently than usual. There is a lot of mathematics that I want to learn this quarter, and somehow it seems that writing these posts is one of the best ways I know to help me absorb everything 1. In this post, I want to do a quick introduction to fourier analysis on the circle $S^1$ and on the real line $\R$. Because of my (lack of a) background in analysis, I may be a little handwavy every now and then 2, but the main ideas will be there. In addition, as I’m stealing this material from Stein typing this all up, I might try to include some footnotes with questions/comments I have about how the ideas here generalize or relate to other parts of mathematics. Finally, this post will lead into another one on the Riemann zeta function 3, so look forward to that. Of course, when things really get going and I’m regularly doing psets and whatnot, blogging may seem less necessary (and logistically possible) than right now ↩ If you’re reading this post, it’s probably better to think of it as motivation for learning about fourier analysis instead of as an introduction to fourier analysis ↩ And hopefully writing this post and the next will provide me with decent motivation/context for studying Tate’s thesis where he uses fourier analysis on some number-theoretic groups to prove results about their attached zeta functions and whatnot ↩Covering Spaces2019-01-11T00:00:00+00:002019-01-11T00:00:00+00:00https://nivent.github.io/blog/covering-spaces<p>This post is more for me than it is for you. I wanna make sure I still understand the theory of covering spaces after not thinking about them for a while. It’s possible you’ll get something out of this as well. Throughout this post, every space is assumed path-connected and locally path-connected unless stated otherwise. Furthermore, every map is continuous.</p>
<h1 id="the-basics">The Basics</h1>
<p class="definition">
A <b>covering map</b> is a surjective map $p:\wt X\to X$ such that every $x\in X$ has a <b>fundamental</b> (or <b>elementary</b>) <b>neighborhood</b> $U$, meaning each connected component of $\inv p(U)$ is mapped homeomorphically onto $U$ by $p$. The connected components of $\inv p(U)$ are called <b>lifts</b> of $U$. Given $x\in X$, the set $\inv p(x)$ is called the <b>fiber</b> above $x$ and an inpidual $\tilde x\in\inv p(x)$ is called a <b>lift</b> of $x$. The pair $(\wt X,p)$ is called a <b>covering space</b> of $X$, and we will often abuse notation by referring to $\wt X$ alone as a covering space. We will sometimes refer to $X$ as the <b>base space</b>.
</p>
<p>Covering spaces are neat because they have simpler topologies (read: “simpler” fundamental groups <sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup>) than the base space, allowing us to “lift” certain arguments from a base space to (one of) its coverings, and because a space’s coverings have a “Galois correspondence” with its fundamental group.</p>
<p>We will begin by showing the existence of lifts of maps $f:Y\to X$. By this, we mean a map $f:Y\to\wt X$ such that the following diagram is commutative</p>
<center>
<img src="https://nivent.github.io/images/blog/covering-spaces/lift.png" width="200" height="100" />
</center>
<p>where the dashed line is meant to signify that $\tilde f$ exists given some (sufficiently nice) $f:Y\to X$. This is easiest to see in the case of paths.</p>
<p>For the remainder of this section, fix some base space $X$ and a covering space $(\wt X,p)$.</p>
<p class="lemma" name="unique path lifting">
Given a path $f:I\to X$ and a lift $\tilde x\in\inv p(f(0))$, there exists a unique path $\tilde f:I\to\wt X$ such that $\tilde f(0)=\tilde x$ and $p\tilde f=f$.
</p>
<p class="proof4">
By covering $X$ by fundamental neighborhoods, pulling them back to $I$ via $f$, and appealing to compactness of $I$, we obtain some finite cover $\{U_i\}_{i=1}^n$ of $I$ such that $f(U_i)$ is contained in some elementary neighborhood. By possibly refining this cover, this means there's some $m\in\N$ such that $I_j:=[j/m,(j+1)/m]$ gets mapped into a fundamental neighborhood by $f$ for $j=0,\dots,m-1$. We can now lift $f$ piece-by-piece. Let $\tilde U_0\subseteq\tilde X$ be the unique path component of $\inv p\parens{f(I_0)}$ containing $\tilde x$. Since $p:\tilde U_0\to f(I_0)$ is a homeomorphism, we define $\tilde f$ on $I_0$ by $\tilde f(x)=\inv p(f(x))$ where we consider $p$ a function $\tilde U_0\to f(I_0)$. We now proceed inductively. Given $f$ defined on $I_0\cup\cdots\cup I_{j-1}$, we define it on $I_j$ by letting $U_j$ be the path component of $\inv p(f(I_j))$ containing $\tilde f(j/m)$, and then lifting $f$ on $I_j$ the only way possible.
</p>
<p class="theorem">
Given a covering space $p:\tilde X\to X$, a homotopy $f_t:Y\to X$, and a map $\st{f_0}:Y\to\tilde X$ lifting $f_0$, there exists a unique homotopy $\tilde f_t:Y\to\tilde X$ of $\tilde f_0$ that lifts $f_t$.
</p>
<p class="proof4">
We first constrct a lift locally around each point. Let $F:Y\by I\to X,(y,t)\mapsto f_t(y)$. Given some $y\in Y$ and $t\in[0,1]$, we can find a basic neighborhood $B_t=N_t\by(a_t,b_t)$ such that $F(B_t)$ is contained in some fundamental neighborhood. Using compactness of $I$ and varying $t$, we see that finitely many such $B_t$ are sufficient for covering $\{y\}\by I$. Intersecting the $N_t$ and choosing a suitably large $m\in\N$, we can find a single neighborhood $N_y$ of $y$ such that $F(N\by[j/m,(j+1)/j])$ is contained in a fundamental neighborhood for each $j=0,\cdots,m-1$. Repeating the inductive procedure from last time, this let's us construct a lift $\st f_t^y:I\to\tilde X$ of $f_t$.<br />
Finally, consider some $y,y'\in Y$. Note that the lifts $\st f_t^y,\st f_t^{y'}$ must agree on $N_y\cap N_{y'}$ since for any $y_0\in N_y\cap N_{y'}$, $t\mapsto\st f_t^y\mid_{\{y_0\}\by I}$ and $t\mapsto\st f_t^{y'}\mid_{\{y_0\}\by I}$ are both lifts of the path $t\mapsto f_t(y_0)$ and hence are equal by the above lemma. Thus, our various local lifts stitch together to give a global lift $\tilde f_t:Y\to\tilde X$ which is continuous since it's continuous on each $N_y$ and these cover $Y$. It's unique because it's unique of each "slice" $\{y\}\by I$.
</p>
<p class="remark">
If $f_t:I\to X$ is a homotopy of paths (i.e. it fixes endpoints), then any lift $\st{f_t}:I\to X$ also fixes endpoints because $\st{f_0}:I\to X$ lifts a constant path as does $\st{f_1}:I\to X$ (and $\inv p(x)$ is discrete for all $x\in X$).
</p>
<p>This has some immediate applications. Note that when we write a map $f:(X,x)\to(Y,y)$ we mean that $f:X\to Y$ is a map with $f(x)=y$ (more generally, given $A\subseteq X$ and $B\subseteq Y$, $f:(X,A)\to(Y,B)$ means $f(A)\subseteq B$).</p>
<p class="proposition">
The map $\push p:\pi_1(\wt X,\st x_0)\to\pi_1(X, x_0)$ induced by the covering $p:(\wt X,\st x_0)\to(X,x_0)$ is injective. Furthermore, the its image $\push p\pi_1(\wt X,\st x_0)$ in $\pi_1(X,x_0)$ consists of (homotopy classes of) loops in $X$ based at $x_0$ whose lifts to $\wt X$ starting at $\wt x_0$ are loops.
</p>
<p class="proof4">
Suppose that $\st f\in\pi_1(\wt X,\st x_0)$ has a nulhomotopic image, so there's some homotopy $h_t:I\to X$ with $h_0=p\tilde f$ and $h_1(t)=x_0$ for all $t$. Since $\tilde f$ clearly lifts $h_0$, we can lift this homotopy to $\st{h_t}:I\to\wt X$ where $\st{h_0}=\st f$. Then, $\st{h_1}$ lifts $h_1$, a constant path, so $\st{h_1}$ must be constant as well. Hence, $\tilde f$ is nulhomotopic, so $\push p$ is injective. The second part of the proposition is obvious.
</p>
<p class="proposition">
The size of the fiber $\inv p(x)$ over any $x\in X$ is equal to the index of $\push p\pi_1(\wt X,\st x_0)$ in $\pi_1(X, x_0)$.
</p>
<p class="proof4">
Let $G=\pi_1(X, x_0)$, $H=\push p\pi_1(\wt X,\st x_0)$, and $G/H=\{Hg:g\in G\}$ be the set of cosets of $H$. We define a function $\phi:G/H\to\inv p(x_0)$ by letting $\phi(Hg)$ be $\tilde g(1)$ where $\tilde g$ is the lift of $g$ starting at $\st x_0$. This is well-defined since elements of $H$ lift to loops, so $hg$ has the same endpoint as $g$ for any $h\in H$. Note that $\phi$ is surjective since $\wt X$ is path-connected, and $\phi$ is injective because $\phi(Hg)=\phi(Hk)\implies \st g\inv{\st k}\in\pi_1(\wt X,\st x_0)$, so $g\inv k\in H$.
</p>
<p class="corollary">
Every fiber of $p:\wt X\to X$ has the same size, and $\wt X$ is a fiber bundle over $X$.
</p>
<h1 id="more-lifts">More Lifts</h1>
<p>We’ve seen how to lift homotopies given a starting point, but what about more general maps? Fix a covering space $p:(\wt X,\st x_0)\to(X,x_0)$.</p>
<p class="proposition">
Given a map $f:(Y,y_0)\to(X,x_0)$ this lifts to a map $\st f:(Y,y_0)\to(\wt X,\st x_0)$ iff $\push f\pi_1(Y,y_0)\subseteq\push p\pi_1(\wt X,\st x_0)$.
</p>
<p class="proof4">
The "only if" direction is obvious since $f=p\tilde f$, so we focus on the "if" direction. Given some $y\in Y$, let $g$ be a path from $y_0$ to $y$, so $\push fg$ is a path from $x_0$ to $f(y)$. This lifts to a path $\wt{\push fg}$ starting at $\st x_0$; call the other endpoint of this lift $\st f(y)$, i.e. $\st f(y)=\wt{\push fg}(1)$. This is well-defined since given another path $h$ from $y_0$ to $y$, we have $g\inv h\in\pi_1(Y, y_0)$ so $\push f(g\inv h)\in\push p\pi_1(\wt X,\st x_0)$ meaning that $(\push fg)\inv{(\push fh)}$ lifts to a loop based at $\tilde x_0$ which is possible iff $\wt{\push fg}(1)=\wt{\push fh}(1)$. Thus, we only need to show that $\st f$ is continuous. For this, let $U\subset X$ be an open neighborhood of $f(y)$ with a lift $\wt U\subseteq\wt X$ containing $\st f(y)$ such that $p:\wt U\to U$ is a homeomorphism. Choose a path-connected open neighborhood $V$ of $y$ with $f(V)\subseteq U$. Then, $\st f(V)\subseteq\wt U$ (since $V$ path-connected) and $\tilde f\mid_V=\inv pf$ so $\st f$ is continuous at $y$.
</p>
<p class="cor">
For $n\ge2$ (really, $n\ge1$), let $\pi_n(X, x_0):=[(S^n, s_0),(X, x_0)]$ be the set (really group) of homotopy classes of maps $(S^n, s_0)\to(X, x_0)$ <sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup>. Then, for $n\ge2$ (this time actually $n\ge2$), $\pi_n(X,x_0)\simeq\pi_n(\wt X,\st x_0)$.
</p>
<p class="proof4">
Using the fact that $S^n$ is simply connected for $n\ge2$ (and the fact that $I$ is simple connected), we can apply to above proposition to lift any map $S^n\to X$ up to a map $S^n\to\wt X$ (giving surjectivity) and lift any homotopy $S^n\by I\to X$ up to a homotopy $S^n\by I\to\wt X$ (giving injectivity).
</p>
<p class="proposition">
Let $Y$ be connected (but not necessarily path-connected or locally path-connected), and fix a map $f:Y\to X$ with lifts $\st{f_1},\st{f_2}:Y\to\wt X$. If these lifts agree at one point of $Y$, then they agree on all of $Y$.
</p>
<p class="proof4">
Let $S={y\in Y:\st{f_1}(y)=\st{f_2}(y)}$, and fix some $y\in Y$. Let $U$ be a fundamental neighborhood of $f(y)$. Let $\wt U_1,\wt U_2\subseteq\wt X$ be such that $p$ maps both of them homeomorphically onto $U$, $\st f_1(y)\in\wt U_1$, and $\st f_2(y)\in\wt U_2$. Then, $N=\inv f(U)$ is a neighborhood of $y$ mapped into $\wt U_i$ by $\st{f_i}$, $i=1,2$. If $y\in S$, then $N\subseteq S$ since $\st f_1(y’),\st f_2(y’)\in\wt U_1$ both lift $f(y)\in U$; hence, $S$ is open. If $y\not\in S$, then $N\subseteq Y\sm S$ since $\wt U_1\cap\wt U_2=\emptyset$; hence, $Y\sm S$ is open. Since $Y$ is connected, this means that $S=Y$ or $S=\emptyset$, but $S$ is nonempty by assumption, so $S=Y$.
</p>
<h1 id="galois-correspondence">Galois Correspondence</h1>
<p class="definition">
Let $(\wt X_1,p)$ and $(\wt X_2,q)$ both be covers of $X$. A map $\phi:\wt X_1\to\wt X_2$ is called a <b>homomorphism</b> if $q\phi=p$. The group of isomorphisms $\wt{X_1}\to\wt{X_1}$ is denoted $\Aut(\wt{X_1},p)$.
</p>
<p class="remark">
Any $\phi\in\Aut(\wt X,p)$ is a lift of the covering map $p:\wt X\to p$ and so it is determined by its action on any single point.
</p>
<p class="lemma">
Let $(\wt X_1,\wt x_1)$ and $(\wt X_2, \st x_2)$ be covers of $X$ (with covering maps $p,q$ respectively). Then, $(\wt X_1, p)\simeq(\wt X_2, q)$ iff $\push p\pi_1(\wt{X_1},\st x_1)=\push q\pi_1(\wt{X_2},\st x_2)$.
</p>
<p class="proof4">
Note this is literal equality as sets, and note just isomorphism as groups. The $\to$ direction is easy, so we’ll only do the $\leftarrow$ direction. Suppose that $\push p\pi_1(\wt{X_1},\st x_1)=\push q\pi_1(\wt{X_2},\st x_2)$, and fix any $\st x_1’\in\wt{X_1}$. Now we do the only thing we can do in this situation. Pick a path $g$ from $\st x_1$ to $\st x_1’$, push it down to $X$, lift it up to $\wt{X_2}$ starting at $\st x_2$, and call its right endpoint $f(\st x_1)$. We then show this construction is well-defined and gives an isomorphism of covering spaces, which it is (since the push-forward of loops in $\wt{X_1}$ lift to loops in $\wt{X_2}$) and it does (because you can define the inverse in the same way).
</p>
<p class="definition">
Let $(\wt X, \st x_0)$ be a covering of $(X, x_0)$. We call it a <b>universal covering (space)</b> if $\pi_1(\wt X,\st x_0)=0$.
</p>
<p>The name is justified by the following theorem.</p>
<p class="theorem">
Let $(\wt X,\st x_0)$ be the universal cover of $(X,x_0)$ (with covering map $p$). Then, for any subgroup $H\leq\pi_1(X,x_0)$, there exists a cover $(\wt X_H, \st x_h)$ (with covering map $q$) of $(X,x_0)$ that $(\wt X,\st x_0)$ covers with $\push q\pi_1(\wt X_h, \st x_h)=H$.
</p>
<p class="proof4">
Fix a subgroup $H\le\pi_1(X, x_0)$. Note that $\pi_1(X,x_0)$ acts on $\wt X$ (on the right!) via the map (secretly an isomorphism) $\pi_1(X, x_0)\to\Aut(\wt X,\st x_0)\op$ given by sending $f\in\pi_1(X,x_0)$ to the (unique!) automorphism sending $\st x_0$ to $\st f(1)$ where $\st f$ lifts $x_0$ beginning at $\st x_0$. Let $\wt{X_H}= X/H$, the orbit space of this action restricted to $H$, and define $q:\wt{X_H}\to X$ via $q(\st xH)=p(\st x)$. This is well defined since $H$ preserves the fibers of $p$. <br />
So, we want to show that $q$ is a covering map, that the quotient map $h:\wt X\to\wt{X_H}$ is a covering map, and that $\push q\pi_1(\wt{X_H},\st x_H)=H$ where $\st x_H=h(\wt x_0)$. I was more concerned with the lifting properties of covering spaces when I began this post, so I’ll leave all of this as an exercise.
</p>
<p class="corollary">
Let $(\wt X, \st x_0)$ be a universal cover of $(X, x_0)$, and let $(X’, x_0’)$ be a cover of $(X, x_0)$. Then, $(\wt X, \st x_0)$ also covers $(X’, x_0’)$.
</p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>read: their fundamental groups are subgroups of the base group. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>this obviously does not depend on the choice of $s_0$ upto isomorphism in the relevant category (i.e. pointed sets with basepoint the homotopy class of maps into $x_0$ or groups) <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>This post is more for me than it is for you. I wanna make sure I still understand the theory of covering spaces after not thinking about them for a while. It’s possible you’ll get something out of this as well. Throughout this post, every space is assumed path-connected and locally path-connected unless stated otherwise. Furthermore, every map is continuous.