# An introduction to group actions

Reproduced below is a set of lecture notes I wrote for an undergraduate course on abstract algebra.

The below notes assume that the reader is assumed to be familiar with basic group theory, up to the first isomorphism theorem. The reader is also expected to have a nodding acquaintance with the language of generators and relations.

## 1. Groups of Transformations and Permutations

\(GL(2,\mathbb{R})\), the set of invertible 2-by-2 matrices with real entries, can be made into a group by considering the matrix multiplication operation as the binary operation on \(GL(2,\mathbb{R})\). Matrix multiplication is associative; the identity matrix \(I_2 = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}\) serves as the group identity; every invertible matrix is, well, invertible. In this manner, \(GL(2,\mathbb{R})\) is a group of matrices.

Recall, however, that every matrix defines a linear
transformation. In the case of \(GL(2,\mathbb{R})\), every \(M \in
GL(2,\mathbb{R})\) defines a linear transformation
\(L_M:\mathbb{R}^2 \to \mathbb{R}^2\) given by the formula
\(L_M(v) = M \cdot v\), where \(\cdot\) represents the
matrix-vector multiplication. In other words, every element of the group
\(GL(2,\mathbb{R})\) transforms 2-vectors to some other 2-vectors.
From this angle, \(GL(2,\mathbb{R})\) is a group of *transformations
on \(\mathbb{R}^2\)*.

Similarly, then, we can define the group
\(GL(n,\mathbb{F})\) of invertible \(n\)-by-\(n\) matrices with
entries in a field \(\mathbb{F}\) and consider it a group of
transformations on \(\mathbb{F}^n\), the space of \(n\)-vectors
with entries in \(\mathbb{F}\). Generalizing even further, we define
the *general linear group* *on a vector space \(V\) over a field
\(\mathbb{F}\)*, denoted by \(GL(V)\), to be the group of all
invertible linear transformations \(T:V \to V\) with function
composition as the group operation. Each element of \(GL(V)\) takes
vectors in \(V\) and transforms them into other vectors in \(V\). In
other words, \(GL(V)\) is a group of transformations on \(V\).

But
what is a "group of transformations", really? Every group we have
discussed above has one thing in common: each group \(G\) comes with
some set \(X\) on which each element of \(G\) can be thought of as a
function \(f:X \to X\). The most general group of this kind is
the *symmetric group on a nonempty set \(X\)*, which is the collection
\(S_X\) of all bijections \(f:X \to X\). We require the functions
to be bijections, so that \(S_X\) with respect to the function
composition operation is a *bona fide* group. Note that the general
linear group \(GL(V)\) is a subgroup of \(S_V\), the symmetric
group on the vector space \(V\).

Being a bijection, an arbitrary
element \(f\) of the symmetric group \(S_X\) transforms each
element of \(X\) in a way that *preserves* \(X\). Indeed, if we let
\(f\) act on the elements of \(X\), then we end up with \(X\),
relabeled. In this sense, the elements of \(S_X\) are
called *permutations on *\(X\), and subgroups of
\(S_X\) *permutation groups on \(X\)*.

## 2. Cayley's Theorem; Basic Definitions and Examples

Let us now consider an abstract group \(G\). Unlike \(GL(V)\) or
\(S_X\), the group \(G\) does not come equipped with an obvious set
on which the elements of \(G\) can act as permutations. What we do
have is a binary operation on \(G\), which takes *two* elements of
\(G\) and outputs one element of \(G\). This is to say that, by
fixing one of the two input parameters, the group operation can be
thought of as a *function* on \(G\). We formalize this observation as
follows:

Theorem 2.1(Cayley). Every group is isomorphic to a permutation group on \(G\).

**Proof.** Given a \(g \in G\), we define the *left regular
representation* \(L_g:G \to G\) by setting \(L_g(x) = gx\) for
each \(x \in G\). We note that \(L_g\) is a bijection, as
\(L_g^{-1}\) is the inverse of \(L_g\).

Let us now define
a *left-multiplication map* \(L:G \to S_G\) by declaring \(L(g) =
L_g\) for each \(g \in G\). We fix \(g,h \in G\) and observe
that

\[\begin{align*} L(gh)(x) &= L_{gh}(x) = ghx \\ &= g(hx) = L_g(L_h(x)) \\ &= (L_g \circ L_h)(x) = (L(g) \circ L(h))(x) \end{align*}\]

for all \(x \in G\). From this, it follows that \(L(gh) = L(g)L(h)\). Since \(g\) and \(h\) were arbitrary, we conclude that \(L\) is a group homomorphism.

It remains to show that \(L\) is injective, whence \(G\) is isomorphic to \(\operatorname{im} L\), a subgroup of \(S_G\). To this end, we fix \(g,h \in G\) and suppose that \(L(g) = L(h)\). This, in particular, implies that

\[1_G = L_g(g^{-1}) = L(g)(g^{-1}) = L(h)(g^{-1}) = L_h(g^{-1}) = hg^{-1}.\]

Since the inverse element of a fixed group element is unique, we conclude that \(g = h\). This completes the proof of Cayley's theorem. \(\square\)

We glean from Cayley's theorem an important idea: we can make an abstract group \(G\) act on a set \(X\) by identifying \(G\) with a permutation group on \(X\).

Definition 2.2.Let \(G\) be a group. Apermutation representation of \(G\)is a group homomorphism \(\varphi:G \to S_X\) into the symmetric group \(S_X\) on some set \(X\). Given a nonempty set \(X\), we say that\(G\) acts on \(X\)if there exists a permutation representation \(\varphi:G \to S_X\). In this case, \(X\) is referred to as a\(G\)-set. If, in addition, \(\varphi:G \to S_X\) is injective, then we say that\(G\) acts faithfully on \(X\).

Given a permutation representation \(\varphi:G \to S_X\), we often write

\[g \cdot x \, \, \, \, \, \, \, \, \, \, \, \mbox{ or } \, \, \, \, \, \, \, \, \, \, \, \, gx\]

to denote \(\varphi(g)(x)\). This is the *left action notation*,
which we shall use throughout this post. Also prevalent in the
literature is the *right action notation*, which employs

\[x \cdot g \, \, \, \, \, \, \, \, \, \, \, \mbox{ or } \, \, \, \, \, \, \, \, \, \, \, \, xg\]

to denote \(\varphi(g)(x)\). We will not use the right action notation in this post.

Here is a simple exercise to get you used to the left action notation.

Exercise 2.3.If \(G\) acts on \(X\), then the following holds:Conversely, if (1) and (2) hold, then the mapping \(\varphi:G \to S_X\) given by the formula \(\varphi(g)(x) = g \cdot x\) is a permutation representation.

- \(1_G \cdot x = x\) for all \(x \in X\).
- \(g \cdot (h \cdot x) = (gh) \cdot x\) for all \(g,h \in G\) and \(x \in X\).

We shall therefore use the left action notation and the permutation
representation notation interchangeably. In construcitn a group action,
we often define a map and check that it is a permutation representation,
or define a dot operation and check that properties (1) and (2) hold. If
a group action arises from restricting an already-extant functions
\(f\) on a smaller domain \(D\), then we must also check that the
range of \(f\vert_D\) is a subset of \(D\): see **Example 2.10**.

Let us now study several examples of group actions. There is no need to peruse all of these on the first read: take a look at a few, move on to Section 3, come back whenever certain examples are cited.

**Example 2.4.** Let \(X\) be a nonempty set. The symmetric group
\(S_X\) acts faithfully on \(X\) via the identity representation
\(\operatorname{id}:S_X \to S_X\) that sends each \(\sigma \in
\S_X\) to itself. If \(G\) is a subgroup of \(S_X\), then the
canonical injection map \(\iota_G:G \to S_X\) that sends each \(g
\in G\) to itself is an injective permutation representation of
\(G\) on \(X\). Therefore, \(G\) acts faithfully on \(X\).

Now, if \(Y\) is a set that contains \(X\) as a subset, then we can define an injective group homomorphism \(\varphi:S_X \to S_Y\) by setting

\[\varphi(\sigma)(x) = \begin{cases} \sigma(y) & \mbox{ if } y \in X; \\ y & \mbox{ if } y \in Y \smallsetminus X. \end{cases}\]

In other words, we identify \(S_X\) with the subgroup of permutations on \(Y\) that permutes \(X\) and fixes \(Y \smallsetminus X\). It now follows that \(S_X\) acts faithfully on \(Y\). Moreover, \(\varphi \circ \iota_G:G \to S_Y\) is an injective group homomorphism, whence \(G\) acts faithfully on \(Y\).

**Example 2.5.** In general, if \(G\) acts on \(X\), then every
subgroup \(H\) of \(G\) acts on \(X\). To see this, we take a
permutation representaiton \(\varphi:G \to X\) and precompose it
with the canonical injection map \(\iota_H:H \to G\), obtaining
another permutation representation \(\varphi \circ \iota_H:H \to
S_X\). Note that if the action of \(G\) on \(X\) is faithful, then
so is induced action of \(H\) on \(X\).

Even more generally, if \(G\) acts on \(X\), and if \(\varphi:H \to G\) is a group homomorphism from another group \(H\) into \(G\), then \(H\) acts on \(X\). \(\square\)

**Example 2.6.** The general linear group \(GL(V)\) is a subgroup of
the symmetric group \(V\), and so \(GL(V)\) acts faithfully on
\(V\). \(\square\)

**Example 2.7** (Linear groups). We introduce a
few important subgroups of \(GL(n,\mathbb{F})\). Since
\(GL(n,\mathbb{F})\) is a subgroup of the symmetric group
\(S_{\mathbb{F}^n}\), **Example 2.4** implies that all such
subgroups act faithfully on \(\mathbb{F}^n\). The *special linear
group* is the subgroup \(SL(n,\mathbb{F})\) consisting of matrices of
determinant 1. The *orthogonal group* is the subgroup
\(O(n,\mathbb{F})\) consisting of matrices \(A \in
GL(n,\mathbb{F})\) such that \(AA^t = A^t A = I_n\), where
\(t\) denotes the matrix transpose
operation and \(I_n\) the
\(n\)-by-\(n\) identity matrix. The *special orthogonal group* is
the subgroup \(SO(n,\mathbb{F}) = O(n,\mathbb{F}) \cap
SL(n,\mathbb{F})\).

If \(\mathbb{F} = \mathbb{R}\), then we typically write \(O(n)\) and \(SO(n)\) to denote \(O(n,\mathbb{R})\) and \(SO(n,\mathbb{R})\), respectively. Both \(O(n)\) and \(SO(n)\) act faithfully on \(\mathbb{R}^n\).

If
\(\mathbb{F} = \mathbb{C}\), then we define the *unitary group*
\(U(n)\) to be the group of matrices \(A \in GL(n,\mathbb{C})\)
such that \(AA^* = A^*A = I_n\), where \(*\) denotes the
conjugate transpose
operation. The *special unitary group* is the subgroup \(SU(n) = U(n)
\cap SL(n,\mathbb{C})\). Both \(U(n)\) and \(SU(n)\) act
faithfully on \(\mathbb{C}^n\). \(\square\)

**Example 2.8**
(Isometry groups, orthogonal groups, and symmetry groups). Let
\(\|x\|\) denote the usual Euclidean
norm
of \(x \in \mathbb{R}^n\). An *isometry on \(\mathbb{R}^n\)* is
a function \(f:\mathbb{R}^n \to \mathbb{R}^n\) such that
\(\|f(x) - f(y)\| = \|x-y\|\) for all \(x,y \in
\mathbb{R}^n\). We note that isometries are always injective. Indeed,
if \(f(x) = f(y)\) for some \(x,y \in \mathbb{R}^n\), then \(0 =
\|f(x) - f(y)\| = \|x-y\|\), whence \(x=y\).

The key fact is as
follows; see, for example, Theorem 6.2.3 in M. Artin's
*Algebra* for a proof.

Theorem 2.8.1.If \(f:\mathbb{R}^n \to \mathbb{R}^n\) is an isometry, then there exists an \(M \in O(n)\) such that \(f(x) = f(0) + Mx\) for all \(x \in \mathbb{R}^n\).

Since the mapping \(x \mapsto Mx\) is a bijection, we see that every
isometry on \(\mathbb{R}^n\) is an element of
\(S_{\mathbb{R}^n}\), the symmetric group on \(\mathbb{R}^n\).
It follows from **Example 2.4** that any group of isometries on
\(\mathbb{R}^n\) acts faithfully on \(\mathbb{R}^n\). Note also
that **Theorem** **2.8.1** implies that \(O(n)\) is, in fact, the
group of *all* isometries \(f\) on \(\mathbb{R}^n\) such that
\(f(0) = 0\).

Let us now fix a subset \(E\) of \(\mathbb{R}^n\).
An isometry \(f:\mathbb{R}^n \to \mathbb{R}^n\) is said to be
a *symmetry of \(E\)* if \(f(E) = E\). Since \(f|_E:E \to E\) is
a restriction of an injective function, it must be injective. The
surjectivity is guaranteed by defintion, and so we conclude that a
symmetry of \(E\) is a permutation on \(E\). Therefore, any group of
symmetries of \(E\) is a subgroup of \(S_E\), whence by **Example
2.4** it acts faithfully on \(E\). In particular, the symmetry group
of \(E\), which we define below, acts faithfully on \(E\):

Definition 2.8.2.Let \(E \subseteq \mathbb{R}^n\). Thesymmetry group of \(E\)is the group \(G\) of all symmetries of \(E\).

For example, \(O(n)\) is the symmetry group of the
*\((n-1)\)-sphere*

\(\mathbb{S}^{n-1} = \{x \in \mathbb{R}^n : \|x\| = 1\}\).

To see this, we first fix \(M \in O(n)\). \(M\) is invertible, and so the mapping \(x \mapsto Mx\) is injective on \(\mathbb{S}^{n-1}\). Moreover, for each \(y \in \mathbb{S}^{n-1}\), we see that

\[ \begin{align*} \|M^{-1}y\|^2 &= \langle M^{-1}y,M^{-1}y \rangle = \langle M^ty, M^t y \rangle \\ &=\langle MM^t y, y \rangle = \langle I_ny,y \rangle = \|y\|^2 = 1, \end{align*} \]

whence \(x=M^{-1}y\) is an element of \(\mathbb{S}^{n-1}\) such that \(Mx = y\). It follows that the mapping \(x \mapsto Mx\) is a symmetry of \(\mathbb{S}^{n-1}\). We conclude that the symmetry group of \(\mathbb{S}^{n-1}\) includes \(O(n)\).

To show the
reverse inclusion, we fix a symmetry \(f\) of \(\mathbb{S}^{n-1}\)
and show that \(f(x) = Mx\) for some \(M \in O(n)\). It suffices
to show that \(f(0) = 0\), for then the desired result follows from
**Theorem 2.8.1**.

To this end, we suppose for a contradiction that \(f(0) \neq 0\) and let \(x = \|f(0)\|^{-1} M^{-1} f(0)\). Since

\[\|x\| = \|f(0)\|^{-1}\|M^{-1} f(0)\| = \|f(0)\|^{-1} \|f(0)\| = 1,\]

we see that \(x \in \mathbb{S}^{n-1}\). Now, **Theorem 2.8.1**
implies that

\[ \begin{align*} \|f(x)\| &= \|f(0) + Mx\| = \|f(0) + \|f(0)\|^{-1} f(0)\| \\ &= (1 + \|f(0)\|^{-1}) \|f(0)\| = \|f(0)\| + 1 > 0, \end{align*} \]

which contradicts the assumption that \(f\) is a symmetry of \(\mathbb{S}^{n-1}\). It now follows that \(O(n)\) is the symmetry group of \(\mathbb{S}^{n-1}\). \(\square\)

**Example 2.9** (Dihedral groups).
Consider \(O(2)\), the orthogonal group of
degree 2. We have seen in **Example 2.8** that \(O(2)\) is the
symmetry group of the unit circle \(\mathbb{S}^1\). The typical
elements of \(O(2)\) are the *rotation matrices*

\[R_\theta = \begin{pmatrix} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{pmatrix}\]

and the *reflection matrices*

\[T_\theta = \begin{pmatrix} \cos \theta & \sin \theta \\ \sin \theta & - \cos \theta \end{pmatrix}.\]

Note that \(T_\theta = R_\theta T_0\). Since

\[\displaystyle T_0 \begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} x \\ - y \end{pmatrix},\]

\(T_0\) represents reflection with respect to the \(x\) axis. Consequently, \(T_\theta\) can be thought of as reflection with respect to an appropriate axis of reflection determined by the rotation matrix \(R_\theta\).

An important fact about rotation matrices and
reflection matrices in \(O(2)\) is as follows; see, for example,
Section 2.1 of Grove & Benson, *Finite Reflection
Groups*
for a proof.

Theorem 2.9.1.\(O(2)\) is generated by \(\{R_\theta : \theta \in \mathbb{R}\} \cup \{T_0\}\).

We now define the *dihedral group of degree \(n\),* denoted by
\(D_n\)*,* to be the subgroup of \(O(2)\) generated by \(\rho =
R_{2\pi/n}\) and \(\tau = T_0\). Every element of \(D_n\) is a
symmetry of the regular \(n\)-gon \(G^n\) whose vertices are
precisely the elements of the following set:

\[\left\{ \left( \cos \frac{2k \pi}{n} , \sin \frac{2 k \pi}{n} \right) \in \mathbb{S}^1 : k \in \{0,1,\ldots,n-1\} \right\}.\]

By **Theorem 2.9.1**, *every* symmetry of \(G^n\) is a finite product
of rotation matrices and \(\tau\). Since the symmetry group of
\(G^n\) cannot contain rotation matrices other than
\(\rho,\ldots,\rho^{n-1}\), we see that every symmetry of
\(G^n\) must be a finite product of \(\rho\) and \(\tau\). The
relation \(\rho\tau = \tau\rho^{-1} = \tau \rho^{n-1}\) now
implies that all such products can be written in the form \(\rho^i
\tau^j\) for some \(i,j \in \mathbb{N}\). All such isometries are
already included in the dihedral group, and so we conclude that
\(D_n\) is the symmetry group of \(G^n\). In particular, it
follows from **Example 2.8** that \(D_n\) acts faithfully on
\(G^n\). \(\square\)

**Example 2.10** (Actions of dihedral groups
on other sets). Recall \(D_n\), the dihedral group of degree \(n\),
from **Example 2.9.** We think of \(D_n\) as a subgroup of
\(O(2)\), so that each element of \(D_n\) is a 2-by-2 matrix. For
all \(k,n \in \mathbb{N}\), we define \(v_k^n = ( \cos \frac{2
k \pi}{n}, \sin \frac{2 k \pi}{n} )\). The set

\[V^n = \{v_0^n,v_1^n,\ldots,v_{n-1}^n\}\]

is a subset of \(\mathbb{S}^1 = \{x \in \mathbb{R}^2 : \|x\| =
1\}\), the unit circle in \(\mathbb{R}^2\). Let us define a
\(D_n\)-action on \(V^n\) by setting \(M \cdot v = Mv\) for
each \(M \in D_n\) and every \(v \in V^n\). To check that this
is indeed a group action, we must check that the mapping \(v \mapsto
Mv\) is a bijection on \(V^n\). This can be checked explicitly,
using rotation matrices and reflection matrices introduced in **Example
2.9**. \(D_n\) can now be thought of as a subgroup of
\(S_{V^n}\), whence **Example 2.4** implies that \(D_n\) acts
faithfully on \(V^n\).

Let us now assume that \(n\) is even, and consider the following collection of two-element sets:

\[P^n = \left\{\{v_k^n,v_{k+n/2}^n\} : k \in \{0,1,\ldots,n/2-1\}\right\}.\]

The \(D_n\)-action on \(P^n\) given by setting \(M \cdot \{v,w\} = \{Mv,Mw\}\) is likewise a group action. Note, however, that rotation by \(\pi\) fixes all elements of \(X\). It follows that this \(D_n\)-action is not faithful. \(\square\)

**Example 2.11** (Regular actions). Recall from the proof
of Cayley's theorem that
the left-multiplication map \(L_g:G \to G\) with respect to \(g
\in G\) is the map \(L_g(h) = gh\). The mapping \(L:G \to
S_G\) that sends each \(g \in G\) to the corresponding
left-multiplication map \(L_g\) is a permutation representation,
called the *left regular representation*.

Analogously, we define the
*right-multiplication map* \(R_g:G \to G\) with respect to \(g \in
G\) to be the map \(R_g(h) = hg^{-1}\). The map \(R:G \to S_G\)
that sends each \(g \in G\) to the corresponding right-multiplication
map \(R_g\) is a permutation representation, called the *right
regular representation*.

We note that \(G\) acts faithfully on itself in both cases. \(\square\)

**Example 2.12** (Regular action on
cosets). Let \(G\) be a group and \(H\) be a subgroup of \(G\). We
define \(G/H = \{xH : x \in G\}\), the collection of all left
cosets. We define a \(G\)-action on \(G/H\) by setting \(g \cdot
xH = (gx)H\). This is indeed a group action, as \(1_G \cdot xH =
xH\) and

\[(g_1g_2) \cdot xH = (g_1g_2x)H = g_1(g_2H) = g_1 \cdot (g_2 \cdot H)\]

for all \(g_1,g_2,x \in G\). \(\square\)

**Example 2.13**
(Conjugation on elements). Combining the left-multiplication map
\(L_g\) and the right-multiplication maps \(R_g\) from **Example
2.10**, we obtain a very important permutation representation. We define
the *conjugation representation of a group \(G\)* to be the map
\(\varphi:G \to S_G\) defined by the formula \(\varphi(g)(x) =
gxg^{-1}\). Since \(gxg^{-1} = (L_g \circ R_g)(x)\), we see at
once that each \(\varphi(g)\) is a permutation on \(G\). Observe
that \(1_G \cdot x = 1_Gx = x\), and that

\[\begin{align*} (g_1g_2) \cdot x &= (g_1g_2)x(g_1g_2)^{-1} = g_1(g_2xg_2^{-1})g_1^{-1} \\ &= g_1(g_2 \cdot x)g_1^{-1} = g_1 \cdot (g_2 \cdot x)\end{align*}\]

It follows that \(\varphi\) is a permutation representation. The
resulting action is referred to as the *conjugation action*. In light of
this, we call \(gxg^{-1}\) the *conjugation* of \(x\) by \(g\).

The conjugation action is not necessarily faithful. For example, if \(G\) is abelian, then \(gxg^{-1} = x\) regardless of the choice of \(g \in G\). \(\square\)

**Example 2.14** (Conjugation on
subgroups). Let \(G\) be a group, and let \(\mathcal{S}(G)\) be the
collection of all subgroups of \(G\). We claim that the *conjugate
subgroup* \(gHg^{-1} = \{ghg^{-1} : h \in H\}\) of an arbitrary
subgroup \(H\) of \(G\) is also a subgroup of \(G\), regardless of
our choice of \(g \in G\). To see this, we fix \(g \in G\). Since
\(1_G = g1_Gg^{-1}\), the identity element \(1_G\) is in
\(gHg^{-1}\). If \(gh_1g^{-1},gh_2g^{-1} \in gHg^{-1}\), then

\[(gh_1g^{-1})(gh_2g^{-1}) = gh_1h_2g^{-1} \in gHg^{-1},\]

as \(h_1h_2 \in H\). Therefore, \(gHg^{-1}\) is closed under group operation. Finally, if \(ghg^{-1} \in gHg^{-1}\), then

\[(ghg^{-1})^{-1} = gh^{-1}g^{-1} \in gHg^{-1},\]

as \(h^{-1} \in H\). We conclude that \(gHg^{-1} \leq G\). Let us now define a mapping \(\phi:G \to S_{\mathcal{S}(G)}\) by setting \(\phi(g)(H) = gHg^{-1}\) for each \(g \in G\) and every \(H \in \mathcal{S}(G)\). Observe that \(\phi(g)\) is a permutation on \(\mathcal{S}(G)\), as \(\phi(g^{-1})\) is the inverse of \(\phi(g)\). Moreover,

\[\begin{align*} (g_1g_2) \cdot H &= (g_1g_2)H(g_1g_2)^{-1} = g_1(g_2Hg_2^{-1})g_1^{-1} \\ &= g_1(g_2 \cdot H)g_1^{-1} = g_1 \cdot (g_2 \cdot H)\end{align*}\]

and so \(\varphi\) is a permutation representation. Similarly as in
**Example 2.13**, the action induced by \(\phi\) is not necessarily
faithful. For example, if \(G\), then \(gHg^{-1} = H\) for all \(g
\in G\) and \(H \in \mathcal{S}(G)\). \(\square\)

## 3. Counting Principles

Let us now turn to the abstract theory of group actions; some
applications of the theory presented in this section can be found in
**Section 4**. The two fundamental objects associated with a group
action are as follows:

Definition 3.1.Let \(G\) act on \(X\), and fix \(x \in X\). The\(G\)-orbitof \(x\) is $$\mathcal{O}(x) = \{gx : g \in G\}.$$ Thestabilizer, or theisotropy subgroup, of \(x\) is $$G_x = \{g \in G : gx = x\}.$$

We remark that \(\mathcal{O}(x)\) is a subset of \(X\), and that
\(G_x\) is a subgroup of \(G\). **Example 3.2.** Pick a group
\(G\) and let \(G\) act on itself by conjugation (see **Example
2.13**). The orbit of \(x \in G\) is the *conjugacy class of \(x\)*

\[Conj(x) = \{gxg^{-1} : g \in G\},\]

and the stabilizer of \(x\) is the *centralizer*

\[C_G(x) = \{g \in G : gxg^{-1} = x\}.\]

**Example 3.3.** Pick a group \(G\) and let \(G\) act on the
collection of all subgroups \(\mathcal{S}(G)\) by conjugation (see
**Example 2.14**). The orbit of \(H \in \mathcal{S}(G)\) is the
collection of all conjugates

\[\{gHg^{-1} : g \in G\},\]

and the stabilizer of \(H\) is the *normalizer*

\[N_G(H) = \{g \in G : gHg^{-1} = H\}.\]

Exercise 3.4.Write down orbits and stabilizers for actions introduced inExample 2.9,Example 2.10,Example 2.11, andExample 2.12.

How are the orbit and the stabilizer of a point \(x\) related? Observe that the coset \(h G_x\) can be rewriten as follows, by making the substitution \(z = hg\):

\[\begin{align*} h G_x &= \{hg : g \in G \mbox{ such that } gx = x\} \\ &= \{z : z \in G \mbox{ such that } (h^{-1}z) x = x\} \\ &= \{z : z \in G \mbox{ such that } z x = h x\} \\ &= \{g : g \in G \mbox{ such that } gx = hx\}. \end{align*}\]

Therefore, there is a one-to-one correspondence between the cosets \(h G_x\) and the points of the form \(hx\). Now, the collection of all points of the form \(hx\) is precisely the stabilizer \(\mathcal{O}(x)\). We therefore conjecture that \(|\mathcal{O}(x)|\) equals the number of cosets \(hG_x\). This turns out to be true:

Theorem 3.5(Orbit-stabilizer theorem). Let a group \(G\) act on a set \(X\). For each fixed \(x \in X\), we have the formula $$\vert\mathcal{O}(x)\vert = [G:G_x].$$

**Proof.** Let \(G/G_x\) denote the left coset space of \(G_x\).
We define a function \(f:\mathcal{O}(x) \to G/G_x\) by setting
\(f(gx) = gG_x\). Let us first check that \(f\) is well-defined.
Indeed, if \(gx = hx\), then \(x = g^{-1}hx\), and so \(g^{-1}h
\in G_x\). Therefore, \(g^{-1}hG_x = G_x\), whence \(gG_x =
hG_x\). It follows that \(f\) is well-defined.

We claim that \(f\) is bijective. If \(gG_x = hG_x\), then \(G_x = g^{-1}hG_x\), and so \(g^{-1}h \in G_x\). This implies that \(x = g^{-1}hx\), whence \(gx = hx\). It follows that \(f\) is injective. Moreover, \(f(gx) = gG_x\) for all \(g \in G\), and so \(f\) is surjective. We conclude that \(|\mathcal{O}(x)| = [G:G_x]\). \(\square\)

The orbit-stabilizer theorem is a *counting principle*,
in the sense that many combinatorial results about finite groups can be
derived as simple corollaries. Below, we establish a few.

**Corollary 3.6**. Let a finite group \(G\) act on \(X\). For each \(x \in X\), we have the identity \(\vert\mathcal{O}(x)\vert = \frac{\vert G \vert}{\vert G_x \vert}\). In particular, \(\vert \mathcal{O}(x) \vert\) divides \(\vert G \vert\).
{: .notice}

**Proof**. We apply Lagrange's theorem and the orbit-stabilizer theorem to
obtain the following:

\[|G| = |G_x|[G:G_x] = |G_x||\mathcal{O}(x)|.\]

Diving through by \(|G_x|\), we obtain the desired result. \(\square\)

Corollary 3.7. Let \(G\) be a finite group. For each \(x \in G\), the number of conjugates of \(x\) equals \([G:C_G(x)]\).

**Proof.** Apply the orbit-stabilizer theorem to **Example 3.2**.
\(\square\)

Corollary 3.8. Let \(G\) be a finite group. For each \(H \leq G\), the number of conjugate subgroups of \(H\) equals \([G:N_G(H)]\).

**Proof.** Apply the orbit-stabilizer theorem to **Example 3.3**.
\(\square\)

As alluded to above, each of the three corollaries give rise to useful combinatorial results in group theory. Let us first establish a combinatorial restriction on the size of \(X\). For this, we need the following result:

(Orbit decomposition theorem). Let a group \(G\) act on a set \(X\). The collection \(\{\mathcal{O}(x) : x \in X\}\) forms a partition of \(X\).Theorem 3.9

**Proof**. Suppose that \(\mathcal{O}(x) \cap \mathcal{O}(y) \neq
\varnothing\) and fix \(z \in \mathcal{O}(x) \cap
\mathcal{O}(y)\). Find \(g_0,h_0 \in G\) such that \(g_0 \cdot
x = z = h_0 \cdot y\). It now follows that \(g \cdot x =
(gg_0^{-1} h_0) \cdot y\) for all \(g \in G\), whence
\(\mathcal{O}(x) \subseteq \mathcal{O}(y)\). Similarly, we see that
\(h \cdot y = (hh_0^{-1} g_0) \cdot x\) for all \(h \in H\),
and so \(\mathcal{O}(y) \subseteq \mathcal{O}(x)\). We conclude
that \(\mathcal{O}(x) = \mathcal{O}(y)\). \(\square\)

The following combinatorial principle now is an immediate corollary of what we have established thus far.

Corollary 3.10. Let a finite group \(G\) act on a finite set \(X\). Let \(\Gamma\) be a subset of \(X\) consisting of one representative from each set in the partition formed by the orbits (seeTheorem 3.9). Then $$\vert X \vert = \sum_{x \in \Gamma} \vert \mathcal{O}(x) \vert =\sum_{x \in \Gamma} [G:G_x] = \vert G \vert \sum_{x \in \Gamma}\frac{1}{\vert G_x \vert}.$$

**Proof**. The first equality is an immediate consequence of **Theorem
3.9**. The second equality follows from the orbit-stabilizer theorem
(**Theorem 3.5**). The third equality follows from **Corollary 3.6**.
\(\square\) When a group \(G\) acts on itself, then **Corollary
3.10** becomes a statement about the cardinality of the group itself. Let
us carefully translate **Corollary 3.10** in the context of conjugation
action. For this, we need the following result:

Corollary 3.11.The conjugacy classes of a group \(G\) form a partition of \(G\).

**Proof.** From **Example 3.2**, we know that the orbits of the
conjugation action of \(G\) on itself are precisely the conjugacy
classes. The orbit decomposition theorem (**Theorem 3.9**) now furnishes
the desired result. \(\square\)

We are now ready to adapt **Corollary
3.9** to describe groups acting on themselves by conjugation.

Theorem 3.12(Conjugacy class equation). Let \(G\) be a finite group. Let \(\Gamma\) be a subset of \(G\) consisting of one representative from each set in the partition formed by the conjugacy classes (seeCorollary 3.11). Then $$\begin{align*} \vert G \vert &= \sum_{x \in \Gamma} [G : C_{G}(x)] \\ &= \vert Z(G) \vert + \sum_{x \in \Gamma \smallsetminus Z(G)} [G:C_{G}(x)], \end{align*}$$ where $$Z(G) = \{x \in G : gxg^{-1} = x \mbox{ for all } g \in G\}$$ is thecenterof \(G\).

**Proof.** The first equality follows from **Corollary 3.10** and
**Corollary 3.7**. The second equality follows from the fact that the
conjugacy class of each element in the center is a singleton set. \(\square\)

The class equation, combined with **Corollary 3.8**,
allows us to establish powerful converses to Lagrange's theorem, which
we establish in the next section. We conclude by establishing a
combinatorial restriction on the size of \(\Gamma\).

Theorem 3.12(Burnside's lemma). Let a finite group \(G\) act on a finite set \(X\). Let \(\Gamma\) be a subset of \(X\) consisting of one representative from each set in the partition formed by the orbits (seeTheorem 3.9). Then $$\vert \Gamma \vert = \frac{1}{\vert G \vert} \sum_{g \in G} X^g,$$ where \(X^g\) denotes the number of elements of \(X\) fixed by the action of \(g\).

**Proof.** We first observe that

\[|\Gamma| = \sum_{x \in X}\frac{1}{|\mathcal{O}(x)|},\]

as there are \(|\mathcal{O}(x)|\)-many orbits in the equivalence
class of \(\mathcal{O}(x)\) furnished by the orbit decomposition
theorem (**Theorem 3.9**). For each \(g \in G\) and every \(x \in
X\), we define

\[F(g,x) = \begin{cases} 1 & \mbox{ if } g \cdot x = x; \\ 0 & \mbox{ if } g \cdot x \neq x. \end{cases}\]

We now invoke the orbit-stabilizer theorem (**Corollary 3.6**) to deduce
that

\[ \begin{align*} |\Gamma| &= \displaystyle \frac{1}{|G|} \sum_{x\in X} |G_x| = \displaystyle \frac{1}{|G|} \sum_{x \in X} \sum_{g \in G} F(g,x) \\ &= \displaystyle \frac{1}{|G|} \sum_{g\in G} \sum_{x \in X} F(g,x) = \displaystyle \frac{1}{|G|}\sum_{g \in G} X^g, \end{align*} \]

as was to be shown. \(\square\)

## 4. Sylow's theorems

While Lagrange's theorem imposes restrictions on the orders of subgroups and elements of a group, it does not guarantee the existence of a subgroup or an element of a certain order.

**Example 4.1.** Consider the
alternating group \(A_4\), which is of order 12. We show that
\(A_4\) has no subgroup of order 6. To this end, we shall make use of
the following preliminary result:

Lemma 4.2.Every subgroup of a group \(G\) of index 2 is normal in \(G\).

We assume for now that the lemma holds. Suppose for a contradiction that \(H\) is a subgroup of \(A_4\) of order 6, which must be a normal subgroup of \(A_4\) by the lemma. The conjugacy classes of \(A_4\) are as follows:

\[ \begin{align*} &\{\operatorname{id}\}, \{(12)(34), (13)(24),(14)(23)\}, \\ &\{(123),(124),(134),(234)\}, \\ &\{(132),(142),(143),(243)\}. \end{align*} \]

Since \(H\) is normal, \(H\) must be a union of conjugacy classes. Since \(\{e\}\) must be a subset of \(H\), this implies that we ought to be able to write \(|H|=6\) as a sum of 1 and the cardinalities of conjugacy classes. This is impossible, and so we conclude that \(A_4\) has no subgroup of order 6.

It now suffices to establish the lemma. Let \(G\) be a group and let \(H\) be a subgroup of index 2. Note that the two left cosets of \(H\) are \(H\) itself and \(g_0H\) for an arbitrary \(g_0 \in G \smallsetminus H\). Fix \(h \in H\) and \(g \in G\). If \(g \in H\), then \(ghg^{-1} \in H\). We therefore suppose that \(g \in G \smallsetminus H\). If \(g(hg^{-1}) \in gH\), then \(hg^{-1}) \in H\), and so \(g^{-1} = h^{-1}(hg^{-1})\). This, in particular, implies that \(g \in H\), whence \(gH \subseteq H\), a contradiction. It follows that \(ghg^{-1} \in H\), as was to be shown.

See Brennan and MacHale, "Variations on a Theme: \(A_4\) Definitely Has no Subgroup of Order Six!" for alternate proofs. \(\square\)

When, then, could we establish a converse to Lagrange's theorem? The solution is intimately tied to the study of \(p\)-groups, which we now define.

Definition 4.3.Let \(p\) be a prime number. A group \(G\) is said to be a\(p\)-groupif each element of \(G\) is of order \(p^n\) for some integer \(n\) that depends on the element.

The problem of finding a nontrivial \(p\)-subgroup of a group \(G\) is equivalent to finding an element of \(G\) of order \(p\). Indeed, if \(g \in G\) is an element of order \(p\), then the cyclic subgroup \(\langle g \rangle\) is a nontrivial \(p\)-subgroup of \(G\). Conversely, if \(H\) is a nontrivial \(p\)-subgroup of \(H\), then we can find a non-identity element \(g \in H\) of order \(p^n\). In this case, \(g^{p^{n-1}}\) is an element of order \(p\).

Now, in order for a finite group \(G\) to have a \(p\)-subgroup, the order of \(G\) must be divisible by \(p\). The following result shows that the converse holds.

Theorem 4.4(Cauchy, 1845). Every finite group \(G\) whose order is divisible by a prime number \(p\) contains an element of order \(p\).

**Proof.** We first prove the theorem for abelian groups. We shall
proceed by induction on \(m = |G| / p\). If \(m = 1\), then \(G\)
is a group of prime order, hence cyclic; it follows that \(G\)
contains an element of order \(p\).

Fix \(m > 1\) and suppose that every abelian group of order \(pn\) for each \(1 \leq n < m\) contains an element of order \(p\). Let \(g\) be a non-identity element of \(G\). If \(|g|\) is divisible by \(p\), then some power of \(g\) is of order \(p\). If not, then \(G / \langle g \rangle\) is an abelian group of order \(pn\) for some \(1 \leq n < m\), whence it contains an element \(h \langle g \rangle\) of order \(p\). Now, the canonical projection \(\pi:G \to G/\angle x \angle\) is a homomorphism, and so the order of \(h\) must be divisible by \(p\), order of \(\pi(h) = h\langle x \rangle\). It follows that some power of \(h\) must be \(p\). By induction, every Aelian group whose order is divisible by \(p\) contains an element of order \(p\).

We
now lift the assumption that \(G\) is abelian. Once again, we proceed
by induction on \(m = |G| / p\). The \(m=1\) case is established
above. Fix \(m > 1\) and suppose that every group of order \(pn\)
for some \(1 \leq n > m\) contains an element of order \(p\).
Let \(g \in G\). If \(g \notin Z(G)\), then \([G:C_G(g)]\),
the number of conjugates of \(g\), is larger than \(1\), and so
\(|C_G(g)| < |G|\). By induction, \(C_G(g)\) must contain an
element of order \(p\), whence so must \(G\). We may therefore
assume that \(p \nmid |C_G(g)|\) whenever \(g \notin Z(G)\).
Lagrange's theorem now implies that \(p \mid [G:C_G(g)]\) whenever
\(g \notin Z(G)\). We now invoke the conjugacy class equation
(**Theorem 3.12**):

\[|G| = |Z(G)| + \sum_{g \in \Gamma \smallsetminus Z(G)} [G : C_G(g)].\]

Since \(|G|\) and all \([G:C_G(g)]\) are divisible by \(g\), it follows that \(Z(G)\) is divisible by \(p\). But \(Z(G)\) is abelian, and so \(Z(G)\) must contain an element of order \(p\). It follows that \(G\) contains an element of order \(p\). \(\square\)

Cauchy's theorem provides a convenient equivalence characterization of finite \(p\)-groups:

Corollary 4.5.A finite group \(G\) is a \(p\)-group if and only if the order of \(G\) is a power of \(p\).

**Proof.** If \(|G| = p^n\), then Lagrange's theorem implies that
\(G\) is a \(p\)-group. If \(|G|\) is divisible by some other
prime \(q\), then Cauchy's theorem implies that \(G\) contains an
element of order \(q\), and so \(G\) is not a \(p\)-group.
\(\square\)

The main result of this section is a substantial strengthening of Cauchy's theorem, due to Sylow.

Definition 4.6.ASylow \(p\)-subgroupof a group \(G\) is a maximal \(p\)-subgroup. In other words, if \(P\) is a Sylow \(p\)-subgroup of \(G\), then no other \(p\)-subgroup of \(G\) properly contains \(P\).

Sylow's theorems concern the existence and various other properties of Sylow \(p\)-subgroups of finite groups whose orders are divisible by \(p\).

Theorem 4.7(Sylow, 1872). Let \(G\) be a finite group of order \(p^nm\), where \(p\) is a prime number and \(m\) is an integer relatively prime to \(p\). The following holds:

First Sylow theorem.\(G\) contains a Sylow \(p\)-subgroup of order \(p^n\).Second Sylow theorem.All Sylow \(p\)-subgroups are conjugates of one another. In other words, if \(P\) and \(Q\) are Sylow \(p\)-subgroups of \(G\), then there exists a \(g \in G\) such that \(gPg^{-1} = Q\). This, in particular, implies that all Sylow \(p\)-subgroups are isomorphic to one anotherThird Sylow theorem.Let \(n_p\) denote the number of Sylow \(p\)-subgroups of \(G\). We have the following combinatorial restrictions on \(n_p\):

- \(n_p\) divides \(\vert G \vert\);
- \(n_p \equiv 1 \mbox{ mod } p\).

Before we prove the Sylow theorems, we make a few observations. Firstly,
the second Sylow theorem, in conjunction with the third Sylow theorem
and **Corollary 4.5**, implies that \(G\) has a normal subgroup of
order \(p^n\) if and only if \(n_p = 1\). The third Sylow theorem
provides two combinatorial restrictions on \(n_p\), which can be
exploited in various clever ways to determine \(n_p\). Secondly, the
second Sylow theorem, in conjunction with the first Sylow theorem,
implies that every Sylow \(p\)-subgroup of \(G\) is of order
\(p^n\). In other words, maximality in subgroup inclusion
automatically implies maximality in order. Thirdly, the maximality
observation, combined with the second Sylow theorem, implies the
following structure theorem:

Corollary 4.8(Frattini's argument). Let \(G\) be a finite group and \(K\) a normal subgroup of \(G\). If \(P\) is a Sylow \(p\)-subgroup of \(K\) for some prime \(p\), then $$G = KN_G(P).$$

**Proof of Corollary.** Fix \(g \in G\). Since \(K\) is normal in
\(G\), we see that \(K = gKg^{-1}\). Now, \(gPg^{-1} \leq
gKg^{-1} = K\), and so \(gPg^{-1}\) is a Sylow \(p\)-subgroup of
\(K\) by maximality. The second Sylow theorem therefore furnishes \(k
\in K\) such that \(kPk^{-1} = gPg^{-1}\), so that \(P =
(k^{-1}g)P(k^{-1})g)^{-1}\). It follows that \(k^{-1}g \in
N_G(P)\). Since \(g \in G\) was arbitrary, the factorization \(g =
k(k^{-1}g)\) establishes the desired identity. \(\square\)

Let us
now turn to the proof of the Sylow theorems, which are taken from
Rotman, *An Introduction to the Theory of
Groups*. As
in the statement of the theorem, we assume that \(|G| = p^n m\). We
shall make use of the following lemma:

Lemma 4.9.Let \(P\) be a Sylow \(p\)-subgroup of a finite group \(G\). The following holds:

- \(\vert N_G(P)/P\vert\) is relatively prime to \(p\);
- If \(g \in G\) is of order \(p^k\) for some \(k\) and \(gPg^{-1} = P\), then \(g \in P\).

**Proof of lemma.** (1) If \(|N_G(P)/P|\) is divisible by \(p\),
then Cauchy's theorem (**Theorem 4.4**) furnishes an element \(gP\) of
order \(p\). Therefore, \(H = \langle gP \rangle\) is a subgroup
of \(N_G(P)/P\) of order \(p\).

Let \(\pi:N_G(P) \to N_G(P)/P\) be the canonical projection map and set \(H^* = \pi^{-1}(H)\). If \(g,h \in H^*\), then \(\pi(gh^{-1}) = \pi(g)\pi(h)^{-1} \in H\), and so \(gh^{-1} \in H^*\). It follows that \(H^* \leq N_G(P)\). Moreover, the first isomorphism theorem, in conjunction with Lagrange's theorem, implies that

\[|H^*| = \left|\ker \pi|_{H^*}\right| \left| \operatorname{im} \pi|_{H^*} \right| = |P| |H|,\]

whence by **Corollary 4.5** \(H^*\) is a \(p\)-subgroup of \(G\)
containing \(P\). This contradicts the maximality of \(P\), and so
we conclude that \(|N_G(P)/P|\) fails to be divisible by \(p\).

(2) Note that \(g' = g^{p^{k-1}}\) is of order \(p\). Since \(N_G(P) \leq G\), we see that \(g' \in N_G(P)\), and so \(g'P \in N_G(P)/P\). If \(g' \notin P\), then \(|g'| = p\), and so \(|gP| = p'\). By Lagrange's theorem, \(|N_G(P)/P|\) must be divisible by \(p\), and this contradicts (1). \(\square\)

**Proof of the Second and Third Sylow theorems.** Let \(X =
\{P_1,\ldots,P_k\}\) denote the collection of all conjugates of
\(P\) in \(G\), where \(P_1 = P\). Taking a cue from **Example
2.14**, we define the conjugation action of \(G\) on \(X\) as
follows: \(\varphi(g) = (gPg^{-1}\) for each \(g \in G\). It is
routine to check that \(\varphi\) is a group homomorphism from
\(G\) to \(S_X\): a simple modification of the argument in
**Example 2.14** suffices.

Let \(Q\) be an arbitrary Sylow
\(p\)-subgroup of \(G\). \(Q\) is a subgroup of \(G\), and so
\(Q\) acts on \(X\) by conjugation (**Example 2.5**). By the
orbit-stabilizer theorem (**Corollary 3.6**), the size of each orbit
divides \(|Q|\), whence by **Corollary 4.5** each
\(|\mathcal{O}(P_i)|\) is a power of \(p\).

If
\(|\mathcal{O}(P_i)| = 1\) for some \(1 \leq i \leq k\), then
\(gP_ig^{-1} = P_i\) for all \(g \in Q\). In this case, we must
have \(Q \leq P_i\) by **Lemma 4.9(2)**. Since \(P_i\) is a
\(p\)-group by **Corollary 4.5**, the maximality of \(Q\) implies
that \(P_i \leq Q\), and so \(Q = P_i\). Now, if we take \(Q\)
to be \(P\), then we must have \(P \neq P_i\) for all \(2 \leq i
\leq k\), and so \(|\mathcal{O}(P_i)| = p^{t_i}\) for some
\(t_i \geq 1\) whenever \(2 \leq i \leq k\). By the orbit
decomposition theorem (**Theorem 3.9**), \(X\) is the disjoint union
of distinct orbits, whence \(|X|\) equals 1 plus a sum of powers of
\(p\). In other words, \(|X| \equiv 1 \mbox{ mod } p\).

Now, we assume for a contradiction that \(Q\) is a Sylow \(p\)-subgroup of \(G\) not included in \(X\). We have shown above that \(P_i = Q\) whenever \(|\mathcal{O}(P_i)| = 1\). Therefore, we must have \(|\mathcal{O}(P_i)| = p^{t_i}\) for some \(t_i \geq 1\) for all \(1 \leq i \leq k\). This, in particular, implies that \(|X| \equiv 0 \mbox{ mod } p\), contradicting the 1-mod-p relation. Therefore, all Sylow \(p\)-subgroups are conjugates of one another. This, in particular, implies that \(|X| = n_p\), whence \(n_p \equiv 1 \mbox{ mod } p\).

We now invoke the orbit-stabilizer theorem
to the conjugation action of \(G\) on the set of all subgroups of
\(G\) to deduce that \([G:N_G(P)] = |X| = n_p\) (**Corollary
3.8**). Lagrange's theorem implies that \(n_p\) divides \(|G|\).
This completes the proof of the second and third Sylow theorems.
\(\square\)

**Proof of the first Sylow theorem**. Cauchy's theorem
(**Theorem 4.4**) implies that \(G\) has a subgroup of order \(p\),
which is a \(p\)-subgroup of \(G\). This, in particular, shows that
\(G\) has a Sylow \(p\)-subgroup \(P\). Indeed, if \(Q\) is a
non-Sylow \(p\)-subgroup of \(G\), then, by definition, there exists
a \(p\)-subgroup \(Q'\) of \(G\) that properly contains \(Q\).
Since \(|G|\) is finite, this process must terminate in finitely many
steps, and we obtain a Sylow \(p\)-subgroup \(P\).

We claim that
\([G:P]\) is relatively prime to \(p\). Since **Corollary 4.5**
implies that \(|P|\) must be a power of \(p\), the claim, in
conjunction with Lagrange's theorem, implies that \([G:P] = m\) and
\(|P| = p^n\), as was to be shown. To prove the claim, we shall make
use of the following lemma:

Lemma 4.9.If \(C\) is a finite group and \(A \leq B \leq C\), then \([C:A] = [C:B][B:A]\).

**Proof of lemma.** This is an immediate consequence of the fact that
cosets of a subgroup partition a group into equal-sized equivalence
classes. \(\square\) We let \(C = G\), \(B = N_G(P)\), and \(A
= P\) and invoke the lemma to show that both \([G:N_G(P)]\) and
\([N_G(P):P]\) are relatively prime to \(p\). We apply the
orbit-stabilizer theorem to the conjugation action of \(G\) on the set
of all subgroups of \(G\) to deduce that \([G:N_G(P)] = n_p\)
(**Corollary 3.8**). The **third Sylow theorem** implies that that
\(n_p \equiv 1 \mbox{ mod } p\), whence \([G:N_G(P)]\) is
relatively prime to \(p\).

As for \([N_G(P):P]\), we first note
that \(P \triangleleft N_G(P)\), so that \([N_G(P):P] =
|N_G(P)/P|.\) It follows from **Lemma 4.9(1)** that \(|N_G(P)/P|\)
is relatively prime to \(p\). This concludes the proof of the first
Sylow theorem. \(\square\)

The Sylow theorems play an important role in the classification of finite groups. In what follows, we study methods of constructing larger groups out of a collection of smaller groups.

## 5. Direct Products

Recall that the *external direct product* of two groups \(G\) and
\(H\) is the cartesian product \(G \times H\) equipped with
coordinatewise group operation: \((g_1,h_1)(g_2,h_2) =
(g_1g_2,h_1h_2)\) for all \(g_1,g_2 \in G\) and \(h_1,h_2
\in H\). The external direct product of \(G\) and \(H\) is
arguably the simplest and the most natural way to construct a new group
out of \(G\) and \(H\). Therefore, it is of interest to determine
whether a group can be represented as a direct product of its
subgroups.

**Example 5.1.** The integers mod 6 group \(\mathbb{Z}/6\mathbb{Z} =
\{0,1,2,3,4,5\}\) is isomorphic to \(\{0,3\} \times
\{0,2,4\}\). Indeed, the map

\[\varphi:\{0,3\} \times \{0,2,4\} \to \{0,1,2,3,4,5\}\]

given by the formula \(\varphi(n,m) = n+m\) is a group isomorphism. \(\square\)

**Example 5.2.** The Klein-four group \(V = \langle a,b \mid a^2 =
b^2 = (ab)^2 = 1 \rangle\) is isomorphic to \(\{1,a\} \times
\{1,b\}\). Indeed, the map

\[\varphi:\{1,a\} \to \{1,b\} \to \{1,a,b,ab\}\]

given by the formula \(\varphi(x,y) = xy\) is a group isomorphism. \(\square\)

**Example 5.3.** The multiplicative group of complex numbers
\(\mathbb{C}^\times = \{z \in \mathbb{C} : z \neq 0\}\),
equipped with the standard multiplication operation, is isomorphic to
the direct product of the circle group \(\mathbb{T} = \{z \in
\mathbb{C} : |z| = 1\}\) and the group of positive real numbers
\(\mathbb{R}^+ = \{r \in \mathbb{R} : r > 0\}\), both
equipped with the standard multiplication operation. Indeed, every
nonzero complex number \(z\) can be written uniquely as \(|z| e^{i
\mathrm{Arg} z}\), and so the map

\[\varphi: \mathbb{R}^+ \times \mathbb{T} \to \mathbb{C}^\times\]

given by the formula \(\varphi(r,t) = rt\) is a group isomorphism. \(\square\)

The question, then, is as follows: given two subgroups \(N\) and \(K\) of a group \(G\), when is \(G\) isomorphic to \(N \times K\)? The above examples suggest that we should be able to create an arbitrary element of \(G\) by taking the product of an element of \(N\) and an element of \(K\). Moreover, in all of the examples above, \(N\) and \(K\) are effectively disjoint, in the sense that \(N \cap K = \{1_G\}\). These observations lead us to the following definition:

Definition 5.4.Two subgroups \(N\) and \(K\) of \(G\) arepermutable complements—or simplycomplementsif there is no danger of confusion—in \(G\) if \(N \cap K = \{1_G\}\) and \(NK\) equals \(G\).

Is this sufficient? Not so, as the next example shows.

**Example 5.5.** Consider the dihedral group

\[D_3 = \langle \sigma, \tau \mid \sigma^3 = \tau^2 = 1, \sigma \tau = \tau \sigma^2 \rangle.\]

Let \(S = \langle \sigma \rangle\) and \(T = \langle \tau \rangle\), so that \(S \cap T = \{\operatorname{id}\}\). Since every element of \(D_3\) is of the form \(\sigma^m \tau^n\), we see that \(D_3 = ST\). Nevertheless, \(D_3\) cannot be isomorphic to \(S \times T\): \(D_3\) is nonabelian, but \(S \times T\) is abelian. \(\square\)

What additional conditions do we need?

Proposition 5.6.Let \(N\) and \(K\) be subgroups of a group \(G\). If \(N\) and \(K\) are permutable complements in \(G\), \(N \triangleleft G\), and \(K \triangleleft G\), then \(G \cong N \times K\).

**Proof.** Define a mapping \(\varphi:N \times K \to G\) by setting
\(\varphi(n,k) = nk\). Since \(NK = G\), the mapping is surjective.
If \(\varphi(n_1,k_1) = \varphi(n_2,k_2)\), then
\(n_1n_2^{-1} = k_1^{-1}k_2\), so that

\(n_1n_2^{-1} ,k_1^{-}1k_2 \in N \cap K\). Since the intersection is trivial, we see that \(n_1 = n_2\) and \(k_1 = k_2\). It follows that \(\varphi\) is injective.

It remains to show that \(\varphi\) is a group homomorphism. To this end, we show that \(nk = kn\) whenever \(n \in N\) and \(k \in K\). Indeed, the normality of \(N\) in \(G\) implies that \(nkn^{-1}k^{-1} \in N\), and the normality of \(K\) in \(G\) implies that \(nkn^{-1}k^{-1} \in K\). Since the intersection is trivial, we see that \(nkn^{-1}k^{-1} = 1_G\), and so \(nk = kn\).

We now observe that

\[\begin{align*} \varphi( n_1n_2,k_1k_2) &= n_1n_2k_1k_2 \\ &= n_1k_1 n_2k_2 \\ &= \varphi(n_1,k_1) \varphi(n_2,k_2) \end{align*}\]

for all \(n_1,n_2 \in N\) and \(k_1,k_2 \in K\), whence \(\varphi\) is a homomorphism. \(\square\)

The converse also holds.

Proposition 5.7.If \(G \cong H_1 \times H_2\), then \(G\) has normal subgroups \(N\) and \(K\) such that \(N \cong H_1\), \(K \cong K_1\), and \(N\) and \(K\) are permutable complements in \(G\).

**Proof.** Let \(N = H_1 \times \{1_{H_2}\}\) and \(K =
\{1_{H_1}\} \times H_2\). We see at once that \(N\) and \(K\)
are permutable complements in \(G\). Moreover, it follows from the
definition of the direct product that \(nk = kn\) whenever \(n \in
N\) and \(k \in K\).

Fix \(n \in N\). For each \(g \in G\), we can find \(n’ \in N\) and \(k’ \in K\) such that \(g = n’k’\). Therefore,

\[\begin{align*} gng^{-1} &= (n’k’) n (n’k’)^{-1} \\ &= k'(k’)^{-1} n’ n (n’)^{-1} \\ &= n’n (n’)^{-1} \in N \end{align*}\]

by the commutativity of elements of \(N\) and elements of \(K\). Since \(g\) and \(n\) were arbitrary, we conclude that \(N \triangleleft G\). Similarly, we can show that \(K \triangleleft G\). \(\square\)

We can summarize the above two propositions as the equivalence of
*external* direct product and *internal* direct product.

Definition 5.8.A group \(G\) is said to be theinternal direct productof subgroups \(N\) and \(K\) if \(N\) and \(K\) are permutable complements in \(G\), \(N \triangleleft G\), and \(K \triangleleft G\).

As an application, we show that \(GL(n,\mathbb{R})\) can be written as a direct product of two subgroups.

**Example 5.9.** If \(n\) is odd, then the general linear group
\(GL(n,\mathbb{R})\) is isomorphic to the direct product of the group
of scalar matrices \(S = \{\lambda I_n : \lambda \neq 0\}\) and
the special linear group \(SL(n,\mathbb{R})\). Given \(M \in
GL(n,\mathbb{R})\), we let \(\lambda = (\det M)^{1/n}\), so that
\(N = (\lambda^{-1} I) M\) is of determinant 1. We can then write
\(M\) as the product \( (\lambda I_n) N\) of a scalar matrix and a
matrix of determinant 1. Since \(M\) was arbitrary, we see that
\(GL(n,\mathbb{R}) = S \, SL(n,\mathbb{R})\).

Now, \(n\) is odd, and so a scalar matrix \(\lambda I_n\) is of determinant 1 if and only if \(\lambda = 1\). Therefore, \(S \cap SL(n,\mathbb{R}) = \{I_n\}\), and it follows that \(S\) and \(SL(n,\mathbb{R})\) are permutable complements in \(GL(n,\mathbb{R})\).

Since scalar matrices commute with all matrices of compatible size, \(S \triangleleft GL(n,\mathbb{R})\). Moreover, if \(M \in GL(n,\mathbb{R})\) and \(N \in SL(n,\mathbb{R})\), then

\[\det(MNM^{-1}) = \det(M)\det(N)\det(M)^{-1} = 1,\]

and so \(MNM^{-1} \in SL(n,\mathbb{R})\). It follows that
\(SL(n,\mathbb{R}) \triangleleft GL(n,\mathbb{R})\). We conclude
from **Proposition 1.6** that \(GL(n,\mathbb{R}) \cong S \times
SL(n,\mathbb{R})\). \(\square\)

An important classification theorem regarding direct products is the following:

Theorem 5.10(Fundamental theorem on finitely generated abelian groups). If \(G\) is a finitely generated abelian group, then we can find prime numbers \(p_1,\ldots,p_k\) and positive integers \(n,n_1,\ldots,n_k\) such that $$G \cong \mathbb{Z}^n \times \mathbb{Z}/p_1^{n_1}\mathbb{Z} \times \cdots \times \mathbb{Z}/p_k^{n_k}\mathbb{Z};$$ here \(n = 0\) if and only if \(G\) is finite.

## 6. Semidirect Products

Classification results with direct products, even something as strong as
**Theorem 5.10**, does not take into account the decomposition of
\(D_3\) into the “internal product”
\(\{\operatorname{id},\sigma,\sigma^2\}\{\operatorname{id},\tau\}\)
that we discussed in **Example 5.5**. It is therefore reasonable to
generalize the notion of internal direct products introduced in
**Definition 5.8** for the sake of completeness.

Definition 6.1.Let \(G\) be a group, and fix subgroups \(N\) and \(K\) of \(G\). The group \(G\) is said to be aninternalsemidirect productof \(N\) by \(K\), denoted by \(G = N \rtimes K\), if \(N \triangleleft G\) and \(N\) and \(K\) are permutable complements of \(G\).

The notation \(\rtimes\) emphasizes the fact that the left component,
\(N\), is the normal subgroup. Since **Definition 2.1** is a
straightforward generalization of **Definition 1.8**, we see that every
internal direct product is an internal semidirect product. This, in
particular, implies that internal semidirect products are not
necessarily isomorphic.

**Example 6.2.** Consider the dihedral group \(D_3\) and the integers
mod 6 group \(\mathbb{Z}/6\mathbb{Z}\). By **Example 5.1** and
**Example 1.8**, each group is an internal semidirect product of a
subgroup isomorphic to \(\mathbb{Z}/3\mathbb{Z}\) by a subgroup
isomorphic to \(\mathbb{Z}/2\mathbb{Z}\). Nevertheless, two groups
fail to be isomorphic: \(D_3\) is nonabelian, whereas
\(\mathbb{Z}/6\mathbb{Z}\) is abelian. \(\square\)

Before we consider more examples of internal semidirect products, we
make the following observation: the two “parts” of a semidirect product
do not necessarily commute. In the proof of **Proposition 5.6**, we have
shown that if \(G\) is an internal direct product of \(N\) and
\(K\), then \(nk =kn\) whenever \(n \in N\) and \(k \in K\).
Nevertheless, we know that \(\sigma \tau = \tau \sigma^2\) in
\(D_3\), and so the same principle fails to hold for semidirect
products. What we do have, however, is the following:

Proposition 6.3.Let \(G\) be a group, \(N\) a normal subgroup of \(G\), and \(K\) a subgroup of \(G\). Then \(NK\) is a subgroup of \(G\) and \(NK = KN\).

**Proof.** Given arbitrary \(n_1k_1, n_2k_2 \in NK\), we observe
that

\[ \begin{align*} (n_1k_1)(n_2k_2)^{-1} &= n_1k_1 k_2^{-1} n_2 \\ &= n_1 (k_1k_2^{-1}) n_2 (k_1k_2^{-1})^{-1} (k_1k_2^{-1}) \\ &= n_1 n_3 (k_1k_2^{-1}) \in NK \end{align*} \]

for some \(n_3 \in N\) by the normality of \(N\) in \(G\). It follows that \(NK \leq G\). Similarly, \(KN \leq G\). Now, both \(NK\) and \(KN\) coincide with the subgroup \(\langle N \cup K \rangle\) of \(G\) generated by \(N \cup K\), and so \(NK = KN\). \(\square\)

Let us now consider two examples of internal semidirect products, and one counterexample.

**Example 6.4.** Fix \(n \geq 2\). The symmetric group \(S_n\) is
an internal semidirect product of \(A_n\) by \(T = \{1,(12)\}
\cong \mathbb{Z}/2\mathbb{Z}\). Indeed, \(A_n \triangleleft
S_n\) because conjugation preserves parity. Moreover, every
transposition is in \(A_n T\), as \((a b) = (a b) (1 2) (12)\).
Note also that \(A_n T\) contains all odd permutations. Since every
permutation is a product of transpositions, it follows from the subgroup
property established in **Proposition 2.3** that \(S_n = A_n T\).
Finally, every permutation in \(A_n\) is even, but \((12)\) is an
odd permutation, and so \(A_n\) and \(T\) are permutable
complements. It now follows that \(S_n\) is an internal semidirect
product of \(A_n\) by \(T\). \(\square\)

**Example 6.5.** Generalizing **Example 5.5.** and **Example 6.2.**, we
consider the dihedral group

\[D_n = \langle \sigma, \tau \mid \sigma^n = \tau^2 = 1, \sigma \tau = \tau \sigma^{n-1} \rangle.\]

Let \(S = \langle \sigma \rangle\) and \(T = \langle \tau \rangle\), so that \(S \cap T = \{\operatorname{id}\}\) and \(ST = D_n\). Observe now that every element of \(D_n\) is of the form \(\sigma^k \tau^l\). Given \(\sigma^k \tau^l \in D_n\) and \(\sigma^m \in S\), we see that

\[ \begin{align*} (\sigma^k \tau^l) \sigma^m (\sigma^k \tau^l)^{-1} &= \sigma^k \tau^l \sigma^m \tau^{-l} \sigma^{-k} \\ &= \sigma^k \tau^l \tau^{-l} \sigma^{-m} \sigma^{-k} \\ &= \sigma^{k-m-k} \in S. \end{align*}\]

It follows that \(S \triangleleft D_n\), whence \(D_n\) is an internal semidirect product of \(S\) by \(T\). \(\square\)

**Example 6.6.** Consider the unit quaternion group \(Q =
\{1,-1,i,-i,j,-j,k,-k\}\).
Since every nontrivial subgroup of \(Q\) must contain
\(\{-1,1\}\), it follows that no two nontrivial subgroups of \(Q\)
are permutable complements in \(Q\). Therefore, \(Q\) is not an
internal semidirect product of its nontrivial subgroups. \(\square\)

We now turn to the problem of constructing an *external* semidirect
product of two groups, without considering them as subgroups of an
ambient group. Note that if \(G = N \rtimes K\), then the conjugation
action \(\varphi:K \to \operatorname{Aut}(N)\) of \(K\) on
\(N\), given by the formula \(\varphi(k)(n) = knk^{-1}\), is a
well-defined group homomorphism by the normality of \(N\) in \(G\).
In this case, we see that the generic group product in \(G = NK\)
takes the following form:

\[(6.7) \, \, \, \, \begin{align*} (n_{1}k_{1})(n_{2}k_{2}) &= n_{1}(k_{1}n_{2}k_{1}^{-1})k_{1}k_{2} \\ &= \left(n_{1}\varphi_{k_{1}}(n_{2})\right) (k_{1}k_{2}). \end{align*}\]

Taking a cue from this computation, we make the following definition:

Definition 6.8.Let \(N\) and \(K\) be groups. Given a homomorphism \(\varphi:K \to \operatorname{Aut}(N)\), we define theexternal semidirect product of \(N\) by \(K\) with respect to \(\varphi\)to be the cartesian product \(N \times K\) equipped with the group operation $$(n_1,k_1)(n_2,k_2) = (n_1\varphi_{k_1}(n_2), k_1k_2)$$ for all \(n_1,n_2 \in N\) and \(k_1,k_2 \in K\). We denote the resulting group by \(N \rtimes_\varphi K\).

Exercise 6.9.Show that every external semidirect product (N \rtimes_\varphi K) is a group.

Mimicking **Proposition 5.6**, we show that an internal semidirect
product is isomorphic to the corresponding external semidirect product.

Proposition 6.10.Let \(N\) and \(K\) be subgroups of a group \(G\). If \(N\) and \(K\) are permutable complements in \(G\), \(N \triangleleft G\), and \(K \leq G\), then \(G \cong N \rtimes_\varphi K\), where \(\varphi:K \to \operatorname{Aut}(N)\) is the conjugation map \(\varphi_k(n) = knk^{-1}\).

**Proof.** We define a map \(\Gamma:N \rtimes_\varphi K \to G\)
by setting \(\Gamma(n,k) = nk\). Observe that

\[ \begin{align*} \Gamma(n_1,k_1)\Gamma(n_2,k_2) &= n_1k_1n_2k_2 \\ &= n_1 \varphi_{k_1}(n_2) k_1k_2 \\ &= \Gamma( (n_1,k_1)(n_2,k_2) ) \end{align*} \]

by what we have shown in (6.7), and so \(\Gamma\) is a group homomorphism. If \(\Gamma(n,k) = 1_G\), then \(n = k^{-1}\), whence \(n,k^{-1} \in N \cap K\). Since the intersection is trivial, we see that \(n = k^{-1} = 1_G\), whence the kernel of \(\Gamma\) is trivial. Finally, \(\Gamma\) is surjective, as \(NK = G\). \(\square\)

Conversely, an analogue of **Proposition 5.7** holds as well.

Proposition 6.11.If \(G \cong H_1 \rtimes_\varphi H_2\), then \(G\) has a normal subgroup \(N\) and a subgroup \(K\) such that \(N \cong H_1\), \(K \cong K_1\), and \(N\) and \(K\) are permutable complements in \(G\).

**Proof.** Let \(N = H_1 \times \{1_{H_2}\}\) and \(K =
\{1_{H_1}\} \times H_2\), so that \(N\) and \(K\) are
permutable complements in \(G\). To show that \(N \triangleleft
G\), we fix \((n,1) \in N\) and \((n’,k’) \in G\). We first
compute \((n’,k’)^{-1}\). To this end, we let \((a,b) =
(n’,k’)^{-1}\) and observe that

\[1_G = (n’,k’)(a,b) = (n’ \varphi_{k’}(a) , k’ b).\]

Therefore, \(b = (k’)^{-1}\). On the other hand,

\[1_G = (a,b)(n’,k’) = (a \varphi_{b}(n’), b k’),\]

so that \(a = \varphi_b((n’)^{-1}) = \varphi_{(k’)^{-1}}((n’)^{-1})\). Therefore

\[(n’,k’)^{-1} = \left(\varphi(\varphi_{(k’)^{-1}}((n’)^{-1})), (k’)^{-1} \right) = (n^{-1},k^{-1}).\]

We now observe that

\[\begin{align*} (n’,k’)(n,1)(n’,k’)^{-1} &= \left( n’\varphi_{k’}(n),k’ \right) \left( (n’)^{-1}, (k’)^{-1} \right) \\ &= \left( n’\varphi_{k’}(n) \varphi_{k’}( (n’)^{-1}), k’ (k’)^{-1} \right) \\ &= \left( n’ \varphi_{k’}( n(n’)^{-1}) , 1 \right) \in N. \end{align*}\]

It follows that \(N \triangleleft G\), as was to be shown. \(\square\)

Therefore, the knowledge of automorphism groups is paramount in
determining whether a group can be realized as a semidirect product.
Here is a simple example; see Section 7.2 of Rotman, *An Introduction
to the Theory of
Groups* for
a more comprehensive survey.

**Example 6.12.** Let \(G\) and \(H\) be finite groups, and let
\(\varphi:G \to \operatorname{Aut}(H)\) be a group homomorphism.
Lagrange’s theorem, in conjunction with the first isomorphism theorem,
implies that \(|G| = |\ker \varphi| |\operatorname{im} \varphi|\),
and so \(|\operatorname{im} \varphi|\) divides \(|G|\). Moreover,
Lagrange’s theorem implies that \(|\operatorname{im} \varphi|\)
divides \(|\operatorname{Aut}(H)\). Therefore, if \(|G|\) and
\(|\operatorname{Aut}(H)\) are relatively prime, then \(\varphi\)
must be the trivial homomorphism, and so \(G \rtimes_\varphi H\)
must be the direct product \(G \times H\).

Now, we observe that \(\operatorname{Aut}(\mathbb{Z}/p\mathbb{Z}) \cong \mathbb{Z}/(p-1)\mathbb{Z}\) whenever \(p\) is prime. Indeed, Lagrange’s theorem implies that every non-identity element of \(\mathbb{Z}/p\mathbb{Z}\) is a generator. Now, every map from a cyclic group to a cyclic group that maps a generator to a generator can be extended to a group homomorphism. Since there are \(p-1\) generators, there are \(p-1\) such homomorphisms from \(\mathbb{Z}/p\mathbb{Z}\) into itself. It is easy to show that all such maps are automorphisms, and that no other map is an automorphism. The isomorphism statement now follows.

Therefore, there is no non-direct semidirect product \(\mathbb{Z}/3 \mathbb{Z} \rtimes \mathbb{Z}/5\mathbb{Z}\), as 5 and \(3-1 = 2\) are relatively prime. On the other hand, there is precisely one non-direct semidirect product \(\mathbb{Z}/3\mathbb{Z} \rtimes \mathbb{Z}/4\mathbb{Z}\), as there are precisely two ways of mapping \(\mathbb{Z}/4\mathbb{Z}\) homomorphically into \(\operatorname{Aut}(\mathbb{Z}/4\mathbb{Z})\): trivially or surjectively. \(\square\)

## 7. Classification of Groups of Small Order

As an application of the product constructions discussed so far, we establish several classifaction results for finite groups. To this end, we shall make use of the Sylow theorems, which we state here again for ease of reference:

Theorem 7.1(Sylow, 1872). Let \(G\) be a finite group of order \(p^nm\), where \(p\) is a prime number and \(m\) is an integer relatively prime to \(p\). The following holds:

First Sylow theorem.\(G\) contains a Sylow \(p\)-subgroup of order \(p^n\).Second Sylow theorem.All Sylow \(p\)-subgroups are conjugates of one another. In other words, if \(P\) and \(Q\) are Sylow \(p\)-subgroups of \(G\), then there exists a \(g \in G\) such that \(gPg^{-1} = Q\). This, in particular, implies that all Sylow \(p\)-subgroups are isomorphic to one anotherThird Sylow theorem.Let \(n_p\) denote the number of Sylow \(p\)-subgroups of \(G\). We have the following combinatorial restrictions on \(n_p\):

- \(n_p\) divides \(\vert G \vert\);
- \(n_p \equiv 1 \mbox{ mod } p\).

We also make use of Lagrange’s theorem and its corollaries repeatedly; let us record a few here:

Theorem 7.2(Lagrange). Let \(G\) be a finite group, \(H\) a subgroup of \(G\), and \(g\) an element of \(G\). The following holds:

- \(\vert g \vert\) divides \(\vert G \vert\);
- \(\vert G \vert = \vert H \vert [G:H]\);
- If \(H\) is normal in \(G\), then \(\vert G \vert = \vert H \vert \vert G/H \vert \);
- If \(\varphi:G \to G’\) is a homomorphism, then \(\vert G \vert = \vert \ker \varphi \vert \, \vert \operatorname{im} \varphi \vert.\)

Finally, we also rely on the fundamental theorem on finitely generated
abelian groups (**Theorem 6.10**).

The first result is a trivial corollary of Lagrange’s theorem

Proposition 7.2.Every finite group of prime order is cyclic.

**Proof.** Let \(G\) be a group of order \(p\). By Lagrange’s
theorem, every non-identity element of \(G\) is of order \(p\). It
follows that \(G\) is cyclic. \(\square\)

The second result makes use of the fundamental theorem on finitely generated abelian groups.

Theorem 7.4.Let \(p\) be a prime number. Every group of order \(p^2\) is isomorphic to either \(\mathbb{Z}/p^2\mathbb{Z}\) or \(\mathbb{Z}/p\mathbb{Z} \times \mathbb{Z}/p\mathbb{Z}\).

**Proof.** We shall make use of two lemmas:

Lemma 7.5.If \(G\) is a \(p\)-group, then \(\vert Z(G) \vert > 1\).

**Proof of lemma.** Consider the conjugacy class equation (**Theorem 3.11** in my blog post on group actions):

\[ \displaystyle |G| = |Z(G)| + \sum_{g \in \Gamma \smallsetminus Z(G)} [G:C_G(g)]. \]

Since \(C_G(g) \leq G\) for all \(g \in G\), Lagrange’s theorem implies that each \([G:G_C(g)]\) is divisible by \(p\). Since \(|G|\) is divisible by \(p\), it follows that \(|Z(G)|\) is divisible by \(p\). \(\square\)

Lemma 7.6.If \(G/Z(G)\) is cyclic, then \(G\) is abelian.

**Proof of lemma.** Let \(\pi:G \to G/Z(G)\) be the canonical
projection map, and find \(g \in G\) such that \(\pi(g) = gZ(G)\)
is a generator of \(G/Z(G)\). Fix \(x,y \in G\) and find \(m,n
\in \mathbb{N}\) such that \(x \in \pi(g^m)\) and \(y \in
\pi(g^n)\). We can find \(x’,y’ \in Z(G)\) such that \(x =
g^mx’\) and \(y = g^n y’\). We now observe that

\[\begin{align*} xy &= g^mx’ g^n y’ \\ &= y’ g^mg^n x’ \\ &= y’ g^n g^m x’ \\ &= y’ g^n x’g^m \\ &= yx. \end{align*}\]

Since \(x\) and \(y\) were arbitrary, we conclude that \(G\) is abelian. \(\square\)

We now let \(G\) be a group of order \(p^2\). Suppose for a
contradiction that \(G\) is nonabelian. By Lagrange’s theorem,
\(|Z(G)|\) is either 1 or \(p\). **Lemma 3.5** implies that
\(|Z(G)| = p\). Since \(Z(G)\) is normal in \(G\), we apply
Lagrange’s theorem to conclude that \(|G/Z(G)| = p\). It follows from
**Proposition 3.2** that \(G/Z(G)\) is cyclic. Lemma 3.6 now implies
that \(G\) is abelian, which is evidently absurd. It follows that
\(G\) is abelian, whence the fundamental theorem on finitely generated
abelian groups (**Theorem 1.10**) implies the desired classification
result. \(\square\)

The next result is another one of general nature, covering the
two-factor cases that **Theorem 3.4** leaves out.

Theorem 7.7.Let \(p\) and \(q\) be prime numbers such that \(p > q\). Every group of order \(pq\) is isomorphic to either the cyclic group \(\mathbb{Z}/pq\mathbb{Z}\) or a group given by the presentation $$\langle a,b \mid a^q = b^p = 1, aba^{-1} = b^m \rangle,$$ where \(m^q \equiv 1 \mbox{ mod } p\) and \(m \not\equiv 1 \mbox{ and } p\). In particular, if \(q \nmid p-1\), then the second case cannot occur.

**Proof.** We shall make use of three lemmas:

Lemma 7.8.Let \(G\) be a finite group containing a subgroup \(H\) of index \(r\), where \(r\) is the smallest prime divisor of \(\vert G \vert\). Then \(H \triangleleft G\).

**Proof of lemma**. Let \(G/H\) denote the left coset space \(\{gH :
g \in G\}\), so that \(|G/H| = r\). Let \(\varphi:G \to
S_{G/H}\) be the action of \(G\) on \(G/H\) by left multiplication
(see Example 2.12).
By the first isomorphism theorem, the quotient group \(G/\ker
\varphi\) is isomorphic to a subgroup of \(S_{G/H}\). Since
\(|S_{G/H}| = r!\), Lagrange’s theorem implies that \(|G/\ker
\varphi|\) divides \(r!\).

Now, Lagrange’s theorem also implies that \(\vert G/\ker \varphi \vert\) divides \(\vert G \vert\). Since \(r\) is the smallest prime that divides \(\vert G \vert\), we must have \(\vert G/ker \varphi \vert = 1 \mbox{ or } r\). The action \(\varphi\) is nontrivial, and so we must have \(\vert G/\ker \varphi| = r \vert\).

For each \(g \in \ker \varphi\), we see that \(gH =\pi(g)(H) = H\), and so \(g \in H\). Therefore, \(\ker \varphi \leq H\). Lagrange’s theorem implies that

\[r = [G:\ker \varphi] = [G: H][H:\ker \varphi] = r [H:\ker \varphi] .\]

It follows that \([H:\ker \varphi] = 1\), and so \(H = \ker \varphi\). Since \(\ker \varphi\) is normal in \(G\), the proof is complete. \(\square\)

Lemma 7.9.Let \(G\) be a finite group of order \(p^m q^n\). If all of the Sylow subgroups of \(G\) are normal, then \(G\) isomorphic to the direct product of its Sylow subgroups.

**Proof of lemma.** The **third Sylow theorem**, in conjunction with the
**second Sylow theorem**, implies that \(G\) has precisely one Sylow
\(p\)-subgroup \(A\) and one Sylow \(q\)-subgroup \(B\). By
Lagrange’s theorem, \(A \cap B = \{1_G\}\). Moreover, \(|AB| =
p^m q^n\), an dso \(AB = G\). Therefore, \(A\) and \(B\) are
permutable complements in \(G\), both of which are normal in \(G\)
by hypothesis. It follows from **Proposition 1.6** that \(G \cong A
\times B\). \(\square\)

Lemma 7.10(Chinese remainder theorem for groups). If \(k\) and \(n\) are relatively prime positive integers, then \(\mathbb{Z}/kn\mathbb{Z} \cong \mathbb{Z}/k\mathbb{Z} \times \mathbb{Z} / n \mathbb{Z}\).

**Proof of lemma.** Consider \(\mathbb{Z}/k\mathbb{Z} \times
\mathbb{Z}/n\mathbb{Z}\), which is of order \(kn\). Let \(x =
(1,0)\) and \(y = (0,1)\), so that \(|x| = k\) and \(|y| = n\).
Since \(\mathbb{Z}/k\mathbb{Z} \times \mathbb{Z}/n\mathbb{Z}\) is
abelian, we see that \((xy)^i = x^i y^i\) for all \(i \in
\mathbb{Z}\). As \(k\) and \(n\) are relatively prime, \(|xy| =
kn\), and the product group is cyclic. \(\square\)

We now let \(G\) be a group of order \(pq\). The **first Sylow
theorem** implies that \(G\) contains a Sylow \(p\)-subgroup
\(P\). Since |P| = p, **Proposition 7.2** implies that \(P\) is
cyclic; let \(b\) be an element of \(P\) of order \(p\). **Lemma
7.8** implies that \(P \triangleleft G\).

The **first Sylow theorem** also implies that \(G\) contains a Sylow
\(q\)-subgroup \(Q\). Once again, we can invoke **Proposition 3.2**
to find an element \(a\) of \(Q\) of order \(q\). If \(n_q =
1\), then the **second Sylow theorem** implies that \(Q \triangleleft
G\), whence \(G \cong P \times Q\) by **Lemma 7.9**. **Proposition
7.2** implies that \(P \cong \mathbb{Z}/p\mathbb{Z}\) and \(Q
\cong \mathbb{Z}/q\mathbb{Z}\), and it follows from **Lemma 7.10**
that \(G \cong \mathbb{Z}/pq \mathbb{Z}\).

We therefore assume that \(n_q \neq 1\). The **third Sylow theorem**
furnishes a positive integer \(k\) such that \(n_q = kq + 1\). The
**third Sylow theorem** also implies that \(n_q\) divides \(pq\),
whence we must have \(n_q = p\). This, in particular, implies that
\(kq = p-1\), and so \(q \mid p-1\).

Now, \(P \triangleleft G\), and so \(aba^{-1} = b^m\) for some \(m\). If \(m \equiv 1 \mbox{ mod } p\), then \(ab = ba\), and \(G\) is abelian. Let us therefore assume that \(m \not\equiv 1 \mbox{ and } p\).

We claim that \(a^l b a^{-l} = b^{m^l}\) for all \(l \in \mathbb{N}\). This, in particular, implies that \(b = a^q b a^{-q} = b^{m^q}\), whence \(m^q \equiv 1 \mbox{ mod } p\), as was to be shown. It therefore suffices to establish the claim.

We proceed by induction. The \(l = 1\) case has already been established. We fix \(l_0\) and assume that the \(l = l_0 – 1\) case holds. Then

\[\begin{align*} a^{l_0} b a^{-l_0} &= a (a^{l_0 – 1} b a^{-(l_0-1)}) a^{-1} = a b^{m^{l_0-1}} a^{-1} \\ &= (aba^{-1})^{m^{l_0-1}} = (b^m)^{m^{l_0-1}} = b^{m^{l_0}},\end{align*} \]

as was to be shown. \(\square\)

**Remark 7.11.** By generalizing the notion of internal direct products
to more than two subgroups, we can establish the following extension of
**Lemma 7.9**: If \(G\) is a finite group such that all of its Sylow
subgroups are normal, then \(G\) is isomorphic to the direct product
of all of its Sylow subgroups.

Corollary 7.12.If \(p\) is prime, then every group of order \(2p\) is either cyclic or isomorphic to the dihedral group \(D_p\).

**Proof.** If the group is non-cyclic, then it is isomorphic to a group
given by the presentation

\[\langle a,b \mid a^2 = b^p = 1, aba^{-1} = b^m \rangle,\]

where \(m^2 \equiv 1 \mbox{ mod } p\) and \(m \not\equiv 1 \mbox{ and } p\). Among \(1 \leq m \leq p\), we see that \(m = p-1\) is the only choice that satisfies both restrictions. It now suffices to note that the above presentation with \(m = p – 1\) yields the dihedral group \(D_p\). \(\square\)

The only groups of order at most \(15\) that are not covered by the results established thus far are groups of order 8 and groups of order 12. We now tackle them "by hand".

Theorem 7.13.Every group of order 8 is isomorphic to \(\mathbb{Z}/8\mathbb{Z}\), \(\mathbb{Z}/2 \mathbb{Z} \times \mathbb{Z} / 4 \mathbb{Z}\), \(\mathbb{Z}/2\mathbb{Z} \times \mathbb{Z}/2\mathbb{Z} \times \mathbb{Z}/2\mathbb{Z}\), the unit quaternion group \(Q\), or the dihedral group \(D_4\).

**Proof.** Let \(G\) be a group of order 8. By Lagrange’s theorem,
every element of \(G\) is of order 1, 2, 4, or 8. If \(G\) contains
an element of order 8, then \(G \cong \mathbb{Z}/8\mathbb{Z}\).

Let us suppose that \(G\) has no element of order 8. If \(G\) has no element of order 4, then every non-identity element of \(G\) is of order 2. Whenever \(x,y \in G\), we have the identity

\[xyxy = xy(yx)^{-1} = xyyx = x^2 = 1_G,\]

whence \(G\) is abelian. Fix three non-identity elements \(a,b,c\) of \(G\) and define a mapping \(\varphi:(\mathbb{Z}/2\mathbb{Z})^3 \to G\) by setting \(\varphi(k,m,n) = a^kb^mc^n.\) It is routine to check that \(\varphi\) is a group isomorphism.

Let us now suppose that \(G\) has an element of order \(4\). Assume for now that \(G\) is abelian. Fix an element \(a\) of order 4, and pick an element \(b\) of order 2 that is not a power of \(a\). Define a mapping \(\varphi:\mathbb{Z}/4\mathbb{Z} \times \mathbb{Z}/2\mathbb{Z} \to G\) by setting \(\varphi(m,n) = a^mb^n\). Once again, it is easy to check that \(\varphi\) is a group isomorphism.

Finally, we suppose that \(G\) is a nonabelian group of order 8 that
contains an element \(x\) of order 4. Since \([G:\langle x
\rangle] = 2\), we see that \(\langle x \rangle \triangleleft
G\): see **Lemma 4.2**. In
particular, \(G/\langle x \rangle \cong \mathbb{Z}/2\mathbb{Z}\).
Therefore, every \(g \in G\) such that \(g \notin \langle x
\rangle\) satisfies the property that \(g^2 \in \langle x
\rangle\).

We now fix such a \(g\). If \(g^2 = x\) or \(g^2 = x^3\), then \(g\) is of order 8, which is absurd. We must therefore have \(g^2 = 1_G\) or \(g^2 = x^2\).

Moreover, \(\langle x \rangle\) is normal in \(G\), and so \(gxg^{-1} = a^n\) for some \(n\). \(n\) cannot be 0, as \(gx \neq g\). \(n\) cannot be 1, as \(G\) is nonabelian. \(n\) cannot be 2, as as \(x\) and \(gxg^{-1}\) must have the same order. We conclude that \(g x = x^3g\).

Therefore, \(G\) is isomorphic to either

\[\langle g,x \mid g^2 = x^4 = 1_G, gx = x^3g \rangle\]

or

\[\langle g,x \mid x^4 = 1, g^2 = x^2, gx = g^3g \rangle.\]

The first case corresponds to \(D_4\); the second case corresponds to \(Q\). \(\square\)

Theorem 7.15.Every group of order 12 is isomorphic to \(\mathbb{Z}/12\mathbb{Z}\), \(\mathbb{Z}/6 \mathbb{Z} \times \mathbb{Z}/2\mathbb{Z}\), \(A_4\), \(D_6\), or \(\mathbb{Z}/3\mathbb{Z} \rtimes_\varphi \mathbb{Z}/4\mathbb{Z}\), where \(\varphi:\mathbb{Z}/4\mathbb{Z} \to \operatorname{Aut}(\mathbb{Z}/3\mathbb{Z})\) is the unique nontrivial group homomorphism (seeExample 2.12).

**Proof.** We suppose that \(G\) is a nonabelian group of order
12 and show that \(G\) is isomorphic to \(A_4\), \(D_6\), or
\(\mathbb{Z}/3\mathbb{Z} \rtimes_\varphi
\mathbb{Z}/4\mathbb{Z}\). To this end, we assume without loss of
generality that \(G \not\cong A_4\).

The **first Sylow theorem** furnishes a Sylow 3-subgroup \(P\), which
is of order 3. We show that \(P \triangleleft G\). We can think of
each element of \(G\) as a permutation on the set of all left cosets
of \(P\); since \([G:P] = 4\), this furnishes a homomorphism
\(\varphi:G \to S_4\) such that \(\ker \varphi \leq P\).
\(|P| = 3\), and so Lagrange’s theorem implies that \(|\ker
\varphi|\) is either 1 or 3. If \(\ker \varphi\) is trivial, then
\(G\) is isomorphic to a subgroup of \(S_4\) of order 12. Since
\(A_4\) is the only subgroup of \(S_4\) of order 12, we see that
\(G \cong A_4\), which is absurd. Therefore, \(|\ker \varphi| =
3\), and so \(\ker \varphi = P\). It follows that \(P
\triangleleft G\).

The **first Sylow theorem** also furnishes a Sylow 2-subgroup \(Q\),
which is of order 4. By Lagrange’s theorem \(|P \cap Q| = 1\), and so
\(P \cap Q = \{1_G\}\). Moreover, \(|PQ| = 12\), and so \(PQ =
G\). It now follows from **Proposition 2.10** that \(G \cong P
\rtimes_\varphi Q\). Since \(|Q| = 4\) and
\(\operatorname{Aut}(P) \cong \mathbb{Z}/2\mathbb{Z}\), there are
two possible homomorphisms \(\varphi:Q \to \operatorname{Aut}(P)\):
the trivial homomorphism, and the surjective homomorphism (see **Example
2.12**). The first case corresponds to the direct product \(P \times
Q\). Since \(P\) and \(Q\) are both abelian, this implies that
\(G\) is abelian, which is absurd.

The second case corresponds to the semidirect product \(P \rtimes_\varphi Q\), where \(\varphi:Q \to \operatorname{Aut}(P)\) is the unique surjective homomorphism. If \(Q \cong \mathbb{Z}/2\mathbb{Z} \times \mathbb{Z}/2\mathbb{Z}\), then \(P \rtimes Q \cong D_6\). If \(Q \cong \mathbb{Z}/4\mathbb{Z}\), then \(P \rtimes_\varphi Q \cong \mathbb{Z}/3\mathbb{Z} \rtimes_\varphi \mathbb{Z}/4\mathbb{Z}\). \(\square\)

## 8. Exact Sequences and the Extension Problem

We now approach the study of semidirect product from a different angle. We first note that \(N \rtimes K\) is obtained by piecing together \(N\) and \((N \rtimes K)/N\).

Proposition 8.1.\((N \rtimes_\varphi K)/(N \times \{1_K\}) \cong K\).

**Proof.** It suffices to observe that

\((N \rtimes_\varphi K)/(N \times \{1_K\}) \cong 1 \times \{1_K\} \cong K\). \(\square\)

In light of this, we make the following definition:

Definition 8.2.Let \(N\) and \(K\) be groups. We say that \(G\) is anextensionof \(N\) by \(K\) if \(G\) has a normal subgroup \(N’\) such that \(N’ \cong N\) and \(G/N’ \cong K\).

Observe that \(G\) is an extension of \(N\) by \(K\) if and only if there exist an injective homomorphism \(i:N \to G\) and a surjective homomorphism \(p:G \to K\) such that \(\operatorname{im} i = \ker p\). Indeed, if such maps exist, then \(N’ \operatorname{im} i \cong N\) and

\[K = \operatorname{im} p \cong G / \ker p = G / \operatorname{im} i = G/N’.\]

Conversely, if we have isomorphisms \(i:N \to N’\) and \(\bar{p}:G/N’ \to K\), then we can define a mapping \(p:G \to K\) by setting \(p(g) = \bar{p}(gN’)\) and check that \(\operatorname{ker} p = N’ = \operatorname{im} i\).

We isolate the crucial condition:

Definition 8.3.Let \(A \xrightarrow{f} B \xrightarrow{g} C\) be a sequence of group homomorphisms. The sequence is said to beexactin case \(\operatorname{im} f = \ker g\). A longer sequence $$G_1 \xrightarrow{f_1} G_2 \xrightarrow{f_2} \cdots \xrightarrow{f_{n-1}} G_n$$ is said to be exact if each joint \(G_{k-1} \xrightarrow{f_{k-1}} G_k \xrightarrow{f_k} G_{k+1}\) is exact.

Definition 8.4.Ashort exact sequence of groupsis an exact sequence of groups of the form $$1 \to A \xrightarrow{f} B \xrightarrow{g} C \to 1,$$ where \(1\) denotes the trivial group, \(1 \to A\) denotes the group homomorphism that maps the identity to the identity, and \(C \to 1\) denotes the trivial homomorphism.

We make a few observations. Since the image of \(1 \to A\) is trivial, the exactness condition on \(1 \to A \xrightarrow{f} B\) means that \(\ker f = \{1_B\}\), or that \(f\) is injective. The kernel of \(C \to 1\) is \(C\), and so the exactness condition on \(B \xrightarrow{g} C \to 1\) implies that \(\operatorname{im} g = C\), or that \(g\) is surjective. Finally, the exactness condition on \(A \xrightarrow{f} B \xrightarrow{g} C\) states merely that \(\operatorname{im} f = \operatorname{g}\). We thus see that short exact sequences are precisely the embodiments of extensions of groups:

Proposition 8.5.\(G\) is an extension of \(N\) by \(K\) if and only if there exists a short exact sequence \(1 \to N \to G \to K \to 1\).

The *extension problem*, then, is to determine all groups \(G\) that
makes the sequence \(1 \to N \to G \to K \to 1\) exact for fixed
\(N\) and \(K\). The extension problem lies at the heart of the
classification of finite groups. To see why, we need a few notions from
the theory of normal series:

Definition 8.6.Anormal seriesof a group \(G\) is a sequence of subgroups $$G = G_0 \geq G_1 \geq \cdots \geq G_n = 1$$ such that \(G_{k+1} \triangleleft G_k\) for all \(0 \leq k \leq k-1\). The\(k\)th factor groupof the above normal series is the quotient group \(G_k/G_{k+1}\). Thelengthof the above normal series is the number of nontrivial factor groups. The above normal series is said to be acomposition seriesin case each \(G_{k+1}\) is either a maximal proper normal subgroup of \(G_k\) or \(G_{k+1} = G_k\).

The Jordan–Hölder theorem states that each group has a unique composition series in a precise sense. Now, we take a composition series

\[G = G_0 \geq G_1 \geq \cdots \geq G_n = 1\]

of \(G\) and let \(F_k = G_k/G_{k-1}\) be the corresponding factor groups. Note that, at each stage, \(G_k\) is an extension of \(G_{k-1}\) by \(F_k\). Moreover, we observe that each factor group is either simple or trivial. Therefore, an arbitrary finite group \(G\) can be obtained by carrying out the “extending by a finite simple group” process finitely many times—and, by the Jordan–Hölder theorem, this procedure is uniquely determined by \(G\).

In other words, we can complete the classification of finite groups by (1) classifying all finite simple groups and (2) solving the extension problem. (1) has been completed, after 60 years of hard work by many group theorists. (2) is still unsolved, as there is no known general theory of classifying all extensions of a given group: see the Wikipedia page on group extensions.

The unresolved state of the group extension problem indicates that there are, in general, many group extensions that fail to be semidirect products. The following exercise sheds light on this phenomenon:

Exercise 8.7.We say that a short exact sequence of groups $$1 \to N \xrightarrow{f} G \xrightarrow{g} K \to 1$$ isleft splitif there exists a group homomorphism \(G \xrightarrow{f’} N\) such that \(f’ \circ f = \operatorname{id}_N\). Similarly, the short exact sequence is said to beright splitif there exists a group homomorphism \(K \xrightarrow{g’} G\) such that \(g \circ g’ = \operatorname{id}_K\). Show that a group extension \(G\) of \(N\) by \(K\) is a semidirect product of \(N\) by \(K\) if and only if the corresponding short exact sequence is right split. Show also that a group extension \(G\) of \(N\) by \(K\) is a direct product of \(N\) and \(K\) if and only if the corresponding short exact sequence is left split. Conclude that left split implies right split.

See Chapter 7 of Rotman, *An Introduction to the Theory of
Groups* for
a more detailed survey of the group extension problem. The extension
problem for abelian groups (and, in general, abelian
categories) is also
heavily studied. The proper context for this type of problem is
*homological algebra*: see, for
example, Weibel, *An Introduction to Homological
Algebra*.