\documentclass[11pt]{amsart}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\usepackage{amssymb}
%\textwidth 118mm
\textheight 197mm
\usepackage{fancyheadings}
\pagestyle{fancy}
\lhead[]{\thepage}
\rhead[\thepage]{}
\chead{J.\@ L.\@ Walsh: {\it Normal Orthogonal Functions.}}
\renewcommand{\headrulewidth}{0pt}
\lfoot[]{}
\cfoot[]{}
\rfoot[]{}
\newtheorem{theorem}{\sc \hspace{5mm}Theorem}
\newtheorem{lemma}{\sc \hspace{5mm}Lemma}
\renewcommand{\thetheorem}{\Roman{theorem}}
\renewcommand{\thelemma}{}
\renewcommand{\thefootnote}{\fnsymbol{footnote}}
\newcommand{\Wa}[3]{$\varphi^{#1}_{#2}(#3)$}
\newcommand{\W}[2]{$\varphi^{#1}_{#2}$}
\newcommand{\Xa}[3]{$\chi^{#1}_{#2}(#3)$}
\newcommand{\X}[2]{$\chi^{#1}_{#2}$}
\setcounter{page}{5}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{document}
\title{A CLOSED SET OF NORMAL ORTHOGONAL FUNCTIONS*\footnotemark[1]}
\footnotetext[1]{Presented to the
American Mathematical Society, Feb.\ 25, 1922.}
\author{By J. L. Walsh}
\date{}
\maketitle
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section*{\bf Introduction.}
A set of normal orthogonal functions $\{\chi\}$ for the interval $0 \leqq x
\leqq 1$ has been constructed by Haar\footnote[2]{{\it Mathematische Annalen},
Vol.\ 69 (1910), pp.\ 331--371; especially pp.\ 361--371.}, each function
taking merely one constant value in each of a finite number of sub-intervals
into which the entire interval $(0,1)$ is divided. Haar's set is, however,
merely one of an infinity of sets which can be constructed of functions of this
same character. It is the object of the present paper to study a certain new
closed set of functions $\{\varphi\}$ normal and orthogonal on the interval
$(0, 1)$; each function $\varphi$ has this same property of being constant over
each of a finite number of sub-intervals into which the interval $(0,1)$ is
divided. In fact each function $\varphi$ takes only the values $+1$ and $-1$,
except at a finite number of points of discontinuity, where it takes the value
zero.
The chief interest of the set $\varphi$ lies in its similarity to the usual (e.g.,
sine, cosine, Sturm-Liouville, Legendre) set of orthogonal functions, while the
chief interest of the set \X{}{} lies in its {\it dissimilarity} to these
ordinary sets. The set $\varphi$ shares with the familiar sets the following
properties, none of which is possessed by the set \X{}{}: the $n$th function
has $n-1$ zeroes (or better, sign-changes) interior to the interval considered,
each function is either odd or even with respect to the mid-point of the
interval, no function vanishes identically on any sub-interval of the original
interval, and the entire set is uniformly bounded.
Each function \X{}{} can be expressed as a linear combination of a finite number
of functions $\varphi$, so the paper illustrates the changes in properties which
may arise from a simple orthogonal transformation of a set of functions.
In \S~1 we define the set \X{}{} and give some of its principal properties. In
\S~2 we define the set $\varphi$ and compare it with the set \X{}{}. In \S~3
and \S~4 we develop some of the properties of the set $\varphi$, and prove in
particular that every continuous function of bounded variation can be expanded
in terms of the $\varphi$'s and that every continuous function can be so
developed in the sense not of convergence of the series but of summability by
the first Ces\`aro mean. In \S~5 it is proved that there exists a continuous
function which cannot be expanded in a convergent series of the functions
$\varphi$. In \S~6 there is studied the nature of the approach of the
approximating functions to the sum function at a point of discontinuity, and in
\S~7 there is considered the uniqueness of the development of a function.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{\bf Haar's Set \X{}{}.}
\label{sec:HaarSet}
Consider the following set of functions:
\[ \begin{array}{rl}
f_0(x) \equiv 1, & 0 \leqq x \leqq 1, \\
~ \\
f_1^{(1)}(x) \equiv \left\{ \begin{array}{ll}
1, & 0 \leqq x < \frac{1}{2}, \\
% ~ \\
0, & \frac{1}{2} < x \leqq 1,
\end{array}
\right.
&
f_1^{(2)}(x) \equiv \left\{ \begin{array}{ll}
1, & \frac{1}{2} < x \leqq 1, \\
% ~ \\
0, & 0 \leqq x < \frac{1}{2},
\end{array}
\right.
\end{array}
\]
\begin{center}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\end{center}
\[
\begin{array}{lr}
f_k^{(i)} \equiv \left\{ \begin{array}{ll}
1, & \displaystyle \frac{i-1}{2^k} < x < \frac{i}{2^k}, \\
~ \\
0, & \displaystyle 0 \leqq x < \frac{i-1}{2^k},\ \mbox{or}\ \frac{i}{2^k} < x \leqq 1,
\end{array}
\right.
&
\begin{array}{r}
i = 1, 2, 3, \cdots, 2^k, \\
% ~ \\
k = 1, 2, 3, \cdots, \infty;
\end{array} \\
% ~
\end{array}
\]
these functions may be defined at a point of discontinuity to have the average
of the limits approached on the two sides of the discontinuity.
If we have at our disposal all the functions $f_k^{(i)}$, it is clear that we
can approximate to any continuous function in the interval $0 \leqq x \leqq 1$
as closely as desired and hence that we can expand any continuous function in a
uniformly convergent series of functions $f_k^{(i)}$. For a continuous function
$F(x)$ is uniformly continuous in the interval $(0,1)$, and thus uniformly in
that entire interval can be approximated as closely as desired by a linear
combination of the functions $f_k^{(i)}$ where $k$ is chosen sufficiently large
but fixed. The approximation can be made better and better and thus will lead
to a uniformly convergent series of functions $f_k^{(i)}$.
Haar's set \X{}{} may be found by normalizing and orthogonalizing the set
$f_k^{(i)}$, those functions to be ordered with increasing $k$, and for each $k$
with increasing $i$. The set \X{}{} consists of the following
functions:\footnote[1]{L.\@ c., p.\@ 361.}
\[
\begin{array}{ccc}
\chi_0(x) \equiv 1, & 0 \leqq x \leqq 1, & \chi_1(x) \equiv \left\{
\begin{array}{rl}
1, & 0 \leqq x < \frac{1}{2}, \\
% ~ \\
-1, & \frac{1}{2} < x \leqq 1,
\end{array}\right.
\end{array}
\]
\[
\begin{array}{rclrcll}
\chi_2^{(1)}(x) & = & \sqrt{2}, & \chi_2^{(2)} & = & 0, & 0 \leqq x < \frac{1}{4}, \\
% ~ \\
~ & = & -\sqrt{2}, & ~ & = & 0, & \frac{1}{4} < x < \frac{1}{2}, \\
% ~ \\
~ & = & 0, & ~ & = & \sqrt{2}, & \frac{1}{2} < x < \frac{3}{4}, \\
% ~ \\
~ & = & 0, & ~ & = & -\sqrt{2}, & \frac{3}{4} < x \leqq 1,
\end{array}
\]
\begin{center}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\end{center}
\[
\begin{array}{rcllrcl}
\chi_n^{(k)} & = & \sqrt{2^{n-1}}, & \displaystyle \frac{k-1}{2^{n-1}} < x < \frac{2k-1}{2^n}, & k & = & 1, 2, 3, \cdots, 2^{n-1}, \\
~ \\
~ & = & -\sqrt{2^{n-1}}, & \displaystyle \frac{2k-1}{2^n} < x < \frac{k}{2^{n-1}}, & n & = & 1, 2, 3, \cdots, \infty, \\
~ \\
~ & = & 0, & \displaystyle 0 < x < \frac{k-1}{2^{n-1}}\ \mbox{or}\ \frac{k}{2^{n-1}} < x < 1.
\end{array}
\]
The same convention as to the value of \X{(k)}{n} at a point of
discontinuity is made as for the $f_n^{(k)}$, and \Xa{(k)}{n}{0} and
\Xa{(k)}{n}{1} are defined as the limits of \X{(k)}{n} as $x$ approaches
0 and 1.
For any particular value of $N$, all the functions $f_n^{(k)}, n < N$, can be
expressed linearly in terms of the functions \X{(k)}{n}, $n < N$, and
conversely.
Let $F(x)$ be any function integrable and with an integrable square in the
interval $(0,1)$; its formal development in terms of the functions \X{}{} is
\begin{eqnarray}
F(x) & \sim & \chi_0(x) \int_0^1 F(y)\chi_0(y) \mathit{dy} + \chi_1(x) \int_0^1 F(y)\chi_1(y)\mathit{dy} + \cdots \nonumber \\
~ & ~ & ~ \\
& & \mbox{} + \chi_n^{(k)}(x) \int_0^1 F(y)\chi_n^{(k)}(y)\mathit{dy} + \cdots . \nonumber
\end{eqnarray}
This series (1) is formed with coefficients determined formally as for the
Fourier expansions, and it is well known that $S_m(x)$, the sum of the first $m$
terms of this series, is that linear combination $F_m(x)$ of the first $m$ of
the functions \X{}{} which renders a minimum the integral
\[
\int_0^1 (F(x) - F_m(x))^2\mathit{dx}.
\]
That is, $S_m(x)$ is in the sense of least squares the best approximation to
$F(x)$ which can be formed from a linear combination of the first $m$ functions
\X{}{}; it is likewise true that $S_m(x)$ is the best approximation to $F(x)$
which can be formed from a linear combination of those functions $f_n^{(k)}$ that
are dependent on the first $m$ functions \X{}{}.
Let $F(x)$ be continuous in the closed interval $(0,1)$. If $\epsilon$ is any
positive number, there exists a corresponding number $n$ such that
\[
|F(x') - F(x'')| < \epsilon\ \ \ \ \mbox{whenever}\ \ \ \ |x' - x''| < \frac{1}{2^n}.
\]
We interpret $S_{2^n}(x)$ as a linear combination of the functions $f_n^{(k)}$.
The multiplier of the function $f_n^{(k)}$ which appears in $S_{2^n}(x)$ is
chosen so as to furnish the best approximation in the interval
${\displaystyle \left(\frac{k-1}{2^n},\frac{k}{2^n}\right)}$ to the function $F(x)$, so it is
evident that $S_{2^n}(x)$ approximates to $F(x)$ uniformly in the entire
interval $(0,1)$ with an approximation better than $\epsilon$. The function
$S_{2^{n+1}}(x)$ cannot differ from $F(x)$ by more than $\epsilon$ at any point
of the interval $(0,1)$, and so for all functions $S_{2^{n+l}}(x)$. Thus we
have
\begin{theorem}
If $F(x)$ is continuous in the interval $(0,1)$, series $(1)$ converges
uniformly to the value $F(x)$ if the terms are grouped so that each group
contains all the $2^{n-1}$ terms of a set \X{(k)}{n},
$k=1,2,3,\cdots,2^{n-1}$.
\end{theorem}
Haar proves that the series actually converges uniformly to $F(x)$ without the
grouping of terms\footnote[1]{L.\@ c., p.\@ 368.}, and establishes many other results
for expansions in terms of the set \X{}{}; to some of these results we shall
return later.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{\bf The Set $\varphi$.}
\label{sec:SetPhi}
The set $\varphi$, which it is the main purpose of this paper to study, consists
of the following functions:
\[\begin{array}{ccc}
\varphi_0(x) \equiv 1, & 0 \leqq x \leqq 1, & \varphi_1(x) \equiv \left\{
\begin{array}{rl}
1, & 0 \leqq x < \frac{1}{2}, \\
% ~ \\
-1, & \frac{1}{2} < x \leqq 1,
\end{array}\right.
\end{array}\]
\[\begin{array}{rcl}
\varphi_2^{(1)}(x) & \equiv & \left\{ \begin{array}{rl}
1, & 0 \leqq x < \frac{1}{4}, \frac{3}{4} < x \leqq 1,\\
% ~ \\
-1, & \frac{1}{4} < x < \frac{3}{4},
\end{array}
\right. \\
~ & & \\
\varphi_2^{(2)}(x) & \equiv & \left\{ \begin{array}{rl}
1, & 0 \leqq x < \frac{1}{4}, \frac{1}{2} < x < \frac{3}{4}, \\
% ~ \\
-1, & \frac{1}{4} < x < \frac{1}{2}, \frac{3}{4} < x \leqq 1,
\end{array}
\right.
\end{array}\]
\begin{center}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\end{center}
\begin{eqnarray}\begin{array}{rcl}
\varphi_{n+1}^{(2k-1)}(x) & \equiv & \left\{ \begin{array}{l}
\varphi_n^{(k)}(2x), 0 \leqq x < \frac{1}{2}, \\
% ~ \\
(-1)^{k+1}\varphi_n^{(k)}(2x-1), \frac{1}{2} < x \leqq 1,
\end{array}
\right. \\
% ~ & & \\
\varphi_{n+1}^{(2k)}(x) & \equiv & \left\{ \begin{array}{l}
\varphi_n^{(k)}(2x), 0 \leqq x < \frac{1}{2}, \\
% ~ \\
(-1)^{k}\varphi_n^{(k)}(2x-1), \frac{1}{2} < x \leqq 1,
\end{array}
\right.\end{array}
\end{eqnarray}
\[
k = 1, 2, 3, \cdots, 2^{n-1}, \hspace{5mm} n = 1, 2, 3, \cdots, \infty.
\]
In general, the function \W{(1)}{n}, $n > 0$, is to be used, with the
horizontal scale reduced to one half and the vertical scale unchanged, to form
the functions \W{(1)}{n+1} and \W{(2)}{n+1} in each of the
halves $(0,\frac{1}{2})$, $(\frac{1}{2}, 1)$ of the original interval;
the function \Wa{(1)}{n+1}{x} is to be even and the function \W{(2)}{n+1} odd
with respect to the point $x=\frac{1}{2}$. Similarly, the function \W{(k)}{n}
is to be used to form the functions \W{(2k-1)}{n+1} and \W{(2k)}{n+1}, the
former of which is even and the latter odd with respect to the point
$x=\frac{1}{2}$.
All the functions \W{(k)}{n} are to be taken positive in the interval
${\displaystyle \left(0,\frac{1}{2^n}\right)}$.
The function \W{(k)}{n} is to be defined at points of discontinuity as were the
functions $f$ and \X{}{}, and at $x=0$ to have the value 1, and at $x=1$ to have
the value $(-1)^{k+1}$.\footnote[1]{If it is desired to develop periodic functions
by means of the set $\varphi$ [or the similar sets $f$ and \X{}{}]
simultaneously in all intervals $\cdots, (-2,-1),(-1,0),(0,1),(1,2),\cdots$, it
will be wise to change these definitions at $x=0$ and $x=1$ so that always the
value of \Wa{(k)}{n}{x} is the arithmetic mean of the limits approached at these
points to the right and to the left.}
The function \W{(k)}{n} is odd or even with respect to the point $x=\frac{1}{2}$
according as $k$ is even or odd.
The functions \W{}{0}, \W{}{1}, \W{(1)}{2}, \W{(2)}{2} have $0, 1, 2, 3$ zeroes (i.e.,
sign-changes) respectively interior to the interval $(0,1)$. The function
\Wa{(2k-1)}{n+1}{x} has twice as many zeroes as the function \W{(k)}{n}; and
\Wa{(2k)}{n+1}{x} has one more zero, namely at $x=\frac{1}{2}$, than has
\Wa{(2k-1)}{n+1}{x}.
Thus the function \W{(k)}{n} has $2^{n-1}+k-1$ zeroes.
This formula holds for $n=2$ and follows for the general case by induction.
Hence each function \W{(k)}{n} has one more zero than the preceding; the zeroes
of these functions increase in number precisely as do the zeroes of the
classical sets of functions---sine, cosine, Sturm-Liouville, Legendre, etc.
We shall at times find it convenient to use the notation \W{}{0}, \W{}{1},
\W{}{2}, $\cdots$ for the functions \W{(k)}{n}; the subscript denotes the
number of zeroes.
The orthogonality of the system \W{}{} is easily established. Any two functions
\W{(k)}{n} are orthogonal if $n<3$, as may be found by actually testing the
various pairs of functions.
Let us assume this fact to hold for $n=1,2,3, \cdots, N-1$; we shall prove that
it holds for $n=N$.
By the method of construction of the functions \W{}{}, each of the integrals
\[\begin{array}{rlr}
\displaystyle \int_0^{\frac{1}{2}} \varphi^{(k)}_N(x)\varphi^{(i)}_m(x) \mathit{dx}, &
\displaystyle \int_{\frac{1}{2}}^1 \varphi^{(k)}_N(x)\varphi^{(i)}_m(x) \mathit{dx}, &
m \leqq N,
\end{array}\]
is the same except possibly for sign as an integral
\[
\displaystyle \int_0^1 \varphi^{(j)}_{N-1}(y)\varphi^{(l)}_{m-1}(y) \mathit{dy}
\]
after the change of variable $y=2x$ or $y=2x-1$.
Each of these two integrals [in fact, they are the same integral] whose variable
is $y$ has the value zero, so we have the orthogonality of \Wa{(k)}{N}{x} and
\Wa{(i)}{m}{x}:
\[
\displaystyle \int_0^1 \varphi^{(k)}_{N}(x)\varphi^{(i)}_{m}(x) \mathit{dx} = 0.
\]
This proof breaks down if the two functions \Wa{(j)}{N-1}{y}, \Wa{(l)}{m-1}{y}
are the same, but in that case either \Wa{(k)}{N}{x} and \Wa{(i)}{m}{x} are the
same and we do not wish to prove their orthogonality, or one of the functions
\Wa{(k)}{N}{x}, \Wa{(i)}{m}{x} is odd and the other even, so the two are
orthogonal.
Each of the functions \Wa{(k)}{n}{x} is normal, for we have
\[
| \varphi^{(k)}_n(x)| \equiv 1
\]
except at a finite number of points.
Each of the functions \X{}{0}, \X{}{1}, \X{(1)}{2}, \X{(2)}{2}, $\cdots$,
\X{(2^n)}{n+1} can be expressed linearly in terms of the functions \W{}{0},
\W{}{1}, \W{(1)}{2}, \W{(2)}{2}, $\cdots$, \W{(2^n)}{n+1}.
Thus for $n=1$ we have
\[
\begin{array}{lccr}
\chi_0 = \varphi_0, &
\chi_1 = \varphi_1, &
\chi^{(1)}_2 = \frac{1}{2}\sqrt{2}(\varphi^{(1)}_2 + \varphi^{(2)}_2), &
\chi^{(2)}_2 = \frac{1}{2}\sqrt{2}(-\varphi^{(1)}_2 + \varphi^{(2)}_2).
\end{array}\]
It is true generally that except for a constant normalizing factor $\sqrt{2}$,
the function \X{(k)}{n+1}, $k \leqq 2^{n-1}$, is the same linear combination of
the functions $\frac{1}{2}[\varphi^{(2k-1)}_{n+1} + \varphi^{(2k)}_{n+1}]$ as is
\X{(k)}{n} of the functions \W{(k)}{n}, and the functions
\X{(k)}{n+1},$k>2^{n-1}$, is the same linear combination of the functions
$\frac{1}{2}(-1)^{k+1}[\varphi^{(2k-1)}_{n+1} - \varphi^{(k)}_{n+1}]$ as is
\X{(k-2^{n-1})}{n} of the
functions \W{(k)}{n}.
It is similarly true that all the functions $\varphi_0, \varphi_1, \cdots,
\varphi_{n+1}^{(2^n)}$ can be expressed linearly in terms of the functions
$\chi_0, \chi_1, \cdots, \chi_{n+1}^{(2^n)}$. Thus we have for $n=2$,
\[\begin{array}{lccr}
\varphi_0 = \chi_0, & \varphi_1 = \chi_1, &
\varphi_2^{(1)} = \frac{1}{2}\sqrt{2}(\chi_2^{(1)} - \chi_2^{(2)}), &
\varphi_2^{(2)} = \frac{1}{2}\sqrt{2}(\chi_2^{(1)} + \chi_2^{(2)}).
\end{array}\]
The general fact appears by induction from the very definition of the functions
\W{}{}.
The set \X{}{} is known to be closed\footnote[1]{That is, there exists no non-null
Lebesgue-integrable function on the interval $(0,1)$ which is orthogonal to all
functions of the set; l.\@ c., p.\@ 362.}; it follows from the expression of the
\X{}{} in terms of the \W{}{} that the set \W{}{} is also closed.
The definition of the functions \W{(k)}{n} enables us to give a formula for
\Wa{(k)}{n}{x}. Let us set, in binary notation,
\[\begin{array}{lcr}
\hspace{29mm} & \displaystyle x = \frac{a_1}{2^1} + \frac{a_2}{2^2} + \frac{a_3}{2^3} + \cdots, &
\hspace{28mm} a_i = 0\ \mbox{or}\ 1.
\end{array}\]
If $x$ is a binary irrational or if in the binary expansion of $x$ there exists
$a_i \neq 0, i > n$, the following formulas hold for \W{(k)}{n}:
\begin{eqnarray}
& \begin{array}{rclrcl}
\varphi_0 & = & 1, & \varphi_1 & = & (-1)^{a_1}, \\
\varphi^{(1)}_2 & = & (-1)^{a_1 + a_2}, & \varphi^{(2)}_2 & = & (-1)^{a_2}, \\
\varphi^{(1)}_3 & = & (-1)^{a_2 + a_3}, & \varphi^{(2)}_3 & = & (-1)^{a_1 + a_2 + a_3}, \\
\varphi^{(3)}_3 & = & (-1)^{a_1 + a_3}, & \varphi^{(4)}_3 & = & (-1)^{a_3}, \\
\varphi^{(1)}_4 & = & (-1)^{a_3 + a_4}, & \varphi^{(2)}_4 & = & (-1)^{a_1 + a_3 + a_4}, \\
\varphi^{(3)}_4 & = & (-1)^{a_1 + a_2 + a_3 + a_4}, & \varphi^{(4)}_4 & = & (-1)^{a_2 + a_3 + a_4}, \\
\varphi^{(5)}_4 & = & (-1)^{a_2 + a_4}, & \varphi^{(6)}_4 & = & (-1)^{a_1 + a_2 + a_4}, \\
\varphi^{(7)}_4 & = & (-1)^{a_1 + a_4}, & \varphi^{(8)}_4 & = & (-1)^{a_4}, \\
\end{array} & \\
& .\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}., & \nonumber \\
& .\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.. & \nonumber
\end{eqnarray}
The general law appears from these relations; always we have
\begin{eqnarray}
\varphi^{(1)}_n & = & (-1)^{a_{n-1} + a_n}, \\
\varphi^{(k)}_n & = & \varphi_{k-1}\varphi^{(1)}_n. \nonumber
\end{eqnarray}
A general expression for \Wa{(k)}{n}{x} when $x$ is a binary rational can
readily be computed from formulas (3), for we have expressions for the values
of \W{(k)}{n} for neighboring larger and smaller values of the argument than
$x$.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{\bf Expansions in Terms of the Set $\{\varphi\}$.}
\label{sec:ExpansionofSetPhi}
The following theorem results from Theorem I by virtue of the remark that all
functions \W{(k)}{n} can be expressed in terms of the functions \X{(i)}{n} and
conversely, and from the least squares interpretation of a partial sum of a
series of orthogonal functions:
%--------------------------------------
\begin{theorem}
If $F(x)$ is continuous in the interval (0,1), the series
\begin{eqnarray}
F(x) & \sim & \varphi_0(x) \int_0^1 F(y)\varphi_0(y) \mathit{dy} + \varphi_1(x) \int_0^1 F(y)\varphi_1(y)\mathit{dy} \nonumber \\
~ & ~ & ~ \\
& & \mbox{} + \cdots \varphi^{(j)}_i(x) \int_0^1 F(y)\varphi_i^{(j)}(y)\mathit{dy} + \cdots , \nonumber
\end{eqnarray}
converges uniformly to the value $F(x)$ if the terms are grouped so that each group contains all the
$2^{n-1}$ terms of a set \W{(k)}{n}, $k = 1,2,3, \cdots, 2^{n-1}$.
\end{theorem}
%---------------------------------
Series (5) after the grouping of terms is precisely the same as series (1) after the grouping of terms.
Theorem II can be extended to include even discontinuous functions $F(x)$; we suppose $F(x)$
to be integrable in the sense of Lebesgue. Let us introduce the notation
\[
F(a+0)=\lim_{\epsilon=0} F(a+\epsilon), ~~~ F(a-0)=\lim_{\epsilon=0}F(a-\epsilon), ~~ \epsilon > 0,
\]
and suppose that these limits exist for a particular point $x=a$. We introduce the functions
\begin{equation} \begin{array}{rl}
F_1(x) = \left\{ \begin{array}{ll}
F(x), & x < a, \\
% ~ \\
F(a-0), & x \geqq a,
\end{array}
\right.
&
F_2(x) = \left\{ \begin{array}{ll}
F(a+0), & x \leqq a, \\
% ~ \\
F(x), & x > a,
\end{array}
\right.
\end{array}
\end{equation}
The least squares interpretation of the partial sums $S_{2^n}(x)$ of the series (1) or (5) as expressed
in terms of the $f^{(j)}_i$ gives the result that if \mbox{$h_1 < F(x) < h_2$} in any interval, then also
$h_1 < S_{2^n}(x) < h_2$ in any completely interior interval if $n$ is sufficiently large. It follows that
$F_1(x)$ is closely approximated at $x=a$ by its partial sum $S_{2^n}$ if $n$ is sufficiently large, and
that this approximation is uniform in any interval about the point $x=a$ in which $F_1(x)$ is continuous.
A similar result holds for $F_2(x)$.
The function $F_1(x)+F_2(x)$ differs from the original function $F(x)$ merely by the function
\[ \begin{array}{rl}
G(x) = \left\{ \begin{array}{ll}
F(a+0), & x < a, \\
% ~ \\
F(a-0), & x> a.
\end{array}
\right.
\end{array}
\]
The representation of such functions by sequences of the kind we are considering will be studied in more
detail later (\S~\ref{sec:ApproxFunc}), but it is fairly obvious that such a function is represented uniformly
except in the neighborhood of the point $a$. If $F(x)$ is continuous at and in the neighborhood of $a$, or
if $a$ is dyadically rational, the approximation to $G(x)$ is uniform at the point $a$ as well. Thus we have
%%--------------------------------------------------------------------------------------
\begin{theorem}
If $F(x)$ is any integrable function and if ${\displaystyle \lim_{x=a} F(x)}$ exists for a point $a$, then when the terms of the
series (5) are grouped as described in Theorem II, the series so obtained converges for $x=a$ to the value
${\displaystyle \lim_{x=a}F(x)}$. If $F(x)$ is continuous at and in the neighborhood of $a$, then this convergence is
uniform in a neighborhood of $a$.
If $F(x)$ is any integrable function and if the limits $F(a-0)$ and $F(a+0)$ exist for a dyadically rational point
$x = a$, then the series with the terms grouped converges for $x=a$ to the value $\frac{1}{2}[F(a+0)+F(a-0)]$;
this convergence is uniform in the neighborhood of the point $x=a$ if $F(x)$ is continuous on two intervals
extending from $a$, one in each direction.
\end{theorem}
%%--------------------------------------------------------------------------------------
It is now time to study the convergence of series (5) when the terms are not grouped as in Theorems II and III. We shall establish
%%--------------------------------------------------------------------------------------
\begin{theorem}
Let the function $F(x)$ be of limited variation in the interval $0 \leqq x \leqq 1$. Then the series (5) converges
to the value $F(x)$ at every point at which $F(a+0)=F(a-0)$ and at every point at which $x=a$ is dyadically
rational. This convergence is uniform in the neighborhood of $x=a$ in each of these cases if $F(x)$ is continuous
in two intervals extending from $a$, one in each direction.
\end{theorem}
%%--------------------------------------------------------------------------------------
Since $F(x)$ is of limited variation, $F(a+0)$ and $F(a-0)$ exist at every point $a$. Theorem IV tacitly assumes
$F(x)$ to be defined at every point of discontinuity $a$ so that $F(a)=\frac{1}{2}[F(a+0)+F(a-0)]$.
Any such function $F(x)$ can be considered as the difference of two monotonically increasing functions, so
the theorem will be proved if it is proved merely for a monotonically increasing function. We shall assume
that $F(x)$ is such a function, and positive. We are to evaluate the limit of
\[ \begin{array}{c} \displaystyle
\int_0^1 F(y)K_n^{(k)}(x,y)dy, \\
~ \\
K_n^{(k)}(x,y)=\varphi_0(x)\varphi_0(y) + \varphi_1(x)\varphi_1(y) + \cdots + \varphi_n^{(k)}(x)\varphi_n^{(k)}(y).
\end{array}\]
We have already evaluated this limit for the sequence $k=2^{n-1}$, so it remains merely to prove that
\begin{eqnarray}
\lim_{n=\infty}\int_0^1 F(y)Q_n^{(k)}(x,y)dy = 0,
\end{eqnarray}
\[ \begin{array}{c} \displaystyle
~ \\
Q_n^{(k)}(x,y)=\varphi_n^{(1)}(x)\varphi_n^{(1)}(y) + \varphi_n^{(2)}(x)\varphi_n^{(2)}(y) + \cdots + \varphi_n^{(k)}(x)\varphi_n^{(k)}(y),
\end{array} \]
whatever may be the value of $k$.
We shall consider the function $F(x)$ merely at a point $x=a$ of continuity; that is, we study essentially
the new functions $F_1$ and $F_2$ defined by equations (6). In the sequel we suppose $a$ to be dyadically irrational; the necessary modifications for $a$ rational can be made by the reader.
The following formulas are easily found by the definition of the $Q_n^{(k)}$; both $x$ and $y$ are supposed dyadically irrational:
%%
\[\begin{array}{rcl}
Q_2^{(1)}(x,y) & = & \pm 1, \\
Q_2^{(2)}(x,y) & = & \left\{ \begin{array}{l}
0 ~\mbox{if}~ x < \frac{1}{2}, y > \frac{1}{2} ~\mbox{or if}~ x > \frac{1}{2}, y < \frac{1}{2}, \\
\pm 2 ~\mbox{if}~ x < \frac{1}{2}, y < \frac{1}{2} ~\mbox{or if}~ x > \frac{1}{2}, y > \frac{1}{2},
\end{array} \right. \\
.\hspace{6mm}.\hspace{6mm}. & . & .\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}. \\
Q_n^{(1)}(x,y) & = & \pm 1, \\
Q_n^{(2)}(x,y) & = & \left\{ \begin{array}{l}
0 ~\mbox{if}~ x < \frac{1}{2}, y > \frac{1}{2} ~\mbox{or if}~ x > \frac{1}{2}, y < \frac{1}{2}, \\
2Q_{n-1}^{(1)}(2x,2y) ~\mbox{if}~ x<\frac{1}{2}, y<\frac{1}{2}, \\
2Q_{n-1}^{(1)}(2x-1,2y-1) ~\mbox{if}~ x > \frac{1}{2}, y > \frac{1}{2},
\end{array} \right. \\
.\hspace{6mm}.\hspace{6mm}. & . & .\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}.\hspace{6mm}. \\
Q_n^{(2k)}(x,y) & = & \left\{ \begin{array}{l}
0 ~\mbox{if}~ x < \frac{1}{2}, y > \frac{1}{2} ~\mbox{or if}~ x > \frac{1}{2}, y < \frac{1}{2}, \\
2Q_{n-1}^{(k)}(2x,2y) ~\mbox{if}~ x<\frac{1}{2}, y<\frac{1}{2}, \\
2Q_{n-1}^{(k)}(2x-1,2y-1) ~\mbox{if}~ x > \frac{1}{2}, y > \frac{1}{2},
\end{array} \right. \\
Q_n^{(2k+1)}(x,y) & = & \left\{ \begin{array}{l}
\pm 1 ~\mbox{if}~ x < \frac{1}{2}, y > \frac{1}{2} ~\mbox{or}~ x > \frac{1}{2}, y < \frac{1}{2}, \\
{\displaystyle \frac{Q_n^{(2k)} + Q_n^{(2k+2)}}{2}} ~\mbox{if}~ x<\frac{1}{2}, y<\frac{1}{2} ~\mbox{or if}~ x > \frac{1}{2}, y > \frac{1}{2}.
\end{array} \right.
\end{array}\]
The integral in (7) for $x=a$ is to be divided into three parts. Consider an interval bounded by two points of the form ${\displaystyle x=\frac{\rho}{2^\nu}}$, ${\displaystyle x=\frac{\rho+1}{2^\nu}}$, where $\rho$ and $\nu$ are integers and such that
%%
\[
\frac{\rho}{2^\nu} < a < \frac{\rho+1}{2^\nu}.
\]
%%
Then we have
%%
\begin{equation}
\begin{array}{l}
\displaystyle \int_0^1 F_1(y)Q^{(k)}_n(a,y)\mathit{dy} ~=~ \int_0^{\rho/2^\nu} F_1(y)Q^{(k)}_n(a,y)\mathit{dy} \\
\displaystyle \hspace{10mm} + \int_{\rho/2^\nu}^{(\rho+1)/2^\nu} F_1(y)Q^{(k)}_n(a,y)\mathit{dy} + \int_{(\rho+1)/ 2^\nu}^1 F_1(y) Q^{(k)}_n(a,y)\mathit{dy}.
\end{array}
\end{equation}
%%
These integrals on the right need separate consideration.
Let us set
%%
\[
\displaystyle \hspace{15mm} \frac{\rho}{2^\nu} = \frac{\mu_1}{2^1} + \frac{\mu_2}{2^2} + \frac{\mu_3}{2^3} + \cdots + \frac{\mu_\nu}{2^\nu},
\hspace{16mm} \mu_i = 0 ~\mbox{or}~ 1.
\]
%%
The first integral in the right-hand member of (8) can be written
%%
\begin{equation}
\int_0^{\mu_1/2^1} + \int_{\mu_1/2^1}^{(\mu_1/2^1)+(\mu_2/2^2)} + \cdots \int_{(\rho/2^\nu)-(\mu_\nu/2^\nu)}^{\rho/2^\nu} F_1(y)Q^{(k)}_n(a,y)\mathit{dy}.
\end{equation}
%%
Each of the integrals is readily treated. Thus, on the interval $0 \leqq y \leqq \frac{\mu_1}{2^1}$, $Q^{(k)}_n(a,y)$ takes only the values $\pm$ 1 or 0, is 0 if $k$ is even and has the value $\pm \varphi^{(k)}_n(y)$ if $k$ is odd. It is of course true that
%%
\begin{equation}
\lim_{n = \infty} \int_0^1 \Phi(y)\varphi^{(k)}_n(y) \mathit{dy} = 0
\end{equation}
%%
no matter what may be the function $\Phi(y)$ integrable in the sense of Lebesgue and with an integrable square\footnote[1]{This well-known fact follows from the convergence of the series
%%
\[
\Sigma (a^{(k)}_n)^2,
\]
%%
proved from the inequality
%%
\[
\int_0^1 (\Phi(x) - a_0\varphi_0 - a_1\varphi_1 - a^{(1)}_2\varphi^{(1)}_2 - \cdots - a^{(k)}_n\varphi^{(k)}_n)^2\mathit{dx} \geqq 0,
\]
%%
where $a^{(k)}_n = \int_0^1 \Phi(y)\varphi^{(k)}_n(y)\mathit{dy}$.}. Hence we have
%%
\[
\lim_{n = \infty} \int_0^{\mu_1/2^1} F_1(y)Q^{(k)}_n(a,y)\mathit{dy} = 0.
\]
On the interval $\frac{\mu_1}{2^1} \leqq y \leqq \frac{\mu_1}{2^1} + \frac{\mu_2}{2^2}$, the function $Q^{(k)}_n(a,y)$ takes only the values 0, $\pm 1$, $\pm 2$, and except for one of these numbers as constant factor, has the value $\varphi^{(k)}_n(y)$. It is thus true that
%%
\[
\lim_{n = \infty} \int_{\mu_1/2^1}^{(\mu_1/2^1)+(\mu_2/2^2)} F_1(y)Q^{(k)}_n(a,y)\mathit{dy} = 0.
\]
From the corresponding result for each of the integrals in (9) and a similar treatment of the last integral in the right-hand member of (8), we have
%%
\begin{equation} \begin{array}{l} \displaystyle
\lim_{n=\infty} \int_0^{\rho/2^\nu} F_1(y)Q^{(k)}_n(a,y)\mathit{dy} = 0, \\
~ \\
\displaystyle \lim_{n=\infty} \int_{(\rho+1)/2^\nu}^{1} F_1(y)Q^{(k)}_n(a,y)\mathit{dy} = 0 .
\end{array} \end{equation}
We shall obtain an upper limit for the second integral in (8) by the second law of the mean. We notice that
%%
\[
\left | \int_\xi^{(\rho+1)/2^\nu} Q^{(k)}_n(a,y)\mathit{dy} \right | \leqq \frac{1}{2},
\]
%%
whatever may be the value of $\xi$. In fact, this relation is immediate if $n$ is small and it follows for the larger values of $n$ by virtue of the method of construction of the $Q^{(k)}_n$. Moreover, if $n \geqq \nu$ and if $\xi = \frac{\rho}{2^\nu}$, this integral has the value zero. We therefore have from the second law of the mean, $n \geqq \nu$,
%%
\[ \begin{array}{l}
\displaystyle \int_{\rho/2^\nu}^{(\rho+1)/2^\nu} F_1(y)Q^{(k)}_n(a,y)\mathit{dy} = F_1 \left ( \frac{\rho}{2^\nu} \right ) \int_{\rho/2^\nu}^{\xi} Q^{(k)}_n(a,y)\mathit{dy} \\
\displaystyle ~~~~~~~~~~~~~ + F_1 \left ( \frac{\rho + 1}{2^\nu} \right ) \int_{\xi}^{(\rho+1)/2^\nu} Q^{(k)}_n(a,y)\mathit{dy} \\
\displaystyle ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ = \left [ F_1(a) - F_1 \left ( \frac{\rho}{2^\nu} \right ) \right ] \int_{\xi}^{(\rho+1)/2^\nu} Q^{(k)}_n(a,y)\mathit{dy}.
\end{array} \]
%%
By a proper choice of the point $\frac{\rho}{2^\nu}$ we can make the factor of this last integral as small as desired; the entire expression will be as small as desired for sufficiently large $n$. The relations (11) are independent of the choice of $\frac{\rho}{2^\nu}$, so (7) is completely proved for the function $F_1$. A similar proof applies to $F_2$, so (7) can be considered as completely proved for the original function $F(x)$.
The uniform convergence of (5) as stated in Theorem IV follows from the uniform continuity of $F(x)$ and will be readily established by the reader.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{\bf Further Expansion of the set $\varphi$.}
\label{sec:FurtherExpansion}
The least square interpretation already given for the partial sums and the expression of the $\varphi$'s in terms of the $f$'s show that if the terms of (5) are grouped as in Theorems II and III, the question of convergence or divergence of the series at a point depends merely on that point and the nature of the function $F(x)$ in the neighborhood of that point. This same fact for series (5) when the terms are not grouped follows from (8) and (10) if $F(x)$ is integrable and with an integrable square. We shall further extend this result and prove:
%%--------------------------------------------------------------------------------------
\begin{theorem}
If $F(x)$ is any integrable function, then the convergence or divergence of the series (5) at a point depends merely on that point and the behaviour of the function in the neighborhood of that point. If in particular $F(x)$ is of limited variation in the neighborhood of a point $x=a$, and if $a$ is dyadically rational or if $F(a-0) = F(a+0)$, then series (5) converges for $x=a$ to the value $\frac{1}{2}[F(a-0)+F(a+0)]$. If $F(x)$ is not only of limited variation but is also continuous in two neighborhoods one on each side of $a$, and if $a$ is dyadically rational or if $F(a-0)=F(a+0)$, the convergence of (5) is uniform in the neighborhood of the point $a$.
\end{theorem}
%%--------------------------------------------------------------------------------------
Theorem V follows immediately from the reasoning already given and from (10) proved without restriction on $\Phi$; we state the theorem for any bounded normal orthogonal set of functions $\psi_n$:
%%--------------------------------------------------------------------------------------
\begin{theorem}
If $\{\psi_n(x)\}$ is a uniformly bounded set of normal orthogonal functions on the interval $(0,1)$, and if $\Phi(x)$ is any integrable function, then
%%
\begin{equation}
\lim_{n=\infty} \int_0^1 \Phi(x)\psi_n(x)\mathit{dx} = 0.
\end{equation}
\end{theorem}
%%--------------------------------------------------------------------------------------
Denote by E the point set which contains all points of the interval for which $|\Phi(x)|>N$; we choose $N$ so large that
\[
\int_E |\Phi(x)|\mathit{dx} < \epsilon,
\]
where $\epsilon$ is arbitrary. Denote by $\mbox{E}_1$ the point set complementary to E; then we have
\[
\int_0^1 \Phi(x)\psi_n(x)\mathit{dx} = \int_E \Phi(x)\psi_n(x)\mathit{dx} + \int_{E_1} \Phi(x)\psi_n(x)\mathit{dx}.
\]
It follows from the proof of (10) already indicated that the second integral on the right approaches zero as $n$ becomes infinite. The first integral is in absolute value less than $M\epsilon$ whatever may be the value of $n$, where $M$ is the uniform bound of the $\psi_n$. It therefore follows that these two integrals can be made as small as desired, first by choosing $\epsilon$ sufficiently small and then by choosing $n$ sufficiently large\footnote[1]{Theorem VI is proved by essentially this method for the set $\psi_n(x) = \sqrt{2}\sin n\pi x$ by Lebesgue, {\it Annales scientifiques de l'\'{e}cole normale sup\'{e}rieure}, ser.\@ 3, Vol.\@ XX, 1903. See also Hobson, {\it Functions of a Real Variable} (1907), p.\@ 675, and Lebesgue, {\it Annales de la Facult\'{e} des Science de Toulouse}, ser 3, Vol.\@ I (1909), pp.\@ 25--117, especially p.\@~52.}.
It is interesting to note that Theorem VI breaks down if we omit the hypothesis that the set $\psi_n$ is uniformly bounded. In fact Theorem VI does not hold for Haar's set $\chi$. Thus consider the function
%%
\[ \begin{array}{lcr} \displaystyle
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ & \Phi(x) = (x - \frac{1}{2})^{-\nu}, & ~~~~~~~~~~~~~~~~~~~~~~~~~~~ \nu < 1.
\end{array} \]
%%
We have
%%
\[ \begin{array}{l}
\displaystyle \int_0^1 \Phi(x)\chi_n^{(2^{n-2}+1)}(x)\mathit{dx} = \sqrt{2^{n-1}} \int_{1/2}^{1/2+1/2^n} (x-{\textstyle \frac{1}{2}})^{-\nu}\mathit{dx} \\
\displaystyle ~~~~~~~~~~~~~~~ - \sqrt{2^{n-1}} \int_{1/2+1/2^n}^{1/2+1/2^{n-1}} (x-{\textstyle \frac{1}{2}})^{-\nu}\mathit{dx} = \frac{(2^{n-1})^{\nu-(1/2)}}{1-\nu}[2\nu-1].
\end{array} \]
%%
Whenever $\nu \geqq \frac{1}{2}$, it is clear that (12) cannot hold, and if $\nu > \frac{1}{2}$, there is a sub-sequence of the sequence in (12) which actually becomes infinite.
We turn now from the study of the convergence of such a series expansion as (5) to the study of the summability of such expansions, and are to prove
%%
\begin{theorem}
If $F(x)$ is continuous in the closed interval $(0,1)$, the series (5) is summable uniformly in the entire interval to the sum $F(x)$.
If $F(x)$ is integrable in the interval $(0,1)$, and if $F(a-0)$ and $F(a+0)$ exist, and if either $F(a-0)=F(a+0)$ or $a$ is dyadically rational, then the series (5) is summable for $x=a$ to the value $\frac{1}{2}[F(a-0)+F(a+0)]$. If $F(x)$ is continuous in the neighborhood of the point $x=a$, or if $a$ is dyadically rational and $F(x)$ continuous in the neighborhood of $a$ except for a finite jump at $a$, the summability is uniform throughout a neighborhood of that point.
\end{theorem}
In this theorem and below, the term {\it summability} indicates summability by the first Ces\`{a}ro mean.
We shall find it convenient to have for reference the following
%%
\begin{lemma}
Suppose the series
\begin{equation}
\begin{array}{l}
(b_1 + b_2 + \cdots + b_{n_1}) + (b_{n_1+1} + b_{n_1+2} + \cdots + b_{n_2} + \cdots \\
\hspace{48mm} + (b_{n_k+1} + b_{n_k+2} + \cdots + b_{n_{k+1}}) + \cdots
\end{array}
\end{equation}
%%
converges to the sum $B$ and that the sequence
%%
\begin{equation}
\begin{array}{l}
\displaystyle b_1, \frac{2b_1 + b_2}{2}, \frac{3b_1 + 2b_2 + b_3}{3}, \cdots \\
\displaystyle \hspace{41mm} \frac{(n_1 - 1)b_1 + (n_1 - 2)b_2 + \cdots + b_{n_1-1}}{n_1 - 1}, \\
\displaystyle \frac{(n_1-1)b_1 + \cdots + b_{n_1-1}}{n}, \\
\displaystyle \hspace{29mm} \frac{(n_1-1)b_1 + (n_1-2)b_2 + \cdots + b_{n_1-1} + b_{n_1+1}}{n_1+1}, \\
\displaystyle \frac{(n_1-1)b_1 + \cdots + b_{n_1-1} + 2b_{n_1+1} + b_{n_1+2}}{n_1+2}, \cdots \\
\displaystyle (n_1-1)b_1 + \cdots + b_{n_1-1} + (n_2-n_1-1)b_{n_1+1} \\
\displaystyle \frac{\hspace{48mm}+ (n_2-n_1-2)b_{n_1+2}+\cdots+b_{n_2-1}}{n_2-1}, \\
\displaystyle \cdots, \\
\end{array}
\end{equation}
%%
converges to zero. Then the series
\begin{equation}
b_1 + b_2 + b_3 + \cdots
\end{equation}
is summable to the sum $B$.
\end{lemma}
This lemma involves merely a transformation of the formulas involving the limit notions. Insert zeroes in series (13) so that the parentheses are respectively the $n_1$-th, $n_2$-th, $n_3$-th terms of the new series; this new series converges to the sum $B$ and hence is summable to the sum $B$. The term-by-term difference of the new series and (15) is the series
%%
\begin{equation}
\begin{array}{l}
\displaystyle b_1 + b_2 + \cdots + b_{n1-1} - (b_1+b+2+\cdots+b_{n_1-1})+b_{n_1+1}+b_{n_1+2} \\
\displaystyle \hspace{22mm} +\cdots+ b_{n_1-1}-(b_{n_1+1}+b_{n_1+2}+\cdots+b_{n_2-1})+\cdots,
\end{array}
\end{equation}
%%
which is to be shown to be summable to the sum zero. The sequence corresponding to the summation of (16) is precisely (14).
A sufficient condition for the convergence to zero of (14) is that we have, independently of $m$,
%%
\begin{equation}
\lim_{k=\infty}\frac{mb_{n_k+1}+(m-1)b_{n_k+2}+\cdots+b_{n_k+m}}{m}=0, ~~~~~~m \leqq n_{k+1}-n_k,
\end{equation}
%%
for from a geometric point of view each term of the sequence (14) is the center of gravity of a number of terms such as occur in (17), each term weighted according to the number of $b_i$ that appears in it. An $(\epsilon,\delta)$-proof can be supplied with no difficulty.
For the case of Theorem VII let us assume $F(x)$ integrable and that $F(a-0)$ and $F(a+0)$ exist. The series (15) is to be identified with the series (5), and (13) with (5) after the terms are grouped as in Theorem III. The sum that appears in (17) is, then, for $x=a$,
%%
\begin{equation}
\begin{array}{l}
\displaystyle \frac{1}{m}\int_0^1 [m\varphi^{(1)}_n(a)\varphi^{(1)}_n(y)+(m-1)\varphi^{(2)}_n(a)\varphi^{(2)}_n(y) + \cdots \\
\displaystyle \hspace{50mm} + \varphi^{(m)}_n(a)\varphi^{(m)}_n(y)]F(y)\mathit{dy}, ~~~m \leqq 2^{n-1}.
\end{array}
\end{equation}
%%
We shall prove that (18) formed for the function $F_1(y)$ defined in (6) and for $a$ dyadically irrational has the limit zero as $n$ becomes infinite.
Let us notice that
%%
\begin{equation}
\begin{array}{l}
\displaystyle \frac{1}{m}\int_0^1 | m\varphi^{(1)}_n(a)\varphi^{(1)}_n(y)+(m-1)\varphi^{(2)}_n(a)\varphi^{(2)}_n(y)+\cdots \\
\displaystyle \hspace{70mm} + \varphi^{(m)}_n(a)\varphi^{(m)}_n(y) | \mathit{dy} = 1.
\end{array}
\end{equation}
%%
This follows directly from (3) and (4). The value of the integral in (19) is unchanged it we replace $a$ by any dyadic irrational $b$. Choose $0 < b < 2^{-n}$, so that all the functions $\varphi_0,\varphi_1,\varphi_2,\cdots,\varphi_{m-1}$ are positive for $x=b$. Then the integrand in (19) can be reduced merely to $m\varphi_0(y)$, so (19) is proved.
Let us consider the integral (18) formed for the function $F_1(y)$ to be divided as in (8), where as before
\[
\frac{\rho}{2^\nu} < a < \frac{\rho+1}{2^\nu},
\]
and let us denote by (20), (21), (22), (23) respectively the entire integral and its three parts. Then (22) can be made as small as desired simply by proper choice of the point $\displaystyle \frac{\rho}{2^\nu}$, for the interval $\displaystyle \left( \frac{\rho}{2^\nu},\frac{\rho+1}{2^\nu} \right)$ we can make $|F_1(y)-F_1(a)|$ uniformly small, we have established (19), and we have also
%%
\begin{equation*}
\begin{array}{l}
\displaystyle \int_{\rho/2^\nu}^{(\rho+1)/2^\nu} [m\varphi^{(1)}_n(a)\varphi^{(1)}_n(y)+(m-1)\varphi^{(2)}_n(a)\varphi^{(2)}_n(y) \\
\displaystyle \hspace{65mm} +\cdots+ \varphi^{(m)}_n(a)\varphi^{(m)}_n(y)]F_1(a) \mathit{dy} = 0
\end{array}
\end{equation*}
%%
if merely $n>\nu$.
The integral (21) is the average of $m$ integrals of the type that appear in (8):
\[
~~~~~~~~~~~~~~~~~~~~~~~~~\int_0^{\rho/2^\nu}F_1(y)Q^{(k)}_n(a,y)\mathit{dy},~~~~~~~~~~~~~~~~k=1,2,\cdots,m.
\]
Thus the entire integral (21) approaches zero as $n$ becomes infinite. Treatment in a similar way of the integral (23) proves that (20) approaches zero. It is likewise true that (18) formed for the function $F_2(y)$ also approaches zero as $n$ becomes infinite. This completes the proof of the second sentence in Theorem VII for a dyadic irrational; we omit the proof for a dyadic rational. The uniformity of the continuity of $F(x)$ gives us readily the remaining parts of Theorem VII.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{\bf Not Every Continuous Function Can Be Expanded in Terms of the $\varphi$.}
\label{sec:NotEveryFunc}
The summability of the expansions of continuous functions in terms of the functions $\varphi$ is another point of resemblance of those functions to the Fourier sine and cosine functions. Still another point of resemblance which we shall now establish is that there exists a continuous function whose expansion in terms of the $\varphi$'s does not converge at every point of the interval.
Our proof rests on a beautiful theorem due to Haar\footnote[1]{L.\@c., p.\@335. This condition holds for any set of normal orthogonal functions and is necessary as well as sufficient, if a slight restriction is added.}, by virtue of which the existence of such a continuous function will be shown if we prove merely that
\setcounter{equation}{23}
\begin{equation}
\int_0^1 | K^{(k)}_n(a,y) | \mathit{dy}
\end{equation}
is not bounded uniformly for all $n$ and $k$. The point $a$ is a point of divergence of the expansion of the continuous function and for our particular case may be chosen any point of the interval $(0,1)$. We shall study (24) in detail merely for $a$ dyadically irrational; the integral (24) is independent of the point $a$ chosen if $a$ is dyadically irrational.
The integral (24) is bounded uniformly for all the values $n$ if $k=2^{n-1}$, so it will be sufficient to consider the integral
\[
c^{(k)}_n = \int_0^1 | Q^{(k)}_n(a,y) | \mathit{dy}.
\]
The following table shows the value of $c^{(k)}_n$ for small values of $n$ and for each value of $k$:
\
\begin{tabular}{p{11mm}p{3.5mm}p{3.5mm}p{3.5mm}p{3.5mm}p{3.5mm}p{3.5mm}p{3.5mm}p{3.5mm}p{3.5mm}p{3.5mm}p{3.5mm}p{3.5mm}p{3.5mm}p{3.5mm}p{3.5mm}p{3.5mm}}
$n=2$& \multicolumn{8}{c}{1} & \multicolumn{8}{c}{1} \\
$n=3$& \multicolumn{4}{c}{1} & \multicolumn{4}{c}{1} & \multicolumn{4}{c}{$1\frac{1}{2}$} & \multicolumn{4}{c}{1} \\
$n=4$& \multicolumn{2}{c}{1} & \multicolumn{2}{c}{1} & \multicolumn{2}{c}{$1\frac{1}{2}$} & \multicolumn{2}{c}{1} & \multicolumn{2}{c}{$1\frac{3}{4}$} & \multicolumn{2}{c}{$1\frac{1}{2}$} & \multicolumn{2}{c}{$1\frac{3}{4}$} & \multicolumn{2}{c}{1} \\
$n=5$& 1, & 1, & $1\frac{1}{2}$, & 1, & $1\frac{3}{4}$, & $1\frac{1}{2}$, & $1\frac{3}{4}$, & $1$, & $1\frac{7}{8}$, & $1\frac{3}{4}$, & $2\frac{1}{8}$, & $1\frac{1}{2}$, & $2\frac{1}{8}$, & $1\frac{3}{4}$, & $1\frac{7}{8}$, & $1$, \\
~ & . & . & . & . & . & . & . & . & . & . & . & . & . & . & . & .
\end{tabular}
~\newline
We have the general formulas
\begin{eqnarray}
c^{(1)}_n & = & c^{(2n+1)}_n = 1, \nonumber \\
c^{(k)}_n & = & c^{(2k)}_{n+1}, \nonumber \\
c^{(2k+1)}_{n+1} & = & \mbox{$\frac{1}{2}$}[c^{(k)}_n + c^{(k+1)}_n] + \mbox{$\frac{1}{2}$}, \nonumber
\end{eqnarray}
so the $c^{(k)}_n$ are not uniformly bounded.
\begin{theorem}
If a point $a$ is arbitrarily chosen, there will exist a continuous function whose $\varphi$-development does not converge at $a$.
\end{theorem}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{\bf The Approximation to a Function at a Discontinuity.}
\label{sec:ApproxFunc}
We have considered in \S~\ref{sec:ExpansionofSetPhi} and \S~\ref{sec:FurtherExpansion} with a fair degree of completeness the nature of the approach to $F(x)$ of the formal development of an arbitrary function $F(x)$ in the neighborhood of a point of continuity of $F(x)$. We shall now consider the approach to $F(x)$ of this formal development in the neighborhood of a point of discontinuity of $F(x)$. We study this problem merely for a function which is constant except for a single discontinuity, a finite jump, but this leads directly to similar results for any function $F(x)$ at an isolated discontinuity which is a finite jump, if $F(x)$ is of such a nature that the expansion of $F(x)$ would converge uniformly in the neighborhood of the point of discontinuity were that discontinuity removed by the addition of a function constant except for a finite jump.
Let us consider the function
\[
f(x) = \left\{ \begin{array}{ll}
1, & 0 \leqq x < a, \\
0, & a < x \leqq 1.
\end{array} \right .
\]
If $a$ is dyadically rational, $f(x)$ can be expressed as a finite sum of functions $\varphi$\footnote[1]{A discontinuity at $x=0$ or $x=1$ is slightly different [compare the first footnote of \S~\ref{sec:SetPhi}]. Under the present definition of the $\varphi$'s it acts like an artificial discontinuity in the interior of the interval and has no effect on the sequence representing the function.}, and thus is represented uniformly, if we make the definition $f(a)=\frac{1}{2}[f(a-0)+f(a+0)]$; this follows from the evident possibility of expanding $f(x)$ in terms of the functions $f_0, f_1, f^{(1)}_2, \cdots$.
If the point $a$ is dyadically irrational, $f(x)$ {\it cannot be expanded in terms of the $\varphi$}. The formal development of $f(x)$ converges in fact for every value of $x$ other than $a$ and diverges for $x=a$\footnote[2]{This was pointed out for the set $\chi$ by Faber, {\it Jahresbericht der deutschen Mathematiker-Vereinigung}, Vol.\@ 19 (1910), pp.\@ 104--112.}. The convergence for $x \not= a$ follows, indeed, from Theorem IV. We proceed to demonstrate the divergence.
Use the dyadic notation
%%
\[\begin{array}{rr}
\displaystyle ~~~~~~~~~~~~~~~~~~~~a=\frac{a_1}{2^1} + \frac{a_2}{2^2} + \frac{a_3}{2^3} + \cdots, & ~~~~~~~~~~~~~~~~~~~~~a_n = 0 ~\mbox{or}~ 1.
\end{array}\]
%%
The partial sum
%%
\begin{equation*}
\begin{array}{l}
\displaystyle S^{(k)}_n(x) = \varphi_0(x) \int_0^1 f(y)\varphi_0(y)\mathit{dy} + \varphi_1(x)\int_0^1 f(y)\varphi_1(y)\mathit{dy} \\
\displaystyle \hspace{70mm} +\cdots+ \varphi^{(k)}_n(x) \int_0^1 f(y)\varphi^{(k)}_n(y)\mathit{dy}
\end{array}
\end{equation*}
%%
is in the sense of least squares the best approximation to $f(x)$ that can be formed from the functions $\varphi_0, \varphi_1, \cdots, \varphi^{(k)}_n$. It is therefore true that when $k=2^{n-1}$, on every subinterval $\displaystyle \left( \frac{r}{2^n},\frac{r+1}{2^n}\right)$ on which $f(x)$ is constant, $S^{(k)}_n(x)$ is also constant and equal to $f(x)$. On that subinterval $\displaystyle \left( \frac{m}{2^n},\frac{m+1}{2^n}\right)$ which contains the point $a$, $S^{(k)}_n(x)$ has the value
%%
\begin{equation}
2^n a-m=\frac{a_{n+1}}{2^1} + \frac{a_{n+2}}{2^2}+\frac{a_{n+3}}{2^3}+\cdots,
\end{equation}
which lies between zero and unity. Thus $S^{(k)}_n(x) [n>1]$ is a function with two points of discontinuity and which takes on three distinct values at its totality of points of continuity.
The infinite series corresponding to the sequence (25) is
%%
\begin{equation}
\begin{array}{l}
\displaystyle \left( \frac{a_2}{2^1}+\frac{a_3}{2^2}+\frac{a_4}{2^3}+\cdots\right)+\left(\frac{a_3}{2^2}+\frac{a_4}{2^3}+\cdots-\frac{a_2}{2}\right) \\
~ \\
\displaystyle \hspace{38mm} +\left( \frac{a_4}{2^2}+\frac{a_5}{2^3}+\frac{a_6}{2^4}+\cdots-\frac{a_3}{2}\right) \\
~ \\
\displaystyle \hspace{50mm} +\left( \frac{a_5}{2^2}+\frac{a_6}{2^3}+\frac{a_7}{2^4}+\cdots-\frac{a_4}{2}\right)+\cdots.
\end{array}
\end{equation}
%%
Not all numbers $a_n$ after a certain point can be zero and not all of them can be unity, so the general term of the series (26) cannot approach zero and the sequence (25) cannot converge.
It is likewise true that the sequence (25) is not always summable and if summable may not be summable to the value $\frac{1}{2}$. Thus if we choose
\[
a=\frac{1}{2}+\frac{1}{2^2}+\frac{0}{2^3}+\frac{1}{2^4}+\frac{1}{2^5}+\frac{0}{2^6}+\frac{1}{2^7}+\cdots,
\]
the sequence (25) is summable to the sum $\frac{2}{3}$. Likewise the sequence $S^{(k)}_n(x)$ for $x=a$ and where we consider all values of $n$ and $k$, is summable to the value $\frac{2}{3}$.
The general behaviour of $S^{(k)}_n(x)$ for $f(x)$ where we do not make the restriction $k=2^{n-1}$ is quite easily found from the behaviour for $k=2^{n-1}$ and the relation
\[
\varphi^{(i)}_n(a) \int_0^1 f(y)\varphi^{(i)}_n(y)\mathit{dy} = \varphi^{(k)}_n(a)\int_0^1f(y)\varphi^{(k)}_n(y)\mathit{dy},
\]
which holds for all values of $i$, $k$, and $n$.
In fact there occurs a phenomenon quite analogous to Gibbs's phenomenon for Fornier's series. For the set $\varphi$, the approximating functions are uniformly bounded. The peaks of the approximating function $S^{(k)}_n$ disappear entirely for $k=2^{n-1}$ but reappear (usually altered in height) for larger values of $n$.
It is clear that the facts concerning the approximating curves for $f(x)$ hold without essential modification for a function of limited variation at a simple finite discontinuity, and that the facts for the summation of the approximating sequence hold without essential modification for a function continuous except at a simple finite discontinuity.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{\bf The Uniqueness of Expansions.}
\label{Uniqueness}
We now study the possibility of a series of the form
\begin{equation}
a_0\varphi_0(x) + a_1\varphi_1(x)+ \cdots +a_n\varphi_n(x)+\cdots
\end{equation}
which converges on $0 \leqq x \leqq 1$ to the sum zero, with the possible exception of a certain number of points $x$. Faber has pointed out\footnote[1]{L.\@ c., p.\@ 111.} that there exists a series of the functions $\chi^{(k)}_n(x)$ which converges to zero except at one single point, and the convergence is uniform except in the neighborhood of that point.
We state for reference the easily proved
%%
\begin{lemma}
If the series (27) converges for even one dyadically irrational value of $x$, then $\displaystyle \lim_{n=\infty}a_n=0$.
\end{lemma}
This lemma results immediately from the fact that $\varphi^{(k)}_n(x) = \pm 1$ if $x$ is dyadically rational\footnote[1]{This lemma is closely connected with a general theorem due to Osgood, {\it Transactions of the American Mathematical Society}, Vol.\@ 10 (1909), pp.\@ 337--346.
See also Plancherel, {\it Mathematische Annalen}, Vol.\@ 68 (1909--1910), pp.\@ 270-278.}.
We shall now use this lemma to establish
%%
\begin{theorem}
If the series (27) converges to the sum zero uniformly except in the neighborhood of a single value of $x$, then $a_n=0$ for every $n$.
\end{theorem}
We phrase the argument to apply when this exceptional value $x_1$ is dyadically irrational. If $x_1>\frac{1}{2}$, we have for $0 \leqq x \leqq \frac{1}{2}$,
\[\begin{array}{rcl}
\displaystyle a_0\varphi_0(x) + a_1\varphi_1(x)+ \cdots +a_n\varphi_n(x)+\cdots & = & 0, \\
\displaystyle (a_0+a_1)\varphi_0(x) + (a_2+a_3)\varphi_1(x)+ (a_4+a_5)\varphi_2(x)+\cdots & = & 0,
\end{array}\]
for every value of $y=2x$. Then we have from the uniformity of the convergence,
\begin{equation}
a_0 + a_1 = 0, \hspace{10mm} a_2 + a_3 = 0, \hspace{10mm} a_4 + a_5 = 0, \hspace{10mm} \cdots.
\end{equation}
If $x_1 < \frac{3}{4}$, we have for $\frac{3}{4} \leqq x \leqq 1$,
\[
a_0\varphi_0(x) + a_1\varphi_1(x)+ \cdots +a_n\varphi_n(x)+\cdots = 0,
\]
or for $0 \leqq y \leqq 1$, $y = 4x-3$,
%%
\begin{equation*}
\begin{array}{l}
\displaystyle (a_0-a_1+a_2-a_3)\varphi_0(y)+(a_4-a_5+a_6-a_7)\varphi_1(y) \\
\displaystyle \hspace{43mm} +(a_{4n}-a_{4n+1}+a_{4n+2}-a_{4n+3})\varphi_n(y) +\cdots=0.
\end{array}
\end{equation*}
%%
From the uniformity of the convergence we have
%%
\[\begin{array}{cc}
\displaystyle a_0-a_1+a_2-a_3 = 0,\\
\displaystyle a_4-a_5+a_6-a_7 = 0,\\
.\hspace{5mm}.\hspace{5mm}.\hspace{5mm}.\hspace{5mm}.\hspace{5mm}.\hspace{5mm},
\end{array}\]
or from (28)
\[\begin{array}{cc}
\displaystyle a_0=-a_1=-a_2=a_3,\\
\displaystyle a_4=-a_5=-a_6=a_7,\\
.\hspace{5mm}.\hspace{5mm}.\hspace{5mm}.\hspace{5mm}.\hspace{5mm}.\hspace{5mm},
\end{array}\]
If $x_1>\frac{5}{8}$, we have for $\frac{5}{8} \leqq x \leqq \frac{3}{4}$,
\[
a_0\varphi_0(x) + a_1\varphi_1(x)+ \cdots = 0,
\]
or for $0 \leqq y \leqq 1$, $y = 8x-5$,
%%
\begin{equation*}
\begin{array}{l}
\displaystyle (a_0-a_1-a_2+a_3-a_4+a_5+a_6-a_7)\varphi_0(y) \\
\displaystyle \hspace{23mm} (a_8-a_9-a_{10}+a_{11}-a_{12}+a_{13}+a_{14}-a_{15})\varphi_1(y) +\cdots=0.
\end{array}
\end{equation*}
Then each of these coefficients must vanish, and hence
\[
a_0=-a_1=-a_2=a_3=a_4=-a_5=-a_6=a7.
\]
Continuation in this way together with the Lemma shows that every $a_n$ must vanish. This reasoning is typical and does not essentially depend on our numerical assumptions about $x_1$. Then Theorem IX is proved.
The reasoning is precisely similar if instead of the hypothesis of Theorem IX we admit the possibility of a finite number of points in the neighborhood of each of which the convergence is not assumed uniform:
\begin{theorem}
If the series
\[
a_0\varphi_0(x) + a_1\varphi_1(x)+ \cdots +a_n\varphi_n(x)+\cdots
\]
converges to the sum zero uniformly, $0 \leqq x \leqq 1$, except in the neighborhood of a finite number of points, then $0=a_1=a_2=\cdots=a_n=\cdots$.
\end{theorem}
\parbox{1.5in}{\centering\small\sc Harvard University,\newline
~~~~~~~May, 1922.}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\newpage
This paper has been copied from the original article published in the American Journal of Mathematics, 1923, volume 45, pages 5--24.
It has been prepared by Neil Johnson, using TexShop for OSX, in plain \LaTeX\ with additional maths symbols from the {\tt amssymb} package.
All errors are mostly due to me, although I did spot (and correct in a couple of places) minor errors in the original.
\
\parbox{2in}{\centering\small\sc Neil Johnson,\newline
Cambridge, December, 2003.}
\end{document}