\magnification 1200

\hsize = 6truein
\vsize = 9truein
\normalbaselines
\overfullrule=0pt

\font\smcp=cmcsc8 
\headline={\ifnum\pageno>1 
   {\smcp the electronic journal of combinatorics {\bf 5} (1998), \#R5 
    \hfill\folio} \fi} 
\nopagenumbers

\newcount\seccount
\def\section#1\par{\vskip 30pt
    \advance\seccount by 1
    \lemmacount=0 \thmcount=0
    \centerline{\bf\the\seccount. #1}
    \vskip12pt\noindent}
\def\labelsection#1{\edef#1{\the\seccount}}
\def\nonumsection#1\par{\vskip 30pt%
    \centerline{\bf #1}
    \vskip12pt\noindent}

\def\itemskip{\smallskip}
\def\betweenskip{\medskip}

\def\Item{\itemskip\item}
\def\Bitem{\Item{$\bullet$}}
\def\Itemitem{\itemskip\itemitem}
\def\Bitemitem{\Itemitem{$\bullet$}}

\def\define#1{\Bitem{\bf #1}:}
%\def\define#1{\Bitem}

\newcount\lemmacount  \newcount\thmcount  \newcount\corcount

\def\thmfont{\it}
\def\theorem #1{\advance\thmcount by 1\corcount=0
    \edef#1{\the\seccount.\the\thmcount}
    \betweenskip\noindent
    {\bf Theorem~\the\seccount.\the\thmcount}.\thmfont~~}
\def\corollary #1{\advance\corcount by 1
    \edef#1{\the\seccount.\the\thmcount.\the\corcount}
    \betweenskip\noindent
    {\bf Corollary~\the\seccount.\the\thmcount.\the\corcount}.\thmfont~~}
\def\lemma #1{\advance\lemmacount by 1
    \edef#1{\the\seccount.\the\lemmacount}
    \betweenskip\noindent
    {\bf Lemma~\the\seccount.\the\lemmacount}.\thmfont~~}
\def\endthing{\rm}
\def\Endthing{\rm\betweenskip}

\def\proof{\betweenskip\noindent
    {\bf Proof}: }
\def\Proof #1{\betweenskip\noindent
    {\bf Proof} (of #1): }
\def\qed{~~\lower.3ex\vbox{\hrule\hbox{\vrule
     \vbox to1.8ex{\vfill\hbox to1.3ex{\null\hfill}}\vrule}\hrule}}

\newcount\eqcount \eqcount=0
\def\equno#1{\global\advance\eqcount by 1
    \xdef#1{(\the\eqcount)}
    (\the\eqcount)}
\def\eqlabel{\eqno\equno}

\def\lCeil{\bigl\lceil}
\def\rCeil{\bigr\rceil}
\def\comp[#1:#2]{c_{#1:#2}}

% Comparator net drawing macros
\def\Hor{\vrule width\HalfWidth height2.2pt depth-1.8pt}
\def\HOR{\vrule width\HalfWidth height2.8pt depth-1.2pt}
\def\Nul{\vrule width\HalfWidth height2.0pt depth-2.0pt}
\def\Left{\Hor\vrule\Nul}  \def\LEFT{\HOR\vrule\Nul}
\def\Rite{\Nul\vrule\Hor}  \def\RITE{\Nul\vrule\HOR}
\def\Thru{\Hor\vrule\Hor}  \def\THRU{\HOR\vrule\HOR}
\def\None{\Nul\vrule\Nul}
\def\smallstrut{\vrule width0pt height4pt}
\def\ArbNet#1#2{{\def\L{\Left}\def\R{\Rite}\def\T{\Thru}\def\N{\None}%
   \vbox{\offinterlineskip\def\HalfWidth{#1}%
      \tabskip=0pt\halign{\hfill\smallstrut##\hfill&&\hfill##\hfill\cr#2}}}}
\def\NNet#1{\ArbNet{6pt}{#1}}
\def\WNet#1{\ArbNet{12pt}{#1}}

% ``Tree'' drawing macros
\def\Lend{\hfill\vrule height4.4pt width.4pt\vrule height4.4pt depth-4pt
width 4pt}
\def\Rend{\vrule height4.4pt depth-4pt width 4pt\vrule height4.4pt
width.4pt\hfill}
\def\Bar{\vrule height4.4pt depth-4ptwidth8pt}
\def\VBar{\vrule height4.4pt depth-4ptwidth3.8pt%
          \vrule height8pt depth-4pt width.4pt%
          \vrule height4.4pt depth-4ptwidth3.8pt}
\def\Tree#1{\vcenter{\tabskip=0pt\halign{&\hfil##\hfil\cr#1}}}

\font\tenBB=msbm10
\font\sevenBB=msbm7
\font\fiveBB=msbm5
\def\Jap#1{\mathchoice{\hbox{\tenBB  #1}}{\hbox{\tenBB  #1}}%
                     {\hbox{\sevenBB #1}}{\hbox{\fiveBB #1}}}
\def\complex{\Jap C}
\def\reals{\Jap R}
\def\integers{\Jap Z}

\def\set#1{\underline{#1}}
\def\Sqrt#1{\sqrt{\mathstrut\smash{#1}}}
\def\M{{\cal M}}

\def\Aignerref {1}
\def\AjtKSref  {2}
\def\CanWref   {3}
\def\deBref    {4}
\def\DowPRSref {5}
\def\HongSref  {6}
\def\KamBWref  {7}
\def\Knuthref  {8}
\def\MilPTref  {9}


\null\vfill

\centerline{Periodic Sorting Using}
\centerline{Minimum Delay, Recursively Constructed Merging Networks}

\vskip12pt

\centerline{Edward A. Bender}
\centerline{Center for Communications Research}
\centerline{4320 Westerra Court}
\centerline{San Diego, CA 92121, USA}
\centerline{\tt ed@ccrwest.org}

\medskip

\centerline{S. Gill Williamson}
\centerline{Department of Computer Science and Engineering}
\centerline{University of California, San Diego}
\centerline{La Jolla, CA 92093-0114, USA}
\centerline{\tt gwilliamson@ucsd.edu}


\medskip

\centerline{Submitted: December 6, 1996}
\centerline{Submitted in revised form August 25, 1997}
\centerline{Accepted: December 9, 1997}

\bigskip

\nonumsection Abstract

Let $\alpha$ and $\beta$ be a partition of $\{1,\ldots,n\}$
into two blocks.
A merging network is a network of comparators which allows as input
arbitrary real numbers and has the property that, whenever the input
sequence $x_1,x_2,\ldots,x_n$ is such that the subsequence in the
positions $\alpha$ and the subsequence in the positions $\beta$ are
each sorted, the output sequence will be sorted. 
We study the class of ``recursively constructed'' merging networks and
characterize those with delay $\lceil\log_2 n\rceil$ (the best possible
delay for all merging networks).
When $n$ is a power of 2, we show that at least $3^{n/2-1}$ of these
nets are log-periodic sorters; that is, they sort any input sequence
after $\log_2n$ passes through the net. 
(Two of these have appeared previously in the literature.)

\vfill\vfill\vfill

\noindent
1991 AMS Classification Number.
\quad
Primary: 68P10

\eject


\section Introduction

This paper is divided into two main parts in two ways.
First, Sections~1--5 contain the concepts and results and
Sections~6--10 contain the proofs.
Second, one part of the paper deals with merging and the other with
sorting.
The concepts mentioned in the present section will be made precise in
the next section.

In software terms, a merging network is a program with no branching,
looping, or arithmetic other than the replacement of a pair of values
$(x,y)$ with $c(x,y)=\bigl(\min(x,y),\max(x,y)\bigr)$, called a
comparator.
In hardware terms, a merging network is a branch-free and feedback-free
circuit whose only logic units are comparators.
Given two ``interleaved'' sorted sequences, a merging net sorts the entire
sequence.
A sorting net is like a merging net except that the input is an
arbitrary sequence and the output is sorted.
Since comparators may operate in parallel when there is no overlap of
inputs, a considerable amount of parallelism is possible.
If a comparator takes one time unit, the delay of a net is its running
time when the most efficient parallelism is used.

The problem of designing $n$-input merging and sorting nets having
minimum delay or a minimum number of comparators has been studied by
many authors.
Knuth~[\Knuthref,~Sec.\thinspace5.3.4] discussed the history and results
concerning sorting and merging nets up to 1973.
Aigner~[\Aignerref,~Thm.\thinspace3.3] showed that the best merging
nets have delay $\lceil\log_2 n \rceil$ provided neither of the
sequences being merged is empty.
It has recently been shown by Miltersen, Paterson, and
Tarui~[\MilPTref] that a network for merging an $m$-long sequence and
an $n$-long one requires ${1\over2}(m+n)\log_2 m + O(n)$ comparators
provided $n\ge m$ and $m\to\infty$.

A simple information-theoretic argument shows that at least
$\log_2(n!)$ comparators are needed to sort $n$ items.
By Stirling's formula, it follows that the number of comparators is at
least $n\log n+O(n)$.
Since at most $\lfloor n/2\rfloor$ comparators can be executed
simultaneously, the delay of such a sorting net must be at least 
$\log n+O(1)$. 
Until Ajtai, Komlos, and Szemeredi~[\AjtKSref] showed that there are
networks for sorting $n$ items having delay on order of $\log n$ and
using on order of $n\log n$ comparators, researchers were unable to
approach such bounds.
Since all known families of nets of this type are quite complicated
and have very large factors multiplying both $\log n$ and $n\log n$,
it is natural to ask for simpler networks.
Some families have been found with delay times $(\log n+O(1))^2$.
Of particular interest are two found by Dowd, Perl, Rudolph, and
Saks~[\DowPRSref] and Canfield and Williamson~[\CanWref] that consist
of $\lceil\log_2 n\rceil$ repetitions of a merging network.
Such a net is called an $\lceil\log_2 n\rceil$-pass periodic sorter.
Kammeyer, Belew, and Williamson~[\KamBWref] conjectured two additional
such families based on empirical studies.
A pictorial representation of these nets for $n=16$ is given at the
end of the next section at the end.

In the present paper, we study a natural class of $n$-input merging
nets which we call recursive, focusing on those with minimum delay.
We characterize the structure of these nets and show that they achieve
the best possible delay, namely $\lceil\log_2 n\rceil$.
When $n$ is a power of two and the two sorted input
sequences have length $n/2$, we show that
\Item{(a)}
the least number of comparators needed is $(n/2)\log_2(n/2)+1$,
which achieves the asymptotic best possible bound of Miltersen et al, and
\Item{(b)}
at least $3^{n/2-1}$ of the nets sort after $\log_2 n$ passes,
thereby including the known and conjectured results of the previous
paragraph.


\section Definitions

Unfortunately, a variety of concepts are required.
Those needed for stating our main results are collected in this
section.
\define{regular expression notation}
Let $S$ be either a set of sequences or a single sequence.
In the latter case, we identify $S$ with $\{S\}$.
If $T$ is defined similarly, then $ST$ is the set of concatenations of
pairs of sequences, one from $S$ and one from $T$.
In particular, $S^k$ is the set of sequences formed by concatenating $k$
elements of $S$ with repetition allowed.
Also, $S^+$ is the union of $S^k$ over all $k>0$, and $S^*$ is $S^+$
with the empty sequence adjoined.
When it will not lead to confusion we sometimes abuse notation by
letting a set of sequences stand for some element of the set.
\define{(adjacent) comparator}
A {\it comparator\/} is a function $c:\reals^2\to\reals^2$ with
$$
c(x,y) = \Bigl(\min(x,y),\max(x,y)\Bigr).
$$
If $u\in\reals^n$ and $1\le i<j\le n$, then
$v=\comp[i:j](u)\in\reals^n$ is given by $v_k=u_k$ if $k\ne i,j$ and
$(v_i,v_j)=c(u_i,u_j)$.
We also call $\comp[i:j]$ a comparator.
If $j=i+1$, it is an {\it adjacent comparator}.
\define{network}
A {\it network of comparators\/}, or simply a {\it net}, is a
sequence of comparators $\comp[i_1:j_1]$, $\comp[i_2:j_2]$,
$\cdots,$ $\comp[i_k:j_k]$.
A function $f:\reals^n\to\reals^n$ given by
$$
f(s) = \comp[i_k:j_k]\bigl(\cdots\comp[i_1:j_1](s)\cdots\bigr)
$$
is associated with the net.
In other words, the function $f$ is a composition of the comparators
$\comp[i:j]$.
We call $s$ the {\it input\/} and $f(s)$ the {\it output\/} of the
net.
\itemskip\noindent
Here is a pictorial representation of the net
$\comp[1:3],\comp[2:4],\comp[1:2],\comp[2:3],\comp[1:4]$ for
$\reals^4$.
The inputs are shown being ``fed in'' at the top and comparators are
represented by horizontal bars.
Outputs emerge at the bottom.
$$
\vcenter{\NNet{
$s_1$&$s_2$&$s_3$&$s_4$\cr\cr
\N&\N&\N&\N\cr
\R&\T&\L&\N\cr
\N&\R&\T&\L\cr
\R&\L&\N&\N\cr
\N&\R&\L&\N\cr
\R&\T&\T&\L\cr
\N&\N&\N&\N\cr}}
\eqlabel{\simpleNet}
$$
The inputs and outputs of a net may be indexed by a set $\sigma$ other
than $\{1,2,\ldots,n\}$.
To keep track of the set, we refer to a net for $\sigma$.
\define{layer}
A sequence of comparators may be executed in parallel if and only if
it contains no repeated subscripts.
We call such a collection of comparators a {\it layer}.
In \simpleNet, $\comp[1:3]$ and $\comp[2:4]$ can be executed in
parallel, but $\comp[1:2]$ and $\comp[2:3]$ cannot.
\define{delay}
The {\it delay\/} of net is the minimum number of layers needed to
represent the net.
The delay of \simpleNet\ is 3 and the 3 layers are
$\{\comp[1:3],\comp[2:4]\}$, $\{\comp[1:2]\}$, and
$\{\comp[2:3],\comp[1:4]\}$.
\define{(trivial) partition $\vdash$}
If $\sigma$ is a set of integers, $|\sigma|$  denotes its size and
$\sigma\vdash\{\alpha,\beta\}$ is a {\it partition\/} of $\sigma$ into
two, possibly empty, subsets.
If either $\alpha$ or $\beta$ is empty, the partition is 
{\it trivial}.
\define{induced partitions}
If \hbox{$\sigma\vdash\{\alpha,\beta\}$} and
\hbox{$\sigma\vdash\{\sigma^{(1)},\sigma^{(2)}\}$}, let
$\alpha^{(i)}=\alpha\cap\sigma^{(i)}$ and
$\beta^{(i)}=\beta\cap\sigma^{(i)}$.
The {\it induced partitions\/} are
\hbox{$\alpha\vdash\{\alpha^{(1)},\alpha^{(2)}\}$},
\hbox{$\beta\vdash\{\beta^{(1)},\beta^{(2)}\}$},
\hbox{$\sigma^{(1)}\vdash\{\alpha^{(1)},\beta^{(1)}\}$}, and
\hbox{$\sigma^{(2)}\vdash\{\alpha^{(2)},\beta^{(2)}\}$}.
\define{(bi)alternating}
Suppose $\delta\vdash\{\epsilon,\zeta\}$ is such that, when the
elements of $\delta$ are  listed in order, $\epsilon$ contains every
other element of $\delta$ (and hence so does $\zeta$).
We say that the partition $\delta\vdash\{\epsilon,\zeta\}$ is
{\it alternating}.
We call $\sigma\vdash\{\sigma^{(1)},\sigma^{(2)}\}$ {\it bialternating\/}
(with respect to $\alpha$ and $\beta$) if the induced partitions of
$\alpha$ and $\beta$ are each alternating.
\itemskip\noindent
For example listing the elements of $\sigma$ in order and using
subscripts to denote elements of $\alpha^{(i)}$ and $\beta^{(i)}$,
$$
\sigma = \alpha^{(1)}_1,\beta^ {(2)}_1,
\beta^ {(1)}_1,\beta^ {(2)}_2,
\alpha^{(2)}_1,\alpha^{(1)}_2,
\alpha^{(2)}_2,\beta^ {(1)}_2
\eqlabel\bialteq
$$
is bialternating, but
$$
\sigma = \alpha^{(1)}_1,\alpha^{(2)}_1,
\beta^ {(1)}_1,\alpha^{(2)}_2,
\beta ^{(2)}_1,\alpha^{(1)}_2,
\beta ^{(2)}_2,\beta^ {(1)}_2
\eqlabel\notbialteq
$$
is not.
If $\sigma=\{1,\ldots,8\}$, the induced partitions of $\alpha$ and
$\beta$ are
$\alpha\vdash\bigl\{\{1,6\},\;\{5,7\}\bigr\}$ and
$ \beta\vdash\bigl\{\{2,4\},\;\{3,8\}\bigr\}$ for \bialteq, and
$\alpha\vdash\bigl\{\{1,6\},\;\{2,4\}\bigr\}$ and
$ \beta\vdash\bigl\{\{3,8\},\;\{5,7\}\bigr\}$ for \notbialteq.
\define{merger}
A {\it merging net}, or {\it merger},  for
$\sigma\vdash\{\alpha,\beta\}$ is a net such that, if the subsequence
of the input indexed by $\alpha$ is sorted and likewise for $\beta$,
then the output is sorted.
\define{recursive merger; correction subnet}
A merging net for
\hbox{$\sigma\vdash\{\alpha,\beta\}$}  is 
{\it recursive\/} if either
$\{\alpha,\beta\}$ is trivial (and so no comparators are needed) or
there is a partition
$\sigma\vdash\{\sigma^{(1)},\sigma^{(2)}\}$ such that the net
consists of recursive merging nets for $\sigma^{(1)}$
and $\sigma^{(2)}$ followed by an arbitrary comparator net, and the
partition satisfies
\Itemitem{(a)}
$\sigma\vdash\{\sigma^{(1)},\sigma^{(2)}\}\ne \{\alpha,\beta\}$ 
is nontrivial;
\Itemitem{(b)}
if $|\alpha|\ge2$ and $|\beta|\ge2$, then the induced partitions
$\alpha\vdash\{\alpha^{(1)},\alpha^{(2)}\}$ and
$\beta\vdash\{\beta^{(1)},\beta^{(2)}\}$ are both nontrivial.
\Item{}
The arbitrary comparator net following $\sigma^{(1)}$ and
$\sigma^{(2)}$ is called the {\it correction subnet}. 
\itemskip\noindent
Condition (a) can be restated as:
At least one of the induced partitions\break
$\alpha\vdash\{\alpha^{(1)},\alpha^{(2)}\}$ and
$\beta\vdash\{\beta^{(1)},\beta^{(2)}\}$ is nontrivial, which is
weaker than~(b).
Condition~(a) prevents the trivial cases in which all the work
is done before the correction subnet
($\{\sigma^{(1)},\sigma^{(2)}\}$ trivial) or in the correction subnet
($\{\sigma^{(1)},\sigma^{(2)}\}=\{\alpha,\beta\}$) whereas (b)
requires that, when possible, the work must also be divided between
the merging nets for $\sigma^{(1)}$ and $\sigma^{(2)}$ as well.
\define{periodic sorter}
A net is called a $k$-{\it pass periodic} (or {\it sequential\/})
{\it sorter\/} if passing a sequence through $k$ copies of the net
always produces sorted output.
For example, the $2n$-input, delay-2 net
$\vcenter{\NNet{\R&\L&\R&\L&        &\L&\R&\L\cr
               \N&\N&\N&\N&$\cdots$&\N&\N&\N\cr
               \N&\R&\L&\R&        &\R&\L&\N\cr}}$
is an $(n/2)$-pass periodic sorter, called the {\it odd-even transposition
sort\/} [\Knuthref,~p.\thinspace241].

\itemskip\noindent
Here is the Dowd, Perl, Rudolph, and Saks 4-pass periodic sorter for $n=16$:
$$
\vcenter{\def\AB{$\alpha$&$\beta$}\NNet{
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\R&\T&\T&\T&\T&\T&\T&\T&\T&\T&\T&\T&\T&\T&\T&\L\cr
\N&\R&\T&\T&\T&\T&\T&\T&\T&\T&\T&\T&\T&\T&\L&\N\cr
\N&\N&\R&\T&\T&\T&\T&\T&\T&\T&\T&\T&\T&\L&\N&\N\cr
\N&\N&\N&\R&\T&\T&\T&\T&\T&\T&\T&\T&\L&\N&\N&\N\cr
\N&\N&\N&\N&\R&\T&\T&\T&\T&\T&\T&\L&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\R&\T&\T&\T&\T&\L&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\R&\T&\T&\L&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\R&\L&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\R&\T&\T&\T&\T&\T&\T&\L&\R&\T&\T&\T&\T&\T&\T&\L\cr
\N&\R&\T&\T&\T&\T&\L&\N&\N&\R&\T&\T&\T&\T&\L&\N\cr
\N&\N&\R&\T&\T&\L&\N&\N&\N&\N&\R&\T&\T&\L&\N&\N\cr
\N&\N&\N&\R&\L&\N&\N&\N&\N&\N&\N&\R&\L&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\R&\T&\T&\L&\R&\T&\T&\L&\R&\T&\T&\L&\R&\T&\T&\L\cr
\N&\R&\L&\N&\N&\R&\L&\N&\N&\R&\L&\N&\N&\R&\L&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
}}
$$
Here is the Canfield and Williamson 4-pass periodic sorter for $n=16$:
$$
\vcenter{\def\AB{$\alpha$&$\beta$}\NNet{
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\R&\T&\T&\T&\T&\T&\T&\L&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\R&\T&\T&\T&\T&\T&\T&\L&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\R&\T&\T&\T&\T&\T&\T&\L&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\R&\T&\T&\T&\T&\T&\T&\L&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\R&\T&\T&\L&\R&\T&\T&\L&\R&\T&\T&\L&\N&\N&\N\cr
\N&\N&\N&\R&\T&\T&\L&\R&\T&\T&\L&\R&\T&\T&\L&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
}}
$$


\section Theorems About Recursive Merging Networks

Our main theorem on the structure of recursive merging nets is:

\theorem\mainThm
Suppose that $\sigma=\{1,\ldots,n\}\vdash\{\alpha,\beta\}$ with
$|\alpha|\ge2$ and $|\beta|\ge2$.
The following are true:
\Item{(a)}
A minimum delay, recursive merging net for $\sigma$ has delay
$d=\lceil\log_2 n\rceil$.
\Item{(b)}
If there is a minimum delay recursive merging net associated with the
partition $\sigma\vdash\{\sigma^{(1)},\sigma^{(2)}\}$, then this
partition is bialternating and
$$
\lCeil\,\max\bigl(\log_2|\sigma^{(1)}|,\,\log_2|\sigma^{(2)}|\bigr)\rCeil <
\lCeil\,\log_2|\sigma|\,\rCeil.
\eqlabel\spliteqn
$$
\Item{(c)}
The following construction gives precisely the minimum delay recursive
merging nets associated with a bialternating partition
$\sigma\vdash\{\sigma^{(1)},\sigma^{(2)}\}$.
\Itemitem{(i)}
Suppose the minimum elements of $\alpha$ and $\beta$ both belong to
$\sigma^{(1)}$ or both belong to $\sigma^{(2)}$.
Then the correction subnet consists of $\comp[2k:2k+1]$ for 
$1\le k<|\sigma|/2$ and, optionally when $|\sigma|$ is even, 
$\comp[1:|\sigma|]$.
\Itemitem{(ii)}
Suppose one of the minimum elements of $\alpha$ and $\beta$ belongs to
$\sigma^{(1)}$ and the other to $\sigma^{(2)}$,.
Then the correction subnet consists of $\comp[2k-1:2k]$ for $1\le
k\le|\sigma|/2$.
\Itemitem{(iii)}
The recursive merging nets for $\sigma^{(1)}$ and $\sigma^{(2)}$ each
have delay at most $d-1$.
\Item{}
Note that by (i) and (ii), the correction subnet has delay 1.
\Endthing

\noindent
The condition in \spliteqn\ rarely fails.
Bialternating guarantees that $|\alpha^{(1)}|$ and $|\alpha^{(2)}|$
differ by at most one and likewise for $|\beta^{(1)}|$ and
$|\beta^{(2)}|$.
Hence $|\sigma^{(1)}|$ and $|\sigma^{(2)}|$ differ by at most 2, which
can lead to failure of \spliteqn\ when $n$ is nearly a power of 2.
Switching $\beta^{(1)}$ and $\beta^{(2)}$ if necessary, we can insure that
$|\sigma^{(1)}|$ and $|\sigma^{(2)}|$ differ by at most 1 and,
consequently, that \spliteqn\ holds.

Theorem~\mainThm\ enables us to say quite a bit about the structure of
recursive merging networks with minimum delay:
As long as $|\alpha|\ge2$ and $|\beta|\ge2$ there are only two ways to
partition the sequence $\sigma$ for recursion and the correction
subnet is practically determined.
In this case, if
$$
\lCeil\log_2|\sigma^{(i)}|\rCeil = \lCeil\log_2|\sigma|\rCeil-1,
\eqlabel\logequality
$$
then the net for $\sigma^{(i)}$ is also minimum delay.
{From} Theorem~\mainThm(b), there are only two ways in which
\logequality\ can fail to hold for $i=1,2$:
\Item{(a)}
$|\sigma|=2^k+1$ for some $k$, in which case the values of
$|\sigma^{(i)}|$ are $2^{k-1}+1$ and $2^{k-1}$.
\Item{(b)}
$|\sigma|=2^k+2$ for some $k$, in which case the values of
$|\sigma^{(i)}|$ can be $2^{k-1}+2$ and $2^{k-1}$.
\itemskip\noindent
The following corollary avoids these problems by limiting $|\sigma|$ to
powers of~2.


\corollary\powertwoCor
Suppose $|\sigma|=n=2^k$.
Fix a partition $\sigma\vdash\{\alpha,\beta\}$ with\break
$|\alpha|=|\beta|=n/2$.
Given  a sequence $s_1,s_2,\ldots$, call  a maximal subsequence whose
successive indices differ by $t$ a {\it distance-$t$ subsequence}.
The following are true.
\Item{(a)}
The minimum delay of a recursive merging net for $\sigma$ is
$k=\log_2n$.
\Item{(b)}
Every minimum delay recursive merging net for $\sigma$ has the
following structure:
\Itemitem{$-$}
The top $j$ layers together form $2^{k-j}$ disjoint minimum delay
recursive merging nets, each of which has $2^{j-1}$ inputs from
$\alpha$ and $2^{j-1}$ inputs from $\beta$.
Furthermore, the $2^{j-1}$ elements from $\alpha$ (resp.\ $\beta$) are
a distance-$2^{k-j}$ subsequence of the sequence $\alpha$
(resp.\ $\beta$).
\Itemitem{$-$}
The first layer consists of $n/2$ comparators, each comparing an
$\alpha$ and a $\beta$.
\Itemitem{$-$}
For $j>1$, the $j$th layer of the net consists of $2^{k-j}$ parallel
correction subnets as described in Theorem~\mainThm(c).
\Item{(c)}
The number of such nets is $3^{n/2-1}$.
If we disallow the optional comparators (the $\comp[1:|\sigma|]$ of
Theorem~\mainThm(i)), then the number of nets is $2^{n/2-1}$.
\Item{(d)}
The minimum number of comparators in a minimum delay recursive
merging net for $\sigma$ is ${(k-1)n\over2} + 1$ and there is exactly
one such net.
\Endthing

Two aspects of the corollary may be misleading.
First, (d) does not claim that all ${(k-1)n\over 2}+1$ comparators are
essential.
All comparators are essential {\it when they are first introduced\/}
in correction subnets as part of the the recursive construction.
Later comparators may make some earlier comparators redundant.
We believe that this does not happen, but have been unable to prove it.

Second, it appears to be a simple matter to count all delay-$k$
recursive merging nets with $|\alpha|=|\beta|=2^{k-1}$, but
this depends on what is meant by ``all'' nets.
If a merging net is defined to include both the partition and the
comparators, then enumeration is trivial:
There are ${1\over2}{n\choose n/2}$ ways to partition $\sigma$ into
two equally long subseqences.
(The factor of one-half occurs because we do not distinguish between
$\alpha$ and $\beta$.)
If a merging net is defined to be {\it only\/} the arrangement of
comparators, then we are unable to count them.
Problems arise because the same net can be obtained for more than one
partition of $\sigma$.
These partitions can be created by selectively switching
$\alpha^{(2)}$ and $\beta^{(2)}$ at various levels in the recursion.
Such a switch changes $\{\alpha,\beta\}$ but it does not change the
net if
\Item{(a)}
the result is still bialternating and
\Item{(b)}
at all levels of the recursive construction, the use of (i) or (ii) in
Theorem~\mainThm\ is unchanged.
\itemskip\noindent
To illustrate the problem (a) causes, note our switching changes
\bialteq\ to \notbialteq.
We clarify (b) by considering nets for $\{1,2,3,4\}$ with
$|\alpha|=|\beta|=2$.
Then $\{\alpha,\beta\}$ is one of $\bigl\{\{1,2\},\{3,4\}\bigr\}$,
$\bigl\{\{1,3\},\{2,4\}\bigr\}$, and $\bigl\{\{1,4\},\{2,3\}\bigr\}$.
We may assume $1\in\alpha$.
Here are the nine minimum delay recursive nets.
$$
\vbox{\halign{\strut#&&\quad#\quad\cr
\WNet{%
$\alpha^{(1)}$&$\alpha^{(2)}$&$\beta^{(1)}$&$\beta^{(2)}$\cr\cr
\N&\N&\N&\N\cr
\R&\T&\L&\N\cr
\N&\R&\T&\L\cr
\N&\N&\N&\N\cr
\N&\N&\N&\N\cr
\N&\N&\N&\N\cr
\N&\R&\L&\N\cr
\N&\N&\N&\N\cr}
&
\WNet{%
$\alpha^{(1)}$&$\alpha^{(2)}$&$\beta^{(1)}$&$\beta^{(2)}$\cr
\N&\N&\N&\N\cr
\R&\T&\L&\N\cr
\N&\R&\T&\L\cr
\N&\N&\N&\N\cr
\N&\N&\N&\N\cr
\N&\R&\L&\N\cr
\R&\T&\T&\L\cr
\N&\N&\N&\N\cr}
&
\WNet{%
$\alpha^{(1)}$&$\alpha^{(2)}$&$\beta^{(2)}$&$\beta^{(1)}$\cr
\N&\N&\N&\N\cr
\R&\T&\T&\L\cr
\N&\R&\L&\N\cr
\N&\N&\N&\N\cr
\N&\N&\N&\N\cr
\N&\N&\N&\N\cr
\R&\L&\R&\L\cr
\N&\N&\N&\N\cr}
\cr\cr
\WNet{%
$\alpha^{(1)}$&$\beta^{(1)}$&$\alpha^{(2)}$&$\beta^{(2)}$\cr
\N&\N&\N&\N\cr
\R&\L&\R&\L\cr
\N&\N&\N&\N\cr
\N&\N&\N&\N\cr
\N&\N&\N&\N\cr
\N&\R&\L&\N\cr
\N&\N&\N&\N\cr}
&
\WNet{%
$\alpha^{(1)}$&$\beta^{(1)}$&$\alpha ^{(2)}$&$\beta^{(2)}$\cr
\N&\N&\N&\N\cr
\R&\L&\R&\L\cr
\N&\N&\N&\N\cr
\N&\N&\N&\N\cr
\N&\R&\L&\N\cr
\R&\T&\T&\L\cr
\N&\N&\N&\N\cr}
&
\WNet{%
$\alpha^{(1)}$&$\beta^{(2)}$&$\alpha ^{(2)}$&$\beta^{(1)}$\cr
\N&\N&\N&\N\cr
\R&\T&\T&\L\cr
\N&\R&\L&\N\cr
\N&\N&\N&\N\cr
\N&\N&\N&\N\cr
\R&\L&\R&\L\cr
\N&\N&\N&\N\cr}
\cr\cr
\WNet{%
$\alpha^{(1)}$&$\beta^{(1)}$&$\beta^{(2)}$&$\alpha^{(2)}$\cr
\N&\N&\N&\N\cr
\R&\L&\R&\L\cr
\N&\N&\N&\N\cr
\N&\N&\N&\N\cr
\N&\N&\N&\N\cr
\N&\R&\L&\N\cr
\N&\N&\N&\N\cr}
&
\WNet{%
$\alpha^{(1)}$&$\beta^{(1)}$&$\beta^{(2)}$&$\alpha^{(2)}$\cr
\N&\N&\N&\N\cr
\R&\L&\R&\L\cr
\N&\N&\N&\N\cr
\N&\N&\N&\N\cr
\N&\R&\L&\N\cr
\R&\T&\T&\L\cr
\N&\N&\N&\N\cr}
&
\WNet{%
$\alpha^{(1)}$&$\beta^{(2)}$&$\beta^{(1)}$&$\alpha^{(2)}$\cr
\N&\N&\N&\N\cr
\R&\T&\L&\N\cr
\N&\R&\T&\L\cr
\N&\N&\N&\N\cr
\N&\N&\N&\N\cr
\R&\L&\R&\L\cr
\N&\N&\N&\N\cr}
\cr}}
$$
The rows in this array correspond to partitions $\{\alpha,\beta\}$ of
$\sigma=\{1,2,3,4\}$, the first two columns correspond to
cases in which Theorem~\mainThm(i) applies, and 
the last column to the cases in which Theorem~\mainThm(ii) applies.
The input position labels indicate the set to which the position
belongs.
Refer to a net in this array as $N_{i,j}$ in the usual matrix fashion.
Interchanging $\alpha^{(2)}$ and $\beta^{(2)}$ creates the following
interchanges of identical nets $N_{1,3}\leftrightarrow N_{2,3}$,
$N_{2,1}\leftrightarrow N_{3,1}$, and 
$N_{2,2}\leftrightarrow N_{3,2}$, and condition~(b) holds.
In contrast, the interchange $\alpha^{(2)}\leftrightarrow\beta^{(2)}$
interchanges the partition associated with $N_{3,3}$ and the
partition associated with $N_{1,1}$ and $N_{1,2}$, condition~(b)
fails, and so the correction subnets change.

\betweenskip

In Corollary \powertwoCor, the only unused positions in a layer are
those associated with the nonuse of the optional comparators in
Theorem~\mainThm(i).
In the next section, we present examples which show that connecting
some of these positions can destroy a net's merging capability.
The following result shows that some connections do not.

\theorem\extrasmergeThm
Let $|\alpha|=|\beta|$ be a power of two and consider the recursive
merging nets described in Corollary~\powertwoCor. 
For the $j$th layer, let $\lambda_j$ be the set of left ends of
missing optional comparators and let $\rho_j$ be the missing right
ends.
Adding comparators of the form $\comp[l:r]$ with $l\in\lambda_j$ and
$r\in\rho_j$ does not destroy the merging capability of the net.
In fact, such comparators have no effect because their inputs are
always already in order.
\Endthing

\noindent
Suppose $|\sigma|=n=2^k$ and $\sigma\vdash\{\alpha,\beta\}$.
At the $j$th layer up from the output, there are $2^{j-1}$ interleaved
merging nets.
If $t$ of them could have optional comparators, $|\lambda|=|\rho|=t$
and so the number of ways to construct a set of $i$ comparators
$\comp[l:r]$ is ${t\choose i}^2 i!$.
It follows that the theorem leads to
$$
\prod_{j=1}^{k-1}\biggl(\,\sum_{t=0}^{2^{j-1}}{2^{j-1}\choose t}
\sum_{i=0}^t{t\choose i}^2 i!\biggr)
$$
sorting nets for each partition $\sigma\vdash\{\alpha,\beta\}$.
Messy asymptotic estimates show that this behaves like $(n/2)^{n/2}$,
which is considerably larger than the $3^{n/2-1}$ of
Corollary~\powertwoCor(c).



\section Theorems About Periodic Sorting Networks

We now turn our attention to periodic sorting nets built from 
recursive merging nets.
For simplicity, we limit our attention to $|\alpha|=|\beta|=n/2$ a
power of~2, say $n=2^k$.
Corollary \powertwoCor\ describes how all such recursive merging nets
are constructed.
In this case, two types of periodic $k$-pass sorters are known.
We show that they are members of a larger family.

Since a network is a periodic sorter after a sufficiently large number
of passes if and only if it contains all adjacent
comparators~[\DowPRSref,~Thm.\thinspace1], two natural questions are:
\Item{(a)}
Which recursive merging nets contain all adjacent comparators?
\Item{(b)}
Of those nets in (a), which are $k$-pass periodic sorters?
\itemskip\noindent
The first question has a relatively simple answer, but the other
appears more difficult.
The fact that (a) is not sufficient will be illustrated by an example.
We do not have a theorem similar to Theorem~\extrasmergeThm\ for
periodic sorting nets.
However, Theorem~\extrasmergeThm\ can be used to add comparators to
sorting nets which are based on a series of merges, as is the case for
the Batcher sort~[\Knuthref,~Sec.\thinspace5.3.4].

\theorem\sortThm
If $|\sigma|=n=2^k$ and $\sigma\vdash\{\alpha,\beta\}$ is alternating,
then every recursive merging net for $\sigma$ is a $k$-pass periodic
sorter.
\Endthing
\noindent
By Corollary \powertwoCor(c), there are $3^{n/2-1}$ such nets.
We conjecture the nets in Theorem~\extrasmergeThm\ are $k$-pass
periodic sorters whenever $\sigma\vdash\{\alpha,\beta\}$ is
alternating.
This has been verified up to $|\sigma|=16$ by computer.

The construction of a recursive merging net can be described by a
simple tree constructed as follows:
\Item{$-$}
At each vertex, a bialternating partition of the lines is constructed
by placing the leftmost $\alpha$ line in $\alpha^{(1)}$ and the
leftmost $\beta$ line in $\beta^{(i)}$, according to the label $i$ at the
vertex.
\Item{$-$}
The left son corresponds to $\sigma^{(1)}$ and the right to
$\sigma^{(2)}$.
\itemskip
\noindent
The periodic sorting nets given by Dowd, Perl, Rudolph, and
Saks~[\DowPRSref] correspond to trees having all vertices labeled~2. 
The nets given by Canfield and Williamson~[\CanWref] correspond to
trees having all vertices labeled~1 and having none of the optional
comparators of Theorem~\mainThm(i).
In both cases, $\sigma\vdash\{\alpha,\beta\}$ is alternating, and so
these nets are included in Theorem~\sortThm.
Kammeyer, Belew, and Williamson~[\KamBWref] discovered the two nets
for $n=16$ in which the root and its sons have one value and the
remaining vertices have the other value.
Based on these, they conjectured a general pattern which is included
in the families in Corollary~\powertwoCor.

\theorem\alladjThm
Call a partition $\tau\vdash\{\tau^{(1)},\tau^{(2)}\}$
\Item{$-$}
type 1 if listing $\tau$ in order gives an element of
$\bigl(t^{(1)}t^{(1)}t^{(2)}t^{(2)}\bigr)^*$,
or
\Item{$-$}
type 2 if listing $\tau$ in order gives an element of
$\bigl(t^{(1)}t^{(2)}t^{(2)}t^{(1)}\bigr)^*$,
\itemskip\noindent
where $t^{(i)}$ stands for any element of $\tau^{(i)}$.
Other partitions have no type.
Suppose $|\alpha|=|\beta|$, a power of~2.
We will ``mark'' certain vertices of the tree described above and only
consider the type of a marked vertex.
The partition $\sigma\vdash\{\sigma^{(1)},\sigma^{2)}\}$ of the input
lines $\sigma$ associated with a vertex determines its type, if any.
Thus, each vertex will be labeled 1 or 2,
and perhaps be marked and typed.
The tree for a recursive merger has all adjacent comparators if and
only if:
\Item{(a)}
Every marked vertex has type equal to its label.
\Item{(b)}
The root is marked.
\Item{(c)}
For every marked non-leaf vertex:
\Itemitem{(i)}
if its left son has the same label, it is also marked;
\Itemitem{(ii)}
if its right son is labeled 1, it is also marked.
\Endthing


The following merging net has all adjacent comparators but is not a
4-pass sorter.
The lines are labeled according to whether they are in $\alpha$ or
$\beta$.
An input sequence that causes failure and the corresponding output
sequence are given.
The tree associated with the net's construction is shown on the right.
The root, its right son, and the two rightmost leaves are marked.
% file nosort.aA 
$$
\vcenter{\def\AB{$\alpha$&$\beta$}\NNet{
1&1&1&0&1&0&0&0&0&0&0&0&0&0&0&0&~sort\cr\cr
$\alpha$&\AB&\AB&\AB&\AB&\AB&\AB&\AB&$\beta$\cr\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\R&\T&\T&\T&\L&\R&\L&\R&\T&\T&\T&\T&\L&\R&\L&\N\cr
\N&\R&\L&\R&\T&\T&\T&\T&\L&\R&\L&\R&\T&\T&\T&\L\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\R&\T&\T&\L&\R&\T&\T&\L&\N&\N&\N&\N\cr
\N&\N&\R&\T&\T&\T&\T&\T&\T&\L&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\R&\T&\T&\T&\T&\T&\T&\L&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\R&\L&\N&\N&\R&\L&\N&\N&\R&\L&\N&\N&\N\cr
\N&\N&\R&\T&\T&\L&\R&\T&\T&\L&\R&\T&\T&\L&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr\cr
0&0&0&0&0&0&0&0&0&0&0&1&0&1&1&1&~sort\cr
}}
\qquad\Tree{
     &     &     &    2\cr
     &\Lend&\Bar &\VBar&\Bar &\Rend\cr
     &    1&     &     &     &    1\cr
\Lend&\VBar&\Rend&     &\Lend&\VBar&\Rend\cr
    1&     &    1&     &    1&     &    1\cr}
$$



\section Adding Comparators Can Ruin a Net

\labelsection\ruinSection
It is natural to suppose that adding comparators to a sorting or
merging net will not destroy the ability of the net to sort or merge.
It was shown by de Bruijn~[\deBref] that, when a sorting net contains
only adjacent comparators, adding extra comparators does not destroy
the ability of the net to sort.
On the other hand, this is not true when nonadjacent comparators are
allowed
The net
$$
\vcenter{\NNet{
\N&\N&\N&\N\cr
\R&\L&\R&\L\cr
\N&\R&\L&\N\cr
\N&\N&\N&\N\cr
\R&\T&\T&\L\cr
\N&\R&\L&\N\cr
\R&\L&\R&\L\cr
\N&\N&\N&\N\cr}}
\hbox{~~sorts, but the net~~}
\vcenter{\NNet{
\N&\N&\N&\N\cr
\R&\L&\R&\L\cr
\N&\R&\L&\N\cr
\RITE&\LEFT&\N&\N\cr
\R&\T&\T&\L\cr
\N&\R&\L&\N\cr
\R&\L&\R&\L\cr
\N&\N&\N&\N\cr}}
$$
which has $\comp[1:2]$ added, fails to sort the sequence $1,1,0,0$.
This simple counterexample is not recursively constructed and the
added comparator increases the delay.
There are some 16-input recursive merging nets that are
4-pass periodic sorters such that the addition of a comparator
destroys either the merging, the 4-pass periodic sorting, or both.
Examples of this were found by a computer, which was also used to
verify sorting and merging capabilities.
In the following three examples, the added comparator is indicated in
bold face.
In all cases, Theorems~\mainThm\ and~\sortThm\ insure that the
original nets merge and are 4-pass periodic sorters.
The following net is {\it not\/} a 4-pass periodic sorter, but is a
merger.
% file merge.nosort
$$
\vcenter{\def\AB{$\alpha$&$\beta$}\NNet{
1&1&0&1&1&1&0&0&1&1&0&0&0&0&0&0&~sort\cr\cr
\AB&\AB&\AB&\AB&\AB&\AB&\AB&\AB\cr\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\RITE&\THRU&\THRU&\THRU&\THRU&\THRU&\LEFT&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\R&\T&\T&\T&\T&\T&\T&\L&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\R&\T&\T&\T&\T&\T&\T&\L&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\R&\T&\T&\T&\T&\T&\T&\L&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\R&\T&\T&\T&\T&\T&\T&\L&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\R&\T&\T&\L&\R&\T&\T&\L&\R&\T&\T&\L&\N&\N&\N\cr
\N&\N&\N&\R&\T&\T&\L&\R&\T&\T&\L&\R&\T&\T&\L&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr\cr
0&0&0&0&0&0&0&0&1&0&1&1&1&1&1&1&~sort\cr
}}
\qquad
\Tree{
     &     &     &    1\cr
     &\Lend&\Bar &\VBar&\Bar &\Rend\cr
     &    1&     &     &     &    1\cr
\Lend&\VBar&\Rend&     &\Lend&\VBar&\Rend\cr
    1&     &    1&     &    1&     &    1\cr}
$$
The following net is a 4-pass periodic sorter, but is {\it not\/} a
merger.
% file sort.nomerge
$$
\vcenter{\def\AB{$\alpha$&$\beta$}\NNet{
1&0&1&0&1&0&1&1&1&1&1&1&1&1&1&1&~merge\cr\cr
\AB&\AB&\AB&\AB&\AB&\AB&\AB&\AB\cr\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\R&\L&\N&\N&\R&\L&\N&\N&\R&\L&\N&\N&\R&\L&\N&\N\cr
\N&\N&\R&\T&\T&\T&\T&\L&\N&\N&\R&\T&\T&\T&\T&\L\cr
\N&\N&\N&\R&\T&\T&\L&\N&\N&\N&\N&\R&\T&\T&\L&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\RITE&\THRU&\LEFT&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\R&\T&\T&\T&\T&\T&\T&\L&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\R&\T&\T&\T&\T&\T&\T&\L&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\R&\T&\T&\L&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\R&\T&\T&\T&\T&\T&\L&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\R&\T&\T&\L&\R&\T&\T&\L&\R&\T&\T&\L&\N&\N&\N\cr
\N&\N&\R&\L&\N&\N&\R&\L&\N&\N&\R&\L&\N&\N&\R&\L\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr\cr
0&0&1&0&1&1&1&1&1&1&1&1&1&1&1&1&~merge\cr
}}
\qquad
\Tree{
     &     &     &    1\cr
     &\Lend&\Bar &\VBar&\Bar &\Rend\cr
     &    1&     &     &     &    2\cr
\Lend&\VBar&\Rend&     &\Lend&\VBar&\Rend\cr
    1&     &    1&     &    1&     &    1\cr}
$$
The following net is neither a 4-pass periodic sorter nor a merger.
The upper inputs and outputs are for the merge and the lower are for
the sort.
% file nomerge.nosort
$$
\vcenter{\def\AB{$\alpha$&$\beta$}\NNet{
1&0&1&0&1&1&1&1&1&1&1&1&1&1&1&1&~merge\cr\cr
1&1&1&1&1&0&0&1&0&0&0&1&1&0&0&0&~sort\cr\cr
\AB&\AB&\AB&\AB&\AB&\AB&\AB&\AB\cr\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\R&\T&\T&\T&\T&\T&\T&\L&\R&\T&\T&\T&\T&\T&\T&\L\cr
\N&\R&\L&\R&\L&\R&\L&\N&\N&\R&\L&\R&\L&\R&\L&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\RITE&\LEFT&\N&\N&\N&\N&\N&\R&\L&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\R&\T&\T&\T&\T&\T&\T&\L&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\R&\T&\T&\T&\T&\T&\T&\L&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\R&\T&\T&\T&\T&\T&\T&\L&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\R&\T&\T&\L&\R&\T&\T&\L&\R&\T&\T&\L&\R&\T&\T&\L\cr
\N&\N&\R&\T&\T&\L&\R&\T&\T&\L&\R&\T&\T&\L&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr
\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L&\R&\L\cr
\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N&\N\cr\cr
0&1&0&1&1&1&1&1&1&1&1&1&1&1&1&1&~merge\cr\cr
0&0&0&0&0&0&0&1&0&1&1&1&1&1&1&1&~sort\cr
}}
\qquad
\Tree{
     &     &     &    2\cr
     &\Lend&\Bar &\VBar&\Bar &\Rend\cr
     &    2&     &     &     &    1\cr
\Lend&\VBar&\Rend&     &\Lend&\VBar&\Rend\cr
    1&     &    1&     &    1&     &    1\cr}
$$


\section Proof of Theorems \mainThm\ and \extrasmergeThm\  and
Corollary \powertwoCor

Let $\sigma\vdash\{\alpha,\beta\}$ be nontrivial.
We will prove the following two theorems in Sections 7 and~8.

\theorem\altThm
If $\sigma\vdash\{\sigma^{(1)},\sigma^{(2)}\}$ is bialternating,
$|\alpha|\ge2$, and $|\beta|\ge2$, then Theorem~\mainThm(i) and~(ii)
describe the minimum delay correction subnets.
\endthing

\theorem\nonaltThm
If $\sigma\vdash\{\sigma^{(1)},\sigma^{(2)}\}$ is not bialternating
and the induced partitions of $\alpha$ and $\beta$ are not trivial,
then the correction subnet has delay greater than one.
\endthing

\Proof{Theorem \mainThm}
Without loss of generality, we may assume that 
$|\sigma^{(1)}| \ge |\sigma^{(2)}|$.
Since being recursive requires that
$\sigma\vdash\{\sigma^{(1)},\sigma^{(2)}\}\ne \{\alpha,\beta\}$ 
be nontrivial, it is easily seen that the correction subnet cannot be empty.
Since Aigner~[\Aignerref~Thm.\thinspace3.3] has shown that the delay of
any $n$-input merging net for two nonempty sequences is at least
$\lceil\log_2n\rceil$, it follows the delay for $\sigma$ is at least
$\lCeil\log_2|\sigma^{(1)}|\rCeil+1$.
Theorem~\mainThm(a) and~(c) now follow easily by using induction on $n$ and
Theorem~\altThm, provided \spliteqn\ holds.
The discussion immediately following the statement of
Theorem~\mainThm\ shows that it is always possible to satisfy \spliteqn.
Theorem~\mainThm(b) now follows easily from $d=\lceil\log_2 n\rceil$
and Theorem~\nonaltThm.\qed


\Proof{Corollary \powertwoCor}
The delay in (a) follows immediately from the theorem.
Whenever $|\alpha|$ (resp.\ $|\beta|$) is even, the bialternating
requirement guarantees that $|\alpha^{(1)}|=|\alpha^{(2)}|=|\alpha|/2$
(resp.\ $|\beta^{(1)}|=|\beta^{(2)}|=|\beta|/2$).
The recursive structure in (b) now follows immediately from the
theorem.

The distinction between $\alpha^{(1)}$ and $\alpha^{(2)}$
is irrelevant, so we may as well suppose that $\alpha^{(1)}$ contains
the first element of $\alpha$.
Given $\alpha^{(1)}$, the distinction between $\beta^{(1)}$ and
$\beta^{(2)}$ is relevant, because Theorem~\mainThm(b) shows that 
this leads to different correction subnets and hence to different
merging nets.
This fact guarantees that the nets counted in this proof are
distinct.

Let $N_n$ be the number of minimum delay recursive merging nets for
$\sigma$.
When $n\ge4$, the theorem tells us that we have minimum delay merging
nets on $\sigma^{(1)}$ and $\sigma^{(2)}$ and that
$|\alpha^{(i)}|=|\beta^{(i)}|=n/4$.
It also tells us that there are three choices for the correction subnet.
Hence $N_n = N_{n/2} \times N_{n/2} \times 3$ for $n\ge4$.
Note that $N_2=1$, since the net is a single comparator.
It is easily verified that $N_n=3^{n/2-1}$ is the solution to this
recursion.
If we disallow $\comp[1:|\sigma|]$, ``3'' is replaced by ``2'' in the
previous argument.

The minimum number of comparators, $C_k$, for the
$|\alpha|=|\beta|=2^{k-1}$ case is easily found.
The theorem tells us that there is a correction subnet with
$2^{k-1}-1$ comparators but not with less.
Hence $C_k = C_{k-1} + C_{k-1} + 2^{k-1} - 1$.
Since $C_1=1$, one easily verifies that $C_k = (k-1)2^{k-1} + 1$.
To achieve this minimum, case~(ii) in Theorem~\mainThm\ can never
occur.
Hence $\sigma\vdash\{\alpha,\beta\}$ determines
$\{\sigma^{(1)},\sigma^{(2)}\}$.\qed

\Proof{Theorem \extrasmergeThm}
It suffices to prove the last statement in the theorem for a single
comparator $\comp[l:r]$ added to the $j$th layer.
Relabeling $\alpha$ and $\beta$ if necessary, we can assume that
$l\in\alpha$.
After the $j$th layer, Corollary~\powertwoCor(b) guarantees that
position $l$ will contain the minimum of some set $\mu'$ of input
positions associated with one of the $2^{k-j}$ recursive merging nets
of Corollary~\powertwoCor(b).
Likewise, position $r$ will contain the maximum of some set
$\mu''$ of such a set of input positions.
Since the inputs $\alpha$ are sorted, it suffices to show that some
element of $\alpha'=\mu'\cap\alpha$ is to the left of an element of
$\alpha''=\mu''\cap\alpha$.
(We may suppose $\alpha'\ne\alpha''$.)
This follows from two observations:
\Item{(a)}
Since there are no optional comparators in the first layer,
$|\alpha'|\ge2$ and $|\alpha''|\ge2$.
\Item{(b)}
If $\alpha'$ and $\alpha''$ are the positions in $\alpha$ associated
with two merging nets on the $j$th layer, then $\{\alpha',\alpha''\}$
is an alternating partition.
We now prove this.
\itemskip

Let $k>j$ be the layer at which the inputs in positions $\mu'$ and
$\mu''$ first enter the same correction subnet.
It suffices to look at the first $k$ layers and induct on $k-j$.
For $k-j=1$, the result follows from Theorem~\mainThm(b).

We now suppose the result is true for $k-j=m$ and prove it for
$k-j=m+1$.
In level $j+1$, $\alpha'$ is associated with some correction subnet
$\hat\mu'$.
Let $\hat\alpha'$ be the subset of $\alpha$ associated with that
correction subnet and define $\hat\alpha''$ similarly associated with
a correction subnet $\hat\mu''$.
By induction, $\{\hat\alpha',\hat\alpha''\}$ is alternating.
By Theorem~\mainThm(b),
$\hat\alpha'\vdash\{\alpha',\overline\alpha'\}$ and
$\hat\alpha''\vdash\{\alpha'',\overline\alpha''\}$ are alternating
partitions.
The result follows.\qed



\section Proof of Theorem \altThm

Note that Theorem~\altThm\ is not recursive---it deals only with the
correction subnet after both $\sigma^{(1)}$ and $\sigma^{(2)}$ are
sorted.
Since (i) and (ii) have very similar proofs, we will prove only (i).
We may assume that $1\in\sigma^{(1)}$.

We make use of the 0-1 principle, which states that a net correctly
merges or sorts if and only if it does so when its inputs are
restricted to $\{0,1\}$.
See [\Knuthref,~p.\thinspace224] for a proof.
If $\tau$ is a set, let $\tau_i$ be the $i$th smallest element in
$\tau$ and let $\tau(k)$ be the set of elements of $\tau$ which do not
exceed $k$.
Note that $|\tau(\tau_k)|=k$.
When we say ``$\tau$ contains $m$ zeroes'', we mean ``the number of
zeroes in a specified input that are in positions indexed by $\tau$
equals $m$''.
Note that, for (i) in the theorem,
$$
\vcenter{\tabskip=0pt
\halign{\hfill$\Bigl(#\Bigr)\Leftrightarrow\Bigl($&$#\Bigr)$.\hfill\cr 
\hbox{$|\alpha(k)|$ is odd} & \max(\alpha(k))\in\alpha^{(1)}\cr
\hbox{$|\beta (k)|$ is odd} & \max(\beta (k))\in\beta ^{(1)}\cr
}}
\eqlabel\parityeq
$$

We distinguish four cases based on the parity of the number of zeroes
in $\alpha$ and in $\beta$.

\betweenskip

\noindent
{\bf Both Parities}.\quad
We may assume that $\alpha$ contains $2i$ zeroes and $\beta$ contains
$2j+1$.
It follows that $\sigma^{(1)}$ contains $i+j+1$ zeroes and
$\sigma^{(2)}$ contains $i+j$.
Let $k=2i+2j+1$.
Since $|\alpha(k)|+|\beta(k)|=k$, exactly one of $|\alpha(k)|$ and
$|\beta(k)|$ is even, say $|\alpha(k)|$.
Hence $|\alpha^{(1)}(k)|=|\alpha^{(2)}(k)|$ and
$|\beta^{(1)}(k)|=|\beta^{(2)}(k)|+1$.
Thus $|\sigma^{(1)}(k)|=|\sigma^{(2)}(k)|+1$ and so
$$
\vcenter{\halign{$#$~~&#\hfill\cr
|\sigma^{(1)}(k)|=i+j+1,
& the number of zeroes in $\sigma^{(1)}$;\cr
|\sigma^{(2)}(k)|=i+j,
& the number of zeroes in $\sigma^{(2)}$.\cr}}
$$
It follows that, after sorting $\sigma^{(1)}$ and $\sigma^{(2)}$ we
have an element of $0^k1^*$.

\betweenskip

\noindent
{\bf Both Even}.\quad
Let $\alpha$ contain $2i$ zeroes and $\beta$ contain $2j$.
Let $k=2i+2j$.
Note that $\sigma^{(1)}$ and $\sigma^{(2)}$ each contain $k/2$ zeroes.
If $|\alpha(k)|$ and $|\beta(k)|$ are even, the reasoning is similar
to the previous case.
Otherwise, both $|\alpha(k)|$ and $|\beta(k)|$ are odd.
Thus $|\sigma^{(1)}(k)|$ exceeds $|\sigma^{(2)}(k)|$ by 2; that is,
$|\sigma^{(1)}(k)| = i+j+1$ and $|\sigma^{(2)}(k)| = i+j-1$.
By \parityeq, $k\in\sigma^{(1)}$ and so $k+1\in\sigma^{(2)}$.
It follows that after sorting $\sigma^{(1)}$ and $\sigma^{(2)}$, we
have an element of $0^{k-1}101^*$ and so $\comp[k:k+1]$ is needed.

\betweenskip

\noindent
{\bf Both Odd}.\quad
Let $\alpha$ contain $2i+1$ zeroes and $\beta$ contain $2j-1$.
Let $k=2i+2j$.
Note that $\sigma^{(1)}$ contains $k+1$ zeroes and $\sigma^{(2)}$
contains $k-1$.
If $|\alpha(k)|$ and $|\beta(k)|$ are both odd, the reasoning is
similar to the first case.
Otherwise, both $|\alpha(k)|$ and $|\beta(k)|$ are even.
Hence $|\sigma^{(1)}(k)|=|\sigma^{(2)}(k)|=k/2$ and, by \parityeq,
$k\in\sigma^{(2)}$ and $k+1\in\sigma^{(1)}$
It follows that after sorting $\sigma^{(1)}$ and $\sigma^{(2)}$, we
have an element of $0^{k-1}101^*$ and so $\comp[k:k+1]$ is needed.

\betweenskip

In both of the last two cases, $k$ is even and hence any essential
comparator must be of the form $\comp[2l:2l+1]$.
We now show that all of these are essential.
If $|\alpha(2l)|$ and $|\beta(2l)|$ are even, let $\alpha$ and
$\beta$ each contain an odd number of zeroes for a total of $2l$.
Use the ``both odd'' case to see that $\comp[2l:2l+1]$ is essential.
If $|\alpha(2l)|$ and $|\beta(2l)|$ are odd, let $\alpha$ and
$\beta$ each contain an even number of zeroes for a total of $2l$.
Use the ``both even'' case to see that $\comp[2l:2l+1]$ is
essential.

We remark that for case (ii), comparators are needed in the ``both
parities'' case and are not needed in the other two cases.\qed



\section Proof of Theorem \nonaltThm

As in the proof of Theorem \altThm, we use the 0-1 principle to
limit our attention to 0-1 inputs and note that Theorem~\nonaltThm\ is
not recursive.

To avoid excessive superscripting, we will associate 
$a$~with $\alpha^{(1)}$, $A$~with $\alpha^{(2)}$, 
$b$~with $\beta^{(1)}$, and $B$~with $\beta^{(2)}$.
Furthermore, we usually drop subscripts and punctuation and abuse
equality when referring to sequences so that we are dealing with words
in $\M=\{a,b,A,B\}^*$.
For example, $\sigma=Abab$ denotes the fact that
$\sigma=\{\sigma_1,\sigma_2,\sigma_3,\sigma_4\}$, where
$\sigma_i<\sigma_{i+1}$,  has been partitioned into
$\alpha=\{\sigma_1,\sigma_3\}$ and $\beta=\{\sigma_2,\sigma_4\}$.
Furthermore, $\alpha^{(1)}=\{\sigma_3\}$, $\alpha^{(2)}=\{\sigma_2\}$,
$\beta^{(1)}=\beta$, and $\beta^{(2)}$ is empty.
Note that the $abAB$ notation conveys all the information needed to
study one level of the recursive construction:
The recursion will sort the lower case positions and will sort the
upper case positions.
The correction subnet will merge these two sorted sequences, using the
fact that $\alpha$ (the sequence in $\{a,A\}^*$) and $\beta$ (the
sequence in $\{b,B\}^*$) were originally sorted.
\Bitem
Let $\M_k$ be the set of all $x\in\M$ for which the least possible
delay in the correction subnet is $k$.
\Bitem
The notation
$$
s\to x\to t
\quad\hbox{with $x\in\M$ and  $s,t\in\{0,1\}^*$}
$$
means that, if $s$ is sorted on $\alpha$ and $\beta$, then $t$ is what
we obtain {\it after\/} sorting on $\sigma^{(1)}$ and on
$\sigma^{(2)}$, but {\it before\/} the correction subnet.
Thus we think of $x$ as describing a network that sorts lower case
letter positions and sorts upper case letter positions, and we think
of the arrows as indicating data flowing into and out of the network.
\itemskip

To prove Theorem \nonaltThm, we require four lemmas.

\lemma\LemmaOne
If $x=x_1\cdots x_n\in\M_k$, then the following are in $\M_k$ and
conversely:
\item{(a)}
the word obtained from $x$ by changing all $\alpha$'s to $\beta$'s and
conversely, preserving case;
\item{(b)}
the word obtained from $x$ by changing the case of all letters;
\item{(c)}
the words obtained from $x$ by allowing $a$ and $b$ to commute and
allowing $A$ and $B$ to commute.
\item{(d)}
the word $x_n\cdots x_1$ obtained by reversing $x$;
\endthing

\proof
The operations in (a) and (b) do not affect the initial bimonotone
requirements on the $\alpha$'s and on the $\beta$'s, nor do they
affect the monotone rearrangement of the upper case and lower case
sequences.
Interchanging an adjacent $a$,$b$ pair leads to the same sequence
after a monotone rearrangement on the lower case letters.
Since the same is true for upper case letters, (c) is proved.
The reversal of all sequences together with the interchange of 0 and 1
leaves all statements valid and so (d) is proved.\qed


\lemma\LemmaTwo
If $x,y,z\in\M$ and $xyz\in\M_1$, then $x,y,z\in\M_0\cup\M_1$.
\Endthing

\noindent
We omit the simple proof of this lemma.
The following lemma will be proved in the next section.

\lemma\LemmaThree
A correction subnet is minimal if removing any of the comparators
destroys the merging property of the net.
If $x\in\M_1$, then all comparators in a minimal, delay 1 correction
subnet are of the form $\comp[i:i+1]$; i.e., they are adjacent.
\Endthing

\lemma\LemmaFour
The following sequences are not in $\M_1$:
\item{(a)}
elements of $b\{a,A\}^*Aa^*b$,
\item{(b)}
elements of $A\{b,B\}^*b\{a,A\}^*b$.
\endthing

\proof
For (a) we consider two functions.
The first function is 0 everywhere except at the $Aa^*$; that is,
$$
0\{0,0\}^*11^*0  \to  b\{a,A\}^*Aa^*b  \to  0\{0,0\}^*101^*.
$$
Thus a comparator is needed between the rightmost $A$ and the
following position.
The second function is
$$
1\{0,0\}^*01^*1  \to  b\{a,A\}^*Aa^*b  \to  \{0,0\}^*10^+1^*1,
$$
where $0^+$ corresponds to the rightmost block of $A$'s.
Thus a comparator is needed between the leftmost $A$ in that block and
the preceding position.
This shows that $b\{a,A\}^*Aa^*b\not\in\M_1$.

By (a), we can assume that (b) has the form $A\{b,B\}^*ba^*b$.
By Lemmas~\LemmaOne(c) and~\LemmaTwo, it suffices to show that
$A\{b,B\}^*b^2\not\in\M_1$.
Consider the function which is 1 at $A$ and 0 elsewhere.
Then
$$
10^*0^2  \to  A\{b,B\}^*b^2   \to  0^*10^*0^2,
$$
where the final 1 is in the position of the rightmost capital letter.
This sequence requires a nonadjacent comparator and so is not in
$\M_1$ by Lemma~\LemmaThree.\qed

\betweenskip

The rest of this section is devoted to proving the following
strengthened version of Theorem \nonaltThm.

\theorem{\SnonaltThm}
We say the sequence $x$ meets a set $\cal S$ $k$ times if $k$ elements of
$x$ lie in $\cal S$.
Suppose $x\in\M$ meets each of the sets $\{a,A\}$, $\{b,B\}$, $\{a,b\}$,
and $\{A,B\}$ at least twice (the ``meet conditions'') and is not
bialternating, then $x\not\in\M_1\cup\M_0$.
\Endthing

\noindent
To see that this is stronger than Theorem \nonaltThm, note that
$x=aaBB$ satisfies the meet conditions but the partitions of $\alpha$
and $\beta$ are both trivial.

\Proof{Theorem \SnonaltThm}
We induct on $|x|$.
Let $x=x_1\cdots x_{n+1}$.
By Lemma~\LemmaOne, we may assume that $x_{n+1}=b$.
Let $x'=x_1\cdots x_n$.
The meet condition tells us that 
$$
\hbox{$x'$ contains at least two $\alpha$'s.}
\eqlabel\xprimealpha
$$
By Lemma~\LemmaTwo, and the induction hypothesis, we are done if $x'$
satisfies the meet conditions and is not bialternating.
Thus, it suffices to show that the following three situations are
impossible when $x$ is in $\M_1$, satisfies the meet conditions, and
is not bialternating.
\item{(i)}
$x'$ is bialternating.
\item{(ii)}
$x'$ is not bialternating, contains $b$, and does not satisfy the meet
conditions.
\item{(iii)}
$x'$ is not bialternating, does not contain $b$, and does not satisfy
the meet conditions.

\betweenskip

\noindent{\bf Case (i)}.\quad
Since $x'$ is bialternating, \xprimealpha\ tells us that it contains
both $a$ and $A$.
By Lemma~\LemmaFour(a), $x$ must end in either $bab$ or $bb$.
By Lemma~\LemmaOne(c), we can replace $bab$ with $abb$, which does not
destroy the bialternating nature of $x'$.
It follows from the fact that $x'$ is bialternating and
Lemma~\LemmaFour(b) that $x$ ends with either $Cabb$ or $cBbb$ where
$c$ (resp.~$C$) stands for some lower (resp. upper) case letter.
Set all inputs to 0 except for the last $A$ and the following $a$, if
any.
For $Cabb$, we have
$$
0^*10^*100  \to  \cdots Cabb  \to  0^*1001,
$$
which requires the nonadjacent comparator $\comp[n-2:n]$.
By Lemma~\LemmaThree, $x\not\in\M_1$.
For $cBbb$, we have
$$
\left.\matrix{0^*10^*0^2 \cr 0^*10^*10^*0^2\cr}\right\}
\to  \cdots cBbb  \to
\cases{
  0^*100, &if the rightmost $\alpha$ is $A$;\cr
  0^*101, &if the rightmost $\alpha$ is $a$;\cr}
$$
The former requires the nonadjacent comparator $\comp[n-1:n+1]$ and
so is covered by Lemma~\LemmaThree.
The latter requires $\comp[n-1:n]$. 
In this case, set all inputs except $c$ and the last $bb$ to 0:
$$
0^*1011  \to x  \to  0^*1011,
$$
which requires $\comp[n-2:n-1]$.
Since both $\comp[n-2:n-1]$ and $\comp[n-1:n]$ are required, the
delay is at least two and so $x\not\in\M_1$.
This completes case~(i).

\betweenskip

\noindent{\bf Case (ii)}.\quad
There are three possible cases:
\item{(a)}
$x'$ contains $A$ but not $a$.
\item{(b)}
$x'$ contains $A$ and $a$.
\item{(c)}
$x'$ contains $a$ but not $A$.

Suppose (a) holds.
Since $a$ is not present and $x$ satisfies the meet conditions, it
contains at least two $A$'s.
Set all $A$'s to 1 and all $\beta$'s to 0.
After passing through the net $x$, the rightmost position will be 0
and there will be at least two 1's to the left of it.
Apply Lemma~\LemmaThree.

Suppose (b) holds.
Since $x'$ does not satsify the meet conditions, it cannot contain
$B$.
By Lemma~\LemmaFour(a), it follows that no $A$ can lie between two
$b$'s. 
Starting at the rightmost $A$, we have $x=\cdots Aa^*ba^*b\cdots$.
By Lemma~\LemmaOne(c), we can rearrange $Aa^*ba^*b$ as $Aba^*b$, which
is not in $\M_1$ by Lemma~\LemmaFour(b).

Suppose (c) holds.
By the meet condition for $x$, $x'$ contains $a$ and $B$.
If $x'$ contained more than one $a$, we could apply Lemma~\LemmaOne(a)
and Lemma~\LemmaFour\ to conclude that $x'\not\in\M_1$.
Therefore $x'$ contains at exactly one $a$.
Since $x_n=b$, $x$ meets $\{a,A\}$ only once, a contradiction.

\betweenskip

\noindent{\bf Case (iii)}.\quad
Let $\pi(y)$ stand for some permutation of the letters of the word
$y$.
Since $b\not\in x'$, $x=\pi(a^*A^*B^*)b$.
The meet conditions on $x$ show that this is in fact either
$\pi(aa^+BB^+)b$ or $\pi(a^+A^+B^+)b$.
The former cannot occur since then $x'$ would satisfy the meet
conditions.
Thus it remains to consider $x=\pi(a^+A^+B^+)b$.
Note that $x_1\ne b$.

Consider the word $y=x_{n+1}\cdots x_1$.
By Lemma~\LemmaOne(d) and the previous parts of the proof, especially
the last two sentences in the previous paragraph, we may assume that
$y$ is one of 
$$
\pi(A^+b^+B^+)a,\quad
\pi(a^+b^+B^+)A,\quad
\pi(a^+A^+b^+)B.
$$
Combining this with the conclusion of the previous paragraph, it
follows that we may assume that $x$ is one of
$$
x^{(1)} = a\,\pi(A^+B^+)b,\quad
x^{(2)} = A\,\pi(a^+B^+)b,\quad
x^{(3)} = B\,\pi(a^+A^+)b.
$$
Recall that $x$ is not bialternating.
By Lemma~\LemmaOne, we can assume that $x^{(1)}$ and $x^{(2)}$ contain
at least two $B$'s and that $x^{(3)}$ contains two adjacent $a$'s.
To eliminate $x^{(1)}$, apply Lemma~\LemmaThree\ to
$$
1\pi(1^+0^21^*)1  \to  a\pi(A^+B^2B^*)b  \to  10^21^*.
$$
To eliminate $x^{(2)}$, apply Lemma~\LemmaOne(b), possibly
Lemma~\LemmaOne(d), and lastly Lemma~\LemmaFour\ to
$\pi(a^+B^2B^*)$.
To eliminate $x^{(3)}$, apply Lemmas~\LemmaOne(a)
and~\LemmaFour(b).\qed



\section Proof of Lemma \LemmaThree

Suppose that the correction subnet has delay 1 and contains a
nonadjacent comparator.
We will show that this leads to a contradiction.
Among all inputs from $\{0,1\}^n$ that require a nonadjacent comparator
let $s$ be one containing the maximum number of ones.
Let $t$ be the result before the correction subnet, that is 
$s\to x\to t$.
There are (0,1)-sequences $p$, $q$, and $r$ with $q$ nonempty such
that $t=p1q0r$ is the result of passing $s$ through all of the net
except the correction subnet.
The {\it leftmost\/} nonadjacent comparator that is needed,
$\comp[i:j]$, switches the 0 and 1 in $p1q0r$.
Without loss of generality, we may assume that $i\in\sigma^{(1)}$ and 
$j\in\sigma^{(2)}$.
We now prove a sequence of facts
\Item{(a)}
{\thmfont We have $p=0^*$}.
If this were false, there would be a 1 in $p$ that would require
a nonadjacent comparator.
\Item{(b)}
{\thmfont The last 0, if any, in a position indexed by $\alpha$
(resp.\ $\beta$) is in a position indexed by $\sigma^{(2)}$}.
If this were false, we could change that 0 to a 1 without affecting
$t_i$ or $t_j$.
\Item{(c)}
{\thmfont Any zeroes in $t$ after the $i$th position must be indexed
by $\sigma^{(2)}$}.
Since $t_i=1$ is indexed by $\sigma^{(1)}$ this follows from the fact
that the positions indexed by $\sigma^{(i)}$ are sorted before the
correction subnet.
\Item{(d)}
{\thmfont We have $r=1^*$}.
Changing the last 0 in $s$ to a 1 will not alter $t_i$ and, if $r$
contains a 0, will also not alter $t_j$ by (c).
\Item{(e)}
{\thmfont We have $q=1^+$}.
If we change the last 0 in $s$ to a 1, it follows from (b) and (d)
that $t$ is changed to $p1q1r$.
If $q$ contained a 0, a comparator other than $\comp[i:j]$ would be
needed to move $t_i$. 
This contradicts the delay 1 assumption.
\itemskip\noindent
At this point we have shown that $t=0^*11^+01^*$.
\Item{(f)}
{\thmfont The first 1, if any, in a position indexed by $\alpha$
(resp.\ $\beta$) is in a position indexed by $\sigma^{(2)}$}.
If not, change that 1 to a 0.
Since $p=0^*$, this changes $t$ to $0^*01^+01^*$ and so a comparator
other than $\comp[i:j]$ is needed to move $t_j$, contradicting delay~1.
\Item{(g)}
{\thmfont We may divide the total number of input ones between
positions indexed by $\alpha$ and $\beta$ in any manner without
altering $t$}.
By (b) and (f), increasing the number of ones by 1 in one subsequence
and decreasing in by 1 in the other will not change the number of ones
in $\sigma^{(1)}$ and $\sigma^{(2)}$.
Hence $t$ will be unchanged.
\itemskip

We are now ready to derive a contradiction.
Let $k$ be the total number of zeroes in $s$.
Let $l$ of the first $k$ positions be indexed by $\alpha$.
{From} (g), we can place $l$ zeroes in the positions indexed by $\alpha$
and the remaining in the positions indexed by $\beta$.
Then $s=0^k1^*$ and so it will not be rearranged by comparators.
Hence $t=0^k1^*$, contradicting the fact that $t=0^*11^+00^*$.\qed


\section Proof of Theorems \sortThm\ and \alladjThm

\Proof{Theorem \alladjThm}
Think of marking as meaning that the subnets being merged at that
point must contain adjacent comparators.
Consider the lines associated with a marked vertex.
Since there are no comparators between $\sigma^{(1)}$ and
$\sigma^{(2)}$ above the correction subnet, pairs of lines requiring
comparators must belong to the same block of the partition.
The agreement of label and type is precisely what is needed to
guarantee that a pair of adjacent lines needing a comparator occur in
the same block.
This explains (a) in the theorem.
Since the final correction layer contains only about half of the
needed adjacent comparators, condition~(b) starts the adjacent
comparator requirement.
Condition~(c) insures that the requirement is propagated when the
correction layer of a son does not contain the required adjacent
comparators.\qed



\Proof{Theorem \sortThm}
The method of proof is essentially that same as that in [\CanWref] and
[\DowPRSref].

Recall that we defined a {\it distance-$t$ subsequence\/} of
$s_1,s_2,\ldots$ to be a maximal subsequence whose successive indices
differ by $t$.
We call a sequence {\it $t$-sorted\/} if all its distance-$t$
subsequences are sorted.
We will show the following:
\Item{I}
For $0\le i\le k-2$, if $\alpha$ and $\beta$ are $2^{i+1}$-sorted, then,
after passing through the first $k-i$ layers, they are
$2^i$-sorted.
\Item{II}
If the $\alpha$ and $\beta$ sequences feeding into layer $j$ are
$2^i$-sorted and $j>k-i$, then so are the outputs.
\itemskip
\noindent
We now show how to complete the proof given I and II.
Suppose $\alpha$ and $\beta$ are $2^{i+1}$-sorted.
By I, they are each $2^i$-sorted after passing through the first
$k-i$ layers.
By II, this property is preserved by passing through the remaining
layers.
Hence, if the input $\alpha$ and $\beta$ sequences are
$2^{i+1}$-sorted, the output $\alpha$ and $\beta$ sequences are
$2^i$-sorted.
Since $|\alpha|=|\beta|=2^{k-1}$, both sequences are trivially
$2^{k-1}$-sorted and so, by the previous sentence they are $2^0$-sorted
after $k-1$ passes through the net.
Since a 1-sorted sequence is sorted and the net is a merging net, one
more pass completes the sort.
This proves the theorem subject to verifying I and II.

The following observation will prove useful later.
Let $\hat\alpha$ and $\hat\beta$ be any distance-$t$ subsequences of
$\alpha$ and $\beta$ for any $t$.
Together, they are a subsequence $\hat\sigma$ of $\sigma$.
Since the hypothesis of the theorem states that
$\sigma\vdash\{\alpha,\beta\}$ is alternating, it follows that
$$
\hbox{
$\hat\sigma\vdash\{\hat\alpha,\hat\beta\}$ is alternating.}
\eqlabel\altsubsequence
$$

We now prove I.
{From} Corollary~\powertwoCor(b), the first $k-i-1$ layers consist of
$2^{i+1}$ merging nets whose inputs are distance-$2^{i+1}$ subsequences of
$\alpha$ and $\beta$.
Since $\alpha$ and $\beta$ are assumed to be $2^{i+1}$-sorted, each of
the $2^{i+1}$ merging nets has as input a sorted subsequence of
$\alpha$ and a sorted subsequence of $\beta$, and so the output of
each of these merging nets is sorted.
By Corollary~\powertwoCor(b) again, the $(k-i)$th layer consists of
$2^i$ correction subnets.
Limit attention to one of these correction subnets.
Use $\{c,C\}$ and $\{d,D\}$ to stand for the $\{a,A\}$ and $\{b,B\}$
subsequences in some order, with upper (resp.~lower) cases
corresponding to upper (resp.~lower) cases; i.e., letters correspond
to letters and cases to cases.
Note that:
\Item{(a)}
By the preceding discussion, the inputs to the correction subnet from
$cd$ are sorted as are the inputs from $CD$.
\Item{(b)}
By Corollary~\powertwoCor(b), $c$, $C$, $d$, and $D$ are
distance-$2^{i+1}$ subsequences of $\alpha$ and~$\beta$.
\Item{(c)}
By Corollary~\powertwoCor(b), the $cC$ and $dD$ sequences are
distance-$2^{i}$ subsequences of $\alpha$ and $\beta$.
\itemskip\noindent
Since the meaning of $c$ and $d$ and the assignment of upper and lower
case can each be changed, it follows from \altsubsequence, (b), and
(c) that the only possible choices for the interleaving of the four
subseqences are
$$
{\rm(i)}~c_1,d_1,C_1,D_1,c_2,d_2,C_2,\ldots,D_l
~~{\rm and}~~
{\rm (ii)}~c_1,D_1,C_1,d_1,c_2,D_2,C_2,\ldots,d_l,
$$
where $l=2^{k-i-2}$.
Note that (i) and (ii) of Theorem~\mainThm(c) apply to subsequences
(i) and (ii), respectively.

We treat (i) and leave (ii) to the reader since it is similar.
Let a sequence letter stand for the input to the $(k-i)$th layer and a
primed letter stand for the output.
By (a) above, $c_i\le d_i\le c_{i+1}$ and $C_i\le D_i\le C_{i+1}$.
We have $C_i'=\max(d_i,C_i)$ and $c_{i+1}'=\max(D_i,c_{i+1})$.
With or without the optional comparator, $c_1'\le c_1$.
Hence
$$
\vbox{\halign{\hfill$#\phantom{\Bigl(}$\hfill\cr
c_1'\le d_1\le\max(d_1,C_1)=C_1',\cr
C_i'=\max(d_i,C_i)\le\max(D_i,c_{i+1})=c_{i+1}',\cr
c_i'=\max(D_{i-1},c_i)\le\max(d_i,C_i)= C_i'.\cr}}
$$
Thus the $cC$ sequence is sorted.
Similarly, one obtains
$$
\vbox{\halign{\hfill$#\phantom{\Bigl(}$\hfill\cr
D_l'\ge C_l\ge\min(d_l,C_l)=d_l',\cr
d_i'=\min(d_i,C_i)\le\min(D_i,c_{i+1})=D_i',\cr
D_{i-1}'=\min(D_{i-1},c_i)\le\min(d_i,C_i)= d_i',\cr}}
$$
which shows that the $dD$ sequence is sorted.
Since each of these are distance-$2^i$ subsequences, we have proved I.

We now prove II.
Consider that part of layer $j$ corresponding to a particular
correction subnet.

Assume that the correction subnet is as in Theorem~\mainThm(ii).
The layer's input consists of distance-$2^{k-j}$ subsequences
$\hat\alpha$ and $\hat\beta$ of $\alpha$ and $\beta$.
{From} \altsubsequence, the partition
$\hat\sigma\vdash\{\hat\alpha,\hat\beta\}$ is alternating.
{From} Theorem~\mainThm(c), either all left inputs to the comparators
are $\alpha$'s or all inputs to the comparators are $\beta$'s.
Now look at one of the distance-$2^i$ subsequences of $\alpha$ that is
a subequence of $\hat\alpha$.
Call it $\tilde\alpha$.
Note that
$$
\hbox{$\tilde\alpha$ is  a distance-$2^{i-(k-j)}$ subsequence of
$\hat\alpha$.}
\eqlabel\tildesubsequence
$$
Now look at the $\hat\beta$ subsequence $\tilde\beta$ that shares
comparators with $\tilde\alpha$.
It follows from \altsubsequence\ and \tildesubsequence\ that
$\tilde\beta$ is a distance-$2^{i-(k-j)}$ subsequence of $\hat\beta$
and hence a distance-$2^i$ subsequence of $\beta$.
By assumption, $\tilde\alpha$ and $\tilde\beta$ are sorted, and they
are of the same length.
Since the sequence formed by the minima (resp.\ maxima) of
corresponding positions of sorted sequences is also sorted, we are
done.

Now assume that Theorem~\mainThm(i) applies and use the notation and
argument of the preceding paragraph.
We are done except for dealing with the first and last elements of
$\hat\sigma$, neither of which is the input of an adjacent comparator.
We may assume that the first element of $\hat\sigma$ is in $\alpha$.
In this case, the right inputs of the adjacent comparators are always
$\hat\alpha$'s.
Suppose the first element of $\hat\alpha$ is in $\tilde\alpha$.
Use primes to denote the ouput of the correction layer and lack of
primes to denote its input.
With or without the optional comparator,
$\tilde\alpha_1'\le\tilde\alpha_2$.
Since $\tilde\alpha'_i=\max(\tilde\beta_{i-1},\tilde\alpha_i)$ for $i>1$, it
follows that $\tilde\alpha_1'\le\tilde\alpha_2'$.
Hence $\tilde\alpha'$ is sorted.
A similar argument applies to the distance-$2^i$ $\beta$ subsequence
containing the last element of $\hat\sigma$.
This completes the proof of II and hence of Theorem~\sortThm.\qed




\nonumsection References

\frenchspacing

\item{[\Aignerref]}
M. Aigner, Parallel complexity of sorting problems,
{\it J. Algorithms\/} {\bf 3} (1982) 79--88.
V. Grinberg has given an alternate proof.
See Knuth, {\it The Art of Computer Programming}, Volume~3 errata,
solution to Ex.~46 of Sec.~5.3.4 (text page 641).
See
{\tt http://www-cs-faculty.stanford.edu/$\sim$knuth/taocp.html}.

\item{[\AjtKSref]}
M. Ajtai, J. Komlos, and E. Szemeredi, Sorting in $c\log n$ parallel steps,
{\it Combinatorica\/} {\bf 3} (1983) 1--19.

\item{[\CanWref]}
E. R. Canfield and S. G. Williamson, A sequential sorting network
analogous to the Batcher merge, {\it Linear and Multilinear Algebra\/}
{\bf 29} (1991) 43--51.

\item{[\deBref]}
N. G. de Bruijn, Sorting by means of swapping, {\it Discrete. Math.}
{\bf 9} (1974) 333-339.

\item{[\DowPRSref]}
M. Dowd, Y. Perl, L. Rudolph, and M. Saks, The periodic balanced
sorting network, {\it J. Assoc. Comput. Mach.} {\bf 36} (1989)
738--757.

\item{[\HongSref]}
Z. Hong [H. Zhu] and R. Sedgewick, Notes on merging networks, {\it Proc. 14th 
Annual ACM Symposium on the Theory of Computing}, ACM, New York (1982)
296--302.

\item{[\KamBWref]}
T. E. Kammeyer, R. K. Belew, and S. G. Williamson, Evolving
compare-exchange networks using grammars, {\it Artificial Life\/}
{\bf 2} (1995) 199-237.

\item{[\Knuthref]}
D. E. Knuth, {\it Sorting and Searching}, volume 3 of {\it The Art of
Computer Programming}, Addison-Wesley, Reading, MA (1973).

\item{[\MilPTref]}
P. B. Miltersen, M. Paterson, and J. Tarui, The asymptotic complexity
of merging networks, {\it J. Assoc. Comput. Mach.} {\bf 43} (1996)
147--165.

\bye
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%



