\documentstyle[12pt,sched-header,eepic]{article}
\Scribe{Keren Bendel}
\Lecturer{Yuval Rabani}
\LectureNumber{13}
\LectureDate{22 June 2000}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% General Macros
\newtheorem{theorem}{Theorem}[section]
\newtheorem{proposition}[theorem]{Proposition}
\newtheorem{property}[theorem]{Property}
\newtheorem{corollary}[theorem]{Corollary}
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem{claim}[theorem]{Claim}
\newtheorem{fact}[theorem]{Fact}
\newtheorem{definition}{Definition}[section]
\newtheorem{remark}{Remark}[section]
\newtheorem{example}{Example}[section]
\def\eod{\vrule height 6pt width 5pt depth 0pt}
\newenvironment{proof}{\noindent {\bf Proof:} \hspace{.4em}}
{\hspace*{\fill}{\eod}}
\newcommand{\ceil}[1]{\lceil #1 \rceil }
\begin{document}
\MakeScribeTop
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{center}
\bf \huge Convex quadratic programing
\end{center}
\section{Introduction}
We study the following scheduling problem: $R|| \sum w_jC_j$. The goal is
to minimize the total weighted completion time of $n$ jobs on $m$
unrelated parallel machines.
%A set $J$ of $n$ jobs
%has to be scheduled on $m$ unrelated parallel processors or machines.
%Together with each job $j$ we are given its positive processing
%requirement which also depends on the machine $i$ job $j$ will be
%processed on and is therefore denoted by $p_{ij}$. Each job $j$ must be
%processed for the respective amount of time without interruption on one of
%the $m$ machines.
Note that for a given job $j$ it may happen that $p_{ij}= \infty$
for some (but not all) machines $i$ such that job $j$ cannot be scheduled
on those machines.
%Every machine can process at most one job at a time. We
%denote the completion time of job $j$ by $C_j$. The goal is to minimize
%the total weighted completion time: a weight $w_j \geq 0$ is associated
%with each job $j$ and we seek to minimize \(\sum_{j \in J} w_j C_j\).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Throughout the notes we will use the following convention: The value of
an optimum solution to the scheduling problem under consideration is
denoted by $Z^*$. For a relaxation $(R)$ we denote the optimum value of
$(R)$ by $Z^*_R$ and the value of an arbitrary feasible solution $a$ to
$(R)$ by $Z_R(a)$.\\
{\bf Observation:} The problem of scheduling one machines $1||\sum w_jC_j$
can be solved optimally by applying Smith's ratio rule \cite{S56}:
schedule the jobs in order of nonincreasing ratios $w_j / p_j$. And so the
problem of scheduling unrelated parallel machines $R||\sum w_jC_j$ can be
reduced to an assignment problem of jobs to machines. Throughout the notes
we will use the following convention: Whenever we apply Smith's ratio rule
on machine $i$ and $w_k /p_{ik} = w_j / p_{ij}$ for a pair of jobs $j,k$
the job with the smaller index is scheduled first. \\
To simplify notation, we introduce for each machine $i$ a corresponding
total order $(J,\prec_i)$ on the set of jobs by setting $j\prec_i k$ if
either $w_k / p_{ik} > w_j / p_{ij}$ or $w_k /p_{ik} = w_j / p_{ij}$ and
$j < k$.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{An Integer quadratic programming formulation}
We introduce for each machine $i = 1,\ldots,m$ and each job $j\in J$ a
binary variable $a_{ij}\in{0,1}$ which is 1 if and only if job $j$
is being processed on machine $i$. Minimizing the average weighted
completion time can be expressed using quadratic conditions and leads to
the following integer quadratic program $(IQP)$:
\begin{eqnarray}
\mbox{minimize} & \displaystyle \sum_{j\in J}w_jC_j \hspace*{4.1cm}
&\nonumber \\
\mbox{subject to} & \displaystyle \sum_{i=1}^m a_{ij} = 1
\hspace*{3.8cm}
&\mbox{for all $j$} \label{eq:a}\\
& C_j =\displaystyle \sum_{i=1}^ma_{ij}\cdot(p_{ij}
+ \sum_{k\prec_i j} a_{ik}\cdot p_{ik}) & \mbox{for all $j$}
\label{eq:Cj}\\
& a_{ij} \in {0,1} \hspace*{4cm}
&\mbox{for all $i,j$}
\label{eq:liniar}
\end{eqnarray}
Constraints (\ref{eq:a}) ensure that each job is assigned to exactly one
of the
$m$
machines. If a job $j$ was assigned to machine $i$, its completion time
is the sum of its own processing time $p_{ij}$ and the processing time of
the other jobs $k\prec _i j$ that are also scheduled on machine $i$, as
shown
in constraints (\ref{eq:Cj}). Notice that we could remove constraints
(\ref{eq:Cj})
and
replace $C_j$ in the objective function by the corresponding term on the
right hand side of (\ref{eq:Cj}).\\
We denote by $(QP)$ the quadratic programming relaxation of $(IQP)$ that is
obtained by replacing the integrality conditions (\ref{eq:liniar}) by $a_{ij} \geq 0$.
A feasible solution $\bar{a}$ to (QP) can be turned into a feasible
solution to (IQP), i.e., a feasible schedule, by randomized rounding. Each
job $j$ is randomly assigned to machine $i$ with probability
$\bar{a}_{ij}$. We impose the condition that the
random choices are performed pairwise independently for the jobs and refer
to this rounding procedure as Algorithm RANDOMIZED ROUNDING.\\
Algorithm RANDOMIZED ROUNDING can easily be derandomized, by the
method of conditional probabilities, see \cite{M95}.
\begin{theorem}
\label{TH:ZQP}
Given a feasible solution $\bar{a}$ to (QP), the expected value of the
schedule computed by Algorithm {\em RANDOMIZED ROUNDING} is equal
to
$Z_{QP}(\bar{a})$.
\end{theorem}
\begin{proof}
\begin{eqnarray*}
\mbox{E}[C_j] & = & \sum_{i=1}^m Pr[j\mapsto i]\cdot (p_{ij} +
\sum_{k\prec_i j}
\mbox{Pr}[k \mapsto i | j\mapsto j]\cdot p_{ik})\\
& = & \sum_{i=1}^m a_{ij}(p_{ij} + \sum_{k\prec_i j}a_{ik}p_{ik})\\
& = &C_j(a)
\end{eqnarray*}
where $j' \mapsto i'$ denotes the event that job $j$ is assigned to
machine $i$.
\end{proof}
\begin{corollary}
\label{cor:1}
The optimal values of (IQP) and (QP) are equal. Moreover, given an optimum
solution $\bar{a}$ to (QP) one can construct an optimum solution to (IQP)
by assigning each job j to an arbitrary machine i with $\bar{a}_{ij}> 0$.
\end{corollary}
It follows from Corollary~\ref{cor:1} that it is still NP-hard to find an
optimum solution to the quadratic program (QP).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{A convex quadratic programming relaxation}
Plugging constraints (2) into the objective function, the quadratic
program (QP) can be rewritten as
\begin{eqnarray}
\mbox{minimize} & c^Ta+\frac{1}{2}a^TDa & \label{eq:cTa} \\
\mbox{subject to} & \displaystyle \sum_{i=1}^m a_{ij} = 1
\hspace*{0.6cm}
&\mbox{ for $j\in J$} \\
& a \geq 0 \hspace*{1.4cm} &
\end{eqnarray}
where $a \in R^{mn}$ denotes the vector consisting of all variables
$a_{ij}$ lexicographically ordered with respect to the natural order
$1,2,\ldots ,m$ of the machines and then for each machine $i$, the jobs
orderd according to $\prec_i$ The vector $c \in R^{mn}$ is given by
$c_{ij} = w_ip_{ij}$ and $D= (d_{(ij)(hk)})$ is a symmetric $mn \times
mn$-matrix given through
\[d_{(ij)(hk)} = \left\{ \begin{array}{ll}
0 & \mbox{if $i \neq h$ or $j=k$,} \\
w_jp_{ik} & \mbox{if $i=h$ and $k \prec_i j$,} \\
w_kp_{ij} & \mbox{if $i=h$ and $j \prec_i k$.}
\end{array}
\right. \]
Denote $x=Da$, then\\
\(x_{ij} = D_{(ij)}\cdot a = \displaystyle\sum_{h,k}D_{(ij)(hk)}a_{(hk)}
= \sum_kD_{(ij)(ik)} a_{(ik)} = \sum_{k \prec_i
j}w_jp_{ik}a_{ik} + \sum_{j\prec_i k}w_kp_{ij}a_{ik} \)\\
Therefore we obtain that\\
\(a^TDa = a^Tx = \displaystyle \sum_{ij}a_{ij}x_{ij} =
\sum_{ij}a_{ij}(\sum_{k \prec_i j}w_jp_{ik}a_{ik} + \sum_{j\prec_i
k}w_kp_{ij}a_{ik} = 2\sum_{ij}\sum_{k \prec_i j}w_jp_{ik}a_{ij}a_{ik}\) \\
A quadratic program of the form min $c^Tx+x^TDx$ subject to $Ax=b$,
$x \geq 0$, can be solved in polynomial time if the objective function is
convex.\\
A function of the form $c^Tx+x^TDx$ where $D$ is symmetric is convex if
and only if $D$ is positive semidefinite, i.e., $\forall x$ $x^TDx \geq 0$.\\
Because of the lexicographic order of the incidices the matrix $D$ is
decomposed into $m$ diagonal blocks $D_i$, $i=1,\ldots m$, corresponding to
the $m$ machines.
\begin{equation}
\label{mat:D}
D = \left( \begin{array}{cccc}
D_1 & 0 & \ldots & 0 \\
0 & D_2 & \ldots & 0 \\
\vdots & \vdots & \ddots & \vdots \\
0 & 0 & \ldots & D_m
\end{array} \right) .
\end{equation}
If we assume that the jobs are indexed according to $\prec_i$ and if we
denote $p_{ij}$ simply by $p_j$, in each block, the $i$-th block $D_i$ has
the following form
\begin{equation}
\label{mat:Di}
D_i = \left( \begin{array}{ccccc}
0 & w_2p_1 & w_3p_1 & \ldots & w_np_1 \\
w_2p_1 & 0 & w_3p_2 & \ldots & w_np_2 \\
w_3p_1 & w_3p_2 & 0 & \ldots & w_np_3 \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
w_np_1 & w_np_2 & w_np_3 & \ldots & 0
\end{array} \right) .
\end{equation}
{\bf Example:} consider an instance consisting of 2 jobs where all weights
and processing times on the $i$-th machine are equal to one. In this case
we get
\begin{equation}
\label{mat:exmp}
D_i = \left( \begin{array}{cc}
0 & 1 \\
1 & 0
\end{array} \right) .
\end{equation}
In particular, det$D_i = -1$ and $D$ is not positive semidefinite.
Therefor the objective function is not convex.\\
For binary vectors $a \in {0,1}^{mn}$ we can rewrite the linear term
$c^Ta$ in (\ref{eq:cTa}) as $a^T$diag$(c)a$, where
%diag$(c)$ denotes the diagonal
%matrix whose diagonal entries coincide with the entries of vector $c$.
\[ \mbox{diag}(c) = \left( \begin{array}{cccc}
c_{11} & & 0 \\
& \ddots & \\
0 & & c_{mn}
\end{array} \right) . \]
We try to convexify the objective function of (QP) by adding a positive
fraction $2\gamma \cdot$diag$(c)$, $0<\gamma \leq 1$, to $D$ such that
$D+2\gamma \cdot$diag$(c)$ is positive semidefinite. This leads to the
following modified objective function:
\begin{eqnarray}
\mbox{min} & (1-\gamma)\cdot c^Ta +
\frac {1}{2}a^T(D+2\gamma\cdot \mbox{diag}(c))a . \label{OF:diag}
\end{eqnarray}
since $c \geq 0$, the value of the linear function $c^Ta$ is greater than
or equal to the value of the quadratic function $a^T$diag$(c)a$ for
arbitrary $a \in [0,1]^{mn}$; equality holds for $a_{ij} \in \{0,1\}$.
Therefore the modified objective function (\ref{OF:diag}) underestimates
(\ref{eq:cTa}). Since we
want to keep the gap as small as possible and since (\ref{OF:diag}) is
nonincreasing
in $\gamma$ for each fixed vector $a$, we are looking for a smallest
possible value of $\gamma$ such that $D+2\gamma\cdot \mbox{diag}(c)$ is
positive semidefinite. \\
For $\gamma = \frac{1}{2}$ we will denote this linear program as (CQP).
\begin{lemma}
\label{LM:convex}
\[ \begin{array}{c}
(1-\gamma)\cdot c^Ta + \frac{1}{2}a^T(D+2\gamma\cdot \mbox{diag}(c))a
\end{array} \]
is convex for arbitrary instances of $\mathbf{R||\sum w_jC_j}$ if and only
if $\gamma \geq \frac{1}{2}$.
\end{lemma}
\begin{proof}
In order to show that the positive semidefiniteness of $D+2\gamma\cdot
\mbox{diag}(c)$ for all instances implies $\gamma \geq \frac{1}{2}$, we
consider the example given above. Here the diagonal entries of the $i$-th block
of $D+2\gamma\cdot \mbox{diag}(c)$ are equal to $2\gamma$ such that this
block is positive semidefinite if and only if $\gamma \geq \frac{1}{2}$.
Thus, we consider the case $\gamma = \frac{1}{2}$ and show that
$D+\mbox{diag}(c)$ is always positive semidefinite. Using the same
notation as before, the $i^{th}$ block of $D+\mbox{diag}(c)$ has the
form:
\begin{equation}
\label{mat:A}
A = \left( \begin{array}{ccccc}
w_1p_1 & w_2p_1 & w_3p_1 & \ldots & w_np_1 \\
w_2p_1 & w_2p_2 & w_3p_2 & \ldots & w_np_2 \\
w_3p_1 & w_3p_2 & w_3p_3 & \dots & w_np_3 \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
w_np_1 & w_np_2 & w_np_3 & \ldots & w_np_n
\end{array} \right) .
\end{equation}
We prove that the matrix $A$ is positive semidefinite by showing that the
determinants of all its principle sub-matrices are nonnegative. Note that
each principle sub-matrix corresponds to a subset of jobs $J'\subseteq J$
and is the same form as $A$ for a smaller instance induced by the set of
jobs $J'$. Therefor it suffices to show that the determinant of $A$ is
nonnegative for all instances.
For $j=1,\ldots ,n$, we multiply the $j^{th}$ row and column of $A$ by
$\frac{1}{p_j} >0$. Then for $j=1,\ldots ,n-1$, we iteratively subtract
column $j+1$ from column $j$. The resulting matrix is upper-triangular.
The $j^{th}$ diagonal entry is equal to
$\frac{w_j}{p_j}-\frac{w_{j+1}}{p_{j+1}}\geq 0$, for $j=1,\ldots ,n-1$,
and the $n^{th}$ diagonal entry is $\frac{w_n}{p_n} \geq 0$. This is due
to the ordering according to smith's rule, and so the determinant of the
resulting matrix is nonnegative. \\
Since for $\gamma \geq \frac{1}{2}$ the matrix $D+2\gamma\cdot
\mbox{diag}(c)$ can be written as the sum of the
two positive semidefinite matrices
$D+\mbox{diag}(c)$ and $(2\gamma -1)\cdot \mbox{diag}(c)$, the result
follows.
\end{proof}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Simple approximation algorithms}
\begin{theorem}
\label{TH:a}
Computing an optimum solution $\bar{a}$ to (CQP) and using
{\em DERANDOMIZED ROUNDING} to construct a feasible schedule is a
2-approximation for the problem $\mathbf{R||\sum w_jC_j}$.
\end{theorem}
\begin{proof}
\[\mbox{E}[ \displaystyle \sum_j w_jC_j] = Z_{QP}(\bar{a}) =
Z_{CQP}(\bar{a}) +
\frac{1}{2}(c^T\bar{a} - \bar{a}^T\mbox{diag}(c)\bar{a}) \leq 2 \cdot
Z_{CQP}(\bar{a}).\]
The inequality follows from $Z_{CQP}(\bar{a}) \geq \frac{1}{2}c^T\bar{a}$
and $\bar{a}^T\mbox{diag}(c)\bar{a} \geq 0$. Since $\bar{a}$ can be
computed in polynomial time and $Z_{CQP}(\bar{a})=Z_{CQP}^*$ is a lower
bound on $Z^*$, we have found a 2-approximation algorithm.
\end{proof}
\begin{lemma}
\label{LM:optimum}
For instances of $\mathbf{P||\sum w_jC_j}$ the vector $\bar{a}$, defined by
$\bar{a}_{ij} = \frac{1}{m}$ for all $i,j$, is an optimum solution to
(CQP). (This optimum solution is unique if all ratios $w_j/p_j$,
$j=1,\ldots,n$, are different and positive).
\end{lemma}
\begin{proof}
Let $a\neq \bar{a}$ a feasible solution to (CQP). Since (CQP) is symmetric
with respect to the $m$ identical machines, we get $m-1$ additional
solution of the same value by cyclically permuting the machines
$m-1$ times. The convex combination with coefficients $\frac{1}{m}$ of $a$
and these new solutions is precisely $\bar{a}$. Since the objective
function of (CQP) is convex, the value of $\bar{a}$ is less than or equal
to the value of $a$.\\
\end{proof}
\begin{theorem}
\label{TH:b}
Assigning each job independently and uniformly at random to one of the $m$
machines is a $(\frac{3}{2} - \frac{1}{2m})$-approximation algorithm for
the problem $\mathbf{P||\sum w_jC_j}$.
\end{theorem}
\begin{proof}
For the case of identical parallel machines $c_{ij} = w_jp_j$ and therefore
for every feasible solution $a$ to (CQP)
\[c^Ta = \displaystyle \sum_{ij} c_{ij}a_{ij} = \sum_j w_jp_j \sum_i
a_{ij}= \sum_jw_jp_j \leq Z^*.\]
\[\bar{a}^T \mbox{diag}(c)\bar{a} = \frac{1}{m}c^T \bar{a}\]
and therefor we get:
\begin{eqnarray*}
\mbox{E}[ \displaystyle \sum_j w_jC_j] & = & Z_{QP}(\bar{a}) =
Z_{CQP}(\bar{a}) + \frac{1}{2}(c^T\bar{a} - \bar{a}^T
\mbox{diag}(c)\bar{a}) \\
& \leq & Z_{CQP}(\bar{a}) + (\frac{1}{2} - \frac{1}{2m})c^T\bar{a} \leq
(\frac{3}{2} - \frac{1}{2m})Z^*.
\end{eqnarray*}
\end{proof}
\begin{corollary}
\label{cor:2}
For instances of $\mathbf{R||\sum w_jC_j}$, the value of an optimal
schedule is within a factor 2 of the optimum solution to the relaxation
(CQP) This bound is tight even for the case of identical parallel machines
$\mathbf{P||\sum w_jC_j}$.
\end{corollary}
\begin{proof}
The positive result follows from the proof of the theorem. To prove the
tightness of this result, consider an instance with one job and $m$
identical parallel machines. The value $Z^*$ of an optimum schedule is
equal to one; by lemma~\ref{LM:optimum} we get $Z^*(CQP) =
\frac{m+1}{2m}$.
Thus, if $m$ goes to infinity the ratio $Z^*/Z^*(CQP)$ converges to 2.
\end{proof}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Improving the relaxation and approximation}
Unfortunately, we cannot directly carry over the $\frac{3}{2}$-approximation
from Theorem~\ref{TH:b} to the setting of unrelated parallel machines. The
reason is that $c^Ta$ is not necessarily a lower bound on $Z^*$ for every
feasible solution $a$ to (CQP). However, the value of each binary solution
$a$ is bounded from below by $c^Ta$. The idea for an improved
approximation result is to add this lower bound as a constraint to (CQP).
It leads to the following strengthened relaxation (CQP'):
\begin{eqnarray}
\mbox{minimize} & Z_{CQP'} \hspace*{5.25cm} & \nonumber \\
\mbox{subject to} & \displaystyle \sum_{i=1}^m a_{ij} = 1
\hspace*{4.5cm} &\mbox{ for $j\in J$} \nonumber \\
& Z_{CQP'} \geq \frac{1}{2}c^Ta
+\frac{1}{2}a^T(D+\mbox{diag}(c))a & \\
& Z_{CQP'} \geq c^Ta \hspace*{4.1cm} & \\
& a \geq 0 \hspace*{5.35cm} & \nonumber
\end{eqnarray}
Unfortunately it is not clear whether (CQP') can be solved to optimality
in polynomial time, because now we have a convex constraint instead of a
convex objective function and the solution to (CQP') my be irrational.
On the other hand, (CQP') is a convex program and can be solved within an
additive error of $\varepsilon$ in polynomial time, for example through
the
ellipsoid algorithm, see \cite{G88}.
\begin{theorem}
Computing a near optimal solution to the relaxation (CQP') and using
Algorithm {\em DERANDOMIZED ROUNDING} to get a feasible schedule is a
$\frac{3}{2}$-approximation algorithm for $\mathbf{R||\sum w_jC_j}$.
\end{theorem}
\begin{proof}
We compute a feasible solution $\bar{a}$ to (CQP') satisfying
$Z_{CQP'}(\bar{a})