Technical Report MSC-2006-17

Title: The Complexity of SIMD Alignment
Authors: Liza Fireman
Supervisors: Erez Petrank
Abstract: Optimizing programs for modern multiprocessor or vector platforms is a major important challenge for compilers today. Various problems in this domain are not yet thoroughly understood. In this work, we focus on one such problem: the \simda problem. In this problem we are given a code that includes a loop with misaligned references. Such a code requires additional realignment operations to allow parallelization on SIMD architectures because of SIMD alignment constraints. The realigning of data is achieved by shifting streams of data after reading them into the SIMD registers and before applying the SIMD operations. Shifts are executed inside the loop and therefore affect the performance significantly. The problem is to automatically reorganize data streams to satisfy the alignment requirements imposed by the hardware with a minimum number of shift executions.

Previously, only heuristics were used to solve this problem, without any guarantees on the number of shifts in the obtained solution. We study two interesting and realistic special cases of the \simda problem and present two novel and efficient algorithms that solve the problem {\em optimally} for these two cases. In the first case, we deal with expressions whose number of different alignments is not restricted, but their graph representation form a tree and any array appears in the expression at most once. For this special case we show a polynomial-time algorithm using dynamic programming. In the second case, the graph representation can be any DAG, but we restrict the expression to contain only two distinct predetermined alignments associated with the input and output operands. We use a {\sc min-cut}/{\sc max-flow} algorithm as a subroutine.

We also discuss the relation between the \simda problem and the \multiw and \multiwn problems. We show how to achieve an approximated solution to the \simda problem based on approximation algorithms to these two known problems.

CopyrightThe above paper is copyright by the Technion, Author(s), or others. Please contact the author(s) for more information

Remark: Any link to this technical report should be to this page (, rather than to the URL of the PDF files directly. The latter URLs may change without notice.

To the list of the MSC technical reports of 2006
To the main CS technical reports page

Computer science department, Technion