Technical Report MSC-2019-12

Title: Differentiable Neural Architecture Search with an Arithmetic Complexity Constraint
Authors: Yochai Zur
Supervisors: Alexander M. Bronstein
PDFCurrently accessibly only within the Technion network
Abstract: Deep learning in general and convolutional neural networks (CNN) in particular have demonstrated spectacular success on a variety of tasks becoming a \emph{de facto} standard in machine learning. Deep learning techniques allow to train the parameters of the network completely automatically from the data. However, architectural choices such as the number and the specific topology of the layers are usually designed by hand. In fact, the most successful CNN architectures in use today were designed by trial and error. Neural Architecture Search (NAS) aims at automating the design of neural network architectures for a given task. Being part of a larger \emph{automated machine learning} (AutoML) trend, it promises to alleviate the scarcity of machine learning experts needed to design custom architectures, and do this task better than humans.

While being a very popular tool in modern machine learning arsenal, CNNs are typically very demanding computationally at inference time. One of the main ways to alleviate this burden is \emph{quantization} relying on the use of low-precision arithmetic representation for the weights and the activations, leading to more efficient arithmetic operations. The resulting reduction of computational complexity depends on the exact hardware platform, but is typically significant. One of the big questions in designing a quantized neural network is how to allocate bit widths to different filters achieving optimal tradeoff between computational complexity and loss in accuracy. Another popular method that has so far been studied independently is the pruning of the number of filters in each layer.

The present study is an attempt to formulate optimal bit allocation and pruning as a NAS problem, searching for the best configuration of allocated bits satisfying a computational complexity budget while maximizing the model accuracy. While conventional NAS methods apply evolution or reinforcement learning over a discrete and non-differentiable search space, recently, a differentiable search method has been proposed. The method is based on the continuous relaxation of the architecture representation, allowing an efficient search of the architecture using gradient descent.

We introduce a differentiable NAS method allowing to find superior heterogeneous architectures, \ie CNNs in which each filter can be quantized with a different bitwidth or each layer can have a different number of filters, and evaluate its performance.

CopyrightThe above paper is copyright by the Technion, Author(s), or others. Please contact the author(s) for more information

Remark: Any link to this technical report should be to this page (, rather than to the URL of the PDF files directly. The latter URLs may change without notice.

To the list of the MSC technical reports of 2019
To the main CS technical reports page

Computer science department, Technion