Natalie Eckel

Research thesis

Empirical Study of Object-Layout Strategies and Optimization Techniques

Supervisor: Dr. Joseph (Yossi) Gil

Table of contents:


    One of the most exciting challenges in the implementation of object oriented programming languages is to efficiently realize together multiple inheritance and dynamic dispatch. Indeed, it was believed that it was impossible to efficiently introduce multiple inheritance into C++, until proved otherwise by Stroustrup. Although there is a large body of research on the time overhead of object oriented programs, there is little work on memory overhead. To support efficient implementation of dynamic dispatch and multiple inheritance, there are compiler-generated fields, stored in object's memory. This work takes an empirical approach to the study of the memory overhead due to compiler generated fields, which turns out to be significant in the presence of multiple inheritance.

    We study the performance, in terms of overhead to object size of three compilation strategies: separate compilation, whole program analysis, and user annotations as done in C++. A variant to each such strategy is the inclusion of pointers to indirect virtual bases in objects. Using a database of several large multiple inheritance hierarchies, spanning 7000 classes, several application domains and different programming languages we find that in all strategies there are certain classes which give rise a large number of compiler generated fields in their object layout.

    Several recently introduced optimization techniques are studied in this work. Our study shows that the existence of virtual inheritance greatly increases the number of compiler generated fields stored in object memory. Some of the techniques, such as devirtualization and inlining, strive to reduce the number of virtual inheritance in class hierarchies. Others, such as bidirectional layout and packing, minimize the number of compiler generated fields by suggesting major architecture changes. The optimization techniques also broken down by their applicability in different compilation strategies: some require whole program information while others don't. We study the efficacy of the optimization techniques and show that an average saving of close to 50% in object memory overhead can be achieved.


Seminar lecture slides

Final thesis paper

This work was presented at the 14th European Conference on Object-Oriented Programming (ECOOP) in June 2000. The final paper, submitted to the Senate of the Technion, is availabled The paper used a great amount of benchmarking data, compiled in the form of tables and figures. Those figures were used in the Encapsulated Postscript (eps) format. Click here to dowload the figures used in this paper (zipped eps).

Software and resources used and developed in this research

Database of class hierarchies

In this research a large database of classs hierarchies was used for the purposes of benchmarking of object-layout optimization algorithms. Our database selection started from the collection of large hierarchies which served Krall, Vitek and Horspool in becnhamrking their algorithms: We enriched this collection with the class hierarchies of JDK1.1 and ISE version 4 distribution of Eiffel. The database contains 25 text files with extension "INH". Each file describes a specific class hierarchy. A hierarchy file contain a line
super <base class> name  <derived class>
for each inheritance link in the hierarchy. There is no information about the nature of inheritance (virtual or non-virtual) available.

Benchmarking program

A program was written in order to analyze the structure of class hierarchies and produce the benchmmarking results for our optimization algorithms. Project information: How to build and run the project:

Benchmarking results

The statistics generated by the program were processed using Microsoft (R) Excel 97 Hebrew Edition. The following zipped XLS files contains benchmarking data and charts used in this work It HIGHTLY RECOMMENDED to dowload and unpack all those files simultaneously, since there are links between those files - the data is stored in one and the charts are in the other.