eric schkufza
I am a member of the VMware research group (link). I graduated from stanford university with a PhD in computer science in 2015 (thesis). My advisor was professor alex aiken (genealogy). I am interested in applying applying the tools of large-scale data analysis and machine learning to the design of optimizing compilers. A list of my recent publications is below. Click here to download my cv (pdf).

Stratified Synthesis: Automatically Learning the x86-64 Instruction Set (PLDI 2016)
The x86-64 ISA sits at the bottom of the software stack of most desktop and server software. Because of its importance, many software analysis and verification tools depend, either explicitly or implicitly, on correct modeling of the semantics of x86-64 instructions. However, formal semantics for the x86-64 ISA are difficult to obtain and often written manually through great effort. We describe a mechanically synthesized formal semantics for a large fraction of the x86-64 Haswell ISA’s many thousands of instruction variants. The key to our results is stratified synthesis, where we use a set of instructions whose semantics are known to synthesize the semantics of additional instructions whose semantics are unknown. As the set of formally described instructions increases, the synthesis vocabulary expands, making it possible to synthesize the semantics of increasingly complex instructions. We describe an automatically synthesized formal semantics for 1,620 instruction variants of the x86-64 Haswell ISA. We evaluate the learned semantics against manually written semantics (where available) and find that they are formally equivalent with the exception of 58 instructions, where the manually written semantics contain an error. We further find the learned formulas to be largely as precise as manually written ones and of similar size. (pdf) (bib)
Conditionally Correct Superoptimization (OOPSLA 2015)
The aggressive optimization of heavily used kernels is an important problem in high-performance computing. However, both general purpose compilers and highly specialized tools such as superoptimizers often do not have sufficient static knowledge of restrictions on program inputs that could be exploited to produce the very best code. For many applications, the best possible code is conditionally correct: the optimized kernel is equal to the code that it replaces only under certain preconditions on the kernel’s inputs. The main technical challenge in producing conditionally correct optimizations is in obtaining non-trivial and useful conditions and proving conditional equivalence formally in the presence of loops. We combine abstract interpretation, decision procedures, and testing to yield a verification strategy that can address both of these problems. This approach yields a superoptimizer for x86 that in our experiments produces binaries that are often multiple times faster than those produced by production compilers. (pdf) (bib)
Stochastic Optimization of Floating-Point Programs with Tunable Precision (PLDI 2014)
The aggressive optimization of floating-point computations is an important problem in high-performance computing. Unfortunately, floating-point instruction sets have complicated semantics that often force compilers to preserve semantics as written. We present a method that treats floating-point optimization as a stochastic search problem. We demonstrate the ability to generate reduced precision implementations of Intel's handwritten C numeric library which are up to 6 times faster than the original code, and achieve end-to-end speedups of over 30% on a direct numeric simulation and a ray tracer by optimizing kernels that can tolerate a loss of precision while still remaining correct. Because these optimizations are mostly not amenable to formal verification using the current state of the art, we present a stochastic search technique for characterizing maximum error. The technique comes with an asymptotic guarantee and provides strong evidence of correctness. (pdf) (bib) (slides)
Data-Driven Equivalence Checking (OOPSLA 2013)
We present a data driven algorithm for checking the equivalence of two loops: the algorithm infers simulation relations using data from test runs. Once a candidate simulation relation has been obtained, an off-the-shelf SMT solver is used to check whether it actually holds. The algorithm is sound: insufficient data will cause the proof to fail. We demonstrate a prototype implementation of our algorithm, called DDEC, which is the first sound equivalence checker for loops written in x86. (pdf) (bib)
Stochastic Superoptimization (ASPLOS 2013)
We formulate the loop-free binary superoptimization task as a stochastic search problem. The competing constraints of transformation correctness and performance improvement are encoded as terms in a cost function, and a Markov Chain Monte Carlo sampler is used to rapidly explore the space of all possible programs to find one that is an optimization of a given target program. Although our method sacrifices completeness, the scope of programs we are able to consider, and the resulting quality of the programs that we produce, far exceed those of existing superoptimizers. Beginning from binaries compiled by llvm -O0 for 64-bit x86, our prototype implementation, STOKE, is able to produce programs which either match or outperform the code produced by gcc -O3, icc -O3, and in some cases, expert handwritten assembly. (pdf) (bib) (slides) (errata)

Interactive Furniture Layout Using Interior Design Guidelines (SIGGRAPH 2011)
We present an interactive furniture layout system that assists users by suggesting furniture arrangements that are based on interior design guidelines. Our system incorporates the layout guidelines as terms in a density function and generates layout suggestions by rapidly sampling the density function using a hardware-accelerated Monte Carlo sampler. Our results demonstrate that the suggestion generation functionality measurably increases the quality of furniture arrangements produced by participants with no prior training in interior design. (pdf) (bib)
Computer Generated Residential Building Layouts (SIGGRAPH Asia 2010)
We present a method for automated generation of building layouts for computer graphics applications. Our approach is motivated by the layout design process developed in architecture. Given a set of high-level requirements, an architectural program is synthesized using a Bayesian network trained on real-world data. The architectural program is realized in a set of floor plans, obtained through stochastic optimization. The floor plans are used to construct a complete three-dimensional building with internal structure. We demonstrate a variety of computer-generated buildings produced by the presented approach. (pdf) (bib)

Programming the Memory Hierarchy Revisited: Supporting Irregular Parallelism in Sequoia (PPOPP 2011)
We describe two novel constructs for programming parallel machines with multi-level memory hierarchies: call-up, which allows a child task to invoke computation on its parent, and spawn, which spawns a dynamically determined number of parallel children until some termination condition in the parent is met. Together we show that these constructs allow applications with irregular parallelism to be programmed in a straightforward manner, and furthermore these constructs complement and can be combined with constructs for expressing regular parallelism. We have implemented spawn and call-up in Sequoia and we present an experimental evaluation on a number of irregular applications. (pdf) (bib)