Synopses & Reviews
The use of static analysis techniques to prove the partial correctness of C code has recently attracted much attention due to the high cost of software errors - particularly with respect to security vulnerabilities. However, research into new analysis techniques is often hampered by the technical difficulties of analysing accesses through pointers, pointer arithmetic, coercion between types, integer wrap-around and other low-level behaviour. Axel Simon provides a concise, yet formal description of a value-range analysis that soundly approximates the semantics of C programs using systems of linear inequalities (polyhedra). The analysis is formally specified down to the bit-level while providing a precise approximation of all low-level aspects of C using polyhedral operations and, as such, it provides a basis for implementing new analyses that are aimed at verifying higher-level program properties precisely. One example of such an analysis is the tracking of the NUL position in C string buffers, which is shown as an extension to the basic analysis and which thereby demonstrates the modularity of the approach. While the book focuses on a sound analysis of C, it will be useful to any researcher and student with an interest in static analysis of real-world programming languages. In fact, many concepts presented here carry over to other languages such as Java or assembler, to other applications such as taint analysis, array and shape analysis and possibly even to other approaches such as run-time verification and test data generation.
Review
From the reviews: "This book describes a static analysis that aims to prove the absence of buffer overflows in C programs. ... The book formally describes how program operations are mapped to operations on polyhedra. ... Many concepts presented here carry over to other languages such as Java or assembler. So it will be useful to any researcher and student with an interest in static analysis of real-world programming languages." (Stefan Meyer, Zentralblatt MATH, Vol. 1155, 2009)
Review
From the reviews:
"This book describes a static analysis that aims to prove the absence of buffer overflows in C programs. ... The book formally describes how program operations are mapped to operations on polyhedra. ... Many concepts presented here carry over to other languages such as Java or assembler. So it will be useful to any researcher and student with an interest in static analysis of real-world programming languages." (Stefan Meyer, Zentralblatt MATH, Vol. 1155, 2009)
Synopsis
Value-Range Analysis of C Programs describes a static analysis for detecting buffer overflows. A buffer overflow in a C program occurs when input is read into a memory buffer whose length exceeds that of the buffer. Overflows usually lead to crashes and may even enable a malicious person to gain control over a computer system. They are recognised as one of the most widespread forms of computer vulnerability. Based on the analysis of a standard mail-forwarding program, necessary refinements of the basic analysis are examined, thereby paving the way for an analysis that is precise enough to prove the absence of buffer overflows in legacy C code.
Table of Contents
Introduction.- Technical Background.- Value Range Analysis.- Analysing C.- Soundness.- An abstraction of C.- Combining Value and Content Abstraction.- Combining Pointer and Value-Range Analysis.- Efficiency.- Completeness.- Analysing String Buffers.- Widening with Landmarks.- Further Refinements.- Related Tools.- The Astrée Anlyser.- SLAM and ESPX.- CCured.- Other Approaches.- Contributions.-
A Semantics for C.- Core C.- Preliminaries.- The Environments.- Concrete Semantics.- Collecting Semantics.- Related Work.-
Abstracting Soundly.-
Abstract State Space.- An Introductory Example.- Points-To Analysis.- The Points-To Abstract Domain.- Related Work.- Numeric Domains.- The Domain of Convex Polyhedra.- Operations on Polyhedra.- Multiplicity Domain.- Combining the Polyhedral and Multiplicity Domain.- Related Work.-
Taming Casting and Wrapping.- Modelling the Wrapping of Integers.- A Language Featuring Finite Integer Arithmetic.- The Syntax of SubC.- The Semantics of SubC.- Polyhedral Analysis of Finite Integers.- Revisiting the Domain of Convex Polyhedra.- Implicit Wrapping of Polyhedral Variables.- Explicit Wrapping of Polyhedral Variables.- Wrapping Variables with a Finite Range.- Wrapping Variables with Infinite Ranges.- Wrapping Several Variables.- An Algorithm for Explicit Wrapping.- An Abstract Semantics for SubC.- Discussion.- Related Work.-
Overlapping Memory Accesses and Pointers.- Memory as a Set of Fields.- Memory Layout for Core C.- Access Trees.- Related Work.- Mixing Values and Pointers.- Abstraction Relation.-
Abstract Semantics.- Expressions and Simple Assignments.- Assigning Structures.- Casting, &-Operations and Dynamic Memory.- Discussion and Related Work.-
Ensuring Efficiency.-
Planar Polyhedra.- Operations on Inequalities.- Entailment on Single Inequalities.- Operations on Sets of Inequalities.- Entailment Checking.- Removing Redundancies.- Convex Hull.- Linear Programming and Planar Polyhedra.- Widening Planar Polyhedra.-
The TVPI Abstract Domain.- Principles of the TVPI Domain.- Entailment Check.- Convex Hull.- Projection.- Reduced Product Between Bounds and Inequalities.- Incremental Closure.- Approximating General Inequalities.- Linear Programming in the TVPI Domain.- Widening of TVPI Polyhedra.- Related Work.-
The Integral TVPI Domain.- The Merit of Z-Polyhedra.- Improving Precision.- Limiting the Growth of Coefficients.- Harvey's Integral Hull Algorithm.- Calculating Cuts Between Two Inequalities.- Integer Hull in the Reduced Product Domain.- Planar Z-Polyhedra and Closure.-Possible Implementations of a Z-TVPI Domain.- Tightening Bpunds Across Projections.- Discussion and Implementation.- Related Work.-
Interfacing Analysis and Numeric Domain.- Separating Interval from Relational Information.- Inferring Relevant Fields and Addresses.- Typed Abstract Variables.- Populating the Field Map.- Applying Widening in Fixpoint Calculations.-
Improving Precision.-
Tracking String Lengths.- Manipulating Implicitly Terminated Strings.- Analysing the String Loop.- Calculating a Fixpoint of the Loop.- Prerequisites for String Buffer Analysis.- Incorporating String Buffer Analysis.- Extending the Abstraction Relation.- Related Work.-
Widening with Landmarks.- An Introduction to Widening/Narrowing.- The Limitations of Narrowing.- Improving Widening and Removing Narrowing.- Revisiting the Analysis of String Buffers.- Applying the Widening/Narrowing Approach.- The Rationale Behind Landmarks.- Creating Landmarks for Widening.- Using Landmarks in Widening.- Acquiring Landmarks.- Using Landmarks at a Widening Point.- Extrapolation Operator for Polyhedra.- Related Work.-
Combining Points-To and Numeric Analysis.- Boolean Flags in the Numeric Domain.- Incoporating Noolean Flags into Points-To Sets.- Practical Implementation.-
Implementation.- Technical Overview at the Anlyser.- Calculating Fixpoints.- Scheduling of Code without Loops.- Scheduling in the Presence of Loops.- Related Work.- Limitations of String Buffer Analysis.- Weaknesses of Tracking First NUL Positions.- Handling Symbolic NUL Positions.-
Conclusion and Outlook.- Conclusion.- Outlook.- Replacing the Polyhedral Domain.- Analysing Assembler instead of Core C.- Better Analysis of Dynamically Allocated Memory.- Analysing Floating Point Arithmetic.- Context-Sensitive Analysis.-
Appendix A: Core C Example.-
References.-
Index