Synopses & Reviews
The use of static analysis techniques to prove the partial correctness of C code has recently attracted much attention due to the high cost of software errors - particularly with respect to security vulnerabilities. However, research into new analysis techniques is often hampered by the technical difficulties of analysing accesses through pointers, pointer arithmetic, coercion between types, integer wrap-around and other low-level behaviour. Axel Simon provides a concise, yet formal description of a value-range analysis that soundly approximates the semantics of C programs using systems of linear inequalities (polyhedra). The analysis is formally specified down to the bit-level while providing a precise approximation of all low-level aspects of C using polyhedral operations and, as such, it provides a basis for implementing new analyses that are aimed at verifying higher-level program properties precisely. One example of such an analysis is the tracking of the NUL position in C string buffers, which is shown as an extension to the basic analysis and which thereby demonstrates the modularity of the approach. While the book focuses on a sound analysis of C, it will be useful to any researcher and student with an interest in static analysis of real-world programming languages. In fact, many concepts presented here carry over to other languages such as Java or assembler, to other applications such as taint analysis, array and shape analysis and possibly even to other approaches such as run-time verification and test data generation.
Review
From the reviews: "This book describes a static analysis that aims to prove the absence of buffer overflows in C programs. ... The book formally describes how program operations are mapped to operations on polyhedra. ... Many concepts presented here carry over to other languages such as Java or assembler. So it will be useful to any researcher and student with an interest in static analysis of real-world programming languages." (Stefan Meyer, Zentralblatt MATH, Vol. 1155, 2009)
Review
From the reviews:
"This book describes a static analysis that aims to prove the absence of buffer overflows in C programs. ... The book formally describes how program operations are mapped to operations on polyhedra. ... Many concepts presented here carry over to other languages such as Java or assembler. So it will be useful to any researcher and student with an interest in static analysis of real-world programming languages." (Stefan Meyer, Zentralblatt MATH, Vol. 1155, 2009)
Synopsis
Value-Range Analysis of C Programs describes a static analysis for detecting buffer overflows. A buffer overflow in a C program occurs when input is read into a memory buffer whose length exceeds that of the buffer. Overflows usually lead to crashes and may even enable a malicious person to gain control over a computer system. They are recognised as one of the most widespread forms of computer vulnerability. Based on the analysis of a standard mail-forwarding program, necessary refinements of the basic analysis are examined, thereby paving the way for an analysis that is precise enough to prove the absence of buffer overflows in legacy C code.
Synopsis
Abu?erover?owoccurswheninputiswrittenintoamemorybu?erthatisnot large enough to hold the input. Bu?er over?ows may allow a malicious person to gain control over a computer system in that a crafted input can trick the defectiveprogramintoexecutingcodethatisencodedintheinputitself.They are recognised as one of the most widespread forms of security vulnerability, and many workarounds, including new processor features, have been proposed to contain the threat. This book describes a static analysis that aims to prove the absence of bu?er over?ows in C programs. The analysis is conservative in the sense that it locates every possible over?ow. Furthermore, it is fully automatic in that it requires no user annotations in the input program. Thekeyideaoftheanalysisistoinferasymbolicstateforeachp- gram point that describes the possible variable valuations that can arise at that point. The program is correct if the inferred values for array indices and pointer o?sets lie within the bounds of the accessed bu?er. The symbolic state consists of a ?nite set of linear inequalities whose feasible points induce a convex polyhedron that represents an approximation to possible variable valuations. The book formally describes how program operations are mapped to operations on polyhedra and details how to limit the analysis to those p- tionsofstructuresandarraysthatarerelevantforveri?cation.Withrespectto operations on string bu?ers, we demonstrate how to analyse C strings whose length is determined by anul character within the string.
Table of Contents
Introduction.- Technical Background.- Value Range Analysis.- Analysing C.- Soundness.- An abstraction of C.- Combining Value and Content Abstraction.- Combining Pointer and Value-Range Analysis.- Efficiency.- Completeness.- Analysing String Buffers.- Widening with Landmarks.- Further Refinements.- Related Tools.- The Astrée Anlyser.- SLAM and ESPX.- CCured.- Other Approaches.- Contributions.-
A Semantics for C.- Core C.- Preliminaries.- The Environments.- Concrete Semantics.- Collecting Semantics.- Related Work.-
Abstracting Soundly.-
Abstract State Space.- An Introductory Example.- Points-To Analysis.- The Points-To Abstract Domain.- Related Work.- Numeric Domains.- The Domain of Convex Polyhedra.- Operations on Polyhedra.- Multiplicity Domain.- Combining the Polyhedral and Multiplicity Domain.- Related Work.-
Taming Casting and Wrapping.- Modelling the Wrapping of Integers.- A Language Featuring Finite Integer Arithmetic.- The Syntax of SubC.- The Semantics of SubC.- Polyhedral Analysis of Finite Integers.- Revisiting the Domain of Convex Polyhedra.- Implicit Wrapping of Polyhedral Variables.- Explicit Wrapping of Polyhedral Variables.- Wrapping Variables with a Finite Range.- Wrapping Variables with Infinite Ranges.- Wrapping Several Variables.- An Algorithm for Explicit Wrapping.- An Abstract Semantics for SubC.- Discussion.- Related Work.-
Overlapping Memory Accesses and Pointers.- Memory as a Set of Fields.- Memory Layout for Core C.- Access Trees.- Related Work.- Mixing Values and Pointers.- Abstraction Relation.-
Abstract Semantics.- Expressions and Simple Assignments.- Assigning Structures.- Casting, &-Operations and Dynamic Memory.- Discussion and Related Work.-
Ensuring Efficiency.-
Planar Polyhedra.- Operations on Inequalities.- Entailment on Single Inequalities.- Operations on Sets of Inequalities.- Entailment Checking.- Removing Redundancies.- Convex Hull.- Linear Programming and Planar Polyhedra.- Widening Planar Polyhedra.-
The TVPI Abstract Domain.- Principles of the TVPI Domain.- Entailment Check.- Convex Hull.- Projection.- Reduced Product Between Bounds and Inequalities.- Incremental Closure.- Approximating General Inequalities.- Linear Programming in the TVPI Domain.- Widening of TVPI Polyhedra.- Related Work.-
The Integral TVPI Domain.- The Merit of Z-Polyhedra.- Improving Precision.- Limiting the Growth of Coefficients.- Harvey's Integral Hull Algorithm.- Calculating Cuts Between Two Inequalities.- Integer Hull in the Reduced Product Domain.- Planar Z-Polyhedra and Closure.-Possible Implementations of a Z-TVPI Domain.- Tightening Bpunds Across Projections.- Discussion and Implementation.- Related Work.-
Interfacing Analysis and Numeric Domain.- Separating Interval from Relational Information.- Inferring Relevant Fields and Addresses.- Typed Abstract Variables.- Populating the Field Map.- Applying Widening in Fixpoint Calculations.-
Improving Precision.-
Tracking String Lengths.- Manipulating Implicitly Terminated Strings.- Analysing the String Loop.- Calculating a Fixpoint of the Loop.- Prerequisites for String Buffer Analysis.- Incorporating String Buffer Analysis.- Extending the Abstraction Relation.- Related Work.-
Widening with Landmarks.- An Introduction to Widening/Narrowing.- The Limitations of Narrowing.- Improving Widening and Removing Narrowing.- Revisiting the Analysis of String Buffers.- Applying the Widening/Narrowing Approach.- The Rationale Behind Landmarks.- Creating Landmarks for Widening.- Using Landmarks in Widening.- Acquiring Landmarks.- Using Landmarks at a Widening Point.- Extrapolation Operator for Polyhedra.- Related Work.-
Combining Points-To and Numeric Analysis.- Boolean Flags in the Numeric Domain.- Incoporating Noolean Flags into Points-To Sets.- Practical Implementation.-
Implementation.- Technical Overview at the Anlyser.- Calculating Fixpoints.- Scheduling of Code without Loops.- Scheduling in the Presence of Loops.- Related Work.- Limitations of String Buffer Analysis.- Weaknesses of Tracking First NUL Positions.- Handling Symbolic NUL Positions.-
Conclusion and Outlook.- Conclusion.- Outlook.- Replacing the Polyhedral Domain.- Analysing Assembler instead of Core C.- Better Analysis of Dynamically Allocated Memory.- Analysing Floating Point Arithmetic.- Context-Sensitive Analysis.-
Appendix A: Core C Example.-
References.-
Index