Synopses & Reviews
sed & awk describes two text processing programs that are mainstays of the UNIX programmer's toolbox.sed is a "stream editor" for editing streams of text that might be too large to edit as a single file, or that might be generated on the fly as part of a larger data processing step. The most common operation done with sed is substitution, replacing one block of text with another.awk is a complete programming language. Unlike many conventional languages, awk is "data driven" -- you specify what kind of data you are interested in and the operations to be performed when that data is found. awk does many things for you, including automatically opening and closing data files, reading records, breaking the records up into fields, and counting the records. While awk provides the features of most conventional programming languages, it also includes some unconventional features, such as extended regular expression matching and associative arrays. sed & awk describes both programs in detail and includes a chapter of example sed and awk scripts.This edition covers features of sed and awk that are mandated by the POSIX standard. This most notably affects awk, where POSIX standardized a new variable, CONVFMT, and new functions, toupper() and tolower(). The CONVFMT variable specifies the conversion format to use when converting numbers to strings (awk used to use OFMT for this purpose). The toupper() and tolower() functions each take a (presumably mixed case) string argument and return a new version of the string with all letters translated to the corresponding case.In addition, this edition covers GNU sed, newly available since the first edition. It also updates the first edition coverage of Bell Labs nawk and GNU awk (gawk), covers mawk, an additional freely available implementation of awk, and briefly discusses three commercial versions of awk, MKS awk, Thompson Automation awk (tawk), and Videosoft (VSAwk).
This text describes two text manipulation programs that are mainstays of the UNIX programmer's toolbox. It covers the sed and awk programs as they are now mandated by the POSIX standard and includes discussion of the GNU versions of these programs.
This edition describes two text processing programs that are mainstays of the UNIX programmer's toolbox. The book lays a foundation for both programs by describing how they are used and by introducing the fundamental concepts of regular expressions and text matching. Cover Title
The book begins with an overview and a tutorial that demonstrate a progression in functionality from grep to sed to awk. sed and awk share a similar command-line syntax, accepting user instructions in the form of a script. Because all three programs use UNIX regular expressions, an entire chapter is devoted to understanding UNIX regular expression syntax. Next, the book describes how to write sed scripts. After getting started by writing a few simple scripts, you'll learn other basic commands that parallel manual editing actions, as well as advanced commands that introduce simple programming constructs. Among the advanced commands are those that manipulate the hold space, a set-aside temporary buffer. The second part of the book has been extensively revised to include POSIX awk as well as coverage of three freely available and three commercial versions of awk. The book introduces the primary features of the awk language and how to write simple scripts. You'll also learn: common programming constructs; how to use awk's built-in functions; how to write user-defined functions; debugging techniques for awk programs; how to develop an application that processes an index, demonstrating much of the power of awk; and FTP and contact information for obtaining various versions of awk. Also included is a miscellany of user-contributed scripts that demonstrate a wide range of sed and awk scripting styles and techniques.
About the Author
Dale Dougherty is the publisher of the O'Reilly Network and Director of O'Reilly Research. Dale has been instrumental in many of O'Reilly's most important efforts, including founding O'Reilly & Associates with Tim O'Reilly. He was the developer and publisher of Global Network Navigator (GNN), the first commercial Web site. Dale was developer and publisher of Web Review, the online magazine for Web designers, and he was O'Reilly & Associates' first editor. Dale has written and edited numerous books at O'Reilly & Associates. Dougherty is a Lecturer in the School of Information Management and Systems (SIMS) at the University of California at Berkeley.
Arnold Robbins, an Atlanta native, is a professional programmer and technical author. He has worked with Unix systems since 1980, when he was introduced to a PDP-11 running a version of Sixth Edition Unix. He has been a heavy AWK user since 1987, when he became involved with gawk, the GNU project's version of AWK. As a member of the POSIX 1003.2 balloting group, he helped shape the POSIX standard for AWK. He is currently the maintainer of gawk and its documentation. He is also coauthor of the sixth edition of O'Reilly's Learning the vi Editor. Since late 1997, he and his family have been living happily in Israel.
Table of Contents
Dedication; Preface; Scope of This Handbook; Availability of sed and awk; Obtaining Example Source Code; Conventions Used in This Handbook; About the Second Edition; Acknowledgments from the First Edition; Comments and Questions; Chapter 1: Power Tools for Editing; 1.1 May You Solve Interesting Problems; 1.2 A Stream Editor; 1.3 A Pattern-Matching Programming Language; 1.4 Four Hurdles to Mastering sed and awk; Chapter 2: Understanding Basic Operations; 2.1 Awk, by Sed and Grep, out of Ed; 2.2 Command-Line Syntax; 2.3 Using sed; 2.4 Using awk; 2.5 Using sed and awk Together; Chapter 3: Understanding Regular Expression Syntax; 3.1 That's an Expression; 3.2 A Line-Up of Characters; 3.3 I Never Metacharacter I Didn't Like; Chapter 4: Writing sed Scripts; 4.1 Applying Commands in a Script; 4.2 A Global Perspective on Addressing; 4.3 Testing and Saving Output; 4.4 Four Types of sed Scripts; 4.5 Getting to the PromiSed Land; Chapter 5: Basic sed Commands; 5.1 About the Syntax of sed Commands; 5.2 Comment; 5.3 Substitution; 5.4 Delete; 5.5 Append, Insert, and Change; 5.6 List; 5.7 Transform; 5.8 Print; 5.9 Print Line Number; 5.10 Next; 5.11 Reading and Writing Files; 5.12 Quit; Chapter 6: Advanced sed Commands; 6.1 Multiline Pattern Space; 6.2 A Case for Study; 6.3 Hold That Line; 6.4 Advanced Flow Control Commands; 6.5 To Join a Phrase; Chapter 7: Writing Scripts for awk; 7.1 Playing the Game; 7.2 Hello, World; 7.3 Awk's Programming Model; 7.4 Pattern Matching; 7.5 Records and Fields; 7.6 Expressions; 7.7 System Variables; 7.8 Relational and Boolean Operators; 7.9 Formatted Printing; 7.10 Passing Parameters Into a Script; 7.11 Information Retrieval; Chapter 8: Conditionals, Loops, and Arrays; 8.1 Conditional Statements; 8.2 Looping; 8.3 Other Statements That Affect Flow Control; 8.4 Arrays; 8.5 An Acronym Processor; 8.6 System Variables That Are Arrays; Chapter 9: Functions; 9.1 Arithmetic Functions; 9.2 String Functions; 9.3 Writing Your Own Functions; Chapter 10: The Bottom Drawer; 10.1 The getline Function; 10.2 The close( ) Function; 10.3 The system( ) Function; 10.4 A Menu-Based Command Generator; 10.5 Directing Output to Files and Pipes; 10.6 Generating Columnar Reports; 10.7 Debugging; 10.8 Limitations; 10.9 Invoking awk Using the #! Syntax; Chapter 11: A Flock of awks; 11.1 Original awk; 11.2 Freely Available awks; 11.3 Commercial awks; 11.4 Epilogue; Chapter 12: Full-Featured Applications; 12.1 An Interactive Spelling Checker; 12.2 Generating a Formatted Index; 12.3 Spare Details of the masterindex Program; Chapter 13: A Miscellany of Scripts; 13.1 uutot.awk--Report UUCP Statistics; 13.2 phonebill--Track Phone Usage; 13.3 combine--Extract Multipart uuencoded Binaries; 13.4 mailavg--Check Size of Mailboxes; 13.5 adj--Adjust Lines for Text Files; 13.6 readsource--Format Program Source Files for troff; 13.7 gent--Get a termcap Entry; 13.8 plpr--lpr Preprocessor; 13.9 transpose--Perform a Matrix Transposition; 13.10 m1--Simple Macro Processor; Appendix A: Quick Reference for sed; A.1 Command-Line Syntax; A.2 Syntax of sed Commands; A.3 Command Summary for sed; Appendix B: Quick Reference for awk; B.1 Command-Line Syntax; B.2 Language Summary for awk; B.3 Command Summary for awk; Appendix C: Supplement for Chapter 12; C.1 Full Listing of spellcheck.awk; C.2 Listing of masterindex Shell Script; C.3 Documentation for masterindex; C.3.1 Background Details; C.3.2 Coding Index Entries; C.3.3 Output Format; C.3.4 Compiling a Master Index; Colophon;