This text describes the design and implementation of the 4.4 BSD operating system, presenting the technical information needed by BSD system programmers and application programmers.
Marshall Kirk McKusick
writes books and articles, consults, and teaches classes on UNIX- and BSD-related subjects. While at the University of California at Berkeley, he implemented the 4.2BSD fast file system, and was the research computer scientist at the Berkeley Computer Systems Research Group (CSRG) overseeing the development and release of 4.3BSD and 4.4BSD. He has twice served as the president of the board of the Usenix Association.
Keith Bostic is a member of the technical staff at Berkeley Software Design, Inc. He spent 8 years as a member of the CSRG, overseeing the development of over 400 freely redistributable UNIX-compatible utilities, and is the recipient of the 1991 Distinguished Achievement Award from the University of California, Berkeley, for his work to make 4.4BSD freely redistributable. Concurrently, he was the principle architect of the 2.10BSD release of the Berkeley Software Distribution for PDP-11s, and the coauthor of the Berkeley Log Structured Filesystem and the Berkeley database package (DB). He is also the author of the widely used vi implementation, nvi. He received his undergraduate degree in Statistics and his Masters degree in Electrical Engineering from George Washington University. He is a member of the ACM, the IEEE, and several POSIX working groups. In his spare time, he enjoys scuba diving in the South Pacific, mountain biking, and working on a tunnel into Kirk and Eric's specially constructed wine cellar. He lives in Massachusetts with his wife, Margo Seltzer, and their cats.
Michael J. Karels is the System Architect and Vice President of Engineering at Berkeley Software Design, Inc. He spent 8 years as the Principal Programmer of the CSRG at the University of California, Berkeley as the system architect for 4.3BSD. Karels received his Bachelor's degree in Microbiology from the University of Notre Dame. While a graduate student in Molecular Biology at the University of California, he was the principal developer of the 2.9BSD UNIX release of the Berkeley Software Distribution for the PDP-11. He is a member of the ACM, the IEEE, and several POSIX working groups. He lives with his wife Teri Karels in the backwoods of Minnesota.
John S. Quarterman is Senior Technical Partner at Texas Internet Consulting, which consults in networks and open systems with particular emphasis on TCP/IP networks, UNIX systems, and standards.He is the author of The Matrix: Computer Networks and Conferencing Systems Worldwide (Digital Press, 1990), and is a coauthor of UNIX, POSIX, and Open Systems: The Open Standards Puzzle (1993), Practical Internetworking with TCP/IP and UNIX (1993), The Internet Connection: System Connectivity and Configuration (1994), and The E-Mail Companion: Communicating Effectively via the Internet and Other Global Networks (1994), all published by Addison-Wesley. He is editor of Matrix News, a monthly newsletter about issues that cross network, geographic, and political boundaries, and of Matrix Maps Quarterly; both are published by Matrix Information and Directory Services, Inc. (MIDS) of Austin, Texas. He is a partner in Zilker Internet Park, which provides Internet access from Austin. He and his wife, Gretchen Quarterman, split their time among his home in Austin, hers in Buffalo, New York, and various other locations.
1. OVERVIEW. 1. History and Goals.
History of the UNIX System.
AT&T UNIX System III and System V.
Berkeley Software Distributions.
UNIX in the World.
BSD and Other Systems.
The Influence of the User Community.
Design Goals of 4BSD.
4.2BSD Design Goals.
4.3BSD Design Goals.
4.4BSD Design Goals.
References. 2. Design Overview of 4.4BSD.
4.4BSD Facilities and the Kernel.
Process Groups and Sessions.
BSD Memory-Management Design Decisions.
Memory Management Inside the Kernel.
Descriptors and I/O.
Multiple Filesystem Support.
References. 3.Kernel Services.
Entry to the Kernel.
Return from the Kernel.
Returning from a System Call.
Traps and Interrupts.
I/O Device Interrupts.
Statistics and Process Scheduling.
Adjustment of the Time.
User, Group, and Other Identifiers.
Process Groups and Sessions.
II. PROCESSES. 4. Process Management.
Introduction to Process Management.
The Process Structure.
The User Structure.
Low-Level Context Switching.
Voluntary Context Switching.
Calculations of Process Priority.
Process Run Queues and Context Switching.
Comparison with POSIX Signals.
Posting of a Signal.
Delivering a Signal.
Process Groups and Sessions.
References. 5. Memory Management.
Processes and Memory .
Advantages of Virtual Memory.
Hardware Requirements for Virtual Memory.
Overview of the 4.4BSD Virtual-Memory System.
Kernel Memory Management.
Kernel Maps and Submaps.
Kernel Address-Space Allocation.
4.4BSD Process Virtual-Address Space.
Mapping to Objects.
Objects to Pages.
Collapsing of Shadow Chains.
5.6 Creation of a New Process.
Reserving Kernel Resources.
Duplication of the User Address Space.
Creation of a New Process Without Copying.
Execution of a File.
Process Manipulation of Its Address Space.
Change of Process Size.
Change of Protection.
Termination of a Process.
The Pager Interface.
The Pageout Daemon.
The Swap-In Process.
The Role of the pmap Module.
Initialization and Startup.
Mapping Allocation and Deallocation.
Change of Access and Wiring Attributes for Mappings.
Management of Page-Usage Information.
Initialization of Physical Pages.
Management of Internal Data Structures.
III. I/O System. 6. I/O System Overview.
I/O Mapping from User to Device.
Entry Points for Block-Device Drivers.
Sorting of Disk I/O Requests.
Raw Devices and Physical I/O.
Entry Points for Character-Device Drivers.
Descriptor Management and Services.
Open File Entries.
Management of Descriptors.
Multiplexing I/O on Descriptors.
Implementation of Select.
Movement of Data Inside the Kernel.
The Virtual-Filesystem Interface.
Contents of a Vnode.
Exported Filesystem Services.
The Name Cache.
Implementation of Buffer Management.
Simple Filesystem Layers.
The Union Mount Filesystem.
References. 7. Local Filesystems.
Hierarchical Filesystem Management.
Structure of an Inode.
Finding of Names in Directories.
Other Filesystem Semantics.
Large File Sizes.
References. 8. Local Filestores.
Overview of the Filestore.
The Berkeley Fast Filesystem.
Organization of the Berkeley Fast Filesystem.
Optimization of Storage Utilization.
Reading and Writing to a File.
The Log-Structured Filesystem.
Organization of the Log-Structured Filesystem.
Reading of the Log.
Writing to the Log.
The Buffer Cache.
Creation of a File.
Reading and Writing to a File.
The Memory-Based Filesystem.
Organization of the Memory-Based Filesystem.
References. 9. The Network Filesystem.
History and Overview.
NFS Structure and Operation.
The NFS Protocol.
The 4.4BSD NFS Implementation.
RPC Transport Issues.
Techniques for Improving Performance.
References. 10. Terminal Handling.
The tty Structure.
Process Groups, Sessions, and Terminal Control.
RS-232 and Modem Control.
Output Line Discipline.
Output Top Half.
Output Bottom Half.
Input Bottom Half.
Input Top Half.
The stop Routine.
The ioctl Routine.
Closing of Terminal Devices.
Other Line Disciplines.
Serial Line IP Discipline.
Graphics Tablet Discipline.
IV. INTERPROCESS COMMUNICATION. 11. Interprocess Communication.
Use of Sockets.
Implementation Structure and Overview.
Mbuf Utility Routines.
Passing Access Rights.
Passing Access Rights in the Local Domain.
References. 12. Network Communication.
Protocol User-Request Routine.
Protocol Control-Output Routine.
Interface between Protocol and Network Interface.
Kernel Routing Tables.
User-Level Routing Policies.
User-Level Routing Interface: Routing Socket.
Buffering and Congestion Control.
Protocol Buffering Policies.
Additional Network-Subsystem Topics.
Address Resolution Protocol.
References. 13. Network Protocols.
Internet Network Protocols.
Internet Ports and Associations.
Protocol Control Blocks.
User Datagram Protocol (UDP).
Internet Protocol (IP).
Transmission Control Protocol (TCP).
TCP Connection States.
Estimation of Round-Trip Time.
TCP Input Processing.
TCP Output Processing.
Sending of Data.
Avoidance of the Silly-Window Syndrome.
Avoidance of Small Packets.
Delayed Acknowledgments and Window Updates.
Buffer and Window Sizing.
Avoidance of Congestion with Slow Start.
Internet Control Message Protocol (ICMP).
OSI Implementation Issues.
Summary of Networking and Interprocess Communication.
Creation of a Communication Channel.
Sending and Receiving of Data.
Termination of Data Transmission or Reception.
V. SYSTEM OPERATION. 14. System Startup.
The boot Program.
System Data Structures.
New Autoconfiguration Data Structures.
New Autoconfiguration Functions.
System Shutdown and Autoreboot.
Passage of Information To and From the Kernel.