- Preface
- Chapter 1 Introduction And Overview
- 1.1 The Importance Of Architecture
- 1.2 Learning The Essentials
- 1.3 Organization Of The Text
- 1.4 What We Will Omit
- 1.5 Terminology: Architecture And Design
- 1.6 Summary
- PARTI Basics
- Chapter 2 Fundamentals Of Digital Logic
- 2.1 Introduction
- 2.2 Electrical Terminology: Voltage And Current
- 2.3 The Transistor
- 2.4 Logic Gates
- 2.5 Symbols Used For Gates
- 2.6 Construction Of Gates From Transistors
- 2.7 Example Interconnection Of Gates
- 2.8 Multiple Gates Per Integrated Circuit
- 2.9 The Need For More Than Combinatorial Circuits
- 2.10 Circuits That Maintain State
- 2.11 Transition Diagrams
- 2.12 Binary Counters
- 2.13 Clocks And Sequences
- 2.14 The Important Concept Of Feedback
- 2.15 Starting A Sequence
- 2.16 Iteration In Software Vs. Replication In Hardware
- 2.17 Gate And Chip Minimization 2. 18 Using Spare Gates
- 2.19 Power Distribution And Heat Dissipation
- 2.20 Timing
- 2.21 Physical Size And Process Technologies
- 2.22 Circuit Boards And Layers
- 2.23 Levels Of Abstraction
- 2.24 Summary
- Chapter 3 Data And Program Representation
- 3.1 Introduction
- 3.2 Digital Logic And Abstraction
- 3.3 Bits And Bytes
- 3.4 Byte Size And Possible Values
- 3.5 Binary Arithmetic
- 3.6 Hexadecimal Notation
- 3.7 Notation For Hexadecimal And Binary Constants
- 3.8 Character Sets
- 3.9 Unicode
- 3.10 Unsigned Integers. Overflow. And Underflow
- 3.11 Numbering Bits And Bytes
- 3.12 Signed Integers
- 3.13 An Example Of Two's Complement Numbers
- 3.14 Sign Extension
- 3.15 Floating Point
- 3.16 Special Values
- 3.17 Range Of IEEE Floating Point Values
- 3.18 Data Aggregates
- 3.19 Program Representation
- 3.20 Summary
- PART II Processors
- Chapter 4 The Variety Of Processors And Computational Engines
- 4.1 Introduction
- 4.2 Von Neumann Architecture
- 4.3 Definition Of A Processor
- 4.4 The Range Of Processors
- 4.5 Hierarchical Structure And Computational Engines
- 4.6 Structure Of A Conventional Processor
- 4.7 Definition Of An Arithmetic Logic Unit (ALU)
- 4.8 Processor Categories And Roles
- 4.9 Processor Technologies
- 4.10 Stored Programs
- 4.11 The Fetch-Execute Cycle
- 4.12 Clock Rate And Instruction Rate
- 4.13 Control: Getting Started And Stopping
- 4.14 Starting The Fetch-Execute Cycle
- 4.15 Summary
- Chapter 5 Processor Types And Instruction Sets
- 5.1 Introduction
- 5.2 Mathematical Power, Convenience, And Cost
- 5.3 Instruction Set And Representation
- 5.4 Opcodes, Operands, And Results
- 5.5 Typical Instruction Formal
- 5.6 Variable-Length Vs. Fixed-Length Instructions
- 5.7 General-Purpose Registers
- 5.8 Floating Point Registers And Register Identification
- 5.9 Programming With Registers
- 5.10 Register Banks
- 5.11 Complex And Reduced Instruction Sets
- 5.12 RISC Design And The Execution Pipeline
- 5.13 Pipelines And Instruction Stalls
- 5.14 Other Causes Of Pipeline Stalls
- 5.15 Consequences For Programmers
- 5.16 Programming, Stalls, And No-Op Instructions
- 5.17 Forwarding
- 5.18 Types Of Operations
- 5.19 Program Counter, Fetch-Execute, And Branching
- 5.20 Subroutine Calls, Arguments, And Register Windows
- 5.21 An Example Instruction Set
- 5.22 Minimalistic Instruction Set
- 5.23 The Principle Of Orthogonality
- 5.24 Condition Codes And Conditional Branching
- 5.25 Summary
- Chapter 6 Operand Addressing And Instruction Representation
- 6.1 Introduction
- 6.2 Zero One. Two. Or Three Address Designs
- 6.3 Zero Operands Per Instruction
- 6.4 One Operand Per Instruction
- 6.5 Two Operands Per Instruction
- 6.6 Three Operands Per Instruction
- 6.7 Operand Sources And Immediate Values
- 6.8 The Von Neumann Bottleneck
- 6.9 Explicit And Implicit Operand Encoding
- 6.10 Operands That Combine Multiple Values
- 6.11 Tradeoffs In The Choice Of Operands
- 6.12 Values In Memory And Indirect Reference
- 6.13 Operand Addressing Modes
- 6.14 Summary
- Chapter 7 CPUs: Microcode, Protection, And Processor Modes
- 7.1 Introduction
- 7.2 A Central Processor
- 7.3 CPU Complexity
- 7.4 Modes Of Execution
- 7.5 Backward Compatibility
- 7.6 Changing Modes
- 7.7 Privilege And Protection
- 7.8 Multiple Levels Of Protection
- 7.9 Microcoded Instructions 7.10 Microcode Variations
- 7.11 The Advantage Of Microcode
- 7.12 Making Microcode Visible To Programmers
- 7.13 Vertical Microcode
- 7.14 Horizontal Microcode
- 7.15 Example Horizontal Microcode 7.16 A Horizontal Microcode Example
- 7.17 Operations That Require Multiple Cycles
- 7.18 Horizontal Microcode And Parallel Execution
- 7.19 Look-Ahead And High Performance Execution 7.20 Parallelism And Execution Order
- 7.21 Out-Of-Order Instruction Execution
- 7.22 Conditional Branches And Branch Prediction
- 7.23 Consequences For Programmers
- 7.24 Summary
- Chapter 8 Assembly Languages And Programming Paradigm
- 8.1 Introduction
- 8.2 Characteristics Of A High-level Programming Language
- 8.3 Characteristics Of A Low-Level Programming Language
- 8.4 Assembly Language
- 8.5 Assembly Language Syntax And Opcodes
- 8.6 Operand Order
- 8.7 Register Names
- 8.8 Operand Types
- 8.9 Assembly Language Programming Paradigm And Idioms
- 8.10 Assembly Code For Conditional Execution
- 8.11 Assembly Code For A Conditional Alternative
- 8.12 Assembly Code For Definite Iteration
- 8.13 Assembly Code For Indefinite Iteration
- 8.14 Assembly Code For Procedure Invocation
- 8.15 Assembly Code For Parameterized Procedure Invocation
- 8.16 Consequence For Programmers
- 8.17 Assembly Code For Function Invocation
- 8.18 Interaction Between Assembly And High-Level Languages
- 8.19 Assembly Code For Variables And Storage
- 8.20 Two-Pass Assembler
- 8.21 Assembly Language Macros
- 8.22 Summary
- PART III Memories
- Chapter 9 Memory And Storage
- 9.1 Introduction
- 9.2 Definition
- 9.3 The Key Aspects Of Memory
- 9.4 Characteristics Of Memory Technologies
- 9.5 The Important Concept Of A Memory Hierarchy
- 9.6 Instruction And Data Store
- 9.7 The Fetch-Store Paradigm
- 9.8 Summary
- Chapter 10 Physical Memory And Physical Addressing
- 10.1 Introduction
- 10.2 Characteristics Of Computer Memory
- 10.3 Static And Dynamic RAM Technologies
- 10.4 Measures Of Memory Technology
- 10.5 Density
- 10.6 Separation Of Read And Write Performance
- 10.7 Latency And Memory Controllers
- 10.8 Synchronized Memory Technologies
- 10.9 Multiple Data Rate Memory Technologies
- 10.10 Examples Of Memory Technologies 10. 11 Memory Organization
- 10.12 Memory Access And Memory Bus
- 10.13 Memory Transfer Size
- 10.14 Physical Addresses And Words
- 10.15 Physical Memory Operations
- 10.16 Word Size And Other Data Types
- 10.17 An Extreme Case: Byte Addressing
- 10.18 Byte Addressing With Word Transfers
- 10.19 Using Powers Of Two
- 10.20 Byte Alignment And Programming
- 10.21 Memory Size And Address Space
- 10.22 Programming With Word Addressing
- 10.23 Measures Of Memory Size
- 10.24 Pointers And Data Structures
- 10.25 A Memory Dump
- 10.26 Indirection And Indirect Operands
- 10.27 Memory Banks And Interleaving
- 10.28 Content Addressable Memory
- 10.29 Ternary CAM
- 10.30 Summary
- Chapter 11 Virtual Memory Technologies And Virtual Addressing
- 11.1 Introduction
- 11.2 Definition
- 11.3 A Virtual Example: Byte Addressing
- 11.4 Virtual Memory Terminology
- /1.5 An Interface To Multiple Physical Memory Systems
- 11.6 Address Translation Or Address Mapping
- 11.7 Avoiding Arithmetic Calculation
- 11.8 Discontiguous Address Spaces
- 11.9 Other Memory Organizations
- 11.10 Motivation For Virtual Memory
- 11.11 Multiple Virtual Spaces And Multiprogramming
- 11.12 Multiple Levels Of Visualization
- 11.13 Creating Virtual Spaces Dynamically
- 11.14 Base-Bound Registers
- 11.15 Changing The Virtual Space
- 11.16 Virtual Memory, Base-Bound, And Protection
- 11.17 Segmentation
- 11.18 Demand Paging
- 11.19 Hardware And Software For Demand Paging
- 11.20 Page Replacement
- 11.21 Paging Terminology And Data Structures
- 11.22 Address Translation In A Paging System
- 11.23 Using Powers Of Two
- 11.24 Presence, Use, And Modified Bits
- 11.25 Page Table Storage
- 11.26 Paging Efficiency And A Translation Lookaside Buffer
- 11.27 Consequences For Programmers
- 11.28 Summary
- Chapter 12 Caches And Caching
- 12.1 Introduction 12.2 Definition
- 12.3 Characteristics Of A Cache
- 12.4 The Importance Of Caching
- 12.5 Examples Of Caching
- 12.6 Cache Terminology
- 12.7 Best And Worst Case Cache Performance
- 12.8 Cache Performance On A Typical Sequence
- 12.9 Cache Replacement Policy 12.10 LRU Replacement
- 12.11 Multi-level Cache Hierarchy
- 12.12 Preloading Caches
- 12.13 Caches Used With Memory
- 12.14 TLB As A Cache
- 12.15 Demand Paging As A Form Of Caching
- 12.16 Physical Memory Cache
- 12.17 Write Through And Write Back
- 12.18 Cache Coherence
- 12.19 L1. L2 and L3 Caches
- 12.20 Size Of L1.L2.And L3 Caches
- 12.21 Instruction And Data Caches
- 12.22 Virtual Memory Caching And A Cache Flush
- 12.23 Implementation Of Memory Caching
- 12.24 Direct Mapping Memory Cache
- 12.25 Using Powers Of Two For Efficiency
- 12.26 Set Associative Memory Cache
- 12.27 Consequences For Programmers
- 12.28 Summary
- PART IV I/O
- Chapter 13 Input/Output Concepts And Terminology
- 13.1 Introduction
- 13.2 Input And Output Devices
- 13.3 Control Of An External Device
- 13.4 Data Transfer
- 13.5 Serial And Parallel Data Transfers
- 13.6 Self-Clocking Data
- 13.7 Full-Duplex And Half-Duplex Interaction
- 13.8 Interface Latency And Throughput
- 13.9 The Fundamental Idea Of Multiplexing
- 13.10 Multiple Devices Per External Interface
- 13.11 A Processor's View Of I/O 13. 12 Summary
- Chapter 14 Buses And Bus Architectures
- 14.1 Introduction
- 14.2 Definition Of A Bus
- 14.3 Processors, I/O Devices, And Buses
- 14.4 Proprietary And Standardized Buses
- 14.5 Shared Buses And An Access Protocol
- 14.6 Multiple Buses
- 14.7 A Parallel, Passive Mechanism
- 14.8 Physical Connections
- 14.9 Bus Interface
- 14.10 Address. Control, And Data Lines 14.11 The Fetch-Store Paradigm 14.12 Fetch-Store Over A Bus
- 14.13 The Width Of A Bus
- 14.14 Multiplexing
- 14.15 Bus Width And Size Of Data Items
- 14.16 Bus Address Space
- 14.17 Potential Errors
- 14.18 Address Configuration And Sockets
- 14.19 Many Buses Or One Bus
- 14.20 Using Fetch-Store With Devices
- 14.21 An Example Of Device Control Using Fetch-Store
- 14.22 Operation Of An Interface
- 14.23 Asymmetric Assignments
- 14.24 Unified Memory And Device Addressing
- 14.25 Holes In The Address Space
- 14.26 Address Map
- 14.27 Program Interface To A Bus
- 14.28 Bridging Between Two Buses
- 14.29 Main And Auxiliary Buses
- 14.30 Consequences For Programmers
- 14.31 Switching Fabrics
- 14.32 Summary
- Chapter 15 Programmed And Interrupt-Driven I/O
- 15.1 Introduction
- 15.2 I/O Paradigms
- 15.3 Programmed I/O
- 15.4 Synchronization
- 15.5 Polling
- 15.6 Code For Polling
- 15.7 Control And Status Registers
- 15.8 Processor Use And Polling
- 15.9 First. Second And Third Generation Computers
- 15.10 Interrupt-Driven I/O
- 15.11 A Hardware Interrupt Mechanism
- 15.12 Interrupts And The Fetch-Execute Cycle
- 15.13 Handling An Interrupt
- 15.14 Interrupt Vectors
- 15.15 Initialization And Enabling And Disabling Interrupts
- 15.16 Preventing Interrupt Code From Being Interrupted
- 15.17 Multiple Levels Of Interrupts
- 15.18 Assignment Of Interrupt Vectors And Priorities
- 15.19 Dynamic Bus Connections And Pluggable Devices
- 15.20 The Advantage Of Interrupts
- 15.21 Smart Devices And Improved I/O Performance
- 15.22 Direct Memory Access (DMA)
- 15.23 Buffer Chaining
- 15.24 Scatter Read And Gather Write Operations
- 15.25 Operation Chaining
- 15.26 Summary
- Chapter 16 A Programmer's View Of Devices, I/O, And Buffering
- 16.1 Introduction
- 16.2 Definition Of A Device Driver
- 16.3 Device Independence. Encapsulation, And Hiding
- 16.4 Conceptual Pans Of A Device Driver
- 16.5 Two Types Of Devices
- 16.6 Example Flow Through A Device Driver
- 16.7 Queued Output Operations
- 16.8 Forcing An Interrupt
- 16.9 Queued Input Operations
- 16.10 Devices That Support Bi-Directional Transfer
- 16.11 Asynchronous Vs. Synchronous Programming Paradigm
- 16.12 Asynchrony, Smart Devices, And Mutual Exclusion
- 16.13 I/O As Viewed By An Application
- 16.14 Run-Time I/O Libraries
- 16.15 The Library/Operating System Dichotomy
- 16.16 I/O Operations The OS Supports
- 16.17 The Cost Of I/O Operations
- 16.18 Reducing The System Call Overhead 16./9 77w Important Concept Of Buffering
- 16.20 Implementation of Buffering
- 16.21 Flushing A Buffer
- 16.22 Buffering On Input 16.2J Effectiveness Of Buffering
- 16.24 Buffering In An Operating System
- 16.25 Relation To Caching
- 16.26 An Example: The Unix Standard I/O Library
- 16.27 Summary
- PART V Advanced Topics
- Chapter 17 Parallelism
- 17.1 Introduction
- 17.2 Parallel And Pipelined Architectures
- 17.3 Characterizations Of Parallelism
- 17.4 Microscopic Vs. Macroscopic
- 17.5 Examples Of Microscopic Parallelism
- 17.6 Examples Of Macroscopic Parallelism
- 17.7 Symmetric Vs. Asymmetric
- 17.8 Fine-grain Vs. Coarse-grain Parallelism
- 17.9 Explicit Vs. Implicit Parallelism
- 17.10 Parallel Architectures
- 17.11 Types Of Parallel Architectures (Flynn Classification)
- 17.12 Single Instruction Single Data (SISD)
- 17.13 Single Instruction Multiple Data (S1MD)
- 17.14 Multiple Instructions Multiple Data (M1MD)
- 17.15 Communication, Coordination, And Contention
- 17.16 Performance Of Multiprocessors
- 17.17 Consequences For Programmers
- 17.18 Redundant Parallel Architectures
- 17.19 Distributed And Cluster Computers
- 17.20 Summary
- Chapter 18 Pipelining
- 18.1 Introduction
- 18.2 The Concept Of Pipelining
- 18.3 Software Pipelining
- 18.4 Software Pipeline Performance And Overhead
- 18.5 Hardware Pipelining
- 18.6 How Hardware Pipelining Increases Performance
- 18.7 When Pipelining Can Be Used
- 18.8 The Conceptual Division Of Processing
- 18.9 Pipeline Architectures
- 18.10 Pipeline Setup. Stall. And Flush Times
- 18.11 Definition Of Superpipeline Architecture
- 18.12 Summary
- Chapter 19 Assessing Performance
- 19.1 Introduction
- 19.2 Measuring Power And Performance
- 19.3 Measures Of Computational Power
- 19.4 Application Specific Instruction Counts
- 19.5 Instruction Mix
- 19.6 Standardized Benchmarks
- 19.7 I/O And Memory Bottlenecks
- 19.8 Boundary Between Hardware And Software
- 19.9 Choosing Items To Optimize
- 19.10 Amdahl's Law And Parallel Systems
- 19.11 Summary
- Chapter 20 Architecture Examples And Hierarchy
- 20.1 Introduction
- 20.2 Architectural Levels
- 20.3 System-Level Architecture: A Personal Computer
- 20.4 Bus Interconnection And Bridging
- 20.5 Controller Chips And Physical Architecture
- 20.6 Virtual Buses
- 20.7 Connection Speeds
- 20.8 Bridging Functionality And Virtual Buses
- 20.9 Board-Level Architecture
- 20.10 Chip-Level Architecture
- 20.11 Structure Of Functional Units On A Chip
- 20.12 Summary
- 20.13 Hierarchy Beyond Computer Architectures
- Appendix 1 Lab Exercises For A Computer Architecture Course
- A 1.1 Introduction
- A 1.2 Digital Hardware For A Lab
- A 1.3 Solderless Breadboard
- A 1.4 Using A Solderless Breadboard
- A 1.5 Testing
- A 1.6 Power And Ground Connections
- A 1.7 Lab Exercises
- 1 Introduction and account configuration
- 2 Digital Logic: Use of a breadboard
- 3 Digital Logic: Building an adder from gates
- 4 Digital Logic: Clocks and demultiplexing
- 5 Representation: Testing big endian vs. little endian
- 6 Representation: A hex dump program in C
- 7 Processors: Learn a RISC assembly language
- 8 Processors: Function that can be called from C
- 9 Memory: row-major and column-major array storage
- 10 Input/Output: a buffered I/O library
- 11 A hex dump program in assembly language
- Bibliography
- Index