You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

Operating Systems From 0 to 1.txt 552KB


  1. Operating Systems:
  2. From 0 to 1
  3. Tu, Do Hoang
  4. Table of Contents
  5. Preface
  6. Why another book on Operating Systems?
  7. Prerequisites
  8. What you will learn in this book
  9. What this book is not about
  10. The organization of the book
  11. Acknowledgments
  12. Part I Preliminary
  13. Domain documents
  14. Problem domains
  15. Documents for implementing a problem dom
  16. Software Requirement Document
  17. Software Specification
  18. Documents for writing an x86 Operating S
  19. The physical implementation of a bit
  20. MOSFET transistors
  21. Beyond transistors: digital logic gates
  22. The theory behind logic gates
  23. Logic Gate implementation: CMOS circuit
  24. Beyond Logic Gates: Machine Language
  25. Machine language
  26. Assembly Language
  27. Programming Languages
  28. Abstraction
  29. Why abstraction works
  30. Why abstraction reduces complexity
  31. Computer Architecture
  32. What is a computer?
  33. Server
  34. Desktop Computer
  35. Mobile Computer
  36. Game Consoles
  37. Embedded Computer
  38. Field Gate Programmable Array
  39. Application-Specific Integrated Circuit
  40. Computer Architecture
  41. Instruction Set Architecture
  42. Computer organization
  43. Hardware
  44. x86 architecture
  45. Intel Q35 Chipset
  46. x86 Execution Environment
  47. x86 Assembly and C
  48. objdump
  49. Reading the output
  50. Intel manuals
  51. Experiment with assembly code
  52. Anatomy of an Assembly Instruction
  53. Understand an instruction in detail
  54. Example: jmp instruction
  55. Examine compiled data
  56. Fundamental data types
  57. Pointer Data Types
  58. Bit Field Data Type
  59. String Data Types
  60. Examine compiled code
  61. Data Transfer
  62. Expressions
  63. Stack
  64. Automatic variables
  65. Function Call and Return
  66. Loop
  67. Conditional
  68. The Anatomy of a Program
  69. Reference documents:
  70. ELF header
  71. Section header table
  72. Understand Section in-depth
  73. Program header table
  74. Segments vs sections
  75. Runtime inspection and debug
  76. A sample program
  77. Static inspection of a program
  78. Command: info target/info file/info file
  79. Command: maint info sections
  80. Command: info functions
  81. Command: info variables
  82. Command: disassemble/disas
  83. Command: x
  84. Command: print/p
  85. Runtime inspection of a program
  86. Command: run
  87. Command: break/b
  88. Command: next/n
  89. Command: step/s
  90. Command: ni
  91. Command: si
  92. Command: until
  93. Command: finish
  94. Command: bt
  95. Command: up
  96. Command: down
  97. Command: info registers
  98. How debuggers work: A brief introduction
  99. How breakpoints work
  100. Single stepping
  101. How a debugger understands high level so
  102. Part II Groundwork
  103. Bootloader
  104. x86 Boot Process
  105. Using BIOS services
  106. Boot process
  107. Example Bootloader
  108. Compile and load
  109. Debugging
  110. Loading a program from bootloader
  111. Floppy Disk Anatomy
  112. Read and load sectors from a floppy disk
  113. Improve productivity with scripts
  114. Automate build with GNU Make
  115. GNU Make Syntax summary
  116. Automate debugging steps with GDB script
  117. Linking and loading on bare metal
  118. Understand relocations with readelf
  119. Offset
  120. Info
  121. Type
  122. Sym.Value
  123. Sym. Name
  124. Crafting ELF binary with linker scripts
  125. Example linker script
  126. Understand the custom ELF structure
  127. Manipulate the program segments
  128. C Runtime: Hosted vs Freestanding
  129. Debuggable bootloader on bare metal
  130. Debuggable program on bare metal
  131. Loading an ELF binary from a bootloader
  132. Debugging the memory layout
  133. Testing the new binary
  134. Part III Kernel Programming
  135. x86 Descriptors
  136. Basic operating system concepts
  137. Hardware Abstraction Layer
  138. System programming interface
  139. The need for an Operating System
  140. Drivers
  141. Userspace and kernel space
  142. Memory Segment
  143. Segment Descriptor
  144. Types of Segment Descriptors
  145. Code and Data descriptors
  146. Task Descriptor
  147. Interrupt Descriptor
  148. Descriptor Scope
  149. Global Descriptor
  150. Local Descriptor
  151. Segment Selector
  152. Enhancement: Bootloader with descriptors
  153. Process
  154. Concepts
  155. Process
  156. Task
  157. Process
  158. Scheduler
  159. Context switch
  160. Priority
  161. Preemptive vs Non-preemptive
  162. Process states
  163. procfs
  164. Threads
  165. Task: x86 concept of a process
  166. Task Data Structure
  167. Task State Segment
  168. Task Descriptor
  169. Process Implementation
  170. Requirements
  171. Major Plan
  172. Stage 1: Switch to a task from bootloade
  173. Stage 2: Switch to a task with one funct
  174. Stage 3: Switch to a task with many func
  175. Milestone: Code Refactor
  176. Interrupt
  177. Memory management
  178. Address Space
  179. Virtual Memory
  180. File System
  181. Example: Ex2 filesystem
  182. Bibliography
  183. Preface
  184. Greetings!
  185. You've probably asked yourself at least once how an operating
  186. system is written from the ground up. You might even have years
  187. of programming experience under your belt, yet your understanding
  188. of operating systems may still be a collection of abstract
  189. concepts not grounded in actual implementation. To those who've
  190. never built one, an operating system may seem like magic: a
  191. mysterious thing that can control hardware while handling a
  192. programmer's requests via the API of their favorite programming
  193. language. Learning how to build an operating system seems
  194. intimidating and difficult; no matter how much you learn, it
  195. never feels like you know enough. You're probably reading this
  196. book right now to gain a better understanding of operating
  197. systems to be a better software engineer.
  198. If that is the case, this book is for you. By going through this
  199. book, you will be able to find the missing pieces that are
  200. essential and enable you to implement your own operating system
  201. from scratch! Yes, from scratch without going through any
  202. existing operating system layer to prove to yourself that you are
  203. an operating system developer. You may ask,“Isn't it more
  204. practical to learn the internals of Linux?”.
  205. Yes...
  206. and no.
  207. Learning Linux can help your workflow at your day job. However,
  208. if you follow that route, you still won't achieve the ultimate
  209. goal of writing an actual operating system. By writing your own
  210. operating system, you will gain knowledge that you will not be
  211. able to glean just from learning Linux.
  212. Here's a list of some benefits of writing your own OS:
  213. • You will learn how a computer works at the hardware level, and
  214. you will learn to write software to manage that hardware
  215. directly.
  216. • You will learn the fundamentals of operating systems, allowing
  217. you to adapt to any operating system, not just Linux
  218. • To hack on Linux internals suitably, you'll need to write at
  219. least one operating system on your own. This is just like
  220. applications programming: to write a large application, you'll
  221. need to start with simple ones.
  222. • You will open pathways to various low-level programming domains
  223. such as reverse engineering, exploits, building virtual
  224. machines, game console emulation and more. Assembly language
  225. will become one of your most indispensable tools for low-level
  226. analysis. (But that does not mean you have to write your
  227. operating system in Assembly!)
  228. • Writing an operating system is fun!
  229. Why another book on Operating Systems?
  230. There are many books and courses on this topic made by famous
  231. professors and experts out there already. Who am I to write a
  232. book on such an advanced topic? While it's true that many quality
  233. resources exist, I find them lacking. Do any of them show you how
  234. to compile your C code and the C runtime library independent of
  235. an existing operating system? Most books on operating system
  236. design and implementation only discuss the software side; how the
  237. operating system communicates with the hardware is skipped.
  238. Important hardware details are skipped, and it's difficult for a
  239. self-learner to find relevant resources on the Internet. The aim
  240. of this book is to bridge that gap: not only will you learn how
  241. to program hardware directly, but also how to read official
  242. documents from hardware vendors to program it. You no longer have
  243. to seek out resources to help yourself interpret hardware manuals
  244. and documentation: you can do it yourself. Lastly, I wrote this
  245. book from an autodidact's perspective. I made this book as
  246. self-contained as possible so you can spend more time learning
  247. and less time guessing or seeking out information on the
  248. Internet.
  249. One of the core focuses of this book is to guide you through the
  250. process of reading official documentation from vendors to
  251. implement your software. Official documents from hardware vendors
  252. like Intel are critical for implementing an operating system or
  253. any other software that directly controls the hardware. At a
  254. minimum, an operating system developer needs to be able to
  255. comprehend these documents and implement software based on a set
  256. of hardware requirements. Thus, the first chapter is dedicated to
  257. discussing relevant documents and their importance.
  258. Another distinct feature of this book is that it is “Hello World”
  259. centric. Most examples revolve around variants of a “Hello World”
  260. program, which will acquaint you with core concepts. These
  261. concepts must be learned before attempting to write an operating
  262. system. Anything beyond a simple “Hello World” example gets in
  263. the way of teaching the concepts, thus lengthening the time spent
  264. on getting started writing an operating system.
  265. Let's dive in. With this book, I hope to provide enough
  266. foundational knowledge that will open doors for you to make sense
  267. of other resources. This book will be especially beneficial to
  268. students who've just finished their first C/C++ course. Imagine
  269. how cool it would be to show prospective employers that you've
  270. already built an operating system.
  271. Prerequisites
  272. • Basic knowledge of circuits
  273. – Basic Concepts of Electricity: atoms, electrons, proton,
  274. neutron, current flow.
  275. – Ohm's law
  276. If you are unfamiliar with these concepts, you can quickly
  277. learn them here: http://www.allaboutcircuits.com/textbook/, by
  278. reading chapter 1 and chapter 2.
  279. • C programming. In particular:
  280. – Variable and function declarations/definitions
  281. – While and for loops
  282. – Pointers and function pointers
  283. – Fundamental algorithms and data structures in C
  284. • Linux basics:
  285. – Know how to navigate directory with the command line
  286. – Know how to invoke a command with options
  287. – Know how to pipe output to another program
  288. • Touch typing. Since we are going to use Linux, touch typing
  289. helps. I know typing speed does not relate to problem-solving,
  290. but at least your typing speed should be fast enough to not let
  291. it get in the way and degrade the learning experience.
  292. In general, I assume that the reader has basic C programming
  293. knowledge, and can use an IDE to build and run a program.
  294. What you will learn in this book
  295. • How to write an operating system from scratch by reading
  296. hardware datasheets. In the real world, you will not be able to
  297. consult Google for a quick answer.
  298. • Write code independently. It's pointless to copy and paste
  299. code. Real learning happens when you solve problems on your
  300. own. Some examples are provided to help kick start your work,
  301. but most problems are yours to conquer. However, the solutions
  302. are available online for you after giving a good try.
  303. • A big picture of how each layer of a computer related to each
  304. other, from hardware to software.
  305. • How to use Linux as a development environment and common tools
  306. for low-level programming.
  307. • How a program is structured so that an operating system can
  308. run.
  309. • How to debug a program running directly on hardware with gdb
  310. and QEMU.
  311. • Linking and loading on bare metal x86_64, with pure C. No
  312. standard library. No runtime overhead.
  313. What this book is not about
  314. • Electrical Engineering: The book discusses some concepts from
  315. electronics and electrical engineering only to the extent of
  316. how software operates on bare metal.
  317. • How to use Linux or any OS types of books: Though Linux is used
  318. as a development environment and as a medium to demonstrate
  319. high-level operating system concepts, it is not the focus of
  320. this book.
  321. • Linux Kernel development: There are already many high-quality
  322. books out there on this subject.
  323. • Operating system books focused on algorithms: This book focuses
  324. more on actual hardware platform - Intel x86_64 - and how to
  325. write an OS that utilizes of OS support from the hardware
  326. platform.
  327. The organization of the book
  328. Part 1 provides a foundation for learning operating system.
  329. • Chapter 1 briefly explains the importance of domain
  330. documents. Documents are crucial for the learning experience,
  331. so they deserve a chapter.
  332. • Chapter 2 explains the layers of abstractions from hardware
  333. to software. The idea is to provide insight into how code
  334. runs physically.
  335. • Chapter 3 provides the general architecture of a computer,
  336. then introduces a sample computer model that you will use to
  337. write an operating system.
  338. • Chapter 4 introduces the x86 assembly language through the
  339. use of the Intel manuals, along with commonly used
  340. instructions. This chapter gives detailed examples of how
  341. high-level syntax corresponds to low-level assembly, enabling
  342. you to read generated assembly code comfortably. It is
  343. necessary to read assembly code when debugging an operating
  344. system.
  345. • Chapter 5 dissects ELF in detail. Only by understanding how
  346. the structure of a program at the binary level, you can build
  347. one that runs on bare metal.
  348. • Chapter 6 introduces gdb debugger with extensive examples for
  349. commonly used commands. After acquainting the reader with
  350. gdb, it then provides insight on how a debugger works. This
  351. knowledge is essential for building a debuggable program on
  352. the bare metal.
  353. Part 2 presents how to write a bootloader to bootstrap a
  354. kernel. Hence the name “Groundwork”. After mastering this part,
  355. the reader can continue with the next part, which is a guide
  356. for writing an operating system. However, if the reader does not
  357. like the presentation, he or she can look elsewhere, such as
  358. the OSDev Wiki: http://wiki.osdev.org/.
  359. • Chapter 7 introduces what the bootloader is, how to write one
  360. in assembly, and how to load it on QEMU, a hardware emulator.
  361. This process involves typing repetitive and long commands, so
  362. GNU Make is applied to improve productivity by automating the
  363. repetitive parts and simplifying the interaction with the
  364. project. This chapter also demonstrates the use of GNU Make
  365. in context.
  366. • Chapter 8 introduces linking by explaining the relocation
  367. process when combining object files. In addition to a
  368. bootloader and an operating system written in C, this is the
  369. last piece of the puzzle required for building debuggable
  370. programs on bare metal, including the bootloader written in
  371. Assembly and an operating system written in C.
  372. Part 3 provides guidance on how to write an operating system,
  373. as you should implement an operating system on your own and be
  374. proud of your creation. The guidance consists of simpler and
  375. coherent explanations of necessary concepts, from hardware to
  376. software, to implement the features of an operating system.
  377. Without such guidance, you will waste time gathering
  378. information spread through various documents and the Internet.
  379. It then provides a plan on how to map the concepts to code.
  380. Acknowledgments
  381. Thank you, my beloved family. Thank you, the contributors.
  382. Preliminary
  383. Domain documents
  384. Problem domains
  385. In the real world, software engineering is not only focused on
  386. software, but also the problem domain it is trying to solve.
  387. A problem domain[margin:
  388. problem domain
  389. ]problem domain is the part of the world where the computer is to
  390. produce effects, together with the means available to produce
  391. them, directly or indirectly. (Kovitz, 1999)
  392. A problem domainproblem domain is anything outside of programming
  393. that a software engineer needs to understand to produce correct
  394. code that can achieve the desired effects. “Directly” means
  395. include anything that the software can control to produce the
  396. desired effects, e.g. keyboards, printers, monitors, other
  397. software... “Indirectly” means anything not part of the software
  398. but relevant to the problem domain e.g. appropriate people to be
  399. informed by the software when some event happens, students that
  400. move to correct classrooms according to the schedule generated by
  401. the software. To write a finance application, a software engineer
  402. needs to learn sufficient finance concepts to understand the [margin:
  403. requirements
  404. ]requirementsrequirements of a customer and implement such
  405. requirements, correctly.
  406. Requirements are the effects that the machine is to exert in the
  407. problem domain by virtue of its programming.
  408. Programming alone is not too complicated; programming to solve a
  409. problem domain, is [footnote:
  410. We refer to the concept of “programming” here as someone able to
  411. write code in a language, but not necessary know any or all
  412. software engineering knowledge.
  413. ]. Not only a software engineer needs to understand how to
  414. implement the software, but also the problem domain that it tries
  415. to solve, which might require in-depth expert knowledge. The
  416. software engineer must also select the right programming
  417. techniques that apply to the problem domain he is trying to
  418. solve because many techniques that are effective in one domain
  419. might not be in another. For example, many types of applications
  420. do not require performant written code, but a short time to
  421. market. In this case, interpreted languages are widely popular
  422. because it can satisfy such need. However, for writing huge 3D
  423. games or operating system, compiled languages are dominant
  424. because it can generate the most efficient code required for such
  425. applications.
  426. Often, it is too much for a software engineer to learn
  427. non-trivial domains (that might require a bachelor degree or
  428. above to understand the domains). Also, it is easier for a domain expert
  429. domain expert to learn enough programming to break down the
  430. problem domain into parts small enough for the software engineers
  431. to implement. Sometimes, domain experts implement the software
  432. themselves.
  433. [float Figure:
  434. [Figure 0.1:
  435. Problem domains: Software and Non-software.
  436. ]
  437. <Graphics file: C:/Users/Tu Do/os01/book_src/images/01/domains_general.pdf>
  438. ]
  439. One example of such scenario is the domain that is presented in
  440. this book: operating system. A certain amount of electrical
  441. engineering (EE) knowledge is required to implement an operating
  442. system. If a computer science (CS) curriculum that does not
  443. include minimum EE courses, students in the curriculum have
  444. little chance to implement a working operating system. Even if
  445. they can implement one, either they need to invest a significant
  446. amount of time to study on their own, or they fill code in a
  447. predefined framework just to understand high-level algorithms.
  448. For that reason, EE students have an easier time to implement an
  449. OS, as they only need to study a few core CS courses. In fact,
  450. only “C programming” and “Algorithms and Data Structures” classes
  451. are usually enough to get them started writing code for device
  452. drivers, and later generalize it into an operating system.
  453. [float Figure:
  454. [Figure 0.2:
  455. Operating System domain.
  456. ]
  457. <Graphics file: C:/Users/Tu Do/os01/book_src/images/01/domains_os_example.pdf>
  458. ]
  459. One thing to note is that software is its own problem domain. A
  460. problem domain does not necessarily divide between software and
  461. itself. Compilers, 3D graphics, games, cryptography, artificial
  462. intelligence, etc., are parts of software engineering domains
  463. (actually it is more of a computer science domain than a software
  464. engineering domain). In general, a software-exclusive domain
  465. creates software to be used by other software. Operating System
  466. is also a domain, but is overlapped with other domains such as
  467. electrical engineering. To effectively implement an operating
  468. system, it is required to learn enough of the external domain.
  469. How much learning is enough for a software engineer? At the
  470. minimum, a software engineer should be knowledgeable enough to
  471. understand the documents prepared by hardware engineers for using
  472. (i.e. programming) their devices.
  473. Learning a programming language, even C or Assembly, does not
  474. mean a software engineer can automatically be good at hardware
  475. programming or any related low-level programming domains. One can
  476. spend 10 years, 20 years or his entire life writing C/C++ code,
  477. and he still cannot write an operating system, simply because of
  478. the ignorance of relevant domain knowledge. Just like learning
  479. English does not mean a person automatically becomes good at
  480. reading Math books written in English. Much more than that is
  481. needed. Knowing one or two programming languages is not enough.
  482. If a programmer writes software for a living, he had better be
  483. specialized in one or two problem domains outside of software if
  484. he does not want his job taken by domain experts who learn
  485. programming in their spare time.
  486. Documents for implementing a problem domain
  487. Documents are essential for learning a problem domain (and
  488. actually, anything) since information can be passed down in a
  489. reliable way. It is evident that this written text has been used
  490. for thousands of years to pass knowledge from generation to
  491. generation. Documents are integral parts of non-trivial
  492. projects. Without the documents:
  493. • New people will find it much harder to join a project.
  494. • It is harder to maintain a project because people may forget
  495. important unresolved bugs or quirks in their system.
  496. • It is challenging for customers to understand the product they
  497. are going to use. However, documents do not need to be written
  498. in book format. It can be anything from HTML format to database
  499. format to be displayed by a graphical user interface. Important
  500. information must be stored somewhere safe, readily accessible.
  501. There are many types of documents. However, to facilitate the
  502. understanding of a problem domain, these two documents need to be
  503. written: software requirement document and software
  504. specification.
  505. Software Requirement Document
  506. Software requirement document[margin:
  507. Software requirement
  508. ]Software requirement document includes both a list of
  509. requirements and a description of the problem domain (Kovitz, 1999)
  510. .
  511. A software solves a business problem. But, which problems to
  512. solve, are requested by a customer. Many of these requests make a
  513. list of requirements that our software needs to fulfill. However,
  514. an enumerated list of features is seldom useful in delivering
  515. software. As stated in the previous section, the tricky part is
  516. not programming alone but programming according to a problem
  517. domain. The bulk of software design and implementation depends
  518. upon the knowledge of the problem domain. The better understood
  519. the domain, the higher quality software can be. For example,
  520. building a house is practiced over thousands of years and is well
  521. understood, and it is easy to build a high-quality house;
  522. software is no different. Code that is difficult to understand
  523. is usually due to the author's ignorance of a problem domain. In
  524. the context of this book, we seek to understand the low-level
  525. working of various hardware devices.
  526. Because software quality depends upon an understanding of the
  527. problem domain, a software requirement document should always
  528. include a description of the problem domain.
  529. Be aware that software requirements are not:
  530. What vs How
  531. “what” and “how” are vague terms. What is the “what”? Is it
  532. nouns only? If so, what if a customer requires his software to
  533. perform specific steps of operations, such as purchasing
  534. procedure for a customer on a website. Does it include “verbs”
  535. now? However, isn't the “how” supposed to be step by step
  536. operations? Anything can be the “what” and anything can be the “
  537. how”.
  538. Sketches
  539. Software requirement document is all about the problem domain.
  540. It should not be a high-level description of an implementation.
  541. Some problems might seem straightforward to map directly from
  542. its domain description to the structure of an implementation.
  543. For example:
  544. • Users are given a list of books in a drop-down menu to
  545. choose.
  546. • Books are stored in a linked list”.
  547. • ...
  548. In the future, instead of a drop-down menu, all books are
  549. listed directly on a page in thumbnails. Books might be
  550. reimplemented as a graph, and each node is a book for finding
  551. related books, as a recommender is going to be added in the
  552. next version. The requirement document needs updating again to
  553. remove all the outdated implementation details, thus required
  554. additional efforts to maintain the requirement document, and
  555. when the effort for syncing with the implementation is too
  556. much, the developers give up documentation, and everyone starts
  557. ranting how useless documentation is.
  558. More often than not there is no straightforward one-to-one
  559. mapping. For example, a regular computer user expects an OS to
  560. be something that runs some program with GUI, or their favorite
  561. computer games. But for such requirements, an operating system
  562. is implemented as multiple layers, each hiding the details from
  563. the upper layers. To implement an operating system, a large
  564. body of knowledge from multiple fields is required, especially
  565. if the operating system runs on non-PC devices.
  566. It's best to include informat related to the problem domain in
  567. the requirement document. A good way to test the quality of
  568. a requirement document is to provide it to a domain expert
  569. for proofreading, to ensure he can understand the material.
  570. thoroughly. A requirement document is also useful as a help document later,
  571. or for writing one much easier.
  572. Software Specification
  573. Software specification[margin:
  574. Software specification
  575. ]Software specification document states rules relating desired
  576. behavior of the output devices to all possible behavior of the
  577. input devices, as well as any rules that other parts of the
  578. problem domain must obey.Kovitz (1999)
  579. Simply put, software specification is interface design, with
  580. constraints for the problem domain to follow e.g. the software
  581. can accept certain types of input such as the software is
  582. designed to accept English but no other language. For a hardware
  583. device, a specification is always needed, as software depends on
  584. its hardwired behaviors. And in fact, it is mostly the case that
  585. hardware specifications are well-defined, with the tiniest
  586. details in it. It needs to be that way because once hardware is
  587. physically manufactured, there's no going back, and if defects
  588. exist, it's a devastating damage to the company on both finance
  589. and reputation.
  590. Note that, similar to a requirement document, a specification
  591. only concerns interface design. If implementation details leak
  592. in, it is a burden to sync between the actual implementation and
  593. the specification, and soon to be abandoned.
  594. Another important remark is that, though a specification document
  595. is important, it does not have to be produced before the
  596. implementation. It can be prepared in any order: before or after
  597. a complete implementation; or at the same time with the
  598. implementation, when some part is done, and the interface is
  599. ready to be recorded in the specification. Regardless of methods,
  600. what matter is a complete specification at the end.
  601. Documents for writing an x86 Operating System
  602. When problem domain is different from software domain,
  603. requirement document and specification are usually separated.
  604. However, if the problem domain is inside software, specification
  605. most often includes both, and content of both can be mixed with
  606. each other. As demonstrated by previous sections the importance
  607. of documents, to implement an OS, we will need to collect
  608. relevant documents to gain sufficient domain knowledge. These
  609. documents are as follow:
  610. • Intel® 64 and IA-32 Architectures Software Developer’s Manual
  611. (Volume 1, 2, 3)
  612. • Intel® 3 Series Express Chipset Family Datasheet
  613. • System V Application Binary Interface
  614. Aside from the Intel's official website, the website of this book
  615. also hosts the documents for convenience[footnote:
  616. Intel may change the links to the documents as they update their
  617. website, so this book doesn't contain any link to the documents
  618. to avoid confusion for readers.
  619. ].
  620. Intel documents divide the requirement and specification sections
  621. clearly, but call the sections with different names. The
  622. corresponding to the requirement document is a section called “
  623. Functional Description”, which consists mostly of domain
  624. description; for specification, “Register Description” section
  625. describes all programming interfaces. Both documents carry no
  626. unnecessary implementation details[footnote:
  627. As it should be, those details are trade secret.
  628. ]. Intel documents are also great examples of how to write well
  629. requirements/specifications, as explained in this chapter.
  630. Other than the Intel documents, other documents will be
  631. introduced in the relevant chapters.
  632. This chapter gives an intuition on how hardware and software
  633. connected together, and how software is represented physically.
  634. The physical implementation of a bit
  635. All electronic devices, from simple to complex, manipulate this
  636. flow to achieve desired effects in the real world. Computers are
  637. no exception. When we write software, we indirectly manipulate
  638. electrical current at the physical level, in such a way that the
  639. underlying machine produces desired effects. To understand the
  640. process, we consider a simple light bulb. A light bulb can change
  641. two states between on and off with a switch, periodically: an off
  642. means number 0, and an on means 1.[float MarginFigure:
  643. [MarginFigure 1:
  644. A lightbulb
  645. ]
  646. <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/bulb.svg>
  647. ]
  648. However, one problem is that such a switch requires manual
  649. intervention from a human. What is required is an automatic
  650. switch based on the voltage level, as described above. To enable
  651. automatic switching of electrical signals, a device called
  652. transistor, invented by William Shockley, John Bardeen and Walter
  653. Brattain. This invention started the whole computer industry.
  654. At the core, a [margin:
  655. transistor
  656. ]transistortransistor is just a resistor whose values can vary
  657. based on an input voltage value[float MarginFigure:
  658. [MarginFigure 2:
  659. Modern transistor
  660. ]
  661. <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/transistor.svg>
  662. ]. With this property, a transistor can be used as a current
  663. amplifier (more voltage, less resistance) or switch electrical
  664. signals off and on (block and unblock an electron flow) based on
  665. a voltage level. At 0 v, no current can pass through a
  666. transistor, thus it acts like a circuit with an open switch
  667. (light bulb off) because the resistor value is enough to block
  668. the electrical flow. Similarly, at +3.5 v, current can flow
  669. through a transistor because the resistor value is lessened,
  670. effectively enables electron flow, thus acts like a circuit with
  671. a closed switch.[margin:
  672. If you want a deeper explanation of transistors e.g. how
  673. electrons move, you should look at the video “How semiconductors
  674. work” on Youtube, by Ben Eater.
  675. ]
  676. A bit has two states: 0 and 1, which is the building block of all
  677. digital systems and software. Similar to a light bulb that can be
  678. turned on and off, bits are made out of this electrical stream
  679. from the power source: Bit 0 are represented with 0 v (no
  680. electron flow), and bit 1 is +3.5 v to +5 v (electron flow).
  681. Transistor implements a bit correctly, as it can regulate the
  682. electron flow based on voltage level.
  683. MOSFET transistors
  684. The classic transistors invented open a whole new world of micro
  685. digital devices. Prior to the invention, vacuum tubes - which are
  686. just fancier light bulbs - were used to present 0 and 1, and
  687. required human to turn it on and off. [margin:
  688. MOSFET
  689. ]MOSFETMOSFET, or Metal–Oxide–Semiconductor Field-Effect
  690. Transistor, invented in 1959 by Dawon Kahng and Martin M. (John)
  691. Atalla at Bell Labs, is an improved version of classic
  692. transistors that is more suitable for digital devices, as it
  693. requires shorter switching time between two states 0 and 1, more
  694. stable, consumes less power and easier to produce.
  695. There are also two types of MOSFETs analogous to two types of
  696. transistors: n-MOSFET and p-MOSFET. n-MOSFET and p-MOSFET are
  697. also called NMOS and PMOS transistors for short.
  698. Beyond transistors: digital logic gates
  699. All digital devices are designed with logic gates. A logic gate[margin:
  700. logic gate
  701. ]logic gate is a device that implements a boolean function. Each
  702. logic gate includes a number of inputs and an output. All
  703. computer operations are built from the combinations of logic
  704. gates, which are just combinations of boolean functions. [float MarginFigure:
  705. [MarginFigure 3:
  706. Example: NAND gate
  707. ]
  708. <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/Nand-gate.svg>
  709. ]
  710. The theory behind logic gates
  711. Logic gates accept only binary inputs[footnote:
  712. Input that is either a 0 or 1.
  713. ] and produce binary outputs. In other words, logic gates are
  714. functions that transform binary values. Fortunately, a branch of
  715. math that deals exclusively with binary values already existed,
  716. called Boolean Algebra, developed in the 19[superscript:th]century by George Boole. With a sound mathematical theory as a
  717. foundation logic gates were created. As logic gates implement
  718. Boolean functions, a set of Boolean functions is functionally complete
  719. [margin:
  720. functionally complete
  721. ]functionally complete, if this set can construct all other
  722. Boolean functions can be constructed from. Later, Charles Sanders
  723. Peirce (during 1880 -- 1881) proved that either Boolean function
  724. of NOR or NAND alone is enough to create all other Boolean logic
  725. functions. Thus NOR and NAND gates are functionally complete Peirce (1933)
  726. . Gates are simply the implementations of Boolean logic
  727. functions, therefore NAND or NOR gate is enough to implement all
  728. other logic gates. The simplest gates CMOS circuit can implement
  729. are inverters (NOT gates) and from the inverters, comes NAND
  730. gates. With NAND gates, we are confident to implement everything
  731. else. This is why the inventions of transistors, then CMOS
  732. circuit revolutionized computer industry.[margin:
  733. If you want to understand why and how from NAND gate we can
  734. create all Boolean functions and a computer, I suggest the course
  735. Build a Modern Computer from First Principles: From Nand to
  736. Tetris available on Coursera: https://www.coursera.org/learn/build-a-computer
  737. . Go even further, after the course, you should take the series
  738. Computational Structures on Edx.
  739. ]
  740. We should realize and appreciate how powerful boolean functions
  741. are available in all programming languages.
  742. Logic Gate implementation: CMOS circuit
  743. Underlying every logic gate is a circuit called [margin:
  744. CMOS
  745. ]CMOSCMOS - Complementary MOSFET. CMOS consists of two
  746. complementary transistors, NMOS and PMOS. The simplest CMOS
  747. circuit is an inverter or a NOT gate:
  748. From NOT gate, a NAND gate can be created:
  749. From NAND gate, we have all other gates. As demonstrated, such a
  750. simple circuitry performs the logical operators in day-to-day
  751. program languages e.g. NOT operator ~ is executed directly by an
  752. inverter circuit, and operator & is executed by an AND circuit
  753. and so on. Code does not run on a magic black box. In contrast,
  754. code execution is precise and transparent, often as simple as
  755. running some hardwired circuit. When we write software, we simply
  756. manipulate electrical current at the physical level to run
  757. appropriate circuits to produce desired outcomes. However, this
  758. whole process somehow does not relate to any thought involving
  759. electrical current. That is the real magic and will be explained
  760. soon.
  761. One interesting property of CMOS is that a k-input gate uses k
  762. PMOS and k NMOS transistors (Wakerly, 1999). All logic gates are
  763. built by pairs of NMOS and PMOS transistors, and gates are the
  764. building blocks of all digital devices from simple to complex,
  765. including any computer. Thanks to this pattern, it is possible to
  766. separate between the actual physical circuit implementation and
  767. logical implementation. Digital designs are done by designing
  768. with logic gates then later be “compiled” into physical circuits.
  769. In fact, later we will see that logic gates become a language
  770. that describes how circuits operate. Understanding how CMOS works
  771. is important to understand how a computer is designed, and as a
  772. consequence, how a computer works[footnote:
  773. Again, if you want to understand how logic gates make a computer,
  774. consider the suggested courses on Coursera and Edx earlier.
  775. ].
  776. Finally, an implemented circuit with its wires and transistors is
  777. stored physically in a package called a chip. A chipchip is a
  778. substrate that an integrated circuit is etched onto. However, a
  779. chip also refers to a completely packaged integrated circuit in
  780. consumer market. Depends on the context, it is understood
  781. differently.[float MarginFigure:
  782. [MarginFigure 4:
  783. 74HC00 chip physical view
  784. ]
  785. <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/74hc00_nxp_physical.jpg>
  786. ]
  787. -------------------------------------------
  788. 74HC00 is a chip with four 2-input NAND gates. The chip comes
  789. with 8 input pins and 4 output pins, 1 pin for connecting to a
  790. voltage source and 1 pin for connecting to the ground. This
  791. device is the physical implementation of NAND gates that we can
  792. physically touch and use. But instead of just a single gate, the
  793. chip comes with 4 gates that can be combined. Each combination
  794. enables a different logic function, effective creating other
  795. logic gates. This feature is what make the chip popular.
  796. [float Figure:
  797. [Figure 0.3:
  798. 74HC00 logic diagrams (Source: 74HC00 datasheet, https://neurophysics.ucsd.edu/courses/physics_120/74HC00_QUAD_NAND.pdf
  799. )
  800. ]
  801. [float Figure:
  802. [Sub-Figure a:
  803. Logic diagram of 74HC00
  804. ]
  805. <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/7400_block_diagram.png>
  806. ] [float Figure:
  807. [Sub-Figure b:
  808. Logic diagram of one NAND gate
  809. ]
  810. <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/7400_logic_diagram.png>
  811. ]
  812. ]
  813. Each of the gates above is just a simple NAND circuit with the
  814. electron flows, as demonstrated earlier. Yet, many these
  815. NAND-gates chips combined can build a simple computer.
  816. Software, at the physical level, is just electron flows.
  817. How can the above gates can be created with 74HC00? It is
  818. simple: as every gate has 2 input pins and 1 output pin, we can
  819. write the output of 1 NAND gate to an input of another NAND
  820. gate, thus chaining NAND gates together to produce the diagrams
  821. as above.
  822. -------------------------------------------
  823. Beyond Logic Gates: Machine Language
  824. Machine language
  825. Being built upon gates, as gates only accept a series of 0 and 1,
  826. a hardware device only understands 0 and 1. However, a device
  827. only takes 0 and 1 in a systematic way. [margin:
  828. Machine language
  829. ]Machine languageMachine language is a collection of unique bit
  830. patterns that a device can identify and perform a corresponding
  831. action. A machine instruction is a unique bit pattern that a
  832. device can identify. In a computer system, a device with its
  833. language is called CPU - Central Processing Unit, which controls
  834. all activities going inside a computer. For example, in the x86
  835. architecture, the pattern 10100000 means telling a CPU to add two
  836. numbers, or 000000101 to halt a computer. In the early days of
  837. computers, people had to write completely in binary.
  838. Why does such a bit pattern cause a device to do something? The
  839. reason is that underlying each instruction is a small circuit
  840. that implements the instruction. Similar to how a
  841. function/subroutine in a computer program is called by its name,
  842. a bit pattern is a name of a little function inside a CPU that
  843. got executed when the CPU finds one.
  844. Note that CPU is not the only device with its language. CPU is
  845. just a name to indicate a hardware device that controls a
  846. computer system. A hardware device may not be a CPU but still has
  847. its language. A device with its own machine language is a
  848. programmable device, since a user can use the language to command
  849. the device to perform different actions. For example, a printer
  850. has its set of commands for instructing it how to prints a page.
  851. -------------------------------------------
  852. <exa:74HC00-chip-can>A user can use 74HC00 chip without knowing
  853. its internal, but only the interface for using the device. First,
  854. we need to know its layout:
  855. [float Figure:
  856. [Figure 0.4:
  857. 74HC00 Pin Layout (Source: 74HC00 datasheet, http://www.nxp.com/documents/data_sheet/74HC_HCT00.pdf
  858. )
  859. ]
  860. <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/7400_pin_configuration.pdf>
  861. ]
  862. Then, the functionality of each pin:
  863. [float Table:
  864. [Table 1:
  865. Pin Description (Source: 74HC00 datasheet, http://www.nxp.com/documents/data_sheet/74HC_HCT00.pdf
  866. )
  867. ]
  868. +-----------------------------+---------------+-----------------+
  869. | Symbol | Pin | Description |
  870. +------------------------------+---------------+----------------+
  871. | 1A to 4A | 1, 4, 9, 12 | data input |
  872. +------------------------------+---------------+----------------+
  873. | 1B to 4B | 2, 5, 10, 13 | data input |
  874. +------------------------------+---------------+----------------+
  875. | 1Y to 4Y | 3, 6, 8, 11 | data output |
  876. +------------------------------+---------------+----------------+
  877. | GND | 7 | ground (0 V) |
  878. +------------------------------+---------------+----------------+
  879. | V[subscript:cc][subscript:] | 14 | supply voltage |
  880. +------------------------------+---------------+----------------+
  881. ]
  882. Finally, how to use the pins:
  883. [float Table:
  884. [Table 2:
  885. Functional Description
  886. ]
  887. +------------+--------+
  888. | Input | Output |
  889. +-----+------+--------+
  890. | nA | nB | nY |
  891. +-----+------+--------+
  892. | L | X | H |
  893. +-----+------+--------+
  894. | X | L | H |
  895. +-----+------+--------+
  896. | H | H | L |
  897. +-----+------+--------+
  898. ]
  899. [margin:
  900. • n is a number, either 1, 2, 3, or 4
  901. • H = HIGH voltage level; L = LOW voltage level; X = don’t care.
  902. ]The functional description provides a truth table with all
  903. possible pin inputs and outputs, which also describes the usage
  904. of all pins in the device. A user needs not to know the
  905. implementation, but on such a table to use the device. We can
  906. say that the truth table above is the machine language of the
  907. device. Since the device is digital, its language is a
  908. collection of binary strings:
  909. • The device has 8 input pins, and this means it accepts binary
  910. strings of 8 bits.
  911. • The device has 4 output pins, and this means it produces
  912. binary strings of 4 bits from the 8-bit inputs.
  913. The number of input strings is what the device understand, and
  914. the number of output strings is what the device can speak.
  915. Together, they make the language of the device. Even though
  916. this device is simple, yet the language it can accept contains
  917. quite many binary strings: 2^{8}+2^{4}=272
  918. . However, the
  919. number is a tiny fraction of a complex device like a CPU, with
  920. hundreds of pins.
  921. When leaving as is, 74HC00 is simply a NAND device with two
  922. 4-bit inputs[footnote:
  923. Or simply 4-bit NAND gate, as it can only accept 4 bits of input
  924. at the maximum.
  925. ].
  926. +--------+-----------------------------------------------+----------------------+
  927. | | Input | Output |
  928. +--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+----+
  929. | Pin | 1A | 1B | 2A | 2B | 3A | 3B | 4A | 4B | 1Y | 2Y | 3Y | 4Y |
  930. +--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+----+
  931. | Value | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 1 |
  932. +--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+----+
  933. The inputs and outputs as visually presented:
  934. [float Figure:
  935. [Figure 0.5:
  936. Pins when receiving digital signals that correspond to a binary
  937. string. Green signals are inputs; blue signals are outputs.
  938. ] <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/7400_bin_string1.pdf>
  939. ]
  940. On the other hand, if OR gate is implemented, we can only build
  941. a 2-input OR gate from 74HC00, as it requires 3 NAND gates: 2
  942. input NAND gates and 1 output NAND gate. Each input NAND gate
  943. represents only a 1-bit input of the OR gate. In the following
  944. figure, the pins of each input NAND gates are always set to the
  945. same values (either both inputs are A or both inputs are B) to
  946. represent a single bit input for the final OR gate:
  947. [float Table:
  948. [Table 3:
  949. Truth table of OR logic diagram.
  950. ]
  951. +----+----+----+----+---+
  952. | A | B | C | D | Y |
  953. +----+----+----+----+---+
  954. | 0 | 0 | 1 | 1 | 0 |
  955. +----+----+----+----+---+
  956. | 0 | 1 | 1 | 0 | 1 |
  957. +----+----+----+----+---+
  958. | 1 | 0 | 0 | 1 | 1 |
  959. +----+----+----+----+---+
  960. | 1 | 1 | 0 | 0 | 1 |
  961. +----+----+----+----+---+
  962. ]
  963. -------------------------------------------
  964. To implement a 4-bit OR gate, we need a total of four of 74HC00
  965. chips configured as OR gates, packaged as a single chip as in
  966. figure [or-chip-74hc00].
  967. [float Figure:
  968. [Figure 0.6:
  969. 4-bit OR chip made from four 74HC00 devices
  970. ]<or-chip-74hc00>
  971. <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/4-bit-or-gate-layout.pdf>
  972. ]
  973. Assembly Language
  974. Assembly language is the symbolic representation of binary
  975. machine code, by giving bit patterns mnemonic names. It was a
  976. vast improvement when programmers had to write 0 and 1. For
  977. example, instead of writing 000000101, a programmer simply write
  978. hlt to stop a computer. Such an abstraction makes instructions
  979. executed by a CPU easier to remember, and thus more instructions
  980. could be memorized, less time spent looking up CPU manual to find
  981. instructions in bit forms and as a result, code was written
  982. faster.
  983. Understand assembly language is crucial for low-level programming
  984. domains, even to this day. The more instructions a programmer
  985. want to understand, the deeper understanding of machine
  986. architecture is required.
  987. We can build a device with 2 assembly instructions:
  988. or <op1>, <op2>
  989. nand <op1>, <op2>
  990. • or accepts two 4-bit operands. This corresponds to a 4-input
  991. OR gate device built from 4 74HC00 chips.
  992. • nand accepts two 4-bit operands. This corresponds to a single
  993. 74HC00 chips, leave as is.
  994. Essentially, the gates in the example [exa:74HC00-chip-can]
  995. implements the instructions. Up to this point, we only specify
  996. input and output and manually feed it to a device. That is, to
  997. perform an operation:
  998. • Pick a device by hands.
  999. • Manually put electrical signals into pins.
  1000. First, we want to automate the process of device selection.
  1001. That is, we want to simply write assembly instruction and the
  1002. device that implements the instruction is selected correctly.
  1003. Solving this problem is easy:
  1004. • Give each instruction an index in binary code, called
  1005. operation code or opcode for short, and embed it as part of
  1006. input. The value for each instruction is specified as in
  1007. table [ex-ins-ops].[float MarginTable:
  1008. [MarginTable 1:
  1009. Instruction-Opcode mapping.
  1010. ]<ex-ins-ops>
  1011. +--------------+-------------+
  1012. | Instruction | Binary Code |
  1013. +--------------+-------------+
  1014. +--------------+-------------+
  1015. | nand | 00 |
  1016. +--------------+-------------+
  1017. | or | 01 |
  1018. +--------------+-------------+
  1019. ]
  1020. Each input now contains additional data at the beginning: an
  1021. opcode. For example, the instruction:
  1022. nand 1100, 1100
  1023. corresponds to the binary string: 0011001100. The first two
  1024. bits 00 encodes a nand instruction, as listed in the table
  1025. above.
  1026. • Add another device to select a device, based on a binary code
  1027. peculiar to an instruction.
  1028. Such a device is called a decoder, an important component in a
  1029. CPU that decides which circuit to use. In the above example,
  1030. when feeding 0011001100 to the decoder, because the opcode is
  1031. 00, data are sent to NAND device for computing.
  1032. Finally, writing assembly code is just an easier way to write
  1033. binary strings that a device can understand. When we write
  1034. assembly code and save in a text file, a program called an [margin:
  1035. assembler
  1036. ]assemblerassembler translates the text file into binary strings
  1037. that a device can understand. So, how can an assembler exist in
  1038. the first place? Assume this is the first assembler in the
  1039. world, then it is written in binary code. In the next version,
  1040. life is easier: the programmers write the assembler in the
  1041. assembly code, then use the first version to compile itself.
  1042. These binary strings are then stored in another device that
  1043. later can be retrieved and sent to a decoder. A storage device[margin:
  1044. storage device
  1045. ]storage device is the device that stores machine instructions,
  1046. which is an array of circuits for saving 0 and 1 states.
  1047. A decoder is built out of logic gates similar to other digital
  1048. devices. However, a storage device can be anything that can
  1049. store 0 and 1 and is retrievable. A storage device can be a
  1050. magnetized device that uses magnetism to store information, or
  1051. it can be made out of electrical circuits using. Regardless of
  1052. the technology used, as long as the device can store data and
  1053. is accessible to retrieve data, it suffices. Indeed, the modern
  1054. devices are so complex that it is impossible and unnecessary to
  1055. understand every implementation detail. Instead, we only need
  1056. to learn the interfaces, e.g. the pins, that the devices
  1057. expose.
  1058. A computer essentially implements this process:
  1059. • Fetch an instruction from a storage device.
  1060. • Decode the instruction.
  1061. • Execute the instruction.
  1062. Or in short, a fetch -- decode -- executefetch -- decode --
  1063. execute cycle. The above device is extremely rudimentary, but
  1064. it already represents a computer with a fetch -- decode --
  1065. execute cycle. More instructions can be implemented by adding
  1066. more devices and allocating more opcodes for the instructions,
  1067. then update the decoder accordingly. The Apollo Guidance
  1068. Computer, a digital computer produced for the Apollo space
  1069. program from 1961 -- 1972, was built entirely with NOR gates -
  1070. the other choice to NAND gate for creating other logic gates.
  1071. Similarly, if we keep improving our hypothetical device, it
  1072. eventually becomes a full-fledge computer.
  1073. Programming Languages
  1074. Assembly language is a step up from writing 0 and 1. As time goes
  1075. by, people realized that many pieces of assembly code had
  1076. repeating patterns of usages. It would be nice if instead of
  1077. writing all the repeating blocks of code all over again in all
  1078. places, we simply refer to such blocks of code with easier to use
  1079. text forms. For example, a block of assembly code checks whether
  1080. one variable is greater than another and if so, execute a block
  1081. of code, else execute another block of code; in C, such block of
  1082. assembly code is represented by an if statement that is close to
  1083. human language.
  1084. [float Figure:
  1085. [Figure 0.7:
  1086. Repeated assembly patterns are generalized into a new language.
  1087. ]
  1088. <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/asm_to_proglang.pdf>
  1089. ]
  1090. People created text forms to represent common blocks of assembly
  1091. code, such as the if syntax above, then write a program to
  1092. translate the text forms into assembly code. The program that
  1093. translates such text forms to machine code is called a [margin:
  1094. compiler
  1095. ]compilercompiler:
  1096. Any software logic a programming language can implement, hardware
  1097. can also implement. The reverse is also true: any hardware logic
  1098. that is implemented in a circuit can be reimplemented in a
  1099. programming language. The simple reason is that programming
  1100. languages, or assembly languages, or machine languages, or logic
  1101. gates are just languages to express computations. It is
  1102. impossible for software to implement something hardware is
  1103. incapable of because programming language is just a simpler way
  1104. to use the underlying hardware. At the end of the day,
  1105. programming languages are translated to machine instructions that
  1106. are valid to a CPU. Otherwise, code is not runnable, thus a
  1107. useless software. In reverse, software can do everything hardware
  1108. (that run the software) can, as programming languages are just an
  1109. easier way to use the hardware.
  1110. In reality, even though all languages are equivalent in power,
  1111. not all of them are capable of express programs of each other.
  1112. Programming languages vary between two ends of a spectrum: high
  1113. level and low level.
  1114. The higher level a programming language is, the more distant it
  1115. becomes from the hardware. In some high-level programming languages,
  1116. such as Python, a programmer cannot manipulate underlying
  1117. hardware, despite being able to deliver the same computations as
  1118. low-level programming languages. The reason is that high-level
  1119. languages want to hide hardware details to free programmers from
  1120. dealing with irrelevant details not related to current problem
  1121. domains. Such convenience, however, is not free: it requires
  1122. software to carry an extra code for managing hardware details
  1123. (e.g. memory) thus making the code run slower, and it makes
  1124. hardware programming difficult or impossible. The more
  1125. abstractions a programming language imposes, the more difficult
  1126. it is for writing low-level software, such as hardware drivers or
  1127. an operating system. This is the reason why C is usually a
  1128. language of choice for writing an operating system, since C is
  1129. just a thin wrapper of the underlying hardware, making it easy to
  1130. understand how exactly a hardware device runs when executing a
  1131. certain piece of C code.
  1132. Each programming language represents a way of thinking about
  1133. programs. Higher-level programming languages help to focus on
  1134. problem domains that are not related to hardware at all, and
  1135. where programmer performance is more important than computer
  1136. performance. Lower-level programming languages help to focus on
  1137. the inner-working of a machine, thus are best suited for problem
  1138. domains that are related to control hardware. That is why so many
  1139. languages exist. Use the right tools for the right job to achieve
  1140. the best results.
  1141. Abstraction
  1142. AbstractionAbstraction is a technique for hiding complexity that
  1143. is irrelevant to the problem in context. For example, writing
  1144. programs without any other layer except the lowest layer: with
  1145. circuits. Not only a person needs an in-depth understanding of
  1146. how circuits work, making it much more obscure to design a
  1147. circuit because the designer must look at the raw circuits but
  1148. think in higher-level such as logic gates. It is a distracting
  1149. process, as a designer must constantly translate the idea into
  1150. circuits. It is possible for a designer simply thinks his
  1151. high-level ideas straight, and later translate the ideas into
  1152. circuits. Not only it is more efficient, but it is also more
  1153. accurate as a designer can focus all his efforts into verifying
  1154. the design with high-level thinking. When a new designer arrives,
  1155. he can easily understand the high-level designs, thus can
  1156. continue to develop or maintain existing systems.
  1157. Why abstraction works
  1158. In all the layers, abstractions manifest itself:
  1159. • Logic gates abstract away the details of CMOS.
  1160. • Machine language abstracts away the details of logic gates.
  1161. • Assembly language abstracts away the details of machine
  1162. languages.
  1163. • Programming language abstracts away the details of assembly
  1164. languages.
  1165. We see repeating patterns of how lower-layers build upper-layers:
  1166. • A lower layer has a recurring pattern. Then, this recurring
  1167. pattern is taken out and built a language on top of it.
  1168. • A higher layer strips away layer-specific (non-recurring)
  1169. details to focus on the recurring details.
  1170. • The recurring details are given a new and simpler language than
  1171. the languages of the lower layers.
  1172. What to realize is that every layer is just a more convenient
  1173. language to describe the lower layer. Only after a description is
  1174. fully created with the language of the higher layer, it is then
  1175. be implemented with the language of the lower layer.
  1176. • CMOS layer has a recurring pattern that makes sure logic gates
  1177. are reliably translated to CMOS circuits: a k-input gate uses k
  1178. PMOS and k NMOS transistors (Wakerly, 1999). Since digital
  1179. devices use CMOS exclusively, a language arose to describe
  1180. higher level ideas while hiding CMOS circuits: Logic Gates.
  1181. • Logic Gates hides the language of circuits and focuses on how
  1182. to implement primitive Boolean functions and combine them to
  1183. create new functions. All logic gates receive input and
  1184. generate output as binary numbers. Thanks to this recurring
  1185. patterns, logic gates are hidden away for the new language:
  1186. Assembly, which is a set of predefined binary patterns that
  1187. cause the underlying gates to perform an action.
  1188. • Soon, people realized that many recurring patterns arisen from
  1189. within Assembly language. Repeated blocks of Assembly code
  1190. appear in Assembly source files that express the same or
  1191. similar idea. There were many such ideas that can be reliably
  1192. translated into Assembly code. Thus, the ideas were extracted
  1193. for building into the high level programming languages that
  1194. everyone programmer learns today.
  1195. Recurring patterns are the key to abstraction. Recurring patterns
  1196. are why abstraction works. Without them, no language can be
  1197. built, and thus no abstraction. Fortunately, human already
  1198. developed a systematic discipline for studying patterns:
  1199. Mathematics. As quoted from the British mathematician G. H. Hardy
  1200. (2005):
  1201. A mathematician, like a painter or a poet, is a maker of
  1202. patterns. If his patterns are more permanent than theirs, it is
  1203. because they are made with ideas.
  1204. Isn't that a mathematical formula a representation of a pattern?
  1205. A variable represents values with the same properties given by
  1206. constraints? Mathematics provides a formal system to identify and
  1207. describe existing patterns in nature. For that reason, this
  1208. system can certainly be applied in the digital world, which is
  1209. just a subset of the real world. Mathematics can be used as a
  1210. common language to help translation between layers easier, and
  1211. help with the understanding of layers.
  1212. Why abstraction reduces complexity
  1213. Abstraction by building language certainly leverages productivity
  1214. by stripping irrelevant details to a problem. Imagine writing
  1215. programs without any other layout except the lowest layer: with
  1216. circuits. This is how complexity emerges: when high-level ideas
  1217. are expressed with lower-level language, as the example above
  1218. demonstrated. Unfortunately, this is the case with software as
  1219. programming languages at the moment are more emphasized on
  1220. software rather than the problem domains. That is, without prior
  1221. knowledge, code written in a language is unable to express itself
  1222. the knowledge of its target domain. In other words, a language is
  1223. expressive if its syntax is designed to express the problem
  1224. domain it is trying to solve. Consider this example: That is, the
  1225. what it will do rather the how it will do.
  1226. -------------------------------------------
  1227. Graphviz (http://www.graphviz.org/) is a visualization software
  1228. that provides a language, called dot, for describing graph:
  1229. As can be seen, the code perfectly expresses itself how the
  1230. graph is connected. Even a non-programmer can understand and
  1231. use such language easily. An implementation in C
  1232. would be more troublesome, and that's assuming that the
  1233. functions for drawing graphs are already available. To draw a
  1234. line, in C we might write something like:
  1235. draw_line(a, b);
  1236. However, it is still verbose compared with:
  1237. a -> b;
  1238. Also, a and b must be defined in C, compared to the implicit
  1239. nodes in the dot language. However, if we do not factor in the
  1240. verbosity, then C still has a limitation: it cannot change its
  1241. syntax to suit the problem domain. A domain-specific language
  1242. might even be more verbose, but it makes a domain more
  1243. understandable. If a problem domain must be expressed in C,
  1244. then it is constraint by the syntax of C. Since C is not a
  1245. specialized language for a problem domain that, but is a
  1246. general-purpose programming language, the domain knowledge is
  1247. buried within the implementation details. As a result, a C
  1248. programmer is needed to decipher and extract the domain
  1249. knowledge out. If the domain knowledge cannot be extracted,
  1250. then the software cannot be further developed.
  1251. Linux is full of applications controlled by many domain-specific
  1252. languages and are placed in /etc directory, such as a web server.
  1253. Instead of reprogramming the software, a domain-agnostic language
  1254. is made for it.
  1255. -------------------------------------------
  1256. In general, code that can express a problem domain must be
  1257. understandable by a domain expert. Even within the software
  1258. domain, building a language out of repeated programming patterns
  1259. is useful. It helps people aware the existence of such patterns
  1260. in code and thus making software easier to maintain, as software
  1261. structure is visible as a language. Only a programming language
  1262. that is capable of morphing itself to suit a problem domain can
  1263. achieve that goal. Such language is called a programmable
  1264. programming language. Unfortunately, this approach of turning
  1265. software structure visible is not favored among programmers, as a
  1266. new language must be made out of it along with new toolchain to
  1267. support it. Thus, software structure and domain knowledge are
  1268. buried within code written in the syntax of a general-purpose
  1269. language, and if a programmer is not familiar or even aware of
  1270. the existence of a code pattern, then it is hopeless to
  1271. understand the code. A prime example is reading C code that
  1272. controls hardware, e.g. an operating system: if a programmer
  1273. knows absolutely nothing about hardware, then it is impossible to
  1274. read and write operating system code in C, even if he could have
  1275. 20 years of writing application C code.
  1276. With abstraction, a software engineer can also understand the
  1277. inner-working of a device without specialized knowledge of
  1278. physical circuit design, enables the software engineer to write
  1279. code that controls a device. The separation between logical and
  1280. physical implementation also entails that gate designs can be
  1281. reused even when the underlying technologies changed. For
  1282. example, in some distant future biological computer could be a
  1283. reality, and gates might not be implemented as CMOS but some kind
  1284. of biological cells e.g. as living cells; in either technology:
  1285. electrical or biological, as long as logic gates are physically
  1286. realized, the same computer design could be implemented.
  1287. Computer Architecture
  1288. To write lower level code, a programmer must understand the
  1289. architecture of a computer. It is similar to when one writes
  1290. programs in a software framework, he must know what kinds of
  1291. problems the framework solves, and how to use the framework by
  1292. its provided software interfaces. But before getting to the
  1293. definition of what computer architecture is, we must understand
  1294. what exactly is a computer, as many people still think that a
  1295. computer is a regular computer we put on a desk, or at best, a
  1296. server. Computers come in various shapes and sizes and are
  1297. devices that people never imagine they are computers, and that
  1298. code can run on such devices.
  1299. What is a computer?
  1300. A [margin:
  1301. computer
  1302. ]computercomputer is a hardware device that consists of at least
  1303. a processor (CPU), a memory device and input/output interfaces.
  1304. All the computers can be grouped into two types:
  1305. Single-purpose computer is a computer built at the hardware
  1306. level for specific tasks. For example, dedicated application
  1307. encoders/decoders , timer, image/video/sound processors.
  1308. General-purpose computer is a computer that can be programmed
  1309. (without modifying its hardware) to emulate various features of
  1310. single-purpose computers.
  1311. Server
  1312. A server[margin:
  1313. server
  1314. ]server is a general-purpose high-performance computer with huge
  1315. resources to provide large-scale services for a broad audience.
  1316. The audience are people with their personal computer connected to
  1317. a server.
  1318. [float Figure:
  1319. [Figure 0.8:
  1320. Blade servers. Each blade server is a computer with a modular
  1321. design optimize for the use of physical space and energy. The
  1322. enclosure of blade servers is called a chassis.(Source: [https://commons.wikimedia.org/wiki/File:Wikimedia_Foundation_Servers-8055_35.jpg||Wikimedia]
  1323. , author: Victorgrigas)
  1324. ]
  1325. <Graphics file: C:/Users/Tu Do/os01/book_src/images/03/Wikimedia_Foundation_Servers-8055_35.jpg>
  1326. ]
  1327. Desktop Computer
  1328. A [margin:
  1329. desktop computer
  1330. ]desktop computerdesktop computer is a general-purpose computer
  1331. with an input and output system designed for a human user, with
  1332. moderate resources enough for regular use. The input system
  1333. usually includes a mouse and a keyboard, while the output system
  1334. usually consists of a monitor that can display a large mount of
  1335. pixels. The computer is enclosed in a chassis large enough for
  1336. putting various computer components such as a processor, a
  1337. motherboard, a power supply, a hard drive, etc.
  1338. [float Figure:
  1339. [Figure 0.9:
  1340. A typical desktop computer.
  1341. ]
  1342. <Graphics file: C:/Users/Tu Do/os01/book_src/images/03/computer-158675.svg>
  1343. ]
  1344. Mobile Computer
  1345. A mobile computer[margin:
  1346. mobile computer
  1347. ]mobile computer is similar to a desktop computer with fewer
  1348. resources but can be carried around.
  1349. Game Consoles
  1350. Game consoles are similar to desktop computers but are optimized
  1351. for gaming. Instead of a keyboard and a mouse, the input system
  1352. of a game console are game controllers, which is a device with a
  1353. few buttons for controlling on-screen objects; the output system
  1354. is a television. The chassis is similar to a desktop computer but
  1355. is smaller. Game consoles use custom processors and graphic
  1356. processors but are similar to ones in desktop computers. For
  1357. example, the first Xbox uses a custom Intel Pentium III
  1358. processor.
  1359. Handheld game consoles are similar to game consoles, but
  1360. incorporate both the input and output systems along with the
  1361. computer in a single package.
  1362. Embedded Computer
  1363. An [margin:
  1364. embedded computer
  1365. ]embedded computerembedded computer is a single-board or
  1366. single-chip computer with limited resources designed for
  1367. integrating into larger hardware devices. [float MarginFigure:
  1368. [MarginFigure 5:
  1369. An Intel 82815 Graphics and Memory Controller Hub embedded on a
  1370. PC motherboard. (Source: [https://commons.wikimedia.org/wiki/File:Intel_82815_GMCH.jpg||Wikimedia]
  1371. , author: Qurren)
  1372. ]
  1373. <Graphics file: C:/Users/Tu Do/os01/book_src/images/03/Intel_82815_GMCH.jpg>
  1374. ][float MarginFigure:
  1375. [MarginFigure 6:
  1376. A PIC microcontroller. (Soure: [http://www.microchip.com/wwwproducts/en/PIC18F4620||Microchip]
  1377. )
  1378. ]
  1379. <Graphics file: C:/Users/Tu Do/os01/book_src/images/03/medium-PIC18F4620-PDIP-40.png>
  1380. ]
  1381. A [margin:
  1382. microcontroller
  1383. ]microcontrollerMicrocontroller is an embedded computer designed
  1384. for controlling other hardware devices. A microcontroller is
  1385. mounted on a chip. Microcontrollers are general-purpose
  1386. computers, but with limited resources so that it is only able to
  1387. perform one or a few specialized tasks. These computers are used
  1388. for a single purpose, but they are still general-purpose since it
  1389. is possible to program them to perform different tasks, depends
  1390. on the requirements, without changing the underlying hardware.
  1391. Another type of embedded computer is system-on-chip. A
  1392. system-on-chipsystem-on-chip is a full computer on a single chip.
  1393. Though a microcontroller is housed on a chip, its purpose is
  1394. different: to control some hardware. A microcontroller is usually
  1395. simpler and more limited in hardware resources as it specializes
  1396. only in one purpose when running, whereas a system-on-chip is a
  1397. general-purpose computer that can serve multiple purposes. A
  1398. system-on-chip can run like a regular desktop computer that is
  1399. capable of loading an operating system and run various
  1400. applications. A system-on-chip typically presents in a
  1401. smartphone, such as Apple A5 SoC used in Ipad2 and iPhone 4S, or
  1402. Qualcomm Snapdragon used in many Android phones.[float MarginFigure:
  1403. [MarginFigure 7:
  1404. Apple A5 SoC
  1405. ]
  1406. <Graphics file: C:/Users/Tu Do/os01/book_src/images/03/128px-Apple_A5_Chip.jpg>
  1407. ]
  1408. Be it a microcontroller or a system-on-chip, there must be an
  1409. environment where these devices can connect to other devices.
  1410. This environment is a circuit board called a PCBPCB -- Printed Circuit Board
  1411. Printed Circuit Board. A printed circuit boardPrinted Circuit Board
  1412. is a physical board that contains lines and pads to enable
  1413. electron flows between electrical and electronics components.
  1414. Without a PCB, devices cannot be combined to create a larger
  1415. device. As long as these devices are hidden inside a larger
  1416. device and contribute to a larger device that operates at a
  1417. higher level layer for a higher level purpose, they are embedded
  1418. devices. Writing a program for an embedded device is therefore
  1419. called embedded programmingembedded programming. Embedded
  1420. computers are used in automatically controlled devices including
  1421. power tools, toys, implantable medical devices, office machines,
  1422. engine control systems, appliances, remote controls and other
  1423. types of embedded systems.
  1424. The line between a microcontroller and a system-on-chip is
  1425. blurry. If hardware keeps evolving more powerful, then a
  1426. microcontroller can get enough resources to run a minimal
  1427. operating system on it for multiple specialized purposes. In
  1428. contrast, a system-on-chip is powerful enough to handle the job
  1429. of a microcontroller. However, using a system-on-chip as a
  1430. microcontroller would not be a wise choice as price will rise
  1431. significantly, but we also waste hardware resources since the
  1432. software written for a microcontroller requires little computing
  1433. resources.
  1434. Field Gate Programmable Array
  1435. [margin:
  1436. Field Programmable Gate Array
  1437. ]Field Programmable Gate ArrayField Gate Programmable Array (FPGA
  1438. FPGA) is a hardware an array of reconfigurable gates that makes
  1439. circuit structure programmable after it is shipped away from the
  1440. factory[footnote:
  1441. This is why it is called Field Gate Programmable Array. It is
  1442. changeable “in the field” where it is applied.
  1443. ]. Recall that in the previous chapter, each 74HC00 chip can be
  1444. configured as a gate, and a more sophisticated device can be
  1445. built by combining multiple 74HC00 chips. In a similar manner,
  1446. each FPGA device contains thousands of chips called logic blocks,
  1447. which is a more complicated chip than a 74HC00 chip that can be
  1448. configured to implement a Boolean logic function. These logic
  1449. blocks can be chained together to create a high-level hardware
  1450. feature. This high-level feature is usually a dedicated algorithm
  1451. that needs high-speed processing.
  1452. [float Figure:
  1453. [Figure 0.10:
  1454. FPGA Architecture (Source: [http://www.ni.com/tutorial/6097/en/||National Instruments]
  1455. )
  1456. ]
  1457. <Graphics file: C:/Users/Tu Do/os01/book_src/images/03/fpga_400x212.jpg>
  1458. ]
  1459. Digital devices can be designed by combining logic gates, without
  1460. regarding actual circuit components, since the physical circuits
  1461. are just multiples of CMOS circuits. Digital hardware, including
  1462. various components in a computer, is designed by writing code,
  1463. like a regular programmer, by using a language to describe how
  1464. gates are wired together. This language is called a Hardware
  1465. Description LanguageHardware Description Language. Later the
  1466. hardware description is compiled to a description of connected
  1467. electronic components called a netlistnetlist, which is a more
  1468. detailed description of how gates are connected.
  1469. The difference between FPGA and other embedded computers is that
  1470. programs in FPGA are implemented at the digital logic level,
  1471. while programs in embedded computers like microcontrollers or
  1472. system-on-chip devices are implemented at assembly code level. An
  1473. algorithm written for a FPGA device is a description of the
  1474. algorithm in logic gates, which the FPGA device then follows the
  1475. description to configure itself to run the algorithm. An
  1476. algorithm written for a microcontroller is in assembly
  1477. instructions that a processor can understand and act accordingly.
  1478. FPGA is applied in the cases where the specialized operations are
  1479. unsuitable and costly to run on a regular computer such as
  1480. real-time medical image processing, cruise control system,
  1481. circuit prototyping, video encoding/decoding, etc. These
  1482. applications require high-speed processing that is not achievable
  1483. with a regular processor because a processor wastes a significant
  1484. amount of time in executing many non-specialized instructions -
  1485. which might add up to thousands of instructions or more - to
  1486. implement a specialized operation, thus more circuits at physical
  1487. level to carry the same operation. A FPGA device carries no such
  1488. overhead; instead, it runs a single specialized operation
  1489. implemented in hardware directly.
  1490. Application-Specific Integrated Circuit
  1491. An Application-Specific Integrated CircuitApplication-Specific
  1492. Integrated Circuit (or ASICASIC) is a chip designed for a
  1493. particular purpose rather than for general-purpose use. ASIC does
  1494. not contain a generic array of logic blocks that can be
  1495. reconfigured to adapt to any operation like an FPGA; instead,
  1496. every logic block in an ASIC is made and optimized for the
  1497. circuit itself. FPGA can be considered as the prototyping stage
  1498. of an ASIC, and ASIC as the final stage of circuit production.
  1499. ASIC is even more specialized than FPGA, so it can achieve even
  1500. higher performance. However, ASICs are very costly to manufacture
  1501. and once the circuits are made, if design errors happen,
  1502. everything is thrown away, unlike the FPGA devices which can
  1503. simply be reprogrammed because of the generic gate array.
  1504. Computer Architecture
  1505. The previous section examined various classes of computers.
  1506. Regardless of shapes and sizes, every computer is designed for an
  1507. architect from high level to low level.
  1508. Computer\,Architecture=Instruction\,Set\,Architecture+Computer\,Organization+Hardware
  1509. At the highest-level is the Instruction Set Architecture.
  1510. At the middle-level is the Computer Organization.
  1511. At the lowest-level is the Hardware.
  1512. Instruction Set Architecture
  1513. An instruction setinstruction set is the basic set of commands
  1514. and instructions that a microprocessor understands and can carry
  1515. out.
  1516. An Instruction Set ArchitectureInstruction Set Architecture, or ISA
  1517. ISA, is the design of an environment that implements an
  1518. instruction set. Essentially, a runtime environment similar to
  1519. those interpreters of high-level languages. The design includes
  1520. all the instructions, registers, interrupts, memory models (how
  1521. memory are arranged to be used by programs), addressing modes,
  1522. I/O... of a CPU. The more features (e.g. more instructions) a CPU
  1523. has, the more circuits are required to implement it.
  1524. Computer organization
  1525. [margin:
  1526. Computer organization
  1527. ]Computer organizationComputer organization is the functional
  1528. view of the design of a computer. In this view, hardware
  1529. components of a computer are presented as boxes with input and
  1530. output that connects to each other and form the design of a
  1531. computer. Two computers may have the same ISA, but different
  1532. organizations. For example, both AMD and Intel processors
  1533. implement x86 ISA, but the hardware components of each processor
  1534. that make up the environments for the ISA are not the same.
  1535. Computer organizations may vary depend on a manufacturer's
  1536. design, but they are all originated from the Von Neumann
  1537. architecture[footnote:
  1538. John von Neumann was a mathematician and physicist who invented a
  1539. computer architecture.
  1540. ]:
  1541. [float Figure:
  1542. [Figure 0.11:
  1543. Von-Neumann Architecture
  1544. ]
  1545. <Graphics file: C:/Users/Tu Do/os01/book_src/images/03/von_neumann_architecture.pdf>
  1546. ]
  1547. CPUCPU fetches instructions continuously from main memory and
  1548. execute.
  1549. MemoryMemory stores program code and data.
  1550. BusBus are electrical wires for sending raw bits between the
  1551. above components.
  1552. I/O DevicesI/O Devices are devices that give input to a
  1553. computer i.e. keyboard, mouse, sensor... and takes the output
  1554. from a computer i.e. monitor takes information sent from CPU to
  1555. display it, LED turns on/off according to a pattern computed by
  1556. CPU...
  1557. The Von-Neumann computer operates by storing its instructions in
  1558. main memory, and CPU repeatedly fetches those instructions into
  1559. its internal storage for executing, one after another. Data are
  1560. transferred through a data bus between CPU, memory and I/O
  1561. devices, and where to store in the devices is transferred through
  1562. the address bus by the CPU. This architecture completely
  1563. implements the fetch -- decode -- executefetch -- decode --
  1564. execute cycle.
  1565. The earlier computers were just the exact implementations of the
  1566. Von Neumann architecture, with CPU and memory and I/O devices
  1567. communicate through the same bus. Today, a computer has more
  1568. buses, each is specialized in a type of traffic. However, at the
  1569. core, they are still Von Neumann architecture. To write an OS for
  1570. a Von Neumann computer, a programmer needs to be able to
  1571. understand and write code that controls the cores components:
  1572. CPU, memory, I/O devices, and bus.
  1573. CPUCPU, or Central Processing UnitCentral Processing Unit, is the
  1574. heart and brain of any computer system. Understand a CPU is
  1575. essential to writing an OS from scratch:
  1576. • To use these devices, a programmer needs to controls the CPU to
  1577. use the programming interfaces of other devices. CPU is the
  1578. only way, as CPU is the only direct device a programmer can use
  1579. and the only device that understand code written by a
  1580. programmer.
  1581. • In a CPU, many OS concepts are already implemented directly in
  1582. hardware, e.g. task switching, paging. A kernel programmer
  1583. needs to know how to use the hardware features, to avoid
  1584. duplicating such concept in software, thus wasting computer
  1585. resources.
  1586. • CPU built-in OS features boost both OS performance and
  1587. developer productivity because those features are actual
  1588. hardware, the lowest possible level, and developers are free to
  1589. implement such features.
  1590. • To effectively use the CPU, a programmer needs to understand
  1591. the documentation provided from CPU manufacturer. For example, [[http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html||Intel® 64 and IA-32 Architectures Software Developer Manuals]
  1592. .
  1593. • After understanding one CPU architecture well, it is easier to
  1594. learn other CPU architectures.
  1595. A CPU is an implementation of an ISA, effectively the
  1596. implementation of an assembly language (and depending on the CPU
  1597. architecture, the language may vary). Assembly language is one of
  1598. the interfaces that are provided for software engineers to
  1599. control a CPU, thus control a computer. But how can every
  1600. computer device be controlled with only the access to the CPU?
  1601. The simple answer is that a CPU can communicate with other
  1602. devices through these two interfaces, thus commanding them:
  1603. Registers Registers[margin:
  1604. Registers
  1605. ]are a hardware component for high-speed data access and
  1606. communication with other hardware devices. Registers allow
  1607. software to control hardware directly by writing to registers
  1608. of a device, or receive information from hardware device when
  1609. reading from registers of a device.
  1610. Not all registers are used for communication with other
  1611. devices. In a CPU, most registers are used as high-speed
  1612. storage for temporary data. Other devices that a CPU can
  1613. communicate always have a set of registers for interfacing with
  1614. the CPU.
  1615. Port Port[margin:
  1616. Port
  1617. ]is a specialized register in a hardware device used for
  1618. communication with other devices. When data are written to a
  1619. port, it causes a hardware device to perform some operation
  1620. according to values written to the port. The different between
  1621. a port and a register is that port does not store data, but
  1622. delegate data to some other circuit.
  1623. These two interfaces are extremely important, as they are the
  1624. only interfaces for controlling hardware with software. Writing
  1625. device drivers is essentially learning the functionality of each
  1626. register and how to use them properly to control the device.
  1627. [margin:
  1628. Memory
  1629. ]MemoryMemory is a storage device that stores information. Memory
  1630. consists of many cells. Each cell is a byte with its address
  1631. number, so a CPU can use such address number to access an exact
  1632. location in memory. Memory is where software instructions (in the
  1633. form of machine language) is stored and retrieved to be executed
  1634. by CPU; memory also stores data needed by some software. Memory
  1635. in a Von Neumann machine does not distinguish between which bytes
  1636. are data and which bytes are software instructions. It's up to
  1637. the software to decide, and if somehow data bytes are fetched and
  1638. executed as instructions, CPU still does it if such bytes
  1639. represents valid instructions, but will produce undesirable
  1640. results. To a CPU, there's no code and data; both are merely
  1641. different types of data for it to act on: one tells it how to do
  1642. something in a specific manner, and one is necessary materials
  1643. for it to carry such action.
  1644. The RAM is controlled by a device called a memory controllermemory controller
  1645. . Currently, most processors have this device embedded, so the
  1646. CPU has a dedicated memory bus connecting the processor to the
  1647. RAM. On older CPU[footnote:
  1648. Prior to the CPU's produced in 2009
  1649. ], however, this device was located in a chip also known as MCH
  1650. or Memory Controller HubMemory Controller Hub. In this case, the
  1651. CPU does not communicate directly to the RAM, but to the MCH
  1652. chip, and this chip then accesses the memory to read or write
  1653. data. The first option provides better performance since there is
  1654. no middleman in the communications between the CPU and the
  1655. memory.
  1656. At the physical level, RAM is implemented as a grid of cells that
  1657. each contain a transistor and an electrical device called a [margin:
  1658. capacitor
  1659. ]capacitorcapacitor, which stores charge for short periods of
  1660. time. The transistor controls access to the capacitor; when
  1661. switched on, it allows a small charge to be read from or written
  1662. to the capacitor. The charge on the capacitor slowly dissipates,
  1663. requiring the inclusion of a refresh circuit to periodically read
  1664. values from the cells and write them back after amplification
  1665. from an external power source.
  1666. Bus[margin:
  1667. Bus
  1668. ]Bus is a subsystem that transfers data between computer
  1669. components or between computers. Physically, buses are just
  1670. electrical wires that connect all components together and each
  1671. wire transfer a single big of data. The total number of wires is
  1672. called bus width[margin:
  1673. bus width
  1674. ]bus width, and is dependent on how many wires a CPU can support.
  1675. If a CPU can only accept 16 bits at a time, then the bus has 16
  1676. wires connecting from a component to the CPU, which means the CPU
  1677. can only retrieve 16 bits of data a time.
  1678. Hardware
  1679. Hardware is a specific implementation of a computer. A line of
  1680. processors implement the same instruction set architecture and
  1681. use nearly identical organizations but differ in hardware
  1682. implementation. For example, the Core i7 family provides a model
  1683. for desktop computers that is more powerful but consumes more
  1684. energy, while another model for laptops is less performant but
  1685. more energy efficient. To write software for a hardware device,
  1686. seldom we need to understand a hardware implementation if
  1687. documents are available. Computer organization and especially the
  1688. instruction set architecture are more relevant to an operating
  1689. system programmer. For that reason, the next chapter is devoted
  1690. to study the x86 instruction set architecture in depth.
  1691. x86 architecture
  1692. A chipsetchipset is a chip with multiple functions. Historically,
  1693. a chipset is actually a set of individual chips, and each is
  1694. responsible for a function, e.g. memory controller, graphic
  1695. controllers, network controller, power controller, etc. As
  1696. hardware progressed, the set of chips were incorporated into a
  1697. single chip, thus more space, energy, and cost efficient. In a
  1698. desktop computer, various hardware devices are connected to each
  1699. other through a PCB called a motherboardmotherboard. Each CPU
  1700. needs a compatible motherboard that can host it. Each motherboard
  1701. is defined by its chipset model that determine the environment
  1702. that a CPU can control. This environment typically consists of
  1703. • a slot or more for CPU
  1704. • a chipset of two chips which are the Northbridge and
  1705. Southbridge chips
  1706. – Northbridge chip is responsible for the high-performance
  1707. communication between CPU, main memory and the graphic card.
  1708. – Southbridge chip is responsible for the communication with
  1709. I/O devices and other devices that are not performance
  1710. sensitive.
  1711. • slots for memory sticks
  1712. • a slot or more for graphic cards.
  1713. • generic slots for other devices, e.g. network card, sound card.
  1714. • ports for I/O devices, e.g. keyboard, mouse, USB.
  1715. [float Figure:
  1716. [Figure 0.12:
  1717. Motherboard organization.
  1718. ]<mobo-organization>
  1719. <Graphics file: C:/Users/Tu Do/os01/book_src/images/03/Motherboard_diagram.svg>
  1720. ]
  1721. To write a complete operating system, a programmer needs to
  1722. understand how to program these devices. After all, an operating
  1723. system manages hardware automatically to free application
  1724. programs doing so. However, of all the components, learning to
  1725. program the CPU is the most important, as it is the component
  1726. present in any computer, regardless of what type a computer is.
  1727. For this reason, the primary focus of this book will be on how to
  1728. program an x86 CPU. Even solely focused on this device, a
  1729. reasonably good minimal operating system can be written. The
  1730. reason is that not all computers include all the devices as in a
  1731. normal desktop computer. For example, an embedded computer might
  1732. only have a CPU and limited internal memory, with pins for
  1733. getting input and producing an output; yet, operating systems
  1734. were written for such devices.
  1735. However, learning how to program an x86 CPU is a daunting task,
  1736. with 3 primary manuals written for it: almost 500 pages for
  1737. volume 1, over 2000 pages for volume 2 and over 1000 pages for
  1738. volume 3. It is an impressive feat for a programmer to master
  1739. every aspect of x86 CPU programming.
  1740. Intel Q35 Chipset
  1741. Q35 is an Intel chipset released September 2007. Q35 is used as
  1742. an example of a high-level computer organization because later we
  1743. will use QEMU to emulate a Q35 system, which is latest Intel
  1744. system that QEMU can emulate. Though released in 2007, Q35 is
  1745. relatively modern to the current hardware, and the knowledge can
  1746. still be reused for current chipset model. With a Q35 chipset,
  1747. the emulated CPU is also relatively up-to-date with features
  1748. presented in current day CPUs so we can use the latest software
  1749. manuals from Intel.
  1750. Figure [mobo-organization] is a typical current-day motherboard
  1751. organization, in which Q35 shares similar organization.
  1752. x86 Execution Environment
  1753. An execution environmentexecution environment is an environment
  1754. that provides the facility to make code executable. The execution
  1755. environment needs to address the following question:
  1756. • Supported operations? data transfer, arithmetic, control,
  1757. floating-point...
  1758. • Where are operands stored? registers, memory, stack,
  1759. accumulator
  1760. • How many explicit operands are there for each instruction? 0,
  1761. 1, 2, or 3
  1762. • How is the operand location specified? register, immediate,
  1763. indirect, . . .
  1764. • What type and size of operands are supported? byte, int, float,
  1765. double, string, vector...
  1766. • etc.
  1767. For the remain of this chapter, please carry on the reading to
  1768. chapter 3 in Intel Manual Volume 1, “Basic Execution Environment”
  1769. .
  1770. x86 Assembly and C
  1771. In this chapter, we will explore assembly language, and how it
  1772. connects to C. But why should we do so? Isn't it better to trust
  1773. the compiler, plus no one writes assembly anymore?
  1774. Not quite. Surely, the compiler at its current state of the art
  1775. is trustworthy, and we do not need to write code in assembly,
  1776. most of the time. A compiler can generate code, but as mentioned
  1777. previously, a high-level language is a collection of patterns of
  1778. a lower-level language. It does not cover everything that a
  1779. hardware platform provides. As a consequence, not every assembly
  1780. instruction can be generated by a compiler, so we still need to
  1781. write assembly code for these circumstances to access
  1782. hardware-specific features. Since hardware-specific features
  1783. require writing assembly code, debugging requires reading it. We
  1784. might spend even more time reading than writing. Working with
  1785. low-level code that interacts directly with hardware, assembly
  1786. code is unavoidable. Also, understand how a compiler generates
  1787. assembly code could improve a programmer's productivity. For
  1788. example, if a job or school assignment requires us to write
  1789. assembly code, we can simply write it in C, then let gcc does the
  1790. hard working of writing the assembly code for us. We merely
  1791. collect the generated assembly code, modify as needed and be done
  1792. with the assignment.
  1793. We will learn objdump extensively, along with how to use Intel
  1794. documents to aid in understanding x86 assembly code.
  1795. objdump
  1796. objdumpobjdump is a program that displays information about
  1797. object files. It will be handy later to debug incorrect layout
  1798. from manual linking. Now, we use objdump to examine how high
  1799. level source code maps to assembly code. For now, we ignore the
  1800. output and learn how to use the command first. It is simple to
  1801. use objdump :
  1802. $ objdump -d hello
  1803. -d option only displays assembled contents of executable
  1804. sections. A sectionsection is a block of memory that contains
  1805. either program code or data. A code section is executable by the
  1806. CPU, while a data section is not executable. Non-executable
  1807. sections, such as .data and .bss (for storing program data),
  1808. debug sections... are not displayed. We will learn more about
  1809. section when studying ELF binary file format in chapter [chap:The-Anatomy-of-a-program]
  1810. . On the other hand:
  1811. $ objdump -D hello
  1812. where -D option displays assembly contents of all sections. If -D
  1813. , -d is implicitly assumed. objdump is mostly used for inspecting
  1814. assembly code, so -d is the most useful and thus is set by
  1815. default.
  1816. The output overruns the terminal screen. To make it easy for
  1817. reading, send all the output to less:
  1818. $ objdump -d hello | less
  1819. To intermix source code and assembly, the binary must be compiled
  1820. with -g option to include source code in it, then add -S option:
  1821. $ objdump -S hello | less
  1822. The default syntax used by objdump is AT&T syntax. To change it
  1823. to the familiar Intel syntax:
  1824. $ objdump -M intel -D hello | less
  1825. When using -M option, option -D or -d must be explicitly
  1826. supplied. Next, we will use objdump to examine how compiled C
  1827. data and code are represented in machine code.
  1828. Finally, we will write a 32-bit kernel, therefore we will need to
  1829. compile a 32-bit binary and examine it in 32-bit mode:
  1830. $ objdump -M i386,intel -D hello | less
  1831. -M i386 tells objdump to display assembly content using 32-bit
  1832. layout. Knowing the difference between 32-bit and 64-bit is
  1833. crucial for writing kernel code. We will examine this matter
  1834. later on when writing our kernel.
  1835. Reading the output
  1836. At the start of the output displays the file format of the object
  1837. file:
  1838. hello: file format elf64-x86-64
  1839. After the line is a series of disassembled sections:
  1840. Disassembly of section .interp:
  1841. ...
  1842. Disassembly of section .note.ABI-tag:
  1843. ...
  1844. Disassembly of section .note.gnu.build-id:
  1845. ...
  1846. ...
  1847. etc
  1848. Finally, each disassembled section displays its actual content -
  1849. which is a sequence of assembly instructions - with the following
  1850. format:
  1851. 4004d6: 55 push rbp
  1852. • The first column is the address of an assembly instruction. In
  1853. the above example, the address is 0x4004d6.
  1854. • The second column is assembly instruction in raw hex values. In
  1855. the above example, the address is 0x55.
  1856. • The third column is the assembly instruction. Depends on the
  1857. section, the assembly instruction might be meaningful or
  1858. meaningless. For example, if the assembly instructions are in a
  1859. .text section, then the assembly instructions are actual
  1860. program code. On the other hand, if the assembly instructions
  1861. are displayed in a .data section, then we can safely ignore the
  1862. displayed instructions. The reason is that objdump doesn't know
  1863. which hex values are code and which are data, so it blindly
  1864. translates every hex values into assembly instructions. In the
  1865. above example, the assembly instruction is push %rbp.
  1866. • The optional fourth column is a comment - appears when there is
  1867. a reference to an address - to inform where the address
  1868. originates. For example, the comment in blue:
  1869.     lea r12,[rip+0x2008ee] # 600e10
  1870. <__frame_dummy_init_array_entry>
  1871. is to inform that the referenced address from [rip+0x2008ee] is
  1872. 0x600e10, where the variable __frame_dummy_init_array_entry
  1873. resides.
  1874. In a disassembled section, it may also contain labels. A label is
  1875. a name given to an assembly instruction. The label denotes the
  1876. purpose of an assembly block to a human reader, to make it easier
  1877. to understand. For example, .text section carries many of such
  1878. labels to denote where code in a program start; .text section
  1879. below carries two functions: _start and deregister_tm_clones. The
  1880. _start function starts at address 4003e0, is annotated to the
  1881. left of the function name. Right below _start label is also the
  1882. instruction at address 4003e0. This whole thing means that a
  1883. label is simply a name of a memory address. The function
  1884. deregister_tm_clones also shares the same format as every
  1885. function in the section.
  1886. 00000000004003e0 <_start>:
  1887. 4003e0: 31 ed xor ebp,ebp
  1888. 4003e2: 49 89 d1 mov r9,rdx
  1889. 4003e5: 5e pop rsi
  1890. ...more assembly code....
  1891. 0000000000400410 <deregister_tm_clones>:
  1892. 400410: b8 3f 10 60 00 mov eax,0x60103f
  1893. 400415: 55 push rbp
  1894. 400416: 48 2d 38 10 60 00 sub rax,0x601038
  1895. ...more assembly code....
  1896. Intel manuals
  1897. The best way to understand and use assembly language properly is
  1898. to understand precisely the underlying computer architecture and
  1899. what each machine instruction does. To do so, the most reliable
  1900. source is to refer to documents provided by vendors. After all,
  1901. hardware vendors are the one who made their machines. To
  1902. understand Intel's instruction set, we need the document “Intel
  1903. 64 and IA-32 architectures software developer's manual combined
  1904. volumes 2A, 2B, 2C, and 2D: Instruction set reference, A-Z”. The
  1905. document can be retrieved here: https://software.intel.com/en-us/articles/intel-sdm
  1906. .
  1907. • Chapter 1 provides brief information about the manual, and the
  1908. comment notations used in the book.
  1909. • Chapter 2 provides an in-depth explanation of the anatomy of an
  1910. assembly instruction, which we will investigate in the next
  1911. section.
  1912. • Chapter 3 - 5 provide the details of every instruction of the
  1913. x86_64 architecture.
  1914. • Chapter 6 provides information about safer mode extensions. We
  1915. won't need to use this chapter.
  1916. The first volume “Intel® 64 and IA-32 Architectures Software
  1917. Developer’s Manual Volume 1: Basic Architecture” describes the
  1918. basic architecture and programming environment of Intel
  1919. processors. In the book, Chapter 5 gives the summary of all Intel
  1920. instructions, by listing instructions into different categories.
  1921. We only need to learn general-purpose instructions listed chapter
  1922. 5.1 for our OS. Chapter 7 describes the purpose of each category.
  1923. Gradually, we will learn all of these instructions.
  1924. Read section 1.3 in volume 2, exclude sections 1.3.5 and 1.3.7.
  1925. Experiment with assembly code
  1926. The subsequent sections examine the anatomy of an assembly
  1927. instruction. To fully understand, it is necessary to write code
  1928. and see the code in its actual form displayed as hex numbers. For
  1929. this purpose, we use nasm assembler to write a few line of
  1930. assembly code and see the generated code.
  1931. -------------------------------------------
  1932. Suppose we want to see the machine code generated for this
  1933. instruction:
  1934. jmp eax
  1935. Then, we use an editor e.g. Emacs, then create a new file,
  1936. write the code and save it in a file, e.g. test.asm. Then, in
  1937. the terminal, run the command:
  1938. $ nasm -f bin test.asm -o test
  1939. -f option specifies the file format, e.g. ELF, of the final
  1940. output file. But in this case, the format is bin, which means
  1941. this file is just a flat binary output without any extra
  1942. information. That is, the written assembly code is translated
  1943. to machine code as is, without the overhead of the metadata
  1944. from file format like ELF. Indeed, after compiling, we can
  1945. examine the output using this command:
  1946. $ hd test
  1947. hd (short for hexdump) is a program that displays the content
  1948. of a file in hex format[margin:
  1949. Though its name is short for hexdump, hd can display in different
  1950. base, e.g. binary, other than hex.
  1951. ]. And get the following output:
  1952. 00000000 66 ff e0 |f..|
  1953. 00000003
  1954. The file only consists of 3 bytes: 66 ff e0, which is
  1955. equivalent to the instruction jmp eax.
  1956. -------------------------------------------
  1957. If we were to use elf as file format:
  1958. $ nasm -f elf test.asm -o test
  1959. It would be more challenging to learn and understand assembly
  1960. instructions with all the added noise[footnote:
  1961. The output from hd.
  1962. ]:
  1963. 00000000 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
  1964. |.ELF............|
  1965. 00000010 01 00 03 00 01 00 00 00 00 00 00 00 00 00 00 00
  1966. |................|
  1967. 00000020 40 00 00 00 00 00 00 00 34 00 00 00 00 00 28 00
  1968. |@.......4.....(.|
  1969. 00000030 05 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00
  1970. |................|
  1971. 00000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  1972. |................|
  1973. *
  1974. 00000060 00 00 00 00 00 00 00 00 01 00 00 00 01 00 00 00
  1975. |................|
  1976. 00000070 06 00 00 00 00 00 00 00 10 01 00 00 02 00 00 00
  1977. |................|
  1978. 00000080 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00
  1979. |................|
  1980. 00000090 07 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00
  1981. |................|
  1982. 000000a0 20 01 00 00 21 00 00 00 00 00 00 00 00 00 00 00 |
  1983. ...!...........|
  1984. 000000b0 01 00 00 00 00 00 00 00 11 00 00 00 02 00 00 00
  1985. |................|
  1986. 000000c0 00 00 00 00 00 00 00 00 50 01 00 00 30 00 00 00
  1987. |........P...0...|
  1988. 000000d0 04 00 00 00 03 00 00 00 04 00 00 00 10 00 00 00
  1989. |................|
  1990. 000000e0 19 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00
  1991. |................|
  1992. 000000f0 80 01 00 00 0d 00 00 00 00 00 00 00 00 00 00 00
  1993. |................|
  1994. 00000100 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  1995. |................|
  1996. 00000110 ff e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  1997. |................|
  1998. 00000120 00 2e 74 65 78 74 00 2e 73 68 73 74 72 74 61 62
  1999. |..text..shstrtab|
  2000. 00000130 00 2e 73 79 6d 74 61 62 00 2e 73 74 72 74 61 62
  2001. |..symtab..strtab|
  2002. 00000140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  2003. |................|
  2004. *
  2005. 00000160 01 00 00 00 00 00 00 00 00 00 00 00 04 00 f1 ff
  2006. |................|
  2007. 00000170 00 00 00 00 00 00 00 00 00 00 00 00 03 00 01 00
  2008. |................|
  2009. 00000180 00 74 65 73 74 2e 61 73 6d 00 00 00 00 00 00 00
  2010. |.disp8-5.asm....|
  2011. 00000190
  2012. Thus, it is better just to use flat binary format in this case,
  2013. to experiment instruction by instruction.
  2014. With such a simple workflow, we are ready to investigate the
  2015. structure of every assembly instruction.
  2016. Note: Using the bin format puts nasm by default into 16-bit mode.
  2017. To enable 32-bit code to be generated, we must add this line at
  2018. the beginning of an nasm source file:
  2019. bits 32
  2020. Anatomy of an Assembly Instruction
  2021. Chapter 2 of the instruction reference manual provides an
  2022. in-depth of view of instruction format. But, the information is
  2023. too much that it can overwhelm beginners. This section provides
  2024. an easier instruction before reading the actual chapter in the
  2025. manual.
  2026. <Graphics file: C:/Users/Tu Do/os01/book_src/images/04/x86_instruction_format.pdf>
  2027. Recall that an assembly instruction is simply a fixed-size series
  2028. of bits. The length of an instruction varies and depends on how
  2029. complicated an instruction is. What every instruction shares is a
  2030. common format described in the figure above that divides the bits
  2031. of an instruction into smaller parts that encode different types
  2032. of information. These parts are:
  2033. Instruction Prefixes appears at the beginning of an
  2034. instruction. Prefixes are optional. A programmer can choose to
  2035. use a prefix or not because in practice, a so-called prefix is
  2036. just another assembly instruction to be inserted before another
  2037. assembly instruction that such prefix is applicable.
  2038. Instructions with 2 or 3-bytes opcodes include the prefixes by
  2039. default.
  2040. Opcode is a unique number that identifies an instruction. Each
  2041. opcode is given an mnemonic name that is human readable, e.g.
  2042. one of the opcodes for instruction add is 04. When a CPU sees
  2043. the number 04 in its instruction cache, it sees instruction add
  2044. and execute accordingly. Opcode can be 1,2 or 3 bytes long and
  2045. includes an additional 3-bit field in the ModR/M byte when
  2046. needed.
  2047. This instruction:
  2048. jmp [0x1234]
  2049. generates the machine code:
  2050. ff 26 34 12
  2051. The very first byte, 0xff is the opcode, which is unique to jmp
  2052. instruction.
  2053. ModR/M specifies operands of an instruction. Operand can either
  2054. be a register, a memory location or an immediate value. This
  2055. component of an instruction consists of 3 smaller parts:
  2056. • mod field, or modifier field, is combined with r/m field for
  2057. a total of 5 bits of information to encode 32 possible
  2058. values: 8 registers and 24 addressing modes.
  2059. • reg/opcode field encodes either a register operand, or
  2060. extends the Opcode field with 3 more bits.
  2061. • r/m field encodes either a register operand or can be
  2062. combined with mod field to encode an addressing mode.
  2063. The tables [mod-rm-16] and [mod-rm-32] list all possible 256
  2064. values of ModR/M byte and how each value maps to an addressing
  2065. mode and a register, in 16-bit and 32-bit modes.
  2066. +---------------------------------------------+-------+-------+-------+-------+-------+-------+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2067. | r8(/r) | AL | CL | DL | BL | AH | CH | DH | BH |
  2068. | r16(/r) | AX | CX | DX | BX | SP | BP¹ | SI | DI |
  2069. | r32(/r) | EAX | ECX | EDX | EBX | ESP | EBP | ESI | EDI |
  2070. | mm(/r) | MM0 | MM1 | MM2 | MM3 | MM4 | MM5 | MM6 | MM7 |
  2071. | xmm(/r) | XMM0 | XMM1 | XMM2 | XMM3 | XMM4 | XMM5 | XMM6 | XMM7 |
  2072. | (In decimal) /digit (Opcode) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
  2073. | (In binary) REG = | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 |
  2074. +---------------------------+--------+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2075. |        Effective Address |   Mod |   R/M | Values of ModR/M Byte (In Hexadecimal) |
  2076. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2077. | [BX + SI] | 00 | 000 | 00 | 08 | 10 | 18 | 20 | 28 | 30 | 38 |
  2078. | [BX + DI] | | 001 | 01 | 09 | 11 | 19 | 21 | 29 | 31 | 39 |
  2079. | [BP + SI] | | 010 | 02 | 0A | 12 | 1A | 22 | 2A | 32 | 3A |
  2080. | [BP + DI] | | 011 | 03 | 0B | 13 | 1B | 23 | 2B | 33 | 3B |
  2081. | [SI] | | 100 | 04 | 0C | 14 | 1C | 24 | 2C | 34 | 3C |
  2082. | [DI] | | 101 | 05 | 0D | 15 | 1D | 25 | 2D | 35 | 3D |
  2083. | disp16² | | 110 | 06 | 0E | 16 | 1E | 26 | 2E | 36 | 3E |
  2084. | [BX] | | 111 | 07 | 0F | 17 | 1F | 27 | 2F | 37 | 3F |
  2085. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2086. | [BX + SI] + disp8³ | 01 | 000 | 40 | 48 | 50 | 58 | 60 | 68 | 70 | 78 |
  2087. | [BX + DI] + disp8 | | 001 | 41 | 49 | 51 | 59 | 61 | 69 | 71 | 79 |
  2088. | [BP + SI] + disp8 | | 010 | 42 | 4A | 52 | 5A | 62 | 6A | 72 | 7A |
  2089. | [BP + DI] + disp8 | | 011 | 43 | 4B | 53 | 5B | 63 | 6B | 73 | 7B |
  2090. | [SI] + disp8 | | 100 | 44 | 4C | 54 | 5C | 64 | 6C | 74 | 7C |
  2091. | [DI] + disp8 | | 101 | 45 | 4D | 55 | 5D | 65 | 6D | 75 | 7D |
  2092. | [BP] + disp8 | | 110 | 46 | 4E | 56 | 5E | 66 | 6E | 76 | 7E |
  2093. | [BX] + disp8 | | 111 | 47 | 4F | 57 | 5F | 67 | 6F | 77 | 7F |
  2094. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2095. | [BX + SI] + disp16 | 10 | 000 | 80 | 88 | 90 | 98 | A0 | A8 | B0 | B8 |
  2096. | [BX + DI] + disp16 | | 001 | 81 | 89 | 91 | 99 | A1 | A9 | B1 | B9 |
  2097. | [BP + SI] + disp16 | | 010 | 82 | 8A | 92 | 9A | A2 | AA | B2 | BA |
  2098. | [BP + DI] + disp16 | | 011 | 83 | 8B | 93 | 9B | A3 | AB | B3 | BB |
  2099. | [SI] + disp16 | | 100 | 84 | 8C | 94 | 9C | A4 | AC | B4 | BC |
  2100. | [DI] + disp16 | | 101 | 85 | 8D | 95 | 9D | A5 | AD | B5 | BD |
  2101. | [BP] + disp16 | | 110 | 86 | 8E | 96 | 9E | A6 | AE | B6 | BE |
  2102. | [BX] + disp16 | | 111 | 87 | 8F | 97 | 9F | A7 | AF | B7 | BF |
  2103. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2104. | EAX/AX/AL/MM0/XMM0 | 11 | 000 | C0 | C8 | D0 | D8 | E0 | E8 | F0 | F8 |
  2105. | ECX/CX/CL/MM1/XMM1 | | 001 | C1 | C9 | D1 | D9 | E1 | E9 | F1 | F9 |
  2106. | EDX/DX/DL/MM2/XMM2 | | 010 | C2 | CA | D2 | DA | E2 | EA | F2 | FA |
  2107. | EBX/BX/BL/MM3/XMM3 | | 011 | C3 | CB | D3 | DB | E3 | EB | F3 | FB |
  2108. | ESP/SP/AHMM4/XMM4 | | 100 | C4 | CC | D4 | DC | E4 | EC | F4 | FC |
  2109. | EBP/BP/CH/MM5/XMM5 | | 101 | C5 | CD | D5 | DD | E5 | ED | F5 | FD |
  2110. | ESI/SI/DH/MM6/XMM6 | | 110 | C6 | CE | D6 | DE | E6 | EE | F6 | FE |
  2111. | EDI/DI/BH/MM7/XMM7 | | 111 | C7 | CF | D7 | DF | E7 | EF | F7 | FF |
  2112. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2113. 1. The default segment register is SS for the effective addresses
  2114. containing a BP index, DS for other effective addresses.
  2115. 2. The disp16 nomenclature denotes a 16-bit displacement that
  2116. follows the ModR/M byte and that is added to the index.
  2117. 3. The disp8 nomenclature denotes an 8-bit displacement that
  2118. follows the ModR/M byte and that is sign-extended and added to
  2119. the index.
  2120. <mod-rm-16>
  2121. +---------------------------------------------+-------+-------+-------+-------+-------+-------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2122. | r8(/r) | AL | CL | DL | BL | AH | CH | DH | BH |
  2123. | r16(/r) | AX | CX | DX | BX | SP | BP | SI | DI |
  2124. | r32(/r) | EAX | ECX | EDX | EBX | ESP | EBP | ESI | EDI |
  2125. | mm(/r) | MM0 | MM1 | MM2 | MM3 | MM4 | MM5 | MM6 | MM7 |
  2126. | xmm(/r) | XMM0 | XMM1 | XMM2 | XMM3 | XMM4 | XMM5 | XMM6 | XMM7 |
  2127. | (In decimal) /digit (Opcode) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
  2128. | (In binary) REG = | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 |
  2129. +---------------------------+--------+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2130. |        Effective Address |   Mod |   R/M | Values of ModR/M Byte (In Hexadecimal) |
  2131. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2132. | [EAX] | 00 | 000 | 00 | 08 | 10 | 18 | 20 | 28 | 30 | 38 |
  2133. | [ECX] | | 001 | 01 | 09 | 11 | 19 | 21 | 29 | 31 | 39 |
  2134. | [EDX] | | 010 | 02 | 0A | 12 | 1A | 22 | 2A | 32 | 3A |
  2135. | [EBX] | | 011 | 03 | 0B | 13 | 1B | 23 | 2B | 33 | 3B |
  2136. | [-][-]¹ | | 100 | 04 | 0C | 14 | 1C | 24 | 2C | 34 | 3C |
  2137. | disp32² | | 101 | 05 | 0D | 15 | 1D | 25 | 2D | 35 | 3D |
  2138. | [ESI] | | 110 | 06 | 0E | 16 | 1E | 26 | 2E | 36 | 3E |
  2139. | [EDI] | | 111 | 07 | 0F | 17 | 1F | 27 | 2F | 37 | 3F |
  2140. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2141. | [EAX] + disp8³ | 01 | 000 | 40 | 48 | 50 | 58 | 60 | 68 | 70 | 78 |
  2142. | [ECX] + disp8 | | 001 | 41 | 49 | 51 | 59 | 61 | 69 | 71 | 79 |
  2143. | [EDX] + disp8 | | 010 | 42 | 4A | 52 | 5A | 62 | 6A | 72 | 7A |
  2144. | [EBX] + disp8 | | 011 | 43 | 4B | 53 | 5B | 63 | 6B | 73 | 7B |
  2145. | [-][-] + disp8 | | 100 | 44 | 4C | 54 | 5C | 64 | 6C | 74 | 7C |
  2146. | [EBP] + disp8 | | 101 | 45 | 4D | 55 | 5D | 65 | 6D | 75 | 7D |
  2147. | [ESI] + disp8 | | 110 | 46 | 4E | 56 | 5E | 66 | 6E | 76 | 7E |
  2148. | [EDI] + disp8 | | 111 | 47 | 4F | 57 | 5F | 67 | 6F | 77 | 7F |
  2149. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2150. | [EAX] + disp32 | 10 | 000 | 80 | 88 | 90 | 98 | A0 | A8 | B0 | B8 |
  2151. | [ECX] + disp32 | | 001 | 81 | 89 | 91 | 99 | A1 | A9 | B1 | B9 |
  2152. | [EDX] + disp32 | | 010 | 82 | 8A | 92 | 9A | A2 | AA | B2 | BA |
  2153. | [EBX] + disp32 | | 011 | 83 | 8B | 93 | 9B | A3 | AB | B3 | BB |
  2154. | [-][-] + disp32 | | 100 | 84 | 8C | 94 | 9C | A4 | AC | B4 | BC |
  2155. | [EBP] + disp32 | | 101 | 85 | 8D | 95 | 9D | A5 | AD | B5 | BD |
  2156. | [ESI] + disp32 | | 110 | 86 | 8E | 96 | 9E | A6 | AE | B6 | BE |
  2157. | [EDI] + disp32 | | 111 | 87 | 8F | 97 | 9F | A7 | AF | B7 | BF |
  2158. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2159. | EAX/AX/AL/MM0/XMM0 | 11 | 000 | C0 | C8 | D0 | D8 | E0 | E8 | F0 | F8 |
  2160. | ECX/CX/CL/MM/XMM1 | | 001 | C1 | C9 | D1 | D9 | E1 | E9 | F1 | F9 |
  2161. | EDX/DX/DL/MM2/XMM2 | | 010 | C2 | CA | D2 | DA | E2 | EA | F2 | FA |
  2162. | EBX/BX/BL/MM3/XMM3 | | 011 | C3 | CB | D3 | DB | E3 | EB | F3 | FB |
  2163. | ESP/SP/AH/MM4/XMM4 | | 100 | C4 | CC | D4 | DC | E4 | EC | F4 | FC |
  2164. | EBP/BP/CH/MM5/XMM5 | | 101 | C5 | CD | D5 | DD | E5 | ED | F5 | FD |
  2165. | ESI/SI/DH/MM6/XMM6 | | 110 | C6 | CE | D6 | DE | E6 | EE | F6 | FE |
  2166. | EDI/DI/BH/MM7/XMM7 | | 111 | C7 | CF | D7 | DF | E7 | EF | F7 | FF |
  2167. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2168. 1. The [-][-] nomenclature means a SIB follows the ModR/M byte.
  2169. 2. The disp32 nomenclature denotes a 32-bit displacement that
  2170. follows the ModR/M byte (or the SIB byte if one is present) and
  2171. that is added to the index.
  2172. 3. The disp8 nomenclature denotes an 8-bit displacement that
  2173. follows the ModR/M byte (or the SIB byte if one is present) and
  2174. that is sign-extended and added to the index.
  2175. <mod-rm-32>
  2176. How to read the table:
  2177. In an instruction, next to the opcode is a ModR/M byte. Then,
  2178. look up the byte value in this table to get the corresponding
  2179. operands in the row and column.
  2180. -------------------------------------------
  2181. An instruction uses this addressing mode:
  2182. jmp [0x1234]
  2183. Then, the machine code is:
  2184. ff 26 34 12
  2185. 0xff is the opcode. Next to it, 0x26 is the ModR/M byte. Look
  2186. up in the 16-bit table [margin:
  2187. Remember, using bin format generates 16-bit code by default
  2188. ], the first operand is in the row, equivalent to a disp16, which
  2189. means a 16-bit offset. Since the instruction does not have a
  2190. second operand, the column can be ignored.
  2191. An instruction uses this addressing mode:
  2192. add eax, ecx
  2193. Then the machine code is:
  2194. 01 c8
  2195. 0x01 is the opcode. Next to it, c8 is the ModR/M byte. Look up
  2196. in the 16-bit table at c8 value, the row tells the first
  2197. operand is ax [margin:
  2198. Remember, using bin format generates 16-bit code by default
  2199. ], the column tells the second operand is cx; the column can't be
  2200. ignored as the second operand is in the instruction.
  2201. Why is the first operand in the row and the second in a column?
  2202. Let's break down the ModR/M byte, with an example value c8,
  2203. into bits:
  2204. +----------+---------------------+-------------+
  2205. | mod | reg/opcode | r/m |
  2206. +----------+---------------------+-------------+
  2207. +----+-----+----+----+-----------+----+----+---+
  2208. | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 0 |
  2209. +----+-----+----+----+-----------+----+----+---+
  2210. The mod field divides addressing modes into 4 different
  2211. categories. Further combines with the r/m field, exactly one
  2212. addressing mode can be selected from one of the 24 rows. If an
  2213. instruction only requires one operand, then the column can be
  2214. ignored. Then the reg/opcode field finally provides the if an
  2215. instruction requires one.
  2216. -------------------------------------------
  2217. SIB is Scale-Index-Base byte. This byte encodes ways to
  2218. calculate the memory position into an element of an array. SIB
  2219. is the name that is based on this formula for calculating an
  2220. effective address:
  2221. \mathtt{Effective\,address=scale*index+base}
  2222. • Index is an offset into an array.
  2223. • Scale is a factor of Index. Scale is one of the values 1, 2,
  2224. 4 or 8; any other value is invalid. To scale with values
  2225. other than 2, 4 or 8, the scale factor must be set to 1, and
  2226. the offset must be calculated manually. For example, if we
  2227. want to get the address of the n[superscript:th] element in an array and each element is 12-bytes long. Because
  2228. each element is 12-bytes long instead of 1, 2, 4 or 8, Scale
  2229. is set to 1 and a compiler needs to calculate the offset:
  2230. \mathtt{Effective\,address=1*(12*n)+base}
  2231. Why do we bother with SIB when we can manually calculate the
  2232. offset? The answer is that in the above scenario, an
  2233. additional mul instruction must be executed to get the
  2234. offset, and the mul instruction consumes more than 1 byte,
  2235. while the SIB only consumes 1 byte. More importantly, if the
  2236. element is repeatedly accessed many times in a loop, e.g.
  2237. millions of times, then an extra mul instruction can
  2238. detriment the performance as the CPU must spend time
  2239. executing millions of these additional mul instructions.
  2240. The values 2, 4 and 8 are not random chosen. They map to
  2241. 16-bit (or 2 bytes), 32-bit (or 4 bytes) and 64-bit (or 8
  2242. bytes) numbers that are often used for intensive numeric
  2243. calculations.
  2244. • Base is the starting address.
  2245. Below is the table listing all 256 values of SIB byte, with the
  2246. lookup rule similar to ModR/M tables:

  2248. | r32(/r) | EAX | ECX | EDX | EBX | ESP | EBP | ESI | EDI |
  2249. | (In decimal) /digit (Opcode) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
  2250. | (In binary) REG = | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 |
  2251. +---------------------------+-------+--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2252. |        Effective Address |   SS |   R/M | Values of SIB Byte (In Hexadecimal) |

  2254. | [EAX] | 00 | 000 | 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 |
  2255. | [ECX] | | 001 | 08 | 09 | 0A | 0B | 0C | 0D | 0E | 0F |
  2256. | [EDX] | | 010 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 |
  2257. | [EBX] | | 011 | 18 | 19 | 1A | 1B | 1C | 1D | 1E | 1F |
  2258. | none | | 100 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 |
  2259. | [EBP] | | 101 | 28 | 29 | 2A | 2B | 2C | 2D | 2E | 2F |
  2260. | [ESI] | | 110 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 |
  2261. | [EDI] | | 111 | 38 | 39 | 3A | 3B | 3C | 3D | 3E | 3F |

  2263. | [EAX*2] | 01 | 000 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 |
  2264. | [ECX*2] | | 001 | 48 | 49 | 4A | 4B | 4C | 4D | 4E | 4F |
  2265. | [EDX*2] | | 010 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 |
  2266. | [EBX*2] | | 011 | 58 | 59 | 5A | 5B | 5C | 5D | 5E | 5F |
  2267. | none | | 100 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 |
  2268. | [EBP*2] | | 101 | 68 | 69 | 6A | 6B | 6C | 6D | 6E | 6F |
  2269. | [ESI*2] | | 110 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 |
  2270. | [EDI*2] | | 111 | 78 | 79 | 7A | 7B | 7C | 7D | 7E | 7F |

  2272. | [EAX*4] | 10 | 000 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 |
  2273. | [ECX*4] | | 001 | 88 | 89 | 8A | 8B | 8C | 8D | 8E | 8F |
  2274. | [EDX*4] | | 010 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 |
  2275. | [EBX*4] | | 011 | 98 | 99 | 9A | 9B | 9C | 9D | 9E | 9F |
  2276. | none | | 100 | A0 | A1 | A2 | A3 | A4 | A5 | A6 | A7 |
  2277. | [EBP*4] | | 101 | A8 | A9 | AA | AB | AC | AD | AE | AF |
  2278. | [ESI*4] | | 110 | B0 | B1 | B2 | B3 | B4 | B5 | B6 | B7 |
  2279. | [EDI*4] | | 111 | B8 | B9 | BA | BB | BC | BD | BE | BF |

  2281. | [EAX*8] | 11 | 000 | C0 | C1 | C2 | C3 | C4 | C5 | C6 | C7 |
  2282. | [ECX*8] | | 001 | C8 | C9 | CA | CB | CC | CD | CE | CF |
  2283. | [EDX*8] | | 010 | D0 | D1 | D2 | D3 | D4 | D5 | D6 | D7 |
  2284. | [EBX*8] | | 011 | D8 | D9 | DA | DB | DC | DD | DE | DF |
  2285. | none | | 100 | E0 | E1 | E2 | E3 | E4 | E5 | E6 | E7 |
  2286. | [EBP*8] | | 101 | E8 | E9 | EA | EB | EC | ED | EE | EF |
  2287. | [ESI*8] | | 110 | F0 | F1 | F2 | F3 | F4 | F5 | F6 | F7 |
  2288. | [EDI*8] | | 111 | F8 | F9 | FA | FB | FC | FD | FE | FF |

  2290. 1. The [*] nomenclature means a disp32 with no base if the MOD is
  2291. 00B. Otherwise, [*] means disp8 or disp32 + [EBP]. This
  2292. provides the following address modes:
  2293. +-----------+---------------------------------+
  2294. | MOD bits | Effective Address |
  2295. +-----------+---------------------------------+
  2296. +-----------+---------------------------------+
  2297. | 00 | [scaled index] + disp32 |
  2298. +-----------+---------------------------------+
  2299. | 01 | [scaled index] + disp8 + [EBP] |
  2300. +-----------+---------------------------------+
  2301. | 10 | [scaled index] + disp32 + [EBP] |
  2302. +-----------+---------------------------------+
  2303. <sib>
  2304. This instruction:
  2305. jmp [eax*2 + ebx]
  2306. generates the following code:
  2307. 00000000 67 ff 24 43
  2308. First of all, the first byte, 0x67 is not an opcode but a
  2309. prefix. The number is a predefined prefix for address-size
  2310. override prefix. After the prefix, comes the opcode 0xff and
  2311. the ModR/M byte 0x24. The value from ModR/M suggests that
  2312. there exists a SIB byte that follows. The SIB byte is 0x43.
  2313. Look up in the SIB table, the row tells that eax is scaled by
  2314. 2, and the column tells that the base to be added is in ebx.
  2315. Displacement is the offset from the start of the base index.
  2316. This instruction:
  2317. jmp [0x1234]
  2318. generates machine code is:
  2319. ff 26 34 12
  2320. 0x1234, which is generated as 34 12 in raw machine code, is
  2321. the displacement and stands right next to 0x26, which is the
  2322. ModR/M byte.
  2323. This instruction:
  2324. jmp [eax * 4 + 0x1234]
  2325. generates the machine code:
  2326. 67 ff 24 8d 34 12 00 00
  2327. • 0x67 is an address-size override prefix. Its meaning is
  2328. that if an instruction runs a default address size e.g.
  2329. 16-bit, the use of prefix enables the instruction to use
  2330. non-default address size, e.g. 32-bit or 64-bit. Since the
  2331. binary is supposed to be 16-bit, 0x67 changes the
  2332. instruction to 32-bit mode.
  2333. • 0xff is the opcode.
  2334. • 0x24 is the ModR/M byte. The value suggests that a SIB byte
  2335. follows, according to table [mod-rm-32].
  2336. • 34 12 00 00 is the displacement. As can be seen, the
  2337. displacement is 4 bytes in size, which is equivalent to
  2338. 32-bit, due to address-size override prefix.
  2339. Immediate When an instruction accepts a fixed value, e.g.
  2340. 0x1234, as an operand, this optional field holds the value.
  2341. Note that this field is different from displacement: the value
  2342. is not necessary used an offset, but an arbitrary value of
  2343. anything.
  2344. This instruction:
  2345. mov eax, 0x1234
  2346. generates the code:
  2347. 66 b8 34 12 00 00
  2348. • 0x66 is operand-sized override prefix. Similar to
  2349. address-size override prefix, this prefix enables
  2350. operand-size to be non-default.
  2351. • 0xb8 is one of the opcodes for mov instruction.
  2352. • 0x1234 is the value to be stored in register eax. It is
  2353. just a value for storing directly into a register, and
  2354. nothing more. On the other hand, displacement value is an
  2355. offset for some address calculation.
  2356. Read section 2.1 in Volume 2 for even more details.
  2357. Skim through section 5.1 in volume 1. Read chapter 7 in volume
  2358. 1. If there are terminologies that you don't understand e.g.
  2359. segmentation, don't worry as the terms will be explained in
  2360. later chapters or ignored.
  2361. Understand an instruction in detail
  2362. In the instruction reference manual (Volume 2), from chapter 3
  2363. onward, every x86 instruction is documented in detail. Whenever
  2364. the precise behavior of an instruction is needed, we always
  2365. consult this document first. However, before using the document,
  2366. we must know the writing conventions first. Every instruction has
  2367. the following common structure for organizing information:
  2368. Opcode table lists all possible opcodes of an assembly
  2369. instruction.
  2370. Each table contains the following fields, and can have one or
  2371. more rows:
  2372. +---------------------------------------------------------------------------------------+
  2373. | Opcode Instruction Op/En 64/32-bit Mode CPUID
  2374. Feature flag Description |
  2375. +---------------------------------------------------------------------------------------+
  2376. Opcode shows a unique hexadecimal number assigned to an
  2377. instruction. There can be more than one opcode for an
  2378. instruction, each encodes a variant of the instruction. For
  2379. example, one variant requires one operand, but another
  2380. requires two. In this column, there can be other notations
  2381. aside from hexadecimal numbers. For example, /r indicates
  2382. that the ModR/M byte of the instruction contains a reg
  2383. operand and an r/m operand. The detail listing is in section
  2384. 3.1.1.1 and 3.1.1.2 in the Intel's manual, volume 2.
  2385. Instruction gives the syntax of the assembly instruction that a
  2386. programmer can use for writing code. Aside from the mnemonic
  2387. representation of the opcode, e.g. jmp, other symbols
  2388. represent operands with specific properties in the
  2389. instruction. For example, rel8 represents a relative address
  2390. from 128 bytes before the end of the instruction to 127 bytes
  2391. after the end of instruction; similarly rel16/rel32 also
  2392. represents relative addresses, but with the operand size of
  2393. 16/32-bit instead of 8-bit like rel8. For a detailed listing,
  2394. please refer to section 3.1.1.3 of volume 2.
  2395. Op/En is short for Operand/Encoding. An operand encoding
  2396. specifies how a ModR/M byte encodes the operands that an
  2397. instruction requires. If a variant of an instruction requires
  2398. operands, then an additional table named “Instruction Operand
  2399. Encoding” is added for explaining the operand encoding, with
  2400. the following structure:
  2401. +--------+------------+------------+------------+-----------+
  2402. | Op/En | Operand 1 | Operand 2 | Operand 3 | Operand 4 |
  2403. +--------+------------+------------+------------+-----------+
  2404. Most instructions require one to two operands. We make use of
  2405. these instructions for our OS and skip the instructions that
  2406. require three or four operands. The operands can be readable
  2407. or writable or both. The symbol (r) denotes a readable
  2408. operand, and (w) denotes a writable operand. For example,
  2409. when Operand 1 field contains ModRM:r/m (r), it means the
  2410. first operand is encoded in r/m field of ModR/M byte, and is
  2411. only readable.
  2412. 64/32-bit mode indicates whether the opcode sequence is
  2413. supported in a 64-bit mode and possibly 32-bit mode.
  2414. CPUID Feature Flag indicates indicate a particular CPU feature
  2415. must be available to enable the instruction. An instruction
  2416. is invalid if a CPU does not support the required feature.[margin:
  2417. In Linux, the command:
  2418. cat /proc/cpuinfo
  2419. lists the information of available CPUs and its features in flags
  2420. field.
  2421. ]
  2422. Compat/Leg Mode Many instructions do not have this field, but
  2423. instead is replaced with Compat/Leg Mode, which stands for
  2424. Compatibility or Legacy Mode. This mode enables 64-bit
  2425. variants of instructions to run normally in 16 or 32-bit
  2426. mode. [float MarginTable:
  2427. [MarginTable 2:
  2428. Notations in Compat/Leg Mode
  2429. ]
  2430. +-----------+----------------------------------------------------------------------------------+
  2431. | Notation | Description |
  2432. +-----------+----------------------------------------------------------------------------------+
  2433. +-----------+----------------------------------------------------------------------------------+
  2434. | Valid | Supported |
  2435. +-----------+----------------------------------------------------------------------------------+
  2436. | I | Not supported |
  2437. +-----------+----------------------------------------------------------------------------------+
  2438. | N.E. | The 64-bit opcode cannot be encoded as it overlaps with existing
  2439. 32-bit opcode. |
  2440. +-----------+----------------------------------------------------------------------------------+
  2441. ]
  2442. Description briefly explains the variant of an instruction in
  2443. the current row.
  2444. Description specifies the purpose of the instructions and how
  2445. an instruction works in detail.
  2446. Operation is pseudo-code that implements an instruction. If a
  2447. description is vague, this section is the next best source to
  2448. understand an assembly instruction. The syntax is described in
  2449. section 3.1.1.9 in volume 2.
  2450. Flags affected lists the possible changes to system flags in
  2451. EFLAGS register.
  2452. Exceptions list the possible errors that can occur when an
  2453. instruction cannot run correctly. This section is valuable for
  2454. OS debugging. Exceptions fall into one of the following
  2455. categories:
  2456. • Protected Mode Exceptions
  2457. • Real-Address Mode Exception
  2458. • Virtual-8086 Mode Exception
  2459. • Floating-Point Exception
  2460. • SIMD Floating-Point Exception
  2461. • Compatibility Mode Exception
  2462. • 64-bit Mode Exception
  2463. For our OS, we only use Protected Mode Exceptions and
  2464. Real-Address Mode Exceptions. The details are in section 3.1.1.13
  2465. and 3.1.1.14, volume 2.
  2466. Example: jmp instruction
  2467. Let's look at our good old jmp instruction. First, the opcode
  2468. table:
  2469. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2470. | Opcode | Instruction | Op/
  2471. En | 64-bit Mode | Compat/Leg Mode | Description |
  2472. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2473. | EB cb | JMP rel8 | D | Valid | Valid | Jump short, RIP = RIP + 8-bit displacement sign extended to
  2474. 64-bits |
  2475. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2476. | E9 cw | JMP rel16 | D | N.S. | Valid | Jump near, relative, displacement relative to next instruction.
  2477. Not supported in 64-bit mode. |
  2478. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2479. | E9 cd | JMP rel32 | D | Valid | Valid | Jump near, relative, RIP = RIP + 32-bit displacement sign
  2480. extended to 64-bits |
  2481. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2482. | FF /4 | JMP r/m16 | M | N.S. | Valid | Jump near, absolute indirect, address = zero- extended r/m16. Not
  2483. supported in 64-bit mode |
  2484. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2485. | FF /4 | JMP r/m32 | M | N.S. | Valid | Jump near, absolute indirect, address given in r/m32. Not
  2486. supported in 64-bit mode |
  2487. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2488. | FF /4 | JMP r/m64 | M | Valid | N.E | Jump near, absolute indirect, RIP = 64-Bit offset from register
  2489. or memory |
  2490. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2491. | EA cd | JMP ptr16:16 | D | Inv. | Valid | Jump far, absolute, address given in operand |
  2492. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2493. | EA cp | JMP ptr16:32 | D | Inv. | Valid | Jump far, absolute, address given in operand |
  2494. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2495. | FF /5 | JMP m16:16 | D | Valid | Valid | Jump far, absolute indirect, address given in m16:16 |
  2496. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2497. | FF /5 | JMP m16:32 | D | Valid | Valid | Jump far, absolute indirect, address given in m16:32 |
  2498. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2499. | REX.W + FF /5 | JMP m16:64 | D | Valid | N.E. | Jump far, absolute indirect, address given in m16:64 |
  2500. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2501. <jmp-instruction>
  2502. Each row lists a variant of jmp instruction. The first column has
  2503. the opcode EB cb, with an equivalent symbolic form jmp rel8.
  2504. Here, rel8 means 128 bytes offset, counting from the end of the
  2505. instruction. The end of an instruction is the next byte after the
  2506. last byte of an instruction. To make it more concrete, consider
  2507. this assembly code:
  2508. main:
  2509. jmp main
  2510. jmp main2
  2511. jmp main
  2512. main2:
  2513. jmp 0x1234
  2514. generates the machine code:
  2515. [float Table:
  2516. [Table 4:
  2517. Memory address of each opcode
  2518. ]
  2519. +-------------------+ +-------------------------+
  2520. | main | | main2 |
  2521. +-------------------+ +-------------------------+
  2522. \downarrow
  2523. \downarrow
  2524. +----------+--------------+-----+-----+-----+-----+-----+--------------+-----+-----+----+
  2525. | Address | 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 |
  2526. +---------- +-----+-----+-----+-----+-----+--------------+-----+-----+----+
  2527. +----------+--------------+-----+-----+-----+-----+-----+--------------+-----+-----+----+
  2528. | Opcode | eb | fe | eb | 02 | eb | fa | e9 | 2b | 12 | 00 |
  2529. +----------+--------------+-----+-----+-----+-----+-----+--------------+-----+-----+----+
  2530. ]
  2531. The first jmp main instruction is generated into eb fe and
  2532. occupies the addresses 00 and 01; the end of the first jmp main
  2533. is at address 02, past the last byte of the first jmp main which
  2534. is located at the address 01. The value fe is equivalent to -2,
  2535. since eb opcode uses only a byte (8 bits) for relative
  2536. addressing. The offset is -2, and the end address of the first
  2537. jmp main is 02, adding them together we get 00 which is the
  2538. destination address for jumping to.
  2539. Similarly, the jmp main2 instruction is generated into eb 02,
  2540. which means the offset is +2; the end address of jmp main2 is at
  2541. 04, and adding together with the offset we get the destination
  2542. address is 06, which is the start instruction marked by the label
  2543. main2.
  2544. The same rule can be applied to rel16 and rel32 encoding. In the
  2545. example code, jmp 0x1234 uses rel16 (which means 2-byte offset)
  2546. and is generated into e9 2b 12. As the table [jmp-instruction]
  2547. shows, e9 opcode takes a cw operand, which is a 2-byte offset
  2548. (section 3.1.1.1, volume 2). Notice one strange issue here: the
  2549. offset value is 2b 12, while it is supposed to be 34 12. There is
  2550. nothing wrong. Remember, rel8/rel16/rel32 is an offset, not an
  2551. address. A offset is a distance from a point. Since no label is
  2552. given but a number, the offset is calculated from the start of a
  2553. program. In this case, the start of the program is the address
  2554. 00, the end of jmp 0x1234 is the address 09[footnote:
  2555. which means 9 bytes was consumed, starting from address 0.
  2556. ], so the offset is calculated as 0x1234 - 0x9 = 0x122b. That
  2557. solved the mystery!
  2558. The jmp instructions with opcode FF /4 enable jumping to a near,
  2559. absolute address stored in a general-purpose register or a memory
  2560. location; or in short, as written in the description, absolute
  2561. indirect. The symbol /4 is the column with digit 4 in table [mod-rm-16]
  2562. [footnote:
  2563. The column with the following fields:
  2564. AH
  2565. SP
  2566. ESP
  2567. M45
  2568. XMM4
  2569. 4
  2570. 100
  2571. ]. For example:
  2572. jmp [0x1234]
  2573. is generated into:
  2574. ff 26 34 12
  2575. Since this is 16-bit code, we use table [mod-rm-16]. Looking up
  2576. the table, ModR/M value 26 means disp16, which means a 16-bit
  2577. offset from the start of current index[footnote:
  2578. Look at the note under the table.
  2579. ], which is the base address stored in DS register. In this case,
  2580. jmp [0x1234] is implicitly understood as jmp [ds:0x1234], which
  2581. means the destination address is 0x1234 bytes away from the start
  2582. of a data segment.
  2583. The jmp instruction with opcode FF /5 enables jumping to a far,
  2584. absolute address stored in a memory location (as opposed to /4,
  2585. which means stored in a register); in short, a far pointer. To
  2586. generate such instruction, the keyword far is needed to tell nasm
  2587. we are using a far pointer:
  2588. jmp far [eax]
  2589. is generated into:
  2590. 67 ff 28
  2591. Since 28 is the value in the 5th column of the table [mod-rm-32][footnote:
  2592. Remember the prefix 67 indicates the instruction is used as
  2593. 32-bit. The prefix only added if the default environment is
  2594. assumed as 16-bit when generating code by an assembler.
  2595. ] that refers to [eax], we successfully generate an instruction
  2596. for a far jump. After CPU runs the instruction, the program
  2597. counter eip and code segment register cs is set to the memory
  2598. address, stored in the memory location that eax points to, and
  2599. CPU starts fetching code from the new address in cs and eip. To
  2600. make it more concrete, here is an example:
  2601. <Graphics file: C:/Users/Tu Do/os01/book_src/images/04/far_jmp_ex.pdf>
  2602. The far address consumes total of 6 bytes in size for a 16-bit
  2603. segment and 32-bit address, which is encoded as m16:32 from the
  2604. table [jmp-instruction]. As can be seen from the figure above,
  2605. the blue part is a segment address, loaded into cs register with
  2606. the value 0x5678; the red part is the memory address within that
  2607. segment, loaded into eip register with the value 0x1234 and start
  2608. executing from there.
  2609. Finally, the jmp instructions with EA opcode jump to a direct
  2610. absolute address. For example, the instruction:
  2611. jmp 0x5678:0x1234
  2612. is generated into:
  2613. ea 34 12 78 56
  2614. The address 0x5678:0x1234 is right next to the opcode, unlike FF
  2615. /5 instruction that needs an indirect address in eax register.
  2616. We skip the jump instruction with REX prefix, as it is a 64-bit
  2617. instruction.
  2618. Examine compiled data
  2619. In this section, we will examine how data definition in C maps to
  2620. its assembly form. The generated code is extracted from .bss
  2621. section. That means, the assembly code displayed has no[footnote:
  2622. Actually, code is just a type of data, and is often used for
  2623. hijacking into a running program to execute such code. However,
  2624. we have no use for it in this book.
  2625. ], aside from showing that such a value has an equivalent
  2626. assembly opcode that represents an instruction.
  2627. The code-assembly listing is not random, but is based on Chapter
  2628. 4 of Volume 1, “Data Type”. The chapter lists fundamental data
  2629. types that x86 hardware operates on, and through learning the
  2630. generated assembly code, it can be understood how close C maps
  2631. its syntax to hardware, and then a programmer can see why C is
  2632. appropriate for OS programming. The specific objdump command used
  2633. in this section will be:
  2634. $ objdump -z -M intel -S -D <object file> | less
  2635. Note: zero bytes are hidden with three dot symbols: ... To show
  2636. all the zero bytes, we add -z option.
  2637. Fundamental data types
  2638. The most basic types that x86 architecture works with are based
  2639. on sizes, each is twice as large as the previous one: 1 byte (8
  2640. bits), 2 bytes (16 bits), 4 bytes (32 bits), 8 bytes (64 bits)
  2641. and 16 bytes (128 bits).
  2642. <Graphics file: C:/Users/Tu Do/os01/book_src/images/04/fundamental_data_types.pdf>
  2643. These types are simplest: they are just chunks of memory at
  2644. different sizes that enables CPU to access memory efficiently.
  2645. From the manual, section 4.1.1, volume 1:
  2646. Words, doublewords, and quadwords do not need to be aligned in
  2647. memory on natural boundaries. The natural boundaries for words,
  2648. double words, and quadwords are even-numbered addresses,
  2649. addresses evenly divisible by four, and addresses evenly
  2650. divisible by eight, respectively. However, to improve the
  2651. performance of programs, data structures (especially stacks)
  2652. should be aligned on natural boundaries whenever possible. The
  2653. reason for this is that the processor requires two memory
  2654. accesses to make an unaligned memory access; aligned accesses
  2655. require only one memory access. A word or doubleword operand that
  2656. crosses a 4-byte boundary or a quadword operand that crosses an
  2657. 8-byte boundary is considered unaligned and requires two separate
  2658. memory bus cycles for access.
  2659. Some instructions that operate on double quadwords require memory
  2660. operands to be aligned on a natural boundary. These instructions
  2661. generate a general-protection exception (#GP) if an unaligned
  2662. operand is specified. A natural boundary for a double quadword is
  2663. any address evenly divisible by 16. Other instructions that
  2664. operate on double quadwords permit unaligned access (without
  2665. generating a general-protection exception). However, additional
  2666. memory bus cycles are required to access unaligned data from
  2667. memory.
  2668. In C, the following primitive types (must include stdint.h) maps
  2669. to the fundamental types:
  2670. Source
  2671. #include <stdint.h>
  2672. uint8_t @|\color{red}\bfseries byte|@ = 0x12;
  2673. uint16_t @|\color{blue}\bfseries word|@ = 0x1234;
  2674. uint32_t @|\color{green}\bfseries dword|@ = 0x12345678;
  2675. uint64_t @|\color{magenta}\bfseries qword|@ = 0x123456789abcdef;
  2676. unsigned __int128 @|\color{cyan}\bfseries dqword1|@ = (__int128)
  2677. 0x123456789abcdef;
  2678. unsigned __int128 @|\color{cyan}\bfseries dqword2|@ = (__int128)
  2679. 0x123456789abcdef << 64;
  2680. int main(int argc, char *argv[]) {
  2681. return 0;
  2682. }
  2683. Assembly
  2684. 0804a018 <byte>:
  2685. 804a018: 12 00 adc al,BYTE PTR
  2686. [eax]
  2687. 0804a01a <word>:
  2688. 804a01a: 34 12 xor al,0x12
  2689. 0804a01c <dword>:
  2690. 804a01c: 78 56 js 804a074
  2691. <_end+0x48>
  2692. 804a01e: 34 12 xor al,0x12
  2693. 0804a020 <qword>:
  2694. 804a020: ef out dx,eax
  2695. 804a021: cd ab int 0xab
  2696. 804a023: 89 67 45 mov DWORD PTR
  2697. [edi+0x45],esp
  2698. 804a026: 23 01 and eax,DWORD PTR
  2699. [ecx]
  2700. 0000000000601040 <dqword1>:
  2701. 601040: ef out dx,eax
  2702. 601041: cd ab int 0xab
  2703. 601043: 89 67 45 mov DWORD PTR
  2704. [rdi+0x45],esp
  2705. 601046: 23 01 and eax,DWORD PTR
  2706. [rcx]
  2707. 601048: 00 00 add BYTE PTR
  2708. [rax],al
  2709. 60104a: 00 00 add BYTE PTR
  2710. [rax],al
  2711. 60104c: 00 00 add BYTE PTR
  2712. [rax],al
  2713. 60104e: 00 00 add BYTE PTR
  2714. [rax],al
  2715. 0000000000601050 <dqword2>:
  2716. 601050: 00 00 add BYTE PTR
  2717. [rax],al
  2718. 601052: 00 00 add BYTE PTR
  2719. [rax],al
  2720. 601054: 00 00 add BYTE PTR
  2721. [rax],al
  2722. 601056: 00 00 add BYTE PTR
  2723. [rax],al
  2724. 601058: ef out dx,eax
  2725. 601059: cd ab int 0xab
  2726. 60105b: 89 67 45 mov DWORD PTR
  2727. [rdi+0x45],esp
  2728. 60105e: 23 01 and eax,DWORD PTR
  2729. [rcx]
  2730. gcc generates the variables byte, word, dword, qword, dqword1,
  2731. dword2, written earlier, with their respective values highlighted
  2732. in the same colors; variables of the same type are also
  2733. highlighted in the same color. Since this is data section, the
  2734. assembly listing carries no meaning. When byte is declared with
  2735. uint8_t, gcc guarantees that the size of byte is always 1 byte.
  2736. But, an alert reader might notice the 00 value next to the 12
  2737. value in the byte variable. This is normal, as gcc avoid memory
  2738. misalignment by adding extra padding bytespadding bytes. To make
  2739. it easier to see, we look at readelf output of .data section:
  2740. $ readelf -x .data hello
  2741. the output is (the colors mark which values belong to which
  2742. variables):
  2743. Hex dump of section '.data':
  2744. 0x00601020 00000000 00000000 00000000 00000000 ................
  2745. 0x00601030 12003412 78563412 efcdab89 67452301 ..4.xV4.....gE#.
  2746. 0x00601040 efcdab89 67452301 00000000 00000000 ....gE#.........
  2747. 0x00601050 00000000 00000000 efcdab89 67452301 ............gE#.
  2748. As can be seen in the readelf output, variables are allocated
  2749. storage space according to their types and in the declared order
  2750. by the programmer (the colors correspond the the variables).
  2751. Intel is a little-endian machine, which means smaller addresses
  2752. hold bytes with smaller values, larger addresses hold byte with
  2753. larger values. For example, 0x1234 is displayed as 34 12; that
  2754. is, 34 appears first at address 0x601032, then 12 at 0x601033.
  2755. The decimal values within a byte is unchanged, so we see 34 12
  2756. instead of 43 21. This is quite confusing at first, but you will
  2757. get used to it soon.
  2758. Also, isn't it redundant when char type is always 1 byte already
  2759. and why do we bother adding int8_t? The truth is, char type is
  2760. not guaranteed to be 1 byte in size, but only the minimum of 1
  2761. byte in size. In C, a byte is defined to be the size of a char,
  2762. and a char is defined to be smallest addressable unit of the
  2763. underlying hardware platform. There are hardware devices that the
  2764. smallest addressable unit is 16 bit or even bigger, which means
  2765. char is 2 bytes in size and a “byte” in such platforms is
  2766. actually 2 units of 8-bit bytes.
  2767. Not all architectures support the double quadword type. Still,
  2768. gcc does provide support for 128-bit number and generate code
  2769. when a CPU supports it (that is, a CPU must be 64-bit). By
  2770. specifying a variable of type __int128 or unsigned __int128, we
  2771. get a 128-bit variable. If a CPU does not support 64-bit mode,
  2772. gcc throws an error.
  2773. The data types in C, which represents the fundamental data types,
  2774. are also called unsigned numbers. Other than numerical
  2775. calculations, unsigned numbers are used as a tool for structuring
  2776. data in memory; we will this application see later in the book,
  2777. when various data structures are organized into bit groups.
  2778. In all the examples above, when the value of a variable with
  2779. smaller size is assigned to a variable with larger size, the
  2780. value easily fits in the larger variable. On the contrary, the
  2781. value of a variable with larger size is assigned to a variable
  2782. with smaller size, two scenarios occur:
  2783. • The value is greater than the maximum value of the variable
  2784. with smaller layout, so it needs truncating to the size of the
  2785. variable and causing incorrect value.
  2786. • The value is smaller than the maximum value of the variable
  2787. with a smaller layout, so it fits the variable.
  2788. However, the value might be unknown until runtime and can be
  2789. value, it is best not to let such implicit conversion handled by
  2790. the compiler, but explicitly controlled by a programmer.
  2791. Otherwise it will cause subtle bugs that are hard to catch as the
  2792. erroneous values might rarely be used to reproduce the bugs.
  2793. Pointer Data Types
  2794. Pointers are variables that hold memory addresses. x86 works with
  2795. 2 types of pointers:
  2796. Near pointer is a 16-bit/32-bit offset within a segment, also
  2797. called effective address.
  2798. Far pointer is also an offset like a near pointer, but with an
  2799. explicit segment selector.
  2800. <Graphics file: C:/Users/Tu Do/os01/book_src/images/04/pointer_data_type.pdf>
  2801. C only provides near pointer, since far pointer is platform
  2802. dependent, such as x86. In application code, you can assume that
  2803. the address of current segment starts at 0, so the offset is
  2804. actually any memory addres from 0 to the maximum address.
  2805. Source
  2806. #include <stdint.h>
  2807. int8_t i = 0;
  2808. int8_t @|\color{red}\bfseries *p1|@ = (int8_t *) 0x1234;
  2809. int8_t @|\color{blue}\bfseries *p2|@ = &i;
  2810. int main(int argc, char *argv[]) {
  2811. return 0;
  2812. }
  2813. Assembly
  2814. 0000000000601030 <p1>:
  2815. 601030: 34 12 xor al,0x12
  2816. 601032: 00 00 add BYTE PTR
  2817. [rax],al
  2818. 601034: 00 00 add BYTE PTR
  2819. [rax],al
  2820. 601036: 00 00 add BYTE PTR
  2821. [rax],al
  2822. 0000000000601038 <p2>:
  2823. 601038: 41 10 60 00 adc BYTE PTR
  2824. [r8+0x0],spl
  2825. 60103c: 00 00 add BYTE PTR
  2826. [rax],al
  2827. 60103e: 00 00 add BYTE PTR
  2828. [rax],al
  2829. Disassembly of section .bss:
  2830. 0000000000601040 <__bss_start>:
  2831. 601040: 00 00 add BYTE PTR
  2832. [rax],al
  2833. 0000000000601041 <i>:
  2834. 601041: 00 00 add BYTE PTR
  2835. [rax],al
  2836. 601043: 00 00 add BYTE PTR
  2837. [rax],al
  2838. 601045: 00 00 add BYTE PTR
  2839. [rax],al
  2840. 601047: 00 .byte 0x0
  2841. The pointer p1 holds a direct address with the value 0x1234. The
  2842. pointer p2 holds the address of the variable i. Note that both
  2843. the pointers are 8 bytes in size (or 4-byte, if 32-bit).
  2844. Bit Field Data Type
  2845. A bit fieldbit field is a contiguous sequence of bits. Bit fields
  2846. allow data structuring at bit level. For example, a 32-bit data
  2847. can hold multiple bit fields that represent multiples different
  2848. pieces of information, such as bits 0-4 specifies the size of a
  2849. data structure, bit 5-6 specifies permissions and so on. Data
  2850. structures at the bit level are common for low-level programming.
  2851. <Graphics file: C:/Users/Tu Do/os01/book_src/images/04/bit_field_data_type.pdf>
  2852. Source
  2853. struct bit_field {
  2854. int data1:8;
  2855. int data2:8;
  2856. int data3:8;
  2857. int data4:8;
  2858. };
  2859. struct bit_field2 {
  2860. int data1:8;
  2861. int data2:8;
  2862. int data3:8;
  2863. int data4:8;
  2864. char data5:4;
  2865. };
  2866. struct normal_struct {
  2867. int data1;
  2868. int data2;
  2869. int data3;
  2870. int data4;
  2871. };
  2872. struct normal_struct @|\color{red}\bfseries ns|@ = {
  2873. .data1 = @|\color{red}\bfseries 0x12345678|@,
  2874. .data2 = @|\color{red}\bfseries 0x9abcdef0|@,
  2875. .data3 = @|\color{red}\bfseries 0x12345678|@,
  2876. .data4 = @|\color{red}\bfseries 0x9abcdef0|@,
  2877. };
  2878. int @|\color{blue}\bfseries i|@ = 0x12345678;
  2879. struct bit_field @|\color{magenta}\bfseries bf|@ = {
  2880. .data1 = @|\color{magenta}\bfseries 0x12|@,
  2881. .data2 = @|\color{magenta}\bfseries 0x34|@,
  2882. .data3 = @|\color{magenta}\bfseries 0x56|@,
  2883. .data4 = @|\color{magenta}\bfseries 0x78|@
  2884. };
  2885. struct bit_field2 @|\color{green}\bfseries bf2|@ = {
  2886. .data1 = @|\color{green}\bfseries 0x12|@,
  2887. .data2 = @|\color{green}\bfseries 0x34|@,
  2888. .data3 = @|\color{green}\bfseries 0x56|@,
  2889. .data4 = @|\color{green}\bfseries 0x78|@,
  2890. .data5 = @|\color{green}\bfseries 0xf|@
  2891. };
  2892. int main(int argc, char *argv[]) {
  2893. return 0;
  2894. }
  2895. Assembly
  2896. Each variable and its value are given a unique color in the
  2897. assembly listing below:
  2898. 0804a018 <ns>:
  2899. 804a018: 78 56 js 804a070 <_end+0x34>
  2900. 804a01a: 34 12 xor al,0x12
  2901. 804a01c: f0 de bc 9a 78 56 34 lock fidivr WORD PTR
  2902. [edx+ebx*4+0x12345678]
  2903. 804a023: 12
  2904. 804a024: f0 de bc 9a 78 56 34 lock fidivr WORD PTR
  2905. [edx+ebx*4+0x12345678]
  2906. 804a02b: 12
  2907. 0804a028 <i>:
  2908. 804a028: 78 56 js 804a080 <_end+0x44>
  2909. 804a02a: 34 12 xor al,0x12
  2910. 0804a02c <bf>:
  2911. 804a02c: 12 34 56 adc dh,BYTE PTR
  2912. [esi+edx*2]
  2913. 804a02f: 78 12 js 804a043 <_end+0x7>
  2914. 0804a030 <bf2>:
  2915. 804a030: 12 34 56 adc dh,BYTE PTR
  2916. [esi+edx*2]
  2917. 804a033: 78 0f js 804a044 <_end+0x8>
  2918. 804a035: 00 00 add BYTE PTR [eax],al
  2919. 804a037: 00 .byte 0x0
  2920. The sample code creates 4 variables: ns, i, bf, bf2. The
  2921. definition of normal_struct and bit_field structs both specify 4
  2922. integers. bit_field specifies additional information next to its
  2923. member name, separated by a colon, e.g. .data1 : 8. This extra
  2924. information is the bit width of each bit group. It means, even
  2925. though defined as an int, .data1 only consumes 8 bit of
  2926. information. If additional data members are specified after
  2927. .data1, two scenarios happen:
  2928. • If the new data members fit within the remaining bits after
  2929. .data, which are 24 bits[footnote:
  2930. Since .data1 is declared as an int, 32 bits are still allocated,
  2931. but .data1 can only access 8 bits of information.
  2932. ], then the total size of bit_field struct is still 4 bytes, or
  2933. 32 bits.
  2934. • If the new data members don't fit, then the remaining 24 bits
  2935. (3 bytes) are still allocated. However, the new data members
  2936. are allocated brand new storages, without using the previous 24
  2937. bits.
  2938. In the example, the 4 data members: .data1, .data2, .data3 and
  2939. .data4, each can access 8 bits of information, and together can
  2940. access all of 4 bytes of the integer first declared by .data1. As
  2941. can be seen by the generated assembly code, the values of bf are
  2942. follow natural order as written in the C code: 12 34 56 78, since
  2943. each value is a separate members. In contrast, the value of i is
  2944. a number as a whole, so it is subject to the rule of little
  2945. endianess and thus contains the value 78 56 34 12. Note that at
  2946. 804a02f, is the address of the final byte in bf, but next to it
  2947. is a number 12, despite 78 is the last number in it. This extra
  2948. number 12 does not belong to the value of bf. objdump is just
  2949. being confused that 78 is an opcode; 78 corresponds to js
  2950. instruction, and it requires an operand. For that reason, objdump
  2951. grabs whatever the next byte after 78 and put it there. objdump
  2952. is a tool to display assembly code after all. A better tool to
  2953. use is gdb that we will learn in the next chapter. But for this
  2954. chapter, objdump suffices.
  2955. Unlike bf, each data member in ns is allocated fully as an
  2956. integer, 4 bytes each, 16 bytes in total. As we can see, bit
  2957. field and normal struct are different: bit field structure data
  2958. at the bit level, while normal struct works at byte level.
  2959. Finally, the struct of bf2[footnote:
  2960. bit_field2
  2961. ] is the same of bf[footnote:
  2962. bit_field
  2963. ], except it contains one more data member: .data5, and is
  2964. defined as an integer. For this reason, another 4 bytes are
  2965. allocated just for .data5, even though it can only access 8 bits
  2966. of information, and the final value of bf2 is: 12 34 56 78 0f 00
  2967. 00 00. The remaining 3 bytes must be accessed by the mean of a
  2968. pointer, or casting to another data type that can fully access
  2969. all 4 bytes..
  2970. What happens when the definition of bit_field struct and bf
  2971. variable are changed to:
  2972. struct bit_field {
  2973. int data1:8;
  2974. };
  2975. struct bit_field bf = {
  2976. .data1 = 0x1234,
  2977. };
  2978. What will be the value of .data1?
  2979. What happens when the definition of bit_field2 struct is
  2980. changed to:
  2981. struct bit_field2 {
  2982. int data1:8;
  2983. int data5:32;
  2984. };
  2985. What is layout of a variable of type bit_field2?
  2986. String Data Types
  2987. Although share the same name, string as defined by x86 is
  2988. different than a string in C. x86 defines string as “continuous
  2989. sequences of bits, bytes, words, or doublewords”. On the other
  2990. hand, C defines a string as an array of 1-byte characters with a
  2991. zero as the last element of the array to make a null-terminated
  2992. string. This implies that strings in x86 are arrays, not C
  2993. strings. A programmer can define an array of bytes, words or
  2994. doublewords with char or uint8_t, short or uint16_t and int or
  2995. uint32_t, except an array of bits. However, such a feature can be
  2996. easily implemented, as an array of bits is essentially any array
  2997. of bytes, or words or doublewords, but operates at the bit level.
  2998. The following code demonstrates how to define array (string) data
  2999. types:
  3000. Source
  3001. #include <stdint.h>
  3002. uint8_t @|\color{red}\bfseries a8[2]|@ = {0x12, 0x34};
  3003. uint16_t @|\color{blue}\bfseries a16[2]|@ = {0x1234, 0x5678};
  3004. uint32_t @|\color{magenta}\bfseries a32[2]|@ = {0x12345678,
  3005. 0x9abcdef0};
  3006. uint64_t @|\color{green}\bfseries a64[2]|@ = {0x123456789abcdef0,
  3007. 0x123456789abcdef0};
  3008. int main(int argc, char *argv[])
  3009. {
  3010. return 0;
  3011. }
  3012. Assembly
  3013. 0804a018 <a8>:
  3014. 804a018: 12 34 00 adc dh,BYTE PTR
  3015. [eax+eax*1]
  3016. 804a01b: 00 34 12 add BYTE PTR
  3017. [edx+edx*1],dh
  3018. 0804a01c <a16>:
  3019. 804a01c: 34 12 xor al,0x12
  3020. 804a01e: 78 56 js 804a076 <_end+0x3a>
  3021. 0804a020 <a32>:
  3022. 804a020: 78 56 js 804a078 <_end+0x3c>
  3023. 804a022: 34 12 xor al,0x12
  3024. 804a024: f0 de bc 9a f0 de bc lock fidivr WORD PTR
  3025. [edx+ebx*4-0x65432110]
  3026. 804a02b: 9a
  3027. 0804a028 <a64>:
  3028. 804a028: f0 de bc 9a 78 56 34 lock fidivr WORD PTR
  3029. [edx+ebx*4+0x12345678]
  3030. 804a02f: 12
  3031. 804a030: f0 de bc 9a 78 56 34 lock fidivr WORD PTR
  3032. [edx+ebx*4+0x12345678]
  3033. 804a037: 12
  3034. Despite a8 is an array with 2 elements, each is 1-byte long, but
  3035. it is still allocated with 4 bytes. Again, to ensure natural
  3036. alignment for best performance, gcc pads extra zero bytes. As
  3037. shown in the assembly listing, the actual value of a8 is 12 34 00
  3038. 00, with a8[0] equals to 12 and a8[1] equals to 34.
  3039. Then it comes a16 with 2 elements, each is 2-byte long. Since 2
  3040. elements are 4 bytes in total, which is in the natural alignment,
  3041. gcc pads no byte. The value of a16 is 34 12 78 56, with a16[0]
  3042. equals to 34 12 and a16[1] equals to 78 56. Note that, objdump is
  3043. confused again, as de is the opcode for the instruction fidivr
  3044. (short of reverse divide) that requires another operand, so
  3045. objdump grabs whatever the next bytes that makes sense to it for
  3046. creating “an operand”. Only the highlighted values belong to a32.
  3047. Next is a32, with 2 elements, 4 bytes each. Similar to above
  3048. arrays, the value of a32[0] is 78 56 34 12, the value of a32[1]
  3049. is f0 de bc 9a, exactly what is assigned in the C code.
  3050. Finally is a64, also with 2 elements, but 8 bytes each. The total
  3051. size of a64 is 16 bytes, which is in the natural alignment,
  3052. therefore no padding bytes added. The values of both a64[0] and
  3053. a64[1] are the same: f0 de bc 9a 78 56 34 12, that got
  3054. misinterpreted to fidivr instruction.
  3055. [float Figure:
  3056. [Figure 0.13:
  3057. a8, a16, a32 and a64 memory layouts
  3058. ]
  3059. a8:  
  3060. +----------+
  3061. | 12 | 34 |
  3062. +----------+
  3063. a16:
  3064. +--------------------+
  3065. | 34 12   | 78 56    |
  3066. +--------------------+
  3067. a32:
  3068. +----------------------------------------+
  3069. | 78 56 34 12       | f0 de bc 9a        |
  3070. +----------------------------------------+
  3071. a64:
  3072. +---------------------------------------------------------------------------------+
  3073. | f0 de bc 9a 78 56 34 12               | f0 de bc 9a 78 56 34 12   
  3074.              |
  3075. +---------------------------------------------------------------------------------+
  3076. ]
  3077. However, beyond one-dimensional arrays that map directly to
  3078. hardware string type, C provides its own syntax for
  3079. multi-dimensional arrays:
  3080. Source
  3081. #include <stdint.h>
  3082. uint8_t @|\color{red}\bfseries a2[2][2]|@ = {
  3083. {0x12, 0x34},
  3084. {0x56, 0x78}
  3085. };
  3086. uint8_t @|\color{blue}\bfseries a3[2][2][2]|@ = {
  3087. {{0x12, 0x34},
  3088. {0x56, 0x78}},
  3089. {{0x9a, 0xbc},
  3090. {0xde, 0xff}},
  3091. };
  3092. int main(int argc, char *argv[]) {
  3093. return 0;
  3094. }
  3095. Assembly
  3096. 0804a018 <a2>:
  3097. 804a018: 12 34 56 adc dh,BYTE PTR
  3098. [esi+edx*2]
  3099. 804a01b: 78 12 js 804a02f <_end+0x7>
  3100. 0804a01c <a3>:
  3101. 804a01c: 12 34 56 adc dh,BYTE PTR
  3102. [esi+edx*2]
  3103. 804a01f: 78 9a js 8049fbb
  3104. <_DYNAMIC+0xa7>
  3105. 804a021: bc .byte 0xbc
  3106. 804a022: de ff fdivrp st(7),st
  3107. Technically, multi-dimensional arrays are like normal arrays: in
  3108. the end, the total size is translated into flat allocated bytes.
  3109. A 2 x 2 array is allocated with 4 bytes; a 2\times2\times2
  3110. array
  3111. is allocated with 8 bytes, as can be seen in the assembly listing
  3112. of a2[footnote:
  3113. Again, objdump is confused and put the number 12 next to 78 in a3
  3114. listing.
  3115. ] and a3. In low-level assembly code, the representation is the
  3116. same between a[4] and a[2][2]. However, in high-level C code, the
  3117. difference is tremendous. The syntax of multi-dimensional array
  3118. enables a programmer to think with higher level concepts, instead
  3119. of translating manually from high-level concepts to low-level
  3120. code and work with high-level concepts in his head at the same
  3121. time.
  3122. The following two-dimensional array can hold a list of 2 names
  3123. with the length of 10:
  3124. char names[2][10] = {
  3125. "John Doe",
  3126. "Jane Doe"
  3127. };
  3128. To access a name, we simply adjust the column index[footnote:
  3129. The left index is called column index since it changes the index
  3130. based on a column.
  3131. ] e.g. names[0], names[1]. To access individual character within
  3132. a name, we use the row index[footnote:
  3133. Same with column index, the right index is called row index since
  3134. it changes the index based on a row.
  3135. ] e.g. names[0][0] gives the character “J”, names[0][1] gives the
  3136. character “o” and so on.
  3137. Without such syntax, we need to create a 20-byte array e.g.
  3138. names[20], and whenever we want to access a character e.g. to
  3139. check if the names contains with a number in it, we need to
  3140. calculate the index manually. It would be distracting, since we
  3141. constantly need to switch thinkings between the actual problem
  3142. and the translate problem.
  3143. Since this is a repeating pattern, C abstracts away this
  3144. problem with the syntax for define and manipulating
  3145. multi-dimensional array. Through this example, we can clearly
  3146. see the power of abstraction through language can give us. It
  3147. would be ideal if a programmer is equipped with such power to
  3148. define whatever syntax suitable for a problem at hands. Not
  3149. many languages provide such capacity. Fortunately, through C
  3150. macro, we can partially achieve that goal .
  3151. In all cases, an array is guaranteed to generate contiguous bytes
  3152. of memory, regardless of the dimensions it has.
  3153. What is the difference between a multi-dimensional array and an
  3154. array of pointers, or even pointers of pointers?
  3155. Examine compiled code
  3156. This section will explore how compiler transform high level code
  3157. into assembly code that CPU can execute, and see how common
  3158. assembly patterns help to create higher level syntax. -S option
  3159. is added to objdump to better demonstrate the connection between
  3160. high and low level code.
  3161. In this section, the option --no-show-raw-insn is added to
  3162. objdump command to omit the opcodes for clarity:
  3163. $ objdump --no-show-raw-insn -M intel -S -D <object file> | less
  3164. Data Transfer
  3165. Previous section explores how various types of data are created,
  3166. and how they are laid out in memory. Once memory storages are
  3167. allocated for variables, they must be accessible and writable.
  3168. Data transfer instructions move data (bytes, words, doublewords
  3169. or quadwords) between memory and registers, and between
  3170. registers, effectively read from a storage source and write to
  3171. another storage source.
  3172. Source
  3173. #include <stdint.h>
  3174. int32_t i = 0x12345678;
  3175. int main(int argc, char *argv[]) {
  3176. int j = i;
  3177. int k = 0xabcdef;
  3178. return 0;
  3179. }
  3180. Assembly
  3181. 080483db <main>:
  3182. #include <stdint.h>
  3183. int32_t i = 0x12345678;
  3184. int main(int argc, char *argv[]) {
  3185. 80483db: push ebp
  3186. 80483dc: mov ebp,esp
  3187. 80483de: sub esp,0x10
  3188. int j = i;
  3189. 80483e1: mov eax,ds:0x804a018
  3190. 80483e6: mov DWORD PTR [ebp-0x8],eax
  3191. int k = 0xabcdef;
  3192. 80483e9: mov DWORD PTR [ebp-0x4],0xabcdef
  3193. return 0;
  3194. 80483f0: mov eax,0x0
  3195. }
  3196. 80483f5: leave
  3197. 80483f6: ret
  3198. 80483f7: xchg ax,ax
  3199. 80483f9: xchg ax,ax
  3200. 80483fb: xchg ax,ax
  3201. 80483fd: xchg ax,ax
  3202. 80483ff: nop
  3203. The general data movement is performed with the mov instruction.
  3204. Note that despite the instruction being called mov, it actually
  3205. copies data from one destination to another.
  3206. The red instruction copies data from the register esp to the
  3207. register ebp. This mov instruction moves data between registers
  3208. and is assigned the opcode 89.
  3209. The blue instructions copies data from one memory location (the i
  3210. variable) to another (the j variable). There exists no data
  3211. movement from memory to memory; it requires two mov instructions,
  3212. one for copying the data from a memory location to a register,
  3213. and one for copying the data from the register to the destination
  3214. memory location.
  3215. The pink instruction copies an immediate value into memory.
  3216. Finally, the green instruction copies immediate data into a
  3217. register.
  3218. Expressions
  3219. Source
  3220. int expr(int i, int j)
  3221. {
  3222. int add = i + j;
  3223. int sub = i - j;
  3224. int mul = i * j;
  3225. int div = i / j;
  3226. int mod = i % j;
  3227. int neg = -i;
  3228. int and = i & j;
  3229. int or = i | j;
  3230. int xor = i ^ j;
  3231. int not = ~i;
  3232. int shl = i << 8;
  3233. int shr = i >> 8;
  3234. char equal1 = (i == j);
  3235. int equal2 = (i == j);
  3236. char greater = (i > j);
  3237. char less = (i < j);
  3238. char greater_equal = (i >= j);
  3239. char less_equal = (i <= j);
  3240. int logical_and = i && j;
  3241. int logical_or = i || j;
  3242. ++i;
  3243. --i;
  3244. int i1 = i++;
  3245. int i2 = ++i;
  3246. int i3 = i--;
  3247. int i4 = --i;
  3248. return 0;
  3249. }
  3250. int main(int argc, char *argv[]) {
  3251. return 0;
  3252. }
  3253. Assembly
  3254. The full assembly listing is really long. For that reason, we
  3255. examine expression by expression.
  3256. Expression: int add = i + j;
  3257. 80483e1: mov edx,DWORD PTR [ebp+0x8]
  3258. 80483e4: mov eax,DWORD PTR [ebp+0xc]
  3259. 80483e7: add eax,edx
  3260. 80483e9: mov DWORD PTR [ebp-0x34],eax
  3261. The assembly code is straight forward: variable i and j are
  3262. stored in eax and edx respectively, then added together with
  3263. the add instruction, and the final result is stored into eax.
  3264. Then, the result is saved into the local variable add, which
  3265. is at the location [ebp-0x34].
  3266. Expression: int sub = i - j;
  3267. 80483ec: mov eax,DWORD PTR [ebp+0x8]
  3268. 80483ef: sub eax,DWORD PTR [ebp+0xc]
  3269. 80483f2: mov DWORD PTR [ebp-0x30],eax
  3270. Similar to add instruction, x86 provides a sub instruction
  3271. for subtraction. Hence, gcc translates a subtraction into sub
  3272. instruction, with eax is reloaded with i, as eax still
  3273. carries the result from previous expression. Then, j is
  3274. subtracted from i. After the subtraction, the value is saved
  3275. into the variable sub, at location [ebp-0x30].
  3276. Expression: int mul = i * j;
  3277. 80483f5: mov eax,DWORD PTR [ebp+0x8]
  3278. 80483f8: imul eax,DWORD PTR [ebp+0xc]
  3279. 80483fc: mov DWORD PTR [ebp-0x34],eax
  3280. Similar to sub instruction, only eax is reloaded, since it
  3281. carries the result of previous calculation. imul performs
  3282. signed multiply[footnote:
  3283. Unsigned multiply is perform by mul instruction.
  3284. ]. eax is first loaded with i, then is multiplied with j and
  3285. stored the result back into eax, then stored into the
  3286. variable mul at location [ebp-0x34].
  3287. Expression: int div = i / j;
  3288. 80483ff: mov eax,DWORD PTR [ebp+0x8]
  3289. 8048402: cdq
  3290. 8048403: idiv DWORD PTR [ebp+0xc]
  3291. 8048406: mov DWORD PTR [ebp-0x30],eax
  3292. Similar to imul, idiv performs sign divide. But, different
  3293. from imul above idiv only takes one operand:
  3294. 1. First, i is reloaded into eax.
  3295. 2. Then, cdq converts the double word value in eax into a
  3296. quadword value stored in the pair of registers edx:eax, by
  3297. copying the signed (bit 31[superscript:th]) of the value in eax into every bit position in edx. The pair
  3298. edx:eax is the dividend, which is the variable i, and the
  3299. operand to idiv is the divisor, which is the variable j.
  3300. 3. After the calculation, the result is stored into the pair
  3301. edx:eax registers, with the quotient in eax and remainder
  3302. in edx. The quotient is stored in the variable div, at
  3303. location [ebp-0x30].
  3304. Expression: int mod = i % j;
  3305. 8048409: mov eax,DWORD PTR [ebp+0x8]
  3306. 804840c: cdq
  3307. 804840d: idiv DWORD PTR [ebp+0xc]
  3308. 8048410: mov DWORD PTR [ebp-0x2c],edx
  3309. The same idiv instruction also performs the modulo operation,
  3310. since it also calculates a remainder and stores in the
  3311. variable mod, at location [ebp-0x2c].
  3312. Expression: int neg = -i;
  3313. 8048413: mov eax,DWORD PTR [ebp+0x8]
  3314. 8048416: neg eax
  3315. 8048418: mov DWORD PTR [ebp-0x28],eax
  3316. neg replaces the value of operand (the destination operand)
  3317. with its two's complement (this operation is equivalent to
  3318. subtracting the operand from 0). In this example, the value i
  3319. in eax is replaced replaced with -i using neg instruction.
  3320. Then, the new value is stored in the variable neg at
  3321. [ebp-0x28].
  3322. Expression: int and = i & j;
  3323. 804841b: mov eax,DWORD PTR [ebp+0x8]
  3324. 804841e: and eax,DWORD PTR [ebp+0xc]
  3325. 8048421: mov DWORD PTR [ebp-0x24],eax
  3326. and performs a bitwise AND operation on two operands, and
  3327. stores the result in the destination operand, which is the
  3328. variable and at [ebp-0x24].
  3329. Expression: int or = i | j;
  3330. 8048424: mov eax,DWORD PTR [ebp+0x8]
  3331. 8048427: or eax,DWORD PTR [ebp+0xc]
  3332. 804842a: mov DWORD PTR [ebp-0x20],eax
  3333. Similar to and instruction, or performs a bitwise OR
  3334. operation on two operands, and stores the result in the
  3335. destination operand, which is the variable or at [ebp-0x20]
  3336. in this case.
  3337. Expression: int xor = i ^ j;
  3338. 804842d: mov eax,DWORD PTR [ebp+0x8]
  3339. 8048430: xor eax,DWORD PTR [ebp+0xc]
  3340. 8048433: mov DWORD PTR [ebp-0x1c],eax
  3341. Similar to and/or instruction, xor performs a bitwise XOR
  3342. operation on two operands, and stores the result in the
  3343. destination operand, which is the variable xor at [ebp-0x1c].
  3344. Expression: int not = ~i;
  3345. 8048436: mov eax,DWORD PTR [ebp+0x8]
  3346. 8048439: not eax
  3347. 804843b: mov DWORD PTR [ebp-0x18],eax
  3348. not performs a bitwise NOT operation (each 1 is set to 0, and
  3349. each 0 is set to 1) on the destination operand and stores the
  3350. result in the destination operand location, which is the
  3351. variable not at [ebp-0x18].
  3352. Expression: int shl = i << 8;
  3353. 804843e: mov eax,DWORD PTR [ebp+0x8]
  3354. 8048441: shl eax,0x8
  3355. 8048444: mov DWORD PTR [ebp-0x14],eax
  3356. shl (shift logical left) shifts the bits in the destination
  3357. operand to the left by the number of bits specified in the
  3358. source operand. In this case, eax stores i and shl shifts eax
  3359. by 8 bits to the left. A different name for shl is sal (shift
  3360. arithmetic left). Both can be used synonymous. Finally, the
  3361. result is stored in the variable shl at [ebp-0x14].
  3362. Here is a visual demonstration of shl/sal and shr
  3363. instructions:
  3364. After shifting to the left, the right most bit is set for
  3365. Carry Flag in EFLAGS register.
  3366. Expression: int shr = i >> 8;
  3367. 8048447: mov eax,DWORD PTR [ebp+0x8]
  3368. 804844a: sar eax,0x8
  3369. 804844d: mov DWORD PTR [ebp-0x10],eax
  3370. sar is similar to shl/sal, but shift bits to the right and
  3371. extends the sign bit. For right shift, shr and sar are two
  3372. different instructions. shr differs to sar is that it does
  3373. not extend the sign bit. Finally, the result is stored in the
  3374. variable shr at [ebp-0x10].
  3375. In the figure (b), notice that initially, the sign bit is 1,
  3376. but after 1-bit and 10-bit shiftings, the shifted-out bits
  3377. are filled with zeros.
  3378. [float Figure:
  3379. [Figure 0.14:
  3380. SAR Instruction Operation (Source: Figure 7-8, Volume 1)
  3381. ]
  3382. <Graphics file: C:/Users/Tu Do/os01/book_src/images/04/sar.pdf>
  3383. ]
  3384. With sar, the sign bit (the most significant bit) is
  3385. preserved. That is, if the sign bit is 0, the new bits always
  3386. get the value 0; if the sign bit is 1, the new bits always
  3387. get the value 1.
  3388. Expression: char equal1 = (i == j);
  3389. 8048450: mov eax,DWORD PTR [ebp+0x8]
  3390. 8048453: cmp eax,DWORD PTR [ebp+0xc]
  3391. 8048456: sete al
  3392. 8048459: mov BYTE PTR [ebp-0x41],al
  3393. cmp and variants of the variants of set instructions make up
  3394. all the logical comparisons. In this expression, cmp compares
  3395. variable i and j; then sete stores the value 1 to al register
  3396. if the comparison from cmp earlier is equal, or stores 0
  3397. otherwise. The general name for variants of set instruction
  3398. is called SETcc. The suffix cc denotes the condition being
  3399. tested for in EFLAGS register. Appendix B in volume 1,
  3400. “EFLAGS Condition Codes”, lists the conditions it is possible
  3401. to test for with this instruction. Finally, the result is
  3402. stored in the variable equal1 at [ebp-0x41].
  3403. Expression: int equal2 = (i == j);
  3404. 804845c: mov eax,DWORD PTR [ebp+0x8]
  3405. 804845f: cmp eax,DWORD PTR [ebp+0xc]
  3406. 8048462: sete al
  3407. 8048465: movzx eax,al
  3408. 8048468: mov DWORD PTR [ebp-0xc],eax
  3409. Similar to equality comparison, this expression also compares
  3410. for equality, with an exception that the result is stored in
  3411. an int type. For that reason, one more instruction is a
  3412. added: movzx instruction, a variant of mov that copies the
  3413. result into a destination operand and fills the remaining
  3414. bytes with 0. In this case, since eax is 4-byte wide, after
  3415. copying the first byte in al, the remaining bytes of eax are
  3416. filled with 0 to ensure the eax carries the same value as al.
  3417. [float Figure:
  3418. [Figure 0.15:
  3419. movzx instruction
  3420. ] [float Figure:
  3421. [Sub-Figure a:
  3422. eax before movzx
  3423. ]
  3424. +-----+-----+-----+----+
  3425. | 12 | 34 | 56 | 78 |
  3426. +-----+-----+-----+----+
  3427. ] [float Figure:
  3428. [Sub-Figure b:
  3429. after movzx eax, al
  3430. ]
  3431. +-----+-----+-----+----+
  3432. | 00 | 00 | 00 | 78 |
  3433. +-----+-----+-----+----+
  3434. ]
  3435. ]
  3436. Expression: char greater = (i > j);
  3437. 804846b: mov eax,DWORD PTR [ebp+0x8]
  3438. 804846e: cmp eax,DWORD PTR [ebp+0xc]
  3439. 8048471: setg al
  3440. 8048474: mov BYTE PTR [ebp-0x40],al
  3441. Similar to equality comparison, but used setg for greater
  3442. comparison instead.
  3443. Expression: char less = (i < j);
  3444. 8048477: mov eax,DWORD PTR [ebp+0x8]
  3445. 804847a: cmp eax,DWORD PTR [ebp+0xc]
  3446. 804847d: setl al
  3447. 8048480: mov BYTE PTR [ebp-0x3f],al
  3448. Applied setl for less comparison.
  3449. Expression: char greater_equal = (i >= j);
  3450. 8048483: mov eax,DWORD PTR [ebp+0x8]
  3451. 8048486: cmp eax,DWORD PTR [ebp+0xc]
  3452. 8048489: setge al
  3453. 804848c: mov BYTE PTR [ebp-0x3e],al
  3454. Applied setge for greater or equal comparison.
  3455. Expression: char less_equal = (i <= j);
  3456. 804848f: mov eax,DWORD PTR [ebp+0x8]
  3457. 8048492: cmp eax,DWORD PTR [ebp+0xc]
  3458. 8048495: setle al
  3459. 8048498: mov BYTE PTR [ebp-0x3d],al
  3460. Applied setle for less than or equal comparison.
  3461. Expression: int logical_and = (i && j);
  3462. 804849b: cmp DWORD PTR [ebp+0x8],0x0
  3463. 804849f: je 80484ae <expr+0xd3>
  3464. 80484a1: cmp DWORD PTR [ebp+0xc],0x0
  3465. 80484a5: je 80484ae <expr+0xd3>
  3466. 80484a7: mov eax,0x1
  3467. 80484ac: jmp 80484b3 <expr+0xd8>
  3468. 80484ae: mov eax,0x0
  3469. 80484b3: mov DWORD PTR [ebp-0x8],eax
  3470. Logical AND operator && is one of the syntaxes that is made
  3471. entirely in software[footnote:
  3472. That is, there is no equivalent assembly instruction implemented
  3473. in hardware.
  3474. ] with simpler instructions. The algorithm from the assembly code
  3475. is simple:
  3476. 1. First, check if i is 0 with the instruction at 0x804849b.
  3477. (a) If true, jump to 0x80484ae and set eax to 0.
  3478. (b) Set the variable logical_and to 0, as it is the next
  3479. instruction after 0x80484ae.
  3480. 2. If i is not 0, check if j is 0 with the instruction at
  3481. 0x80484a1.
  3482. (a) If true, jump to 0x80484ae and set eax to 0.
  3483. (b) Set the variable logical_and to 0, as it is the next
  3484. instruction after 0x80484ae.
  3485. 3. If both i and j are not 0, the result is certainly 1, or
  3486. true.
  3487. (a) Set it accordingly with the instruction at 0x80484a7.
  3488. (b) Then jump to the instruction at 0x80484b3 to set the
  3489. variable logical_and at [ebp-0x8] to 1.
  3490. Expression: int logical_or = (i || j);
  3491. 80484b6: cmp DWORD PTR [ebp+0x8],0x0
  3492. 80484ba: jne 80484c2 <expr+0xe7>
  3493. 80484bc: cmp DWORD PTR [ebp+0xc],0x0
  3494. 80484c0: je 80484c9 <expr+0xee>
  3495. 80484c2: mov eax,0x1
  3496. 80484c7: jmp 80484ce <expr+0xf3>
  3497. 80484c9: mov eax,0x0
  3498. 80484ce: mov DWORD PTR [ebp-0x4],eax
  3499. Logical OR operator || is similar to logical and above.
  3500. Understand the algorithm is left as an exercise for readers.
  3501. Expression: ++i; and --i; (or i++ and i--)
  3502. 80484d1: add DWORD PTR [ebp+0x8],0x1
  3503. 80484d5: sub DWORD PTR [ebp+0x8],0x1
  3504. The syntax of increment and decrement is similar to logical
  3505. AND and logical OR in that it is made from existing
  3506. instruction, that is add. The difference is that the CPU
  3507. actually does has a built-in instruction, but gcc decided not
  3508. to use the instruction because inc and dec cause a partial
  3509. flag register stall, occurs when an instruction modifies a
  3510. part of the flag register and the following instruction is
  3511. dependent on the outcome of the flags (section 3.5.2.6, Intel Optimization Manual, 2016
  3512. ). The manual even suggests that inc and dec should be
  3513. replaced with add and sub instructions (section 3.5.1.1, Intel Optimization Manual, 2016
  3514. ).
  3515. Expression: int i1 = i++;
  3516. 80484d9: mov eax,DWORD PTR [ebp+0x8]
  3517. 80484dc: lea edx,[eax+0x1]
  3518. 80484df: mov DWORD PTR [ebp+0x8],edx
  3519. 80484e2: mov DWORD PTR [ebp-0x10],eax
  3520. First, i is copied into eax at 80484d9. Then, the value of
  3521. eax + 0x1 is copied into edx as an effective address at
  3522. 80484dc. The lea (load effective address) instruction copies
  3523. a memory address into a register. According to Volume 2, the
  3524. source operand is a memory address specified with one of the
  3525. processors addressing modes. This means, the source operand
  3526. must be specified by the addressing modes defined in
  3527. 16-bit/32-bit ModR/M Byte tables, [mod-rm-16] and [mod-rm-32]
  3528. .
  3529. After loading the incremented value into edx, the value of i
  3530. is increased by 1 at 80484df. Finally, the previous i value
  3531. is stored back to i1 at [ebp-0x8] by the instruction at
  3532. 80484e2.
  3533. Expression: int i2 = ++i;
  3534. 80484e5: add DWORD PTR [ebp+0x8],0x1
  3535. 80484e9: mov eax,DWORD PTR [ebp+0x8]
  3536. 80484ec: mov DWORD PTR [ebp-0xc],eax
  3537. The primary differences between this increment syntax and the
  3538. previous one are:
  3539. • add is used instead of lea to increase i directly.
  3540. • the newly incremented i is stored into i2 instead of the
  3541. old value.
  3542. • the expression only costs 3 instructions instead of 4.
  3543. This prefix-increment syntax is faster than the post-fix one
  3544. used previously. It might not matter much which version to
  3545. use if the increment is only used once or a few hundred times
  3546. in a small loop, but it matters when a loop runs millions or
  3547. more times. Also, depends on different circumstances, it is
  3548. more convenient to use one over the other e.g. if i is an
  3549. index for accessing an array, we want to use the old value
  3550. for accessing previous array element and newly incremented i
  3551. for current element.
  3552. Expression: int i3 = i--;
  3553. 80484ef: mov eax,DWORD PTR [ebp+0x8]
  3554. 80484f2: lea edx,[eax-0x1]
  3555. 80484f5: mov DWORD PTR [ebp+0x8],edx
  3556. 80484f8: mov DWORD PTR [ebp-0x8],eax
  3557. Similar to i++ syntax, and is left as an exercise to readers.
  3558. Expression: int i4 = --i;
  3559. 80484fb: sub DWORD PTR [ebp+0x8],0x1
  3560. 80484ff: mov eax,DWORD PTR [ebp+0x8]
  3561. 8048502: mov DWORD PTR [ebp-0x4],eax
  3562. Similar to ++i syntax, and is left as an exercise to readers.
  3563. Read section 3.5.2.4, “Partial Register Stalls” to understand
  3564. register stalls in general.
  3565. Read the sections from 7.3.1 to 7.3.7 in volume 1.
  3566. Stack
  3567. A stack is a contiguous array of memory locations that holds a
  3568. collection of discrete data. When a new element is added, a stack
  3569. grows down in memory toward lesser addresses, and shrinks up
  3570. toward greater addresses when an element is removed. x86 uses the
  3571. esp register to point to the top of the stack, at the newest
  3572. element. A stack can be originated anywhere in main memory, as
  3573. esp can be set to any memory address. x86 provides two operations
  3574. for manipulating stacks:
  3575. • push instruction and its variants add a new element on top of
  3576. the stack
  3577. • pop instructions and its variants remove the top-most element
  3578. from the stack.
  3579. +----------+----+
  3580. | 0x10000 | 00 |
  3581. +----------+----+
  3582. | 0x10001 | 00 |
  3583. +----------+----+
  3584. | 0x10002 | 00 |
  3585. +----------+----+
  3586. | 0x10003 | 00 |
  3587. +----------+----+ +-----+
  3588. | 0x10004 | 12 | \leftarrow
  3589. | esp |
  3590. +----------+----+ +-----+
  3591. +----------+----+
  3592. | 0x10000 | 00 |
  3593. +----------+----+
  3594. | 0x10001 | 00 |
  3595. +----------+----+ +-----+
  3596. | 0x10002 | 78 | \leftarrow
  3597. | esp |
  3598. +-----+
  3599. +----------+----+
  3600. | 0x10003 | 56 |
  3601. +----------+----+
  3602. | 0x10004 | 12 |
  3603. +----------+----+
  3604. +----------+----+
  3605. | 0x10000 | 00 |
  3606. +----------+----+
  3607. | 0x10001 | 00 |
  3608. +----------+----+
  3609. | 0x10002 | 00 |
  3610. +----------+----+
  3611. | 0x10003 | 00 |
  3612. +----------+----+ +-----+
  3613. | 0x10004 | 12 | \leftarrow
  3614. | esp |
  3615. +----------+----+ +-----+
  3616. Automatic variables
  3617. Local variables are variables that exist within a scope. A scope
  3618. is delimited by a pair of braces: {..}. The most common scope to
  3619. define local variables is at function scope. However, scope can
  3620. be unnamed, and variables created inside an unnamed scope do not
  3621. exist outside of its scope and its inner scope.
  3622. Function scope:
  3623. void foo() {
  3624. int a;
  3625. int b;
  3626. }
  3627. a and b are variables local to the function foo.
  3628. Unnamed scope:
  3629. int foo() {
  3630. int i;
  3631. {
  3632. int a = 1;
  3633. int b = 2;
  3634. {
  3635. return i = a + b;
  3636. }
  3637. }
  3638. }
  3639. a and b are local to where it is defined and local into its
  3640. inner child scope that return i = a + b. However, they do not
  3641. exist at the function scope that creates i.
  3642. When a local variable is created, it is pushed on the stack; when
  3643. a local variable goes out of scope, it is pop out of the stack,
  3644. thus destroyed. When an argument is passed from a caller to a
  3645. callee, it is pushed on the stack; when a callee returns to the
  3646. caller, the arguments are popped out the stack. The local
  3647. variables and arguments are automatically allocated upon enter a
  3648. function and destroyed after exiting a function, that's why it's
  3649. called automatic variables.
  3650. A base frame pointer points to the start of the current function
  3651. frame, and is kept in ebp register. Whenever a function is
  3652. called, it is allocated with its own dedicated storage on stack,
  3653. called stack frame. A stack frame is where all local variables
  3654. and arguments of a function are placed on a stack[footnote:
  3655. Data and only data are exclusively allocated on stack for every
  3656. stack frame. No code resides here.
  3657. ].
  3658. When a function needs a local variable or an argument, it uses
  3659. ebp to access a variable:
  3660. • All local variables are allocated after the ebp pointer. Thus,
  3661. to access a local variable, a number is subtracted from ebp to
  3662. reach the location of the variable.
  3663. • All arguments are allocated before ebp pointer. To access an
  3664. argument, a number is added to ebp to reach the location of the
  3665. argument.
  3666. • The ebp itself pointer points to the return address of its
  3667. caller.
  3668. +--------------------------------------+---------------------------------------------------------------------------+
  3669. | Previous Frame | Current Frame |
  3670. +--------------------------------------+-----------------------------+----------+----------------------------------+
  3671. | Function Arguments | | ebp | Local variables |
  3672. +-----+-----+-----+-----------+--------+-----------------------------+----------+-----+-----+-----+-----------+----+
  3673. | A1 | A2 | A3 | ........ | An | Return Address | Old ebp | L1 | L2 | L3 | ........ | Ln |
  3674. +-----+-----+-----+-----------+--------+-----------------------------+----------+-----+-----+-----+-----------+----+
  3675. A = Argument
  3676. L = Local Variable
  3677. Here is an example to make it more concrete:
  3678. Source
  3679. int add(int @|\color{red}\bfseries a|@, int
  3680. @|\color{green}\bfseries b|@) {
  3681. int @|\color{blue}\bfseries i|@ = @|\color{red}\bfseries a|@
  3682. + @|\color{green}\bfseries b|@;
  3683. return i;
  3684. }
  3685. Assembly
  3686. 080483db <add>:
  3687. #include <stdint.h>
  3688. int add(int a, int b) {
  3689. 80483db: push ebp
  3690. 80483dc: mov ebp,esp
  3691. 80483de: sub esp,0x10
  3692. int i = a + b;
  3693. 80483e1: mov edx,DWORD PTR [ebp+0x8]
  3694. 80483e4: mov eax,DWORD PTR [ebp+0xc]
  3695. 80483e7: add eax,edx
  3696. 80483e9: mov DWORD PTR [ebp-0x4],eax
  3697. return i;
  3698. 80483ec: mov eax,DWORD PTR [ebp-0x4]
  3699. }
  3700. 80483ef: leave
  3701. 80483f0: ret
  3702. In the assembly listing, [ebp-0x4] is the local variable i, since
  3703. it is allocated after ebp, with the length of 4 bytes (an int).
  3704. On the other hand, a and b are arguments and can be accessed with
  3705. ebp:
  3706. • [ebp+0x8] accesses a.
  3707. • [ebp+0xc] access b.
  3708. For accessing arguments, the rule is that the closer a variable
  3709. on stack to ebp, the closer it is to a function name.
  3710. +-------------------+ +-------------------+ +-------------------+ +-------------------+
  3711. | ebp+0xc | | ebp+0x8 | | ebp+0x4 | | ebp |
  3712. +-------------------+ +-------------------+ +-------------------+ +-------------------+
  3713. ---------------
  3714. \downarrow
  3715. \downarrow
  3716. \downarrow
  3717. \downarrow
  3718. +----------+-----+-----+-----+--------------+-----+-----+-----+--------------+-----+-----+-----+--------------+-----+-----+-----+-------------+
  3719. | | 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0a | 0b | 0c | 0d | 0e | 0f |
  3720. +----------+--------------------------------+--------------------------------+--------------------------------+-------------------------------+
  3721. | 0x10000 | b | a | Return Address | Old ebp |
  3722. +----------+--------------------------------+--------------------------------+--------------------------------+-------------------------------+
  3723. +-------------------+ +-------------------+
  3724. | ebp+0x8 | | ebp+0x4 |
  3725. +-------------------+ +-------------------+
  3726. \downarrow
  3727. \downarrow
  3728. +----------+-----+-----+-----+--------------+-----+-----+-----+--------------+-----+-----+-----+--------------+-----+-----+-----+-------------+
  3729. | | 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0a | 0b | 0c | 0d | 0e | 0f |
  3730. +----- +----- +-------------+
  3731. +----------+-----+-----+-----+--------------+-----+-----+-----+--------------+-----+-----+-----+--------------+-------------------------------+
  3732. |  0xffe0 | | | | | | | | | | | | N | i |
  3733. +----------+-----+-----+-----+--------------+-----+-----+-----+--------------+-----+-----+-----+--------------+-------------------------------+
  3734. N = Next local variable starts here
  3735. From the figure, we can see that a and b are laid out in memory
  3736. with the exact order as written in C, relative to the return
  3737. address.
  3738. Function Call and Return<sub:Function-Call-and>
  3739. Source
  3740. #include <stdio.h>
  3741. int add(int a, int b) {
  3742. int local = 0x12345;
  3743. return a + b;
  3744. }
  3745. int main(int argc, char *argv[]) {
  3746. add(1,1);
  3747. return 0;
  3748. }
  3749. Assembly
  3750. For every function call, gcc pushes arguments on the stack in
  3751. reversed order with the push instructions. That is, the
  3752. arguments pushed on stack are in reserved order as it is
  3753. written in high level C code, to ensure the relative order
  3754. between arguments, as seen in previous section how function
  3755. arguments and local variables are laid out. Then, gcc generates
  3756. a call instruction, which then implicitly pushes a return
  3757. address before transferring the control to add function:
  3758. 080483f2 <main>:
  3759. int main(int argc, char *argv[]) {
  3760. 80483f2: push ebp
  3761. 80483f3: mov ebp,esp
  3762. add(1,2);
  3763. 80483f5: push 0x2
  3764. 80483f7: push 0x1
  3765. 80483f9: call 80483db <add>
  3766. 80483fe: add esp,0x8
  3767. return 0;
  3768. 8048401: mov eax,0x0
  3769. }
  3770. 8048406: leave
  3771. 8048407: ret
  3772. Upon finishing the call to add function, the stack is restored by
  3773. adding 0x8 to stack pointer esp (which is equivalent to 2 pop
  3774. instructions). Finally, a leave instruction is executed and main
  3775. returns with a ret instruction. A ret instruction transfers the
  3776. program execution back to the caller to the instruction right
  3777. after the call instruction, the add instruction. The reason ret
  3778. can return to such location is that the return address implicitly
  3779. pushed by the call instruction, which is the address right after
  3780. the call instruction; whenever the CPU executes ret instruction,
  3781. it retrieves the return address that sits right after all the
  3782. arguments on the stack:
  3783. At the end of a function, gcc places a leave instruction to clean
  3784. up all spaces allocated for local variables and restore the frame
  3785. pointer to frame pointer of the caller.
  3786. 080483db <add>:
  3787. #include <stdio.h>
  3788. int add(int a, int b) {
  3789. 80483db: push ebp
  3790. 80483dc: mov ebp,esp
  3791. 80483de: sub esp,0x10
  3792. int local = 0x12345;
  3793. 80483e1: DWORD PTR [ebp-0x4],0x12345
  3794. return a + b;
  3795. 80483e8: mov edx,DWORD PTR [ebp+0x8]
  3796. 80483eb: mov eax,DWORD PTR [ebp+0xc]
  3797. 80483ee: add eax,edx
  3798. }
  3799. 80483f0: leave
  3800. 80483f1: ret
  3801. The above code that gcc generated for function calling is
  3802. actually the standard method x86 defined. Read chapter 6, “
  3803. Produce Calls, Interrupts, and Exceptions”, Intel manual volume
  3804. 1.
  3805. Loop
  3806. Loop is simply resetting the instruction pointer to an already
  3807. executed instruction and starting from there all over again. A
  3808. loop is just one application of jmp instruction. However, because
  3809. looping is a pervasive pattern, it earned its own syntax in C.
  3810. Source
  3811. #include <stdio.h>
  3812. int main(int argc, char *argv[]) {
  3813. for (int i = 0; i < 10; i++) {
  3814. }
  3815. return 0;
  3816. }
  3817. Assembly
  3818. 080483db <main>:
  3819. #include <stdio.h>
  3820. int main(int argc, char *argv[]) {
  3821. 80483db: push ebp
  3822. 80483dc: mov ebp,esp
  3823. 80483de: sub esp,0x10
  3824. for (int i = 0; i < 10; i++) {
  3825. 80483e1: mov DWORD PTR [ebp-0x4],0x0
  3826. 80483e8: jmp 80483ee <main+0x13>
  3827. 80483ea: add DWORD PTR [ebp-0x4],0x1
  3828. 80483ee: cmp DWORD PTR [ebp-0x4],0x9
  3829. 80483f2: jle 80483ea <main+0xf>
  3830. }
  3831. return 0;
  3832. 80483f4: b8 00 00 00 00 mov eax,0x0
  3833. }
  3834. 80483f9: c9 leave
  3835. 80483fa: c3 ret
  3836. 80483fb: 66 90 xchg ax,ax
  3837. 80483fd: 66 90 xchg ax,ax
  3838. 80483ff: 90 nop
  3839. The colors mark corresponding high level code to assembly code:
  3840. 1. The red instruction initialize i to 0.
  3841. 2. The green instructions compare i to 10 by using jle and
  3842. compare it to 9. If true, jump to 80483ea for another
  3843. iteration.
  3844. 3. The blue instruction increase i by 1, making the loop able
  3845. to terminate once the terminate condition is satisfied.
  3846. Why does the increment instruction (the blue instruction)
  3847. appears before the compare instructions (the green
  3848. instructions)?
  3849. What assembly code can be generated for while and do...while?
  3850. Conditional
  3851. Again, conditional in C with if...else... construct is just
  3852. another application of jmp instruction under the hood. It is also
  3853. a pervasive pattern that earned its own syntax in C.
  3854. Source
  3855. #include <stdio.h>
  3856. int main(int argc, char *argv[]) {
  3857. int i = 0;
  3858. if (argc) {
  3859. i = 1;
  3860. } else {
  3861. i = 0;
  3862. }
  3863. return 0;
  3864. }
  3865. Assembly
  3866. int main(int argc, char *argv[]) {
  3867. 80483db: push ebp
  3868. 80483dc: mov ebp,esp
  3869. 80483de: sub esp,0x10
  3870. int i = 0;
  3871. 80483e1: mov DWORD PTR [ebp-0x4],0x0
  3872. if (argc) {
  3873. 80483e8: cmp DWORD PTR [ebp+0x8],0x0
  3874. 80483ec: je 80483f7 <main+0x1c>
  3875. i = 1;
  3876. 80483ee: mov DWORD PTR [ebp-0x4],0x1
  3877. 80483f5: jmp 80483fe <main+0x23>
  3878. } else {
  3879. i = 0;
  3880. 80483f7: mov DWORD PTR [ebp-0x4],0x0
  3881. }
  3882. return 0;
  3883. 80483fe: mov eax,0x0
  3884. }
  3885. 8048403: leave
  3886. 8048404: ret
  3887. The generated assembly code follows the same order as the
  3888. corresponding high level syntax:
  3889. • red instructions represents if branch.
  3890. • blue instructions represents else branch.
  3891. • green instruction is the exit point for both if and else
  3892. branch.
  3893. if branch first compares whether argc is false (equal to 0)
  3894. with cmp instruction. If true, it proceeds to else branch at
  3895. 80483f7. Otherwise, if branch continues with the code of its
  3896. branch, which is the next instruction at 80483ee for copying 1
  3897. to i. Finally, it skips over else branch and proceeds to
  3898. 80483fe, which is the next instruction pasts the if..else...
  3899. construct.
  3900. else branch is entered when cmp instruction from if branch is
  3901. true. else branch starts at 80483f7, which is the first
  3902. instruction of else branch. The instruction copies 0 to i, and
  3903. proceeds naturally to the next instruction pasts the
  3904. if...else... construct without any jump.
  3905. The Anatomy of a Program<chap:The-Anatomy-of-a-program>
  3906. Every program consists of code and data, and only those two
  3907. components made up a program. However, if a program consists
  3908. purely code and data of its own, from the perspective of an
  3909. operating system (as well as human), it does not know in a
  3910. program, which block of binary is a program and which is just raw
  3911. data, where in the program to start execution, which region of
  3912. memory should be protected and which is free to modify. For that
  3913. reason, each program carries extra metadata to communicate with
  3914. the operating system how to handle the program.
  3915. When a source file is compiled, the generated machine code is
  3916. stored into an object file[margin:
  3917. object file
  3918. ]object file, which is just a block of binary. One or more object
  3919. files can be combined to produce an executable binary[margin:
  3920. executable binary
  3921. ]executable binary, which is a complete program runnable in an
  3922. operating system.
  3923. readelf is a program that recognizes and displays the ELF
  3924. metadata of a binary file, be it an object file or an executable
  3925. binary. ELF, or Executable and Linkable Format, is the content at
  3926. the very beginning of an executable to provide an operating
  3927. system necessary information to load into main memory and run the
  3928. executable. ELF can be thought of similar to the table of
  3929. contents of a book. In a book, a table of contents list the page
  3930. numbers of the main sections, subsections, sometimes even figures
  3931. and tables for easy lookup. Similarly, ELF lists various sections
  3932. used for code and data, and the memory addresses of each symbol
  3933. along with other information.
  3934. An ELF binary is composed of:
  3935. • An ELF header[margin:
  3936. ELF header
  3937. ]ELF header: the very first section of an executable that
  3938. describes the file's organization.
  3939. • A Program header table[margin:
  3940. program header table
  3941. ]program header table: is an array of fixed-size structures that
  3942. describes segments of an executable.
  3943. • A Section header table[margin:
  3944. section header table
  3945. ]section header table: is an array of fixed-size structures that
  3946. describes sections of an executable.
  3947. • Segments and section[margin:
  3948. Segments and sections
  3949. ]Segments and sections are the main content of an ELF binary,
  3950. which are the code and data, divided into chunks of different
  3951. purposes.
  3952. A segmentsegment is a composition of zero or more sections and
  3953. is directly loaded by an operating system at runtime.
  3954. A sectionsection is a block of binary that is either:
  3955. – actual program code and data that is available in memory when
  3956. a program runs.
  3957. – metadata about other sections used only in the linking
  3958. process, and disappear from the final executable.
  3959. Linker uses sections to build segments.
  3960. [float Figure:
  3961. [Figure 0.16:
  3962. ELF - Linking View vs Executable View (Source: Wikipedia)
  3963. ]
  3964. <Graphics file: C:/Users/Tu Do/os01/book_src/images/05/Elf-layout--en.pdf>
  3965. ]
  3966. Later we will compile our kernel as an ELF executable with GCC,
  3967. and explicitly specify how segments are created and where they
  3968. are loaded in memory through the use a linker script, a text file
  3969. to instruct how a linker should generate a binary. For now, we
  3970. will examine the anatomy of an ELF executable in detail.
  3971. Reference documents:
  3972. The [margin:
  3973. ELF specification
  3974. ]ELF specification is bundled as a man page in Linux:
  3975. $ man elf
  3976. It is a useful resource to understand and implement ELF. However,
  3977. it will be much easier to use after you finish this chapter, as
  3978. the specification mixes implementation details in it.
  3979. The default specification is a generic one, in which every ELF
  3980. implementation follows. However, each platform provides extra
  3981. features unique to it. The ELF specification for x86 is currently
  3982. maintained on Github by H.J. Lu: https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI
  3983. .
  3984. Platform-dependent details are referred to as “processor specific”
  3985. in the generic ELF specification. We will not explore these
  3986. details, but study the generic details, which are enough for
  3987. crafting an ELF binary image for our operating system.
  3988. ELF header
  3989. To see the information of an ELF header:
  3990. $ readelf -h hello
  3991. The output:
  3992. ELF Header:
  3993. Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  3994. Class: ELF64
  3995. Data: 2's complement, little
  3996. endian
  3997. Version: 1 (current)
  3998. OS/ABI: UNIX - System V
  3999. ABI Version: 0
  4000. Type: EXEC (Executable file)
  4001. Machine: Advanced Micro Devices
  4002. X86-64
  4003. Version: 0x1
  4004. Entry point address: 0x400430
  4005. Start of program headers: 64 (bytes into file)
  4006. Start of section headers: 6648 (bytes into file)
  4007. Flags: 0x0
  4008. Size of this header: 64 (bytes)
  4009. Size of program headers: 56 (bytes)
  4010. Number of program headers: 9
  4011. Size of section headers: 64 (bytes)
  4012. Number of section headers: 31
  4013. Section header string table index: 28
  4014. Let's go through each field:
  4015. Magic
  4016. Displays the raw bytes that uniquely addresses a file is an ELF
  4017. executable binary. Each byte gives a brief information.
  4018. In the example, we have the following magic bytes:
  4019. Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  4020. Examine byte by byte:
  4021. Byte Description
  4022. -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  4023. 7f 45 4c 46 Predefined values. The first byte is always 7F, the remaining 3
  4024. bytes represent the string “ELF”.
  4025. 02 See Class field below.
  4026. 01 See Data field below.
  4027. 01 See Version field below.
  4028. 00 See OS/ABI field below.
  4029. 00 00 00 00 00 00 00 00 Padding bytes. These bytes are unused and are always set to 0.
  4030. Padding bytes are added for proper alignment, and is reserved for
  4031. future use when more information is needed.
  4032. Class
  4033. A byte in Magic field. It specifies the class or capacity of a
  4034. file.
  4035. Possible values:
  4036. Value Description
  4037. ---------------------------
  4038. 0 Invalid class
  4039. 1 32-bit objects
  4040. 2 64-bit objects
  4041. Data
  4042. A byte in Magic field. It specifies the data encoding of the
  4043. processor-specific data in the object file.
  4044. Possible values:
  4045. Value Description
  4046. ------------------------------------------
  4047. 0 Invalid data encoding
  4048. 1 Little endian, 2's complement
  4049. 2 Big endian, 2's complement
  4050. Version
  4051. A byte in Magic. It specifies the ELF header version number.
  4052. Possible values:
  4053. Value Description
  4054. ----------------------------
  4055. 0 Invalid version
  4056. 1 Current version
  4057. OS/ABI
  4058. A byte in Magic field. It specifies the target operating system
  4059. ABI. Originally, it was a padding byte.
  4060. Possible values: Refer to the latest ABI document, as it is a
  4061. long list of different operating systems.
  4062. Type
  4063. Identifies the object file type.
  4064. Value Description
  4065. -----------------------------------
  4066. ---------------------------------------------
  4067. 0 No file type
  4068. 1 Relocatable file
  4069. 2 Executable file
  4070. 3 Shared object file
  4071. 4 Core file
  4072. 0xff00 Processor specific, lower bound
  4073. 0xffff Processor specific, upper bound
  4074. The values from 0xff00 to 0xffff are reserved for a processor
  4075. to define additional file types meaningful to it.
  4076. Machine
  4077. Specifies the required architecture value for an ELF file e.g.
  4078. x86_64, MIPS, SPARC, etc. In the example, the machine is of x86_64
  4079. architecture.
  4080. Possible values: Please refer to the latest ABI document, as it
  4081. is a long list of different architectures.
  4082. Version
  4083. Specifies the version number of the current object file (not
  4084. the version of the ELF header, as the above Version field
  4085. specified).
  4086. Entry point address
  4087. Specifies the memory address where the very first code to be
  4088. executed. The address of main function is the default in a
  4089. normal application program, but it can be any function by
  4090. explicitly specifying the function name to gcc. For the
  4091. operating system we are going to write, this is the single most
  4092. important field that we need to retrieve to bootstrap our
  4093. kernel, and everything else can be ignored.
  4094. Start of program headers
  4095. The offset of the program header table, in bytes. In the
  4096. example, this number is 64 bytes, which means the 65th byte, or
  4097. <start address> + 64, is the start address of the program
  4098. header table. That is, if a program is loaded at address 0x10000
  4099. in memory, then the start address is 0x10000 (the very first
  4100. byte of Magic field, where the value 0x7f resides) and the
  4101. start address of program header table is 0x10000 + 0x40 = 0x10040
  4102. .
  4103. Start of section headers
  4104. The offset of the section header table in bytes, similar to the
  4105. start of program headers. In the example, it is 6648 bytes into
  4106. file.
  4107. Flags
  4108. Hold processor-specific flags associated with the file. When
  4109. the program is loaded, in a x86 machine, EFLAGS register is set
  4110. according to this value. In the example, the value is 0x0,
  4111. which means EFLAGS register is in a clear state.
  4112. Size of this header
  4113. Specifies the total size of ELF header's size in bytes. In the
  4114. example, it is 64 bytes, which is equivalent to Start of
  4115. program headers. Note that these two numbers are not necessary
  4116. equivalent, as program header table might be placed far away
  4117. from the ELF header. The only fixed component in the ELF
  4118. executable binary is the ELF header, which appears at the very
  4119. beginning of the file.
  4120. Size of program headers
  4121. Specifies the size of each program header in bytes. In the
  4122. example, it is 64 bytes.
  4123. Number of program headers
  4124. Specifies the total number of program headers. In the example,
  4125. the file has a total of 9 program headers.
  4126. Size of section headers
  4127. Specifies the size of each section header in bytes. In the
  4128. example, it is 64 bytes.
  4129. Number of section headers
  4130. Specifies the total number of section headers. In the example,
  4131. the file has a total of 31 section headers. In a section header
  4132. table, the first entry in the table is always an empty section.
  4133. Section header string table index
  4134. Specifies the index of the header in the section header table
  4135. that points to the section that holds all null-terminated
  4136. strings. In the example, the index is 28, which means it's the
  4137. 28[superscript:th] entry of the table.
  4138. Section header table
  4139. As we know already, code and data compose a program. However, not
  4140. all types of code and data have the same purpose. For that
  4141. reason, instead of a big chunk of code and data, they are divided
  4142. into smaller chunks, and each chunk must satisfy these conditions
  4143. (according to gABI):
  4144. • Every section in an object file has exactly one section header
  4145. describing it. But, section headers may exist that do not have
  4146. a section.
  4147. • Each section occupies one contiguous (possibly empty) sequence
  4148. of bytes within a file. That means, there's no two regions of
  4149. bytes that are the same section.
  4150. • Sections in a file may not overlap. No byte in a file resides
  4151. in more than one section.
  4152. • An object file may have inactive space. The various headers and
  4153. the sections might not “cover” every byte in an object file.
  4154. The contents of the inactive data are unspecified.
  4155. To get all the headers from an executable binary e.g. hello, use
  4156. the following command:
  4157. $ readelf -S hello
  4158. Here is a sample output (do not worry if you don't understand the
  4159. output. Just skim to get your eyes familiar with it. We will
  4160. dissect it soon enough):
  4161. There are 31 section headers, starting at offset 0x19c8:
  4162. Section Headers:
  4163. [Nr] Name Type Address
  4164. Offset
  4165. Size EntSize Flags Link Info
  4166. Align
  4167. [ 0] NULL 0000000000000000
  4168. 00000000
  4169. 0000000000000000 0000000000000000 0 0 0
  4170. [ 1] .interp PROGBITS 0000000000400238
  4171. 00000238
  4172. 000000000000001c 0000000000000000 A 0 0 1
  4173. [ 2] .note.ABI-tag NOTE 0000000000400254
  4174. 00000254
  4175. 0000000000000020 0000000000000000 A 0 0 4
  4176. [ 3] .note.gnu.build-i NOTE 0000000000400274
  4177. 00000274
  4178. 0000000000000024 0000000000000000 A 0 0 4
  4179. [ 4] .gnu.hash GNU_HASH 0000000000400298
  4180. 00000298
  4181. 000000000000001c 0000000000000000 A 5 0 8
  4182. [ 5] .dynsym DYNSYM 00000000004002b8
  4183. 000002b8
  4184. 0000000000000048 0000000000000018 A 6 1 8
  4185. [ 6] .dynstr STRTAB 0000000000400300
  4186. 00000300
  4187. 0000000000000038 0000000000000000 A 0 0 1
  4188. [ 7] .gnu.version VERSYM 0000000000400338
  4189. 00000338
  4190. 0000000000000006 0000000000000002 A 5 0 2
  4191. [ 8] .gnu.version_r VERNEED 0000000000400340
  4192. 00000340
  4193. 0000000000000020 0000000000000000 A 6 1 8
  4194. [ 9] .rela.dyn RELA 0000000000400360
  4195. 00000360
  4196. 0000000000000018 0000000000000018 A 5 0 8
  4197. [10] .rela.plt RELA 0000000000400378
  4198. 00000378
  4199. 0000000000000018 0000000000000018 AI 5 24 8
  4200. [11] .init PROGBITS 0000000000400390
  4201. 00000390
  4202. 000000000000001a 0000000000000000 AX 0 0 4
  4203. [12] .plt PROGBITS 00000000004003b0
  4204. 000003b0
  4205. 0000000000000020 0000000000000010 AX 0 0
  4206. 16
  4207. [13] .plt.got PROGBITS 00000000004003d0
  4208. 000003d0
  4209. 0000000000000008 0000000000000000 AX 0 0 8
  4210. [14] .text PROGBITS 00000000004003e0
  4211. 000003e0
  4212. 0000000000000192 0000000000000000 AX 0 0
  4213. 16
  4214. [15] .fini PROGBITS 0000000000400574
  4215. 00000574
  4216. 0000000000000009 0000000000000000 AX 0 0 4
  4217. [16] .rodata PROGBITS 0000000000400580
  4218. 00000580
  4219. 0000000000000004 0000000000000004 AM 0 0 4
  4220. [17] .eh_frame_hdr PROGBITS 0000000000400584
  4221. 00000584
  4222. 000000000000003c 0000000000000000 A 0 0 4
  4223. [18] .eh_frame PROGBITS 00000000004005c0
  4224. 000005c0
  4225. 0000000000000114 0000000000000000 A 0 0 8
  4226. [19] .init_array INIT_ARRAY 0000000000600e10
  4227. 00000e10
  4228. 0000000000000008 0000000000000000 WA 0 0 8
  4229. [20] .fini_array FINI_ARRAY 0000000000600e18
  4230. 00000e18
  4231. 0000000000000008 0000000000000000 WA 0 0 8
  4232. [21] .jcr PROGBITS 0000000000600e20
  4233. 00000e20
  4234. 0000000000000008 0000000000000000 WA 0 0 8
  4235. [22] .dynamic DYNAMIC 0000000000600e28
  4236. 00000e28
  4237. 00000000000001d0 0000000000000010 WA 6 0 8
  4238. [23] .got PROGBITS 0000000000600ff8
  4239. 00000ff8
  4240. 0000000000000008 0000000000000008 WA 0 0 8
  4241. [24] .got.plt PROGBITS 0000000000601000
  4242. 00001000
  4243. 0000000000000020 0000000000000008 WA 0 0 8
  4244. [25] .data PROGBITS 0000000000601020
  4245. 00001020
  4246. 0000000000000010 0000000000000000 WA 0 0 8
  4247. [26] .bss NOBITS 0000000000601030
  4248. 00001030
  4249. 0000000000000008 0000000000000000 WA 0 0 1
  4250. [27] .comment PROGBITS 0000000000000000
  4251. 00001030
  4252. 0000000000000034 0000000000000001 MS 0 0 1
  4253. [28] .shstrtab STRTAB 0000000000000000
  4254. 000018b6
  4255. 000000000000010c 0000000000000000 0 0 1
  4256. [29] .symtab SYMTAB 0000000000000000
  4257. 00001068
  4258. 0000000000000648 0000000000000018 30 47 8
  4259. [30] .strtab STRTAB 0000000000000000
  4260. 000016b0
  4261. 0000000000000206 0000000000000000 0 0 1
  4262. Key to Flags:
  4263. W (write), A (alloc), X (execute), M (merge), S (strings), l
  4264. (large)
  4265. I (info), L (link order), G (group), T (TLS), E (exclude), x
  4266. (unknown)
  4267. O (extra OS processing required) o (OS specific), p (processor
  4268. specific)
  4269. The first line:
  4270. There are 31 section headers, starting at offset 0x19c8
  4271. summarizes the total number of sections in the file, and where
  4272. the address where it starts. Then, comes the listing section by
  4273. section with the following header, is also the format of each
  4274. section output:
  4275. [Nr] Name Type Address Offset
  4276. Size EntSize Flags Link Info Align
  4277. Each section has two lines with different fields:
  4278. Nr The index of each section.
  4279. Name The name of each section.
  4280. Type This field (in a section header) identifies the type of
  4281. each section. Types classify sections (similar to types in
  4282. programming languages are used by a compiler).
  4283. Address The starting virtual address of each section. Note that
  4284. the addresses are virtual only when a program runs in an OS
  4285. with support for virtual memory enabled. In our OS, since we
  4286. run on bare metal, the addresses will all be physical.
  4287. Offset The offset of each section into a file. An [margin:
  4288. offset
  4289. ]offsetoffset is a distance in bytes, from the first byte of a
  4290. file to the start of an object, such as a section or a segment
  4291. in the context of an ELF binary file.
  4292. Size The size in bytes of each section.
  4293. EntSize Some sections hold a table of fixed-size entries, such
  4294. as a symbol table. For such a section, this member gives the
  4295. size in bytes of each entry. The member contains 0 if the
  4296. section does not hold a table of fixed-size entries.
  4297. Flags describes attributes of a section. Flags together with a
  4298. type defines the purpose of a section. Two sections can be of
  4299. the same type, but serve different purposes. For example, even
  4300. though .data and .text share the same type, .data holds the
  4301. initialized data of a program while .text holds executable
  4302. instructions of a program. For that reason, .data is given read
  4303. and write permission, but not executable. Any attempt to
  4304. execute code in .data is denied by the running OS: in Linux,
  4305. such invalid section usage gives a segmentation fault.
  4306. ELF gives information to enable an OS with such protection
  4307. mechanism. However, running on bare metal, nothing can prevent
  4308. from doing anything. Our OS can execute code in data section,
  4309. and vice versa, writing to code section.
  4310. [Table 5:
  4311. Section Flags
  4312. ]
  4313. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4314. | Flag | Descriptions |
  4315. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4316. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4317. | W | Bytes in this section are writable during execution. |
  4318. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4319. | A | Memory is allocated for this section during process execution.
  4320. Some control sections do not reside in the memory image of an
  4321. object file; this attribute is off for those sections. |
  4322. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4323. | X | The section contains executable instructions. |
  4324. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4325. | M | The data in the section may be merged to eliminate duplication.
  4326. Each element in the section is compared against other elements in
  4327. sections with the same name, type and flags. Elements that would
  4328. have identical values at program run-time may be merged. |
  4329. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4330. | S | The data elements in the section consist of null-terminated
  4331. character strings. The size of each character is specified in the
  4332. section header's EntSize field. |
  4333. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4334. | l | Specific large section for x86_64 architecture. This flag is not
  4335. specified in the Generic ABI but in x86_64 ABI. |
  4336. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4337. | I | The Info field of this section header holds an index of a section
  4338. header. Otherwise, the number is the index of something else. |
  4339. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4340. | L | Preserve section ordering when linking. If this section is
  4341. combined with other sections in the output file, it must appear
  4342. in the same relative order with respect to those sections, as the
  4343. linked-to section appears with respect to sections the linked-to
  4344. section is combined with. Apply when the Link field of this
  4345. section's header references another section (the linked-to
  4346. section) |
  4347. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4348. | G | This section is a member (perhaps the only one) of a section
  4349. group. |
  4350. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4351. | T | This section holds Thread-Local Storage, meaning that each thread
  4352. has its own distinct instance of this data. A thread is a
  4353. distinct execution flow of code. A program can have multiple
  4354. threads that pack different pieces of code and execute
  4355. separately, at the same time. We will learn more about threads
  4356. when writing our kernel. |
  4357. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4358. | E | Link editor is to exclude this section from executable and shared
  4359. library that it builds when those objects are not to be further
  4360. relocated. |
  4361. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4362. | x | Unknown flag to readelf. It happens because the linking process
  4363. can be done manually with a linker like GNU ld (we will later
  4364. later). That is, section flags can be specified manually, and
  4365. some flags are for a customized ELF that the open-source readelf
  4366. doesn't know of. |
  4367. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4368. | O | This section requires special OS-specific processing (beyond the
  4369. standard linking rules) to avoid incorrect behavior. A link
  4370. editor encounters sections whose headers contain OS-specific
  4371. values it does not recognize by Type or Flags values defined by
  4372. ELF standard, the link editor should combine those sections. |
  4373. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4374. | o | All bits included in this flag are reserved for operating
  4375. system-specific semantics. |
  4376. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4377. | p | All bits included in this flag are reserved for
  4378. processor-specific semantics. If meanings are specified, the
  4379. processor supplement explains them. |
  4380. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4381. Link and Info are numbers that references the indexes of
  4382. sections, symbol table entries, hash table entries. Link field
  4383. holds the index of a section, while Info field holds an index
  4384. of a section, a symbol table entry or a hash table entry,
  4385. depends on the type of a section.
  4386. Later when writing our OS, we will handcraft the kernel image
  4387. by explicitly linking the object files (produced by gcc)
  4388. through a linker script. We will specify the memory layout of
  4389. sections by specifying at what addresses they will appear in
  4390. the final image. But we will not assign any section flag and
  4391. let the linker take care of it. Nevertheless, knowing which
  4392. flag does what is useful.
  4393. Align is a value that enforces the offset of a section should
  4394. be divisible by the value. Only 0 and positive integral powers
  4395. of two are allowed. Values 0 and 1 mean the section has no
  4396. alignment constraint.
  4397. Output of .interp section:
  4398. [Nr] Name Type Address
  4399. Offset
  4400. Size EntSize Flags Link Info
  4401. Align
  4402. [ 1] .interp PROGBITS 0000000000400238
  4403. 00000238
  4404. 000000000000001c 0000000000000000 A 0 0 1
  4405. Nr is 1.
  4406. Type is PROGBITS, which means this section is part of the
  4407. program.
  4408. Address is 0x0000000000400238, which means the program is
  4409. loaded at this virtual memory address at runtime.
  4410. Offset is 0x00000238 bytes into file.
  4411. Size is 0x000000000000001c in bytes.
  4412. EntSize is 0, which means this section does not have any
  4413. fixed-size entry.
  4414. Flags are A (Allocatable), which means this section consumes
  4415. memory at runtime.
  4416. Info and Link are 0 and 0, which means this section links to no
  4417. section or entry in any table.
  4418. Align is 1, which means no alignment.
  4419. Output of the .text section:
  4420. [14] .text PROGBITS 00000000004003e0
  4421. 000003e0
  4422. 0000000000000192 0000000000000000 AX 0 0
  4423. 16
  4424. Nr is 14.
  4425. Type is PROGBITS, which means this section is part of the
  4426. program.
  4427. Address is 0x00000000004003e0, which means the program is
  4428. loaded at this virtual memory address at runtime.
  4429. Offset is 0x000003e0 bytes into file.
  4430. Size is 0x0000000000000192 in bytes.
  4431. EntSize is 0, which means this section does not have any
  4432. fixed-size entry.
  4433. Flags are A (Allocatable) and X (Executable), which means this
  4434. section consumes memory and can be executed as code at runtime.
  4435. Info and Link are 0 and 0, which means this section links to no
  4436. section or entry in any table.
  4437. Align is 16, which means the starting address of the section
  4438. should be divisible by 16, or 0x10. Indeed, it is: \mathtt{0x3e0/0x10=0x3e}
  4439. .
  4440. Understand Section in-depth
  4441. In this section, we will learn different details of section types
  4442. and the purposes of special sections e.g. .bss, .text, .data...
  4443. by looking at each section one by one. We will also examine the
  4444. content of each section as a hexdump with the commands:
  4445. $ readelf -x <section name|section number> <file>
  4446. For example, if you want to examine the content of section with
  4447. index 25 (the .bss section in the sample output) in the file
  4448. hello:
  4449. $ readelf -x 25 hello
  4450. Equivalently, using name instead of index works:
  4451. $ readelf -x .data hello
  4452. If a section contains strings e.g. string symbol table, the flag
  4453. -x can be replaced with -p.
  4454. NULL marks a section header as inactive and does not have an
  4455. associated section. NULL section is always the first entry of
  4456. section header table. It means, any useful section starts from
  4457. 1.
  4458. The sample output of NULL section:
  4459. [Nr] Name Type Address
  4460. Offset
  4461. Size EntSize Flags Link Info
  4462. Align
  4463. [ 0] NULL 0000000000000000
  4464. 00000000
  4465. 0000000000000000 0000000000000000 0 0
  4466. 0
  4467. Examining the content, the section is empty:
  4468. Section '' has no data to dump.
  4469. NOTE marks a section with special information that other
  4470. programs will check for conformance, compatibility... by a
  4471. vendor or a system builder.
  4472. In the sample output, we have 2 NOTE sections:
  4473. [Nr] Name Type Address
  4474. Offset
  4475. Size EntSize Flags Link Info
  4476. Align
  4477. [ 2] .note.ABI-tag NOTE 0000000000400254
  4478. 00000254
  4479. 0000000000000020 0000000000000000 A 0 0
  4480. 4
  4481. [ 3] .note.gnu.build-i NOTE 0000000000400274
  4482. 00000274
  4483. 0000000000000024 0000000000000000 A 0 0
  4484. 4
  4485. Examine 2nd section with the command:
  4486. $ readelf -x 2 hello
  4487. we have:
  4488. Hex dump of section '.note.ABI-tag':
  4489. 0x00400254 04000000 10000000 01000000 474e5500
  4490. ............GNU.
  4491. 0x00400264 00000000 02000000 06000000 20000000 ............
  4492. ...
  4493. PROGBITS indicates a section holding the main content of a
  4494. program, either code or data.
  4495. There are many PROGBITS sections:
  4496. [Nr] Name Type Address
  4497. Offset
  4498. Size EntSize Flags Link Info
  4499. Align
  4500. [ 1] .interp PROGBITS 0000000000400238
  4501. 00000238
  4502. 000000000000001c 0000000000000000 A 0 0
  4503. 1
  4504. ...
  4505. [11] .init PROGBITS 0000000000400390
  4506. 00000390
  4507. 000000000000001a 0000000000000000 AX 0 0
  4508. 4
  4509. [12] .plt PROGBITS 00000000004003b0
  4510. 000003b0
  4511. 0000000000000020 0000000000000010 AX 0 0
  4512. 16
  4513. [13] .plt.got PROGBITS 00000000004003d0
  4514. 000003d0
  4515. 0000000000000008 0000000000000000 AX 0 0
  4516. 8
  4517. [14] .text PROGBITS 00000000004003e0
  4518. 000003e0
  4519. 0000000000000192 0000000000000000 AX 0 0
  4520. 16
  4521. [15] .fini PROGBITS 0000000000400574
  4522. 00000574
  4523. 0000000000000009 0000000000000000 AX 0 0
  4524. 4
  4525. [16] .rodata PROGBITS 0000000000400580
  4526. 00000580
  4527. 0000000000000004 0000000000000004 AM 0 0
  4528. 4
  4529. [17] .eh_frame_hdr PROGBITS 0000000000400584
  4530. 00000584
  4531. 000000000000003c 0000000000000000 A 0 0
  4532. 4
  4533. [18] .eh_frame PROGBITS 00000000004005c0
  4534. 000005c0
  4535. 0000000000000114 0000000000000000 A 0 0
  4536. 8
  4537. ...
  4538. [23] .got PROGBITS 0000000000600ff8
  4539. 00000ff8
  4540. 0000000000000008 0000000000000008 WA 0 0
  4541. 8
  4542. [24] .got.plt PROGBITS 0000000000601000
  4543. 00001000
  4544. 0000000000000020 0000000000000008 WA 0 0
  4545. 8
  4546. [25] .data PROGBITS 0000000000601020
  4547. 00001020
  4548. 0000000000000010 0000000000000000 WA 0 0
  4549. 8
  4550. [27] .comment PROGBITS 0000000000000000
  4551. 00001030
  4552. 0000000000000034 0000000000000001 MS 0 0
  4553. 1
  4554. For our operating system, we only need the following section:
  4555. .text
  4556. This section holds all the compiled code of a program.
  4557. .data
  4558. This section holds the initialized data of a program. Since
  4559. the data are initialized with actual values, gcc allocates
  4560. the section with actual byte in the executable binary.
  4561. .rodata
  4562. This section holds read-only data, such as fixed-size strings
  4563. in a program, e.g. “Hello World”, and others.
  4564. .bss
  4565. This section, shorts for Block Started by Symbol, holds
  4566. uninitialized data of a program. Unlike other sections, no
  4567. space is allocated for this section in the image of the
  4568. executable binary on disk. The section is allocated only when
  4569. the program is loaded into main memory.
  4570. Other sections are mainly needed for dynamic linking, that is
  4571. code linking at runtime for sharing between many programs. To
  4572. enable such feature, an OS as a runtime environment must be
  4573. presented. Since we run our OS on bare metal, we are
  4574. effectively creating such environment. For simplicity, we won't
  4575. add dynamic linking to our OS.
  4576. SYMTAB and DYNSYM These sections hold symbol table. A symbol
  4577. table is an array of entries that describe symbols in a
  4578. program. A symbol is a name assigned to an entity in a program.
  4579. The types of these entities are also the types of symbols, and
  4580. these are the possible types of an entity:
  4581. In the sample output, section 5 and 29 are symbol tables:
  4582. [Nr] Name Type Address
  4583. Offset
  4584. Size EntSize Flags Link Info
  4585. Align
  4586. [ 5] .dynsym DYNSYM 00000000004002b8
  4587. 000002b8
  4588. 0000000000000048 0000000000000018 A 6 1
  4589. 8
  4590. ...
  4591. [29] .symtab SYMTAB 0000000000000000
  4592. 00001068
  4593. 0000000000000648 0000000000000018 30 47
  4594. 8
  4595. To show the symbol table:
  4596. $ readelf -s hello
  4597. Output consists of 2 symbol tables, corresponding to the two
  4598. sections above, .dynsym and .symtab:
  4599. Symbol table '.dynsym' contains 4 entries:
  4600. Num: Value Size Type Bind Vis Ndx
  4601. Name
  4602. 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
  4603. 1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND
  4604. puts@GLIBC_2.2.5 (2)
  4605. 2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND
  4606. __libc_start_main@GLIBC_2.2.5 (2)
  4607. 3: 0000000000000000 0 NOTYPE WEAK DEFAULT UND
  4608. __gmon_start__
  4609. Symbol table '.symtab' contains 67 entries:
  4610. Num: Value Size Type Bind Vis Ndx
  4611. Name
  4612. ..........................................
  4613. 59: 0000000000601040 0 NOTYPE GLOBAL DEFAULT 26
  4614. _end
  4615. 60: 0000000000400430 42 FUNC GLOBAL DEFAULT 14
  4616. _start
  4617. 61: 0000000000601038 0 NOTYPE GLOBAL DEFAULT 26
  4618. __bss_start
  4619. 62: 0000000000400526 32 FUNC GLOBAL DEFAULT 14
  4620. main
  4621. 63: 0000000000000000 0 NOTYPE WEAK DEFAULT UND
  4622. _Jv_RegisterClasses
  4623. 64: 0000000000601038 0 OBJECT GLOBAL HIDDEN 25
  4624. __TMC_END__
  4625. 65: 0000000000000000 0 NOTYPE WEAK DEFAULT UND
  4626. _ITM_registerTMCloneTable
  4627. 66: 00000000004003c8 0 FUNC GLOBAL DEFAULT 11
  4628. _init
  4629. TLS The symbol is associated with a Thread-Local Storage
  4630. entity.
  4631. Num is the index of an entry in a table.
  4632. Value is the virtual memory address where the symbol is
  4633. located.
  4634. Size is the size of the entity associated with a symbol.
  4635. Type is a symbol type according to table.
  4636. NOTYPE The type of a symbol is not specified.
  4637. OBJECT The symbol is associated with a data object. In C, any
  4638. variable definition is of OBJECT type.
  4639. FUNC The symbol is associated with a function or other
  4640. executable code.
  4641. SECTION The symbol is associated with a section, and exists
  4642. primarily for relocation.
  4643. FILE The symbol is the name of a source file associated with
  4644. an executable binary.
  4645. COMMON The symbol labels an uninitialized variable. That is,
  4646. when a variable in C is defined as global variable without
  4647. an initial value, or as an external variable using the
  4648. extern keyword. In other words, these variables stay in
  4649. .bss section.
  4650. Bind is the scope of a symbol.
  4651. LOCAL are symbols that are only visible in the object files
  4652. that defined them. In C, the static modifier marks a symbol
  4653. (e.g. a variable/function) as local to only the file that
  4654. defines it.
  4655. If we define variables and functions with static modifer:
  4656. static int global_static_var = 0;
  4657. static void local_func() {
  4658. }
  4659. int main(int argc, char *argv[])
  4660. {
  4661. static int local_static_var = 0;
  4662. return 0;
  4663. }
  4664. Then we get the static variables listed as local symbols
  4665. after compiling:
  4666. $ gcc -m32 hello.c -o hello
  4667. $ readelf -s hello
  4668. Symbol table '.dynsym' contains 5 entries:
  4669. Num: Value Size Type Bind Vis Ndx Name
  4670. 0: 00000000 0 NOTYPE LOCAL DEFAULT UND
  4671. 1: 00000000 0 FUNC GLOBAL DEFAULT UND
  4672. puts@GLIBC_2.0 (2)
  4673. 2: 00000000 0 NOTYPE WEAK DEFAULT UND
  4674. __gmon_start__
  4675. 3: 00000000 0 FUNC GLOBAL DEFAULT UND
  4676. __libc_start_main@GLIBC_2.0 (2)
  4677. 4: 080484bc 4 OBJECT GLOBAL DEFAULT 16
  4678. _IO_stdin_used
  4679. Symbol table '.symtab' contains 72 entries:
  4680. Num: Value Size Type Bind Vis Ndx Name
  4681. 0: 00000000 0 NOTYPE LOCAL DEFAULT UND
  4682. ......... output omitted .........
  4683. 38: 0804a020 4 OBJECT LOCAL DEFAULT 26
  4684. global_static_var
  4685. 39: 0804840b 6 FUNC LOCAL DEFAULT 14
  4686. local_func
  4687. 40: 0804a024 4 OBJECT LOCAL DEFAULT 26
  4688. local_static_var.1938
  4689. ......... output omitted .........
  4690. GLOBAL are symbols that are accessible by other object files
  4691. when linking together. These symbols are primarily
  4692. non-static functions and non-static global data. The extern
  4693. modifier marks a symbol as externally defined elsewhere but
  4694. is accessible in the final executable binary, so an extern
  4695. variable is also considered GLOBAL.
  4696. Similar to the LOCAL example above, the output lists many
  4697. GLOBAL symbols such as main:
  4698. Num: Value Size Type Bind Vis Ndx Name
  4699. ......... output omitted .........
  4700. 66: 080483e1 10 FUNC GLOBAL DEFAULT 14 main
  4701. ......... output omitted .........
  4702. WEAK are symbols whose definitions can be redefined.
  4703. Normally, a symbol with multiple definitions are reported
  4704. as an error by a compiler. However, this constraint is lax
  4705. when a definition is explicitly marked as weak, which means
  4706. the default implementation can be replaced by a different
  4707. definition at link time.
  4708. Suppose we have a default implementation of the function
  4709. add:
  4710. #include <stdio.h>
  4711. __attribute__((weak)) int add(int a, int b) {
  4712. printf("warning: function is not implemented.\n");
  4713. return 0;
  4714. }
  4715. int main(int argc, char *argv[])
  4716. {
  4717. printf("add(1,2) is %d\n", add(1,2));
  4718. return 0;
  4719. }
  4720. __attribute__((weak)) is a [margin:
  4721. function attribute
  4722. ]function attribute. A function attributefunction attribute is
  4723. extra information for a compiler to handle a function
  4724. differently from a normal function. In this example, weak
  4725. attribute makes the function add a weak function,which
  4726. means the default implementation can be replaced by a
  4727. different definition at link time. Function attribute is
  4728. a feature of a compiler, not standard C.
  4729. If we do not supply a different function definition in a
  4730. different file (must be in a different file, otherwise
  4731. gcc reports as an error), then the default implementation
  4732. is applied. When the function add is called, it only
  4733. prints the message: "warning: function not
  4734. implemented"and returns 0:
  4735. $ ./hello
  4736. warning: function is not implemented.
  4737. add(1,2) is 0
  4738. However, if we supply a different definition in another
  4739. file e.g. math.c:
  4740. int add(int a, int b) {
  4741. return a + b;
  4742. }
  4743. and compile the two files together:
  4744. $ gcc math.c hello.c -o hello
  4745. Then, when running hello, no warning message is printed
  4746. and the correct value is returned.
  4747. Weak symbol is a mechanism to provide a default
  4748. implementation, but replaceable when a better
  4749. implementation is available (e.g. more specialized and
  4750. optimized) at link-time.
  4751. Vis is the visibility of a symbol. The following values are
  4752. available:
  4753. [Table 6:
  4754. Symbol Visibility
  4755. ]
  4756. +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4757. | Value | Description |
  4758. +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4759. +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4760. | DEFAULT | The visibility is specified by the binding type of asymbol.
  4761. • Global and weak symbols are visible outside of their defining
  4762. component (executable file or shared object).
  4763. • Local symbols are hidden. See HIDDEN below. |
  4764. +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4765. | HIDDEN | A symbol is hidden when the name is not visible to any other
  4766. program outside of its running program. |
  4767. +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4768. | PROTECTED | A symbol is protected when it is shared outside of its running
  4769. program or shared libary and cannot be overridden. That is, there
  4770. can only be one definition for this symbol across running
  4771. programs that use it. No program can define its own definition of
  4772. the same symbol. |
  4773. +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4774. | INTERNAL | Visibility is processor-specific and is defined by
  4775. processor-specific ABI. |
  4776. +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4777. Ndx is the index of a section that the symbol is in. Aside from
  4778. fixed index numbers that represent section indexes, index has
  4779. these special values:
  4780. [Table 7:
  4781. Symbol Index
  4782. ]
  4783. +-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4784. | Value | Description |
  4785. +-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4786. +-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4787. | ABS | The index will not be changed by any symbol relocation. |
  4788. +-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4789. | COM | The index refers to an unallocated common block. |
  4790. +-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4791. | UND | The symbol is undefined in the current object file, which means
  4792. the symbol depends on the actual definition in another file.
  4793. Undefined symbols appears when the object file refers to symbols
  4794. that are available at runtime, from shared library. |
  4795. +-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4796. | LORESERVE
  4797. HIRESERVE | LORESERVE is the lower boundary of the reserve indexes. Its value
  4798. is 0xff00.
  4799. HIREVERSE is the upper boundary of the reserve indexes. Its value
  4800. is 0xffff.
  4801. The operating system reserves exclusive indexes between LORESERVE
  4802. and HIRESERVE, which do not map to any actual section header. |
  4803. +-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4804. | XINDEX | The index is larger than LORESERVE. The actual value will be
  4805. contained in the section SYMTAB_SHNDX, where each entry is a
  4806. mapping between a symbol, whose Ndx field is a XINDEX value, and
  4807. the actual index value. |
  4808. +-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4809. | Others | Sometimes, values such as ANSI_COM, LARGE_COM, SCOM, SUND appear.