You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

Operating Systems From 0 to 1.txt 552KB


  1. Operating Systems:
  2. From 0 to 1
  3. Tu, Do Hoang
  4. Table of Contents
  5. Preface
  6. Why another book on Operating Systems?
  7. Prerequisites
  8. What you will learn in this book
  9. What this book is not about
  10. The organization of the book
  11. Acknowledgments
  12. Part I Preliminary
  13. Domain documents
  14. Problem domains
  15. Documents for implementing a problem dom
  16. Software Requirement Document
  17. Software Specification
  18. Documents for writing an x86 Operating S
  19. The physical implementation of a bit
  20. MOSFET transistors
  21. Beyond transistors: digital logic gates
  22. The theory behind logic gates
  23. Logic Gate implementation: CMOS circuit
  24. Beyond Logic Gates: Machine Language
  25. Machine language
  26. Assembly Language
  27. Programming Languages
  28. Abstraction
  29. Why abstraction works
  30. Why abstraction reduces complexity
  31. Computer Architecture
  32. What is a computer?
  33. Server
  34. Desktop Computer
  35. Mobile Computer
  36. Game Consoles
  37. Embedded Computer
  38. Field Gate Programmable Array
  39. Application-Specific Integrated Circuit
  40. Computer Architecture
  41. Instruction Set Architecture
  42. Computer organization
  43. Hardware
  44. x86 architecture
  45. Intel Q35 Chipset
  46. x86 Execution Environment
  47. x86 Assembly and C
  48. objdump
  49. Reading the output
  50. Intel manuals
  51. Experiment with assembly code
  52. Anatomy of an Assembly Instruction
  53. Understand an instruction in detail
  54. Example: jmp instruction
  55. Examine compiled data
  56. Fundamental data types
  57. Pointer Data Types
  58. Bit Field Data Type
  59. String Data Types
  60. Examine compiled code
  61. Data Transfer
  62. Expressions
  63. Stack
  64. Automatic variables
  65. Function Call and Return
  66. Loop
  67. Conditional
  68. The Anatomy of a Program
  69. Reference documents:
  70. ELF header
  71. Section header table
  72. Understand Section in-depth
  73. Program header table
  74. Segments vs sections
  75. Runtime inspection and debug
  76. A sample program
  77. Static inspection of a program
  78. Command: info target/info file/info file
  79. Command: maint info sections
  80. Command: info functions
  81. Command: info variables
  82. Command: disassemble/disas
  83. Command: x
  84. Command: print/p
  85. Runtime inspection of a program
  86. Command: run
  87. Command: break/b
  88. Command: next/n
  89. Command: step/s
  90. Command: ni
  91. Command: si
  92. Command: until
  93. Command: finish
  94. Command: bt
  95. Command: up
  96. Command: down
  97. Command: info registers
  98. How debuggers work: A brief introduction
  99. How breakpoints work
  100. Single stepping
  101. How a debugger understands high level so
  102. Part II Groundwork
  103. Bootloader
  104. x86 Boot Process
  105. Using BIOS services
  106. Boot process
  107. Example Bootloader
  108. Compile and load
  109. Debugging
  110. Loading a program from bootloader
  111. Floppy Disk Anatomy
  112. Read and load sectors from a floppy disk
  113. Improve productivity with scripts
  114. Automate build with GNU Make
  115. GNU Make Syntax summary
  116. Automate debugging steps with GDB script
  117. Linking and loading on bare metal
  118. Understand relocations with readelf
  119. Offset
  120. Info
  121. Type
  122. Sym.Value
  123. Sym. Name
  124. Crafting ELF binary with linker scripts
  125. Example linker script
  126. Understand the custom ELF structure
  127. Manipulate the program segments
  128. C Runtime: Hosted vs Freestanding
  129. Debuggable bootloader on bare metal
  130. Debuggable program on bare metal
  131. Loading an ELF binary from a bootloader
  132. Debugging the memory layout
  133. Testing the new binary
  134. Part III Kernel Programming
  135. x86 Descriptors
  136. Basic operating system concepts
  137. Hardware Abstraction Layer
  138. System programming interface
  139. The need for an Operating System
  140. Drivers
  141. Userspace and kernel space
  142. Memory Segment
  143. Segment Descriptor
  144. Types of Segment Descriptors
  145. Code and Data descriptors
  146. Task Descriptor
  147. Interrupt Descriptor
  148. Descriptor Scope
  149. Global Descriptor
  150. Local Descriptor
  151. Segment Selector
  152. Enhancement: Bootloader with descriptors
  153. Process
  154. Concepts
  155. Process
  156. Task
  157. Process
  158. Scheduler
  159. Context switch
  160. Priority
  161. Preemptive vs Non-preemptive
  162. Process states
  163. procfs
  164. Threads
  165. Task: x86 concept of a process
  166. Task Data Structure
  167. Task State Segment
  168. Task Descriptor
  169. Process Implementation
  170. Requirements
  171. Major Plan
  172. Stage 1: Switch to a task from bootloade
  173. Stage 2: Switch to a task with one funct
  174. Stage 3: Switch to a task with many func
  175. Milestone: Code Refactor
  176. Interrupt
  177. Memory management
  178. Address Space
  179. Virtual Memory
  180. File System
  181. Example: Ex2 filesystem
  182. Bibliography
  183. Preface
  184. Greetings!
  185. You've probably asked yourself at least once how an operating
  186. system is written from the ground up. You might even have years
  187. of programming experience under your belt, yet your understanding
  188. of operating systems may still be a collection of abstract
  189. concepts not grounded in actual implementation. To those who've
  190. never built one, an operating system may seem like magic: a
  191. mysterious thing that can control hardware while handling a
  192. programmer's requests via the API of their favorite programming
  193. language. Learning how to build an operating system seems
  194. intimidating and difficult; no matter how much you learn, it
  195. never feels like you know enough. You're probably reading this
  196. book right now to gain a better understanding of operating
  197. systems to be a better software engineer.
  198. If that is the case, this book is for you. By going through this
  199. book, you will be able to find the missing pieces that are
  200. essential and enable you to implement your own operating system
  201. from scratch! Yes, from scratch without going through any
  202. existing operating system layer to prove to yourself that you are
  203. an operating system developer. You may ask,“Isn't it more
  204. practical to learn the internals of Linux?”.
  205. Yes...
  206. and no.
  207. Learning Linux can help your workflow at your day job. However,
  208. if you follow that route, you still won't achieve the ultimate
  209. goal of writing an actual operating system. By writing your own
  210. operating system, you will gain knowledge that you will not be
  211. able to glean just from learning Linux.
  212. Here's a list of some benefits of writing your own OS:
  213. • You will learn how a computer works at the hardware level, and
  214. you will learn to write software to manage that hardware
  215. directly.
  216. • You will learn the fundamentals of operating systems, allowing
  217. you to adapt to any operating system, not just Linux
  218. • To hack on Linux internals suitably, you'll need to write at
  219. least one operating system on your own. This is just like
  220. applications programming: to write a large application, you'll
  221. need to start with simple ones.
  222. • You will open pathways to various low-level programming domains
  223. such as reverse engineering, exploits, building virtual
  224. machines, game console emulation and more. Assembly language
  225. will become one of your most indispensable tools for low-level
  226. analysis. (But that does not mean you have to write your
  227. operating system in Assembly!)
  228. • Writing an operating system is fun!
  229. Why another book on Operating Systems?
  230. There are many books and courses on this topic made by famous
  231. professors and experts out there already. Who am I to write a
  232. book on such an advanced topic? While it's true that many quality
  233. resources exist, I find them lacking. Do any of them show you how
  234. to compile your C code and the C runtime library independent of
  235. an existing operating system? Most books on operating system
  236. design and implementation only discuss the software side; how the
  237. operating system communicates with the hardware is skipped.
  238. Important hardware details are skipped, and it's difficult for a
  239. self-learner to find relevant resources on the Internet. The aim
  240. of this book is to bridge that gap: not only will you learn how
  241. to program hardware directly, but also how to read official
  242. documents from hardware vendors to program it. You no longer have
  243. to seek out resources to help yourself interpret hardware manuals
  244. and documentation: you can do it yourself. Lastly, I wrote this
  245. book from an autodidact's perspective. I made this book as
  246. self-contained as possible so you can spend more time learning
  247. and less time guessing or seeking out information on the
  248. Internet.
  249. One of the core focuses of this book is to guide you through the
  250. process of reading official documentation from vendors to
  251. implement your software. Official documents from hardware vendors
  252. like Intel are critical for implementing an operating system or
  253. any other software that directly controls the hardware. At a
  254. minimum, an operating system developer needs to be able to
  255. comprehend these documents and implement software based on a set
  256. of hardware requirements. Thus, the first chapter is dedicated to
  257. discussing relevant documents and their importance.
  258. Another distinct feature of this book is that it is “Hello World”
  259. centric. Most examples revolve around variants of a “Hello World”
  260. program, which will acquaint you with core concepts. These
  261. concepts must be learned before attempting to write an operating
  262. system. Anything beyond a simple “Hello World” example gets in
  263. the way of teaching the concepts, thus lengthening the time spent
  264. on getting started writing an operating system.
  265. Let's dive in. With this book, I hope to provide enough
  266. foundational knowledge that will open doors for you to make sense
  267. of other resources. This book is will be especially beneficial to
  268. students who've just finished their first C/C++ course. Imagine
  269. how cool it would be to show prospective employers that you've
  270. already built an operating system.
  271. Prerequisites
  272. • Basic knowledge of circuits
  273. – Basic Concepts of Electricity: atoms, electrons, proton,
  274. neutron, current flow.
  275. – Ohm's law
  276. If you are unfamiliar with these concepts, you can quickly
  277. learn them here: http://www.allaboutcircuits.com/textbook/, by
  278. reading chapter 1 and chapter 2.
  279. • C programming. In particular:
  280. – Variable and function declarations/definitions
  281. – While and for loops
  282. – Pointers and function pointers
  283. – Fundamental algorithms and data structures in C
  284. • Linux basics:
  285. – Know how to navigate directory with the command line
  286. – Know how to invoke a command with options
  287. – Know how to pipe output to another program
  288. • Touch typing. Since we are going to use Linux, touch typing
  289. helps. I know typing speed does not relate to problem-solving,
  290. but at least your typing speed should be fast enough not to let
  291. it get it the way and degrade the learning experience.
  292. In general, I assume that the reader has basic C programming
  293. knowledge, and can use an IDE to build and run a program.
  294. What you will learn in this book
  295. • How to write an operating system from scratch by reading
  296. hardware datasheets. In the real world, you will not be able to
  297. consult Google for a quick answer.
  298. • Write code independently. It's pointless to copy and paste
  299. code. Real learning happens when you solve problems on your
  300. own. Some examples are provided to help kick start your work,
  301. but most problems are yours to conquer. However, the solutions
  302. are available online for you after giving a good try.
  303. • A big picture of how each layer of a computer related to each
  304. other, from hardware to software.
  305. • How to use Linux as a development environment and common tools
  306. for low-level programming.
  307. • How a program is structured so that an operating system can
  308. run.
  309. • How to debug a program running directly on hardware with gdb
  310. and QEMU.
  311. • Linking and loading on bare metal x86_64, with pure C. No
  312. standard library. No runtime overhead.
  313. What this book is not about
  314. • Electrical Engineering: The book discusses some concepts from
  315. electronics and electrical engineering only to the extent of
  316. how software operates on bare metal.
  317. • How to use Linux or any OS types of books: Though Linux is used
  318. as a development environment and as a medium to demonstrate
  319. high-level operating system concepts, it is not the focus of
  320. this book.
  321. • Linux Kernel development: There are already many high-quality
  322. books out there on this subject.
  323. • Operating system books focused on algorithms: This book focuses
  324. more on actual hardware platform - Intel x86_64 - and how to
  325. write an OS that utilizes of OS support from the hardware
  326. platform.
  327. The organization of the book
  328. Part 1 provides a foundation for learning operating system.
  329. • Chapter 1 briefly explains the importance of domain
  330. documents. Documents are crucial for the learning experience,
  331. so they deserve a chapter.
  332. • Chapter 2 explains the layers of abstractions from hardware
  333. to software. The idea is to provide insight into how code
  334. runs physically.
  335. • Chapter 3 provides the general architecture of a computer,
  336. then introduces a sample computer model that you will use to
  337. write an operating system.
  338. • Chapter 4 introduces the x86 assembly language through the
  339. use of the Intel manuals, along with commonly used
  340. instructions. This chapter gives detailed examples of how
  341. high-level syntax corresponds to low-level assembly, enabling
  342. you to read generated assembly code comfortably. It is
  343. necessary to read assembly code when debugging an operating
  344. system.
  345. • Chapter 5 dissects ELF in detail. Only by understanding how
  346. the structure of a program at the binary level, you can build
  347. one that runs on bare metal.
  348. • Chapter 6 introduces gdb debugger with extensive examples for
  349. commonly used commands. After acquainting the reader with
  350. gdb, it then provides insight on how a debugger works. This
  351. knowledge is essential for building a debuggable program on
  352. the bare metal.
  353. Part 2 presents how to write a bootloader to bootstrap a
  354. kernel. Hence the name “Groundwork”. After mastering this part,
  355. the reader can continue with the next part, which is a guide
  356. for writing an operating system. However, if the reader does not
  357. like the presentation, he or she can look elsewhere, such as
  358. the OSDev Wiki: http://wiki.osdev.org/.
  359. • Chapter 7 introduces what the bootloader is, how to write one
  360. in assembly, and how to load it on QEMU, a hardware emulator.
  361. This process involves typing repetitive and long commands, so
  362. GNU Make is applied to improve productivity by automating the
  363. repetitive parts and simplifying the interaction with the
  364. project. This chapter also demonstrates the use of GNU Make
  365. in context.
  366. • Chapter 8 introduces linking by explaining the relocation
  367. process when combining object files. In addition to a
  368. bootloader and an operating system written in C, this is the
  369. last piece of the puzzle required for building debuggable
  370. programs on bare metal, including the bootloader written in
  371. Assembly and an operating system written in C.
  372. Part 3 provides guidance on how to write an operating system,
  373. as you should implement an operating system on your own and be
  374. proud of your creation. The guidance consists of simpler and
  375. coherent explanations of necessary concepts, from hardware to
  376. software, to implement the features of an operating system.
  377. Without such guidance, you will waste time gathering
  378. information spread through various documents and the Internet.
  379. It then provides a plan on how to map the concepts to code.
  380. Acknowledgments
  381. Thank you, my beloved family. Thank you, the contributors.
  382. Preliminary
  383. Domain documents
  384. Problem domains
  385. In the real world, software engineering is not only focused on
  386. software, but also the problem domain it is trying to solve.
  387. A problem domain[margin:
  388. problem domain
  389. ]problem domain is the part of the world where the computer is to
  390. produce effects, together with the means available to produce
  391. them, directly or indirectly. (Kovitz, 1999)
  392. A problem domainproblem domain is anything outside of programming
  393. that a software engineer needs to understand to produce correct
  394. code that can achieve the desired effects. “Directly” means
  395. include anything that the software can control to produce the
  396. desired effects, e.g. keyboards, printers, monitors, other
  397. software... “Indirectly” means anything not part of the software
  398. but relevant to the problem domain e.g. appropriate people to be
  399. informed by the software when some event happens, students that
  400. move to correct classrooms according to the schedule generated by
  401. the software. To write a finance application, a software engineer
  402. needs to learn sufficient finance concepts to understand the [margin:
  403. requirements
  404. ]requirementsrequirements of a customer and implement such
  405. requirements, correctly.
  406. Requirements are the effects that the machine is to exert in the
  407. problem domain by virtue of its programming.
  408. Programming alone is not too complicated; programming to solve a
  409. problem domain, is [footnote:
  410. We refer to the concept of “programming” here as someone able to
  411. write code in a language, but not necessary know any or all
  412. software engineering knowledge.
  413. ]. Not only a software engineer needs to understand how to
  414. implement the software, but also the problem domain that it tries
  415. to solve, which might require in-depth expert knowledge. The
  416. software engineer must also select the right programming
  417. techniques that apply to the problem domain he is trying to
  418. solve because many techniques that are effective in one domain
  419. might not be in another. For example, many types of applications
  420. do not require performant written code, but a short time to
  421. market. In this case, interpreted languages are widely popular
  422. because it can satisfy such need. However, for writing huge 3D
  423. games or operating system, compiled languages are dominant
  424. because it can generate the most efficient code required for such
  425. applications.
  426. Often, it is too much for a software engineer to learn
  427. non-trivial domains (that might require a bachelor degree or
  428. above to understand the domains). Also, it is easier for a domain expert
  429. domain expert to learn enough programming to break down the
  430. problem domain into parts small enough for the software engineers
  431. to implement. Sometimes, domain experts implement the software
  432. themselves.
  433. [float Figure:
  434. [Figure 0.1:
  435. Problem domains: Software and Non-software.
  436. ]
  437. <Graphics file: C:/Users/Tu Do/os01/book_src/images/01/domains_general.pdf>
  438. ]
  439. One example of such scenario is the domain that is presented in
  440. this book: operating system. A certain amount of electrical
  441. engineering (EE) knowledge is required to implement an operating
  442. system. If a computer science (CS) curriculum that does not
  443. include minimum EE courses, students in the curriculum have
  444. little chance to implement a working operating system. Even if
  445. they can implement one, either they need to invest a significant
  446. amount of time to study on their own, or they fill code in a
  447. predefined framework just to understand high-level algorithms.
  448. For that reason, EE students have an easier time to implement an
  449. OS, as they only need to study a few core CS courses. In fact,
  450. only “C programming” and “Algorithms and Data Structures” classes
  451. are usually enough to get them started writing code for device
  452. drivers, and later generalize it into an operating system.
  453. [float Figure:
  454. [Figure 0.2:
  455. Operating System domain.
  456. ]
  457. <Graphics file: C:/Users/Tu Do/os01/book_src/images/01/domains_os_example.pdf>
  458. ]
  459. One thing to note is that software is its own problem domain. A
  460. problem domain does not necessarily divide between software and
  461. itself. Compilers, 3D graphics, games, cryptography, artificial
  462. intelligence, etc., are parts of software engineering domains
  463. (actually it is more of a computer science domain than a software
  464. engineering domain). In general, a software-exclusive domain
  465. creates software to be used by other software. Operating System
  466. is also a domain, but is overlapped with other domains such as
  467. electrical engineering. To effectively implement an operating
  468. system, it is required to learn enough of the external domain.
  469. How much learning is enough for a software engineer? At the
  470. minimum, a software engineer should be knowledgeable enough to
  471. understand the documents prepared by hardware engineers for using
  472. (i.e. programming) their devices.
  473. Learning a programming language, even C or Assembly, does not
  474. mean a software engineer can automatically be good at hardware
  475. programming or any related low-level programming domains. One can
  476. spend 10 years, 20 years or his entire life writing C/C++ code,
  477. and he still cannot write an operating system, simply because of
  478. the ignorance of relevant domain knowledge. Just like learning
  479. English does not mean a person automatically becomes good at
  480. reading Math books written in English. Much more than that is
  481. needed. Knowing one or two programming languages is not enough.
  482. If a programmer writes software for a living, he should better be
  483. specialized in one or two problem domains outside of software if
  484. he does not want his job taken by domain experts who learn
  485. programming in their spare time.
  486. Documents for implementing a problem domain
  487. Documents are essential for learning a problem domain (and
  488. actually, anything) since information can be passed down in a
  489. reliable way. It is evident that this written text has been used
  490. for thousands of years to pass knowledge from generations to
  491. generations. Documents are integral parts of non-trivial
  492. projects. Without the documents:
  493. • New people will find it much harder to join a project.
  494. • It is harder to maintain a project because people may forget
  495. important unresolved bugs or quirks in their system.
  496. • It is challenging for customers to understand the product they
  497. are going to use. However, documents do not need to be written
  498. in book format. It can be anything from HTML format to database
  499. format to be displayed by a graphical user interface. Important
  500. information must be stored somewhere safe, readily accessible.
  501. There are many types of documents. However, to facilitate the
  502. understanding of a problem domain, these two documents need to be
  503. written: software requirement document and software
  504. specification.
  505. Software Requirement Document
  506. Software requirement document[margin:
  507. Software requirement
  508. ]Software requirement document includes both a list of
  509. requirements and a description of the problem domain (Kovitz, 1999)
  510. .
  511. A software solves a business problem. But, which problems to
  512. solve, are requested by a customer. Many of these requests make a
  513. list of requirements that our software needs to fulfill. However,
  514. an enumerated list of features is seldom useful in delivering
  515. software. As stated in the previous section, the tricky part is
  516. not programming alone but programming according to a problem
  517. domain. The bulk of software design and implementation depends
  518. upon the knowledge of the problem domain. The better understood
  519. the domain, the higher quality software can be. For example,
  520. building a house is practiced over thousands of years and is well
  521. understood, and it is easy to build a high-quality house;
  522. software is no difference. Code that is difficult to understand,
  523. usually because of the ignorance of problem domain. In the
  524. context of this book, we sought to understand the low-level
  525. working of various hardware devices.
  526. Because software quality depends upon the understandings of the
  527. problem domain, the amount of software requirement document
  528. should consist of problem domain description.
  529. Be aware that software requirements are not:
  530. What vs How
  531. “what” and “how” are vague terms. What is the “what”? Is it
  532. nouns only? If so, what if a customer requires his software to
  533. perform specific steps of operations, such as purchasing
  534. procedure for a customer on a website. Does it include “verbs”
  535. now? However, isn't the “how” supposed to be step by step
  536. operations? Anything can be the “what” and anything can be the “
  537. how”.
  538. Sketches
  539. Software requirement document is all about the problem domain.
  540. It should not be a high-level description of an implementation.
  541. Some problems might seem straightforward to map directly from
  542. its domain description to the structure of an implementation.
  543. For example:
  544. • Users are given a list of books in a drop-down menu to
  545. choose.
  546. • Books are stored in a linked list”.
  547. • ...
  548. In the future, instead of a drop-down menu, all books are
  549. listed directly on a page in thumbnails. Books might be
  550. reimplemented as a graph, and each node is a book for finding
  551. related books, as a recommender is going to be added in the
  552. next version. The requirement document needs updating again to
  553. remove all the outdated implementation details, thus required
  554. additional efforts to maintain the requirement document, and
  555. when the effort for syncing with the implementation is too
  556. much, the developers give up documentation, and everyone starts
  557. ranting how useless documentation is.
  558. More often than not there is no straightforward one-to-one
  559. mapping. For example, a regular computer user expects OS to be
  560. something that runs some program with GUI, or their favorite
  561. computer games. But for such requirements, an operating system
  562. is implemented as multiple layers, each hides the details from
  563. the upper layers. To implement an operating system, a large
  564. body of knowledge from multiple fields are required, especially
  565. if the operating system runs on non-PC devices.
  566. It's better to put anything related to the problem domain in
  567. the requirement document. A good way to test the quality of
  568. requirement document is to hand it to the domain expert for
  569. proofreading if he can understand the material thoroughly.
  570. Requirement document is also useful as a help document later,
  571. or for writing one much easier.
  572. Software Specification
  573. Software specification[margin:
  574. Software specification
  575. ]Software specification document states rules relating desired
  576. behavior of the output devices to all possible behavior of the
  577. input devices, as well as any rules that other parts of the
  578. problem domain must obey.Kovitz (1999)
  579. Simply put, software specification is interface design, with
  580. constraints for the problem domain to follow e.g. the software
  581. can accept certain types of input such as the software is
  582. designed to accept English but no other language. For a hardware
  583. device, a specification is always needed, as software depends on
  584. its hardwired behaviors. And in fact, it is mostly the case that
  585. hardware specifications are well-defined, with the tiniest
  586. details in it. It needs to be that way because once hardware is
  587. physically manufactured, there's no going back, and if defects
  588. exist, it's a devastating damage to the company on both finance
  589. and reputation.
  590. Note that, similar to a requirement document, a specification
  591. only concerns interface design. If implementation details leak
  592. in, it is a burden to sync between the actual implementation and
  593. the specification, and soon to be abandoned.
  594. Another important remark is that, though a specification document
  595. is important, it does not have to be produced before the
  596. implementation. It can be prepared in any order: before or after
  597. a complete implementation; or at the same time with the
  598. implementation, when some part is done, and the interface is
  599. ready to be recorded in the specification. Regardless of methods,
  600. what matter is a complete specification at the end.
  601. Documents for writing an x86 Operating System
  602. When problem domain is different from software domain,
  603. requirement document and specification are usually separated.
  604. However, if the problem domain is inside software, specification
  605. most often includes both, and content of both can be mixed with
  606. each other. As demonstrated by previous sections the importance
  607. of documents, to implement an OS, we will need to collects
  608. relevant documents to gain sufficient domain knowledge. These
  609. documents are as follow:
  610. • Intel® 64 and IA-32 Architectures Software Developer’s Manual
  611. (Volume 1, 2, 3)
  612. • Intel® 3 Series Express Chipset Family Datasheet
  613. • System V Application Binary Interface
  614. Aside from the Intel's official website, the website of this book
  615. also hosts the documents for convenience[footnote:
  616. Intel may change the links to the documents as they update their
  617. website, so this book doesn't contain any link to the documents
  618. to avoid confusion for readers.
  619. ].
  620. Intel documents divide the requirement and specification sections
  621. clearly, but call the sections with different names. The
  622. corresponding to the requirement document is a section called “
  623. Functional Description”, which consists mostly of domain
  624. description; for specification, “Register Description” section
  625. describes all programming interfaces. Both documents carry no
  626. unnecessary implementation details[footnote:
  627. As it should be, those details are trade secret.
  628. ]. Intel documents are also great examples of how to write well
  629. requirements/specifications, as explained in this chapter.
  630. Other than the Intel documents, other documents will be
  631. introduced in the relevant chapters.
  632. This chapter gives an intuition on how hardware and software
  633. connected together, and how software is represented physically.
  634. The physical implementation of a bit
  635. All electronic devices, from simple to complex, manipulate this
  636. flow to achieve desired effects in the real world. Computers are
  637. no exception. When we write software, we indirectly manipulate
  638. electrical current at the physical level, in such a way that the
  639. underlying machine produces desired effects. To understand the
  640. process, we consider a simple light bulb. A light bulb can change
  641. two states between on and off with a switch, periodically: an off
  642. means number 0, and an on means 1.[float MarginFigure:
  643. [MarginFigure 1:
  644. A lightbulb
  645. ]
  646. <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/bulb.svg>
  647. ]
  648. However, one problem is that such a switch requires manual
  649. intervention from a human. What is required is an automatic
  650. switch based on the voltage level, as described above. To enable
  651. automatic switching of electrical signals, a device called
  652. transistor, invented by William Shockley, John Bardeen and Walter
  653. Brattain. This invention started the whole computer industry.
  654. At the core, a [margin:
  655. transistor
  656. ]transistortransistor is just a resistor whose values can vary
  657. based on an input voltage value[float MarginFigure:
  658. [MarginFigure 2:
  659. Modern transistor
  660. ]
  661. <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/transistor.svg>
  662. ]. With this property, a transistor can be used as a current
  663. amplifier (more voltage, less resistance) or switch electrical
  664. signals off and on (block and unblock an electron flow) based on
  665. a voltage level. At 0 v, no current can pass through a
  666. transistor, thus it acts like a circuit with an open switch
  667. (light bulb off) because the resistor value is enough to block
  668. the electrical flow. Similarly, at +3.5 v, current can flow
  669. through a transistor because the resistor value is lessened,
  670. effectively enables electron flow, thus acts like a circuit with
  671. a closed switch.[margin:
  672. If you want a deeper explanation of transistors e.g. how
  673. electrons move, you should look at the video “How semiconductors
  674. work” on Youtube, by Ben Eater.
  675. ]
  676. A bit has two states: 0 and 1, which is the building block of all
  677. digital systems and software. Similar to a light bulb that can be
  678. turned on and off, bits are made out of this electrical stream
  679. from the power source: Bit 0 are represented with 0 v (no
  680. electron flow), and bit 1 is +3.5 v to +5 v (electron flow).
  681. Transistor implements a bit correctly, as it can regulate the
  682. electron flow based on voltage level.
  683. MOSFET transistors
  684. The classic transistors invented open a whole new world of micro
  685. digital devices. Prior to the invention, vacuum tubes - which are
  686. just fancier light bulbs - were used to present 0 and 1, and
  687. required human to turn it on and off. [margin:
  688. MOSFET
  689. ]MOSFETMOSFET, or Metal–Oxide–Semiconductor Field-Effect
  690. Transistor, invented in 1959 by Dawon Kahng and Martin M. (John)
  691. Atalla at Bell Labs, is an improved version of classic
  692. transistors that is more suitable for digital devices, as it
  693. requires shorter switching time between two states 0 and 1, more
  694. stable, consumes less power and easier to produce.
  695. There are also two types of MOSFETs analogous to two types of
  696. transistors: n-MOSFET and p-MOSFET. n-MOSFET and p-MOSFET are
  697. also called NMOS and PMOS transistors for short.
  698. Beyond transistors: digital logic gates
  699. All digital devices are designed with logic gates. A logic gate[margin:
  700. logic gate
  701. ]logic gate is a device that implements a boolean function. Each
  702. logic gate includes a number of inputs and an output. All
  703. computer operations are built from the combinations of logic
  704. gates, which are just combinations of boolean functions. [float MarginFigure:
  705. [MarginFigure 3:
  706. Example: NAND gate
  707. ]
  708. <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/Nand-gate.svg>
  709. ]
  710. The theory behind logic gates
  711. Logic gates accept only binary inputs[footnote:
  712. Input that is either a 0 or 1.
  713. ] and produce binary outputs. In other words, logic gates are
  714. functions that transform binary values. Fortunately, a branch of
  715. math that deals exclusively with binary values already existed,
  716. called Boolean Algebra, developed in the 19[superscript:th]century by George Boole. With a sound mathematical theory as a
  717. foundation logic gates were created. As logic gates implement
  718. Boolean functions, a set of Boolean functions is functionally complete
  719. [margin:
  720. functionally complete
  721. ]functionally complete, if this set can construct all other
  722. Boolean functions can be constructed from. Later, Charles Sanders
  723. Peirce (during 1880 -- 1881) proved that either Boolean function
  724. of NOR or NAND alone is enough to create all other Boolean logic
  725. functions. Thus NOR and NAND gates are functionally complete Peirce (1933)
  726. . Gates are simply the implementations of Boolean logic
  727. functions, therefore NAND or NOR gate is enough to implement all
  728. other logic gates. The simplest gates CMOS circuit can implement
  729. are inverters (NOT gates) and from the inverters, comes NAND
  730. gates. With NAND gates, we are confident to implement everything
  731. else. This is why the inventions of transistors, then CMOS
  732. circuit revolutionized computer industry.[margin:
  733. If you want to understand why and how from NAND gate we can
  734. create all Boolean functions and a computer, I suggest the course
  735. Build a Modern Computer from First Principles: From Nand to
  736. Tetris available on Coursera: https://www.coursera.org/learn/build-a-computer
  737. . Go even further, after the course, you should take the series
  738. Computational Structures on Edx.
  739. ]
  740. We should realize and appreciate how powerful boolean functions
  741. are available in all programming languages.
  742. Logic Gate implementation: CMOS circuit
  743. Underlying every logic gate is a circuit called [margin:
  744. CMOS
  745. ]CMOSCMOS - Complementary MOSFET. CMOS consists of two
  746. complementary transistors, NMOS and PMOS. The simplest CMOS
  747. circuit is an inverter or a NOT gate:
  748. From NOT gate, a NAND gate can be created:
  749. From NAND gate, we have all other gates. As demonstrated, such a
  750. simple circuitry performs the logical operators in day-to-day
  751. program languages e.g. NOT operator ~ is executed directly by an
  752. inverter circuit, and operator & is executed by an AND circuit
  753. and so on. Code does not run on magic a black box. In contrast,
  754. code execution is precise and transparent, often as simple as
  755. running some hardwired circuit. When we write software, we simply
  756. manipulate electrical current at the physical level to run
  757. appropriate circuits to produce desired outcomes. However, this
  758. whole process somehow does not relate to any thought involving
  759. electrical current. That is the real magic and will be explained
  760. soon.
  761. One interesting property of CMOS is that a k-input gate uses k
  762. PMOS and k NMOS transistors (Wakerly, 1999). All logic gates are
  763. built by pairs of NMOS and PMOS transistors, and gates are the
  764. building blocks of all digital devices from simple to complex,
  765. including any computer. Thanks to this pattern, it is possible to
  766. separate between the actual physical circuit implementation and
  767. logical implementation. Digital designs are done by designing
  768. with logic gates then later be “compiled” into physical circuits.
  769. In fact, later we will see that logic gates become a language
  770. that describes how circuits operate. Understanding how CMOS works
  771. is important to understand how a computer is designed, and as a
  772. consequence, how a computer works[footnote:
  773. Again, if you want to understand how logic gates make a computer,
  774. consider the suggested courses on Coursera and Edx earlier.
  775. ].
  776. Finally, an implemented circuit with its wires and transistors is
  777. stored physically in a package called a chip. A chipchip is a
  778. substrate that an integrated circuit is etched onto. However, a
  779. chip also refers to a completely packaged integrated circuit in
  780. consumer market. Depends on the context, it is understood
  781. differently.[float MarginFigure:
  782. [MarginFigure 4:
  783. 74HC00 chip physical view
  784. ]
  785. <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/74hc00_nxp_physical.jpg>
  786. ]
  787. -------------------------------------------
  788. 74HC00 is a chip with four 2-input NAND gates. The chip comes
  789. with 8 input pins and 4 output pins, 1 pin for connecting to a
  790. voltage source and 1 pin for connecting to the ground. This
  791. device is the physical implementation of NAND gates that we can
  792. physically touch and use. But instead of just a single gate, the
  793. chip comes with 4 gates that can be combined. Each combination
  794. enables a different logic function, effective creating other
  795. logic gates. This feature is what make the chip popular.
  796. [float Figure:
  797. [Figure 0.3:
  798. 74HC00 logic diagrams (Source: 74HC00 datasheet, http://www.nxp.com/documents/data_sheet/74HC_HCT00.pdf
  799. )
  800. ]
  801. [float Figure:
  802. [Sub-Figure a:
  803. Logic diagram of 74HC00
  804. ]
  805. <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/7400_block_diagram.png>
  806. ] [float Figure:
  807. [Sub-Figure b:
  808. Logic diagram of one NAND gate
  809. ]
  810. <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/7400_logic_diagram.png>
  811. ]
  812. ]
  813. Each of the gates above is just a simple NAND circuit with the
  814. electron flows, as demonstrated earlier. Yet, many these
  815. NAND-gates chips combined can build a simple computer.
  816. Software, at the physical level, is just electron flows.
  817. How can the above gates can be created with 74HC00? It is
  818. simple: as every gate has 2 input pins and 1 output pin, we can
  819. write the output of 1 NAND gate to an input of another NAND
  820. gate, thus chaining NAND gates together to produce the diagrams
  821. as above.
  822. -------------------------------------------
  823. Beyond Logic Gates: Machine Language
  824. Machine language
  825. Being built upon gates, as gates only accept a series of 0 and 1,
  826. a hardware device only understands 0 and 1. However, a device
  827. only takes 0 and 1 in a systematic way. [margin:
  828. Machine language
  829. ]Machine languageMachine language is a collection of unique bit
  830. patterns that a device can identify and perform a corresponding
  831. action. A machine instruction is a unique bit pattern that a
  832. device can identify. In a computer system, a device with its
  833. language is called CPU - Central Processing Unit, which controls
  834. all activities going inside a computer. For example, in the x86
  835. architecture, the pattern 10100000 means telling a CPU to add two
  836. numbers, or 000000101 to halt a computer. In the early days of
  837. computers, people had to write completely in binary.
  838. Why does such a bit pattern cause a device to do something? The
  839. reason is that underlying each instruction is a small circuit
  840. that implements the instruction. Similar to how a
  841. function/subroutine in a computer program is called by its name,
  842. a bit pattern is a name of a little function inside a CPU that
  843. got executed when the CPU finds one.
  844. Note that CPU is not the only device with its language. CPU is
  845. just a name to indicate a hardware device that controls a
  846. computer system. A hardware device may not be a CPU but still has
  847. its language. A device with its own machine language is a
  848. programmable device, since a user can use the language to command
  849. the device to perform different actions. For example, a printer
  850. has its set of commands for instructing it how to prints a page.
  851. -------------------------------------------
  852. <exa:74HC00-chip-can>A user can use 74HC00 chip without knowing
  853. its internal, but only the interface for using the device. First,
  854. we need to know its layout:
  855. [float Figure:
  856. [Figure 0.4:
  857. 74HC00 Pin Layout (Source: 74HC00 datasheet, http://www.nxp.com/documents/data_sheet/74HC_HCT00.pdf
  858. )
  859. ]
  860. <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/7400_pin_configuration.pdf>
  861. ]
  862. Then, the functionality of each pin:
  863. [float Table:
  864. [Table 1:
  865. Pin Description (Source: 74HC00 datasheet, http://www.nxp.com/documents/data_sheet/74HC_HCT00.pdf
  866. )
  867. ]
  868. +-----------------------------+---------------+-----------------+
  869. | Symbol | Pin | Description |
  870. +------------------------------+---------------+----------------+
  871. | 1A to 4A | 1, 4, 9, 12 | data input |
  872. +------------------------------+---------------+----------------+
  873. | 1B to 4B | 2, 5, 10, 13 | data input |
  874. +------------------------------+---------------+----------------+
  875. | 1Y to 4Y | 3, 6, 8, 11 | data output |
  876. +------------------------------+---------------+----------------+
  877. | GND | 7 | ground (0 V) |
  878. +------------------------------+---------------+----------------+
  879. | V[subscript:cc][subscript:] | 14 | supply voltage |
  880. +------------------------------+---------------+----------------+
  881. ]
  882. Finally, how to use the pins:
  883. [float Table:
  884. [Table 2:
  885. Functional Description
  886. ]
  887. +------------+--------+
  888. | Input | Output |
  889. +-----+------+--------+
  890. | nA | nB | nY |
  891. +-----+------+--------+
  892. | L | X | H |
  893. +-----+------+--------+
  894. | X | L | H |
  895. +-----+------+--------+
  896. | H | H | L |
  897. +-----+------+--------+
  898. ]
  899. [margin:
  900. • n is a number, either 1, 2, 3, or 4
  901. • H = HIGH voltage level; L = LOW voltage level; X = don’t care.
  902. ]The functional description provides a truth table with all
  903. possible pin inputs and outputs, which also describes the usage
  904. of all pins in the device. A user needs not to know the
  905. implementation, but on such a table to use the device. We can
  906. say that the truth table above is the machine language of the
  907. device. Since the device is digital, its language is a
  908. collection of binary strings:
  909. • The device has 8 input pins, and this means it accepts binary
  910. strings of 8 bits.
  911. • The device has 4 output pins, and this means it produces
  912. binary strings of 4 bits from the 8-bit inputs.
  913. The number of input strings is what the device understand, and
  914. the number of output strings is what the device can speak.
  915. Together, they make the language of the device. Even though
  916. this device is simple, yet the language it can accept contains
  917. quite many binary strings: 2^{8}+2^{4}=272
  918. . However, the
  919. number is a tiny fraction of a complex device like a CPU, with
  920. hundreds of pins.
  921. When leaving as is, 74HC00 is simply a NAND device with two
  922. 4-bit inputs[footnote:
  923. Or simply 4-bit NAND gate, as it can only accept 4 bits of input
  924. at the maximum.
  925. ].
  926. +--------+-----------------------------------------------+----------------------+
  927. | | Input | Output |
  928. +--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+----+
  929. | Pin | 1A | 1B | 2A | 2B | 3A | 3B | 4A | 4B | 1Y | 2Y | 3Y | 4Y |
  930. +--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+----+
  931. | Value | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 1 |
  932. +--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+----+
  933. The inputs and outputs as visually presented:
  934. [float Figure:
  935. [Figure 0.5:
  936. Pins when receiving digital signals that correspond to a binary
  937. string. Green signals are inputs; blue signals are outputs.
  938. ] <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/7400_bin_string1.pdf>
  939. ]
  940. On the other hand, if OR gate is implemented, we can only build
  941. a 2-input OR gate from 74HC00, as it requires 3 NAND gates: 2
  942. input NAND gates and 1 output NAND gate. Each input NAND gate
  943. represents only a 1-bit input of the OR gate. In the following
  944. figure, the pins of each input NAND gates are always set to the
  945. same values (either both inputs are A or both inputs are B) to
  946. represent a single bit input for the final OR gate:
  947. [float Table:
  948. [Table 3:
  949. Truth table of OR logic diagram.
  950. ]
  951. +----+----+----+----+---+
  952. | A | B | C | D | Y |
  953. +----+----+----+----+---+
  954. | 0 | 0 | 1 | 1 | 0 |
  955. +----+----+----+----+---+
  956. | 0 | 1 | 1 | 0 | 1 |
  957. +----+----+----+----+---+
  958. | 1 | 0 | 0 | 1 | 1 |
  959. +----+----+----+----+---+
  960. | 1 | 1 | 0 | 0 | 1 |
  961. +----+----+----+----+---+
  962. ]
  963. -------------------------------------------
  964. To implement a 4-bit OR gate, we need a total of four of 74HC00
  965. chips configured as OR gates, packaged as a single chip as in
  966. figure [or-chip-74hc00].
  967. [float Figure:
  968. [Figure 0.6:
  969. 4-bit OR chip made from four 74HC00 devices
  970. ]<or-chip-74hc00>
  971. <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/4-bit-or-gate-layout.pdf>
  972. ]
  973. Assembly Language
  974. Assembly language is the symbolic representation of binary
  975. machine code, by giving bit patterns mnemonic names. It was a
  976. vast improvement when programmers had to write 0 and 1. For
  977. example, instead of writing 000000101, a programmer simply write
  978. hlt to stop a computer. Such an abstraction makes instructions
  979. executed by a CPU easier to remember, and thus more instructions
  980. could be memorized, less time spent looking up CPU manual to find
  981. instructions in bit forms and as a result, code was written
  982. faster.
  983. Understand assembly language is crucial for low-level programming
  984. domains, even to this day. The more instructions a programmer
  985. want to understand, the deeper understanding of machine
  986. architecture is required.
  987. We can build a device with 2 assembly instructions:
  988. or <op1>, <op2>
  989. nand <op1>, <op2>
  990. • or accepts two 4-bit operands. This corresponds to a 4-input
  991. OR gate device built from 4 74HC00 chips.
  992. • nand accepts two 4-bit operands. This corresponds to a single
  993. 74HC00 chips, leave as is.
  994. Essentially, the gates in the example [exa:74HC00-chip-can]
  995. implements the instructions. Up to this point, we only specify
  996. input and output and manually feed it to a device. That is, to
  997. perform an operation:
  998. • Pick a device by hands.
  999. • Manually put electrical signals into pins.
  1000. First, we want to automate the process of device selection.
  1001. That is, we want to simply write assembly instruction and the
  1002. device that implements the instruction is selected correctly.
  1003. Solving this problem is easy:
  1004. • Give each instruction an index in binary code, called
  1005. operation code or opcode for short, and embed it as part of
  1006. input. The value for each instruction is specified as in
  1007. table [ex-ins-ops].[float MarginTable:
  1008. [MarginTable 1:
  1009. Instruction-Opcode mapping.
  1010. ]<ex-ins-ops>
  1011. +--------------+-------------+
  1012. | Instruction | Binary Code |
  1013. +--------------+-------------+
  1014. +--------------+-------------+
  1015. | nand | 00 |
  1016. +--------------+-------------+
  1017. | or | 01 |
  1018. +--------------+-------------+
  1019. ]
  1020. Each input now contains additional data at the beginning: an
  1021. opcode. For example, the instruction:
  1022. nand 1100, 1100
  1023. corresponds to the binary string: 0011001100. The first two
  1024. bits 00 encodes a nand instruction, as listed in the table
  1025. above.
  1026. • Add another device to select a device, based on a binary code
  1027. peculiar to an instruction.
  1028. Such a device is called a decoder, an important component in a
  1029. CPU that decides which circuit to use. In the above example,
  1030. when feeding 0011001100 to the decoder, because the opcode is
  1031. 00, data are sent to NAND device for computing.
  1032. Finally, writing assembly code is just an easier way to write
  1033. binary strings that a device can understand. When we write
  1034. assembly code and save in a text file, a program called an [margin:
  1035. assembler
  1036. ]assemblerassembler translates the text file into binary strings
  1037. that a device can understand. So, how can an assembler exist in
  1038. the first place? Assume this is the first assembler in the
  1039. world, then it is written in binary code. In the next version,
  1040. life is easier: the programmers write the assembler in the
  1041. assembly code, then use the first version to compile itself.
  1042. These binary strings are then stored in another device that
  1043. later can be retrieved and sent to a decoder. A storage device[margin:
  1044. storage device
  1045. ]storage device is the device that stores machine instructions,
  1046. which is an array of circuits for saving 0 and 1 states.
  1047. A decoder is built out of logic gates similar to other digital
  1048. devices. However, a storage device can be anything that can
  1049. store 0 and 1 and is retrievable. A storage device can be a
  1050. magnetized device that uses magnetism to store information, or
  1051. it can be made out of electrical circuits using. Regardless of
  1052. the technology used, as long as the device can store data and
  1053. is accessible to retrieve data, it suffices. Indeed, the modern
  1054. devices are so complex that it is impossible and unnecessary to
  1055. understand every implementation detail. Instead, we only need
  1056. to learn the interfaces, e.g. the pins, that the devices
  1057. expose.
  1058. A computer essentially implements this process:
  1059. • Fetch an instruction from a storage device.
  1060. • Decode the instruction.
  1061. • Execute the instruction.
  1062. Or in short, a fetch -- decode -- executefetch -- decode --
  1063. execute cycle. The above device is extremely rudimentary, but
  1064. it already represents a computer with a fetch -- decode --
  1065. execute cycle. More instructions can be implemented by adding
  1066. more devices and allocating more opcodes for the instructions,
  1067. then update the decoder accordingly. The Apollo Guidance
  1068. Computer, a digital computer produced for the Apollo space
  1069. program from 1961 -- 1972, was built entirely with NOR gates -
  1070. the other choice to NAND gate for creating other logic gates.
  1071. Similarly, if we keep improving our hypothetical device, it
  1072. eventually becomes a full-fledge computer.
  1073. Programming Languages
  1074. Assembly language is a step up from writing 0 and 1. As time goes
  1075. by, people realized that many pieces of assembly code had
  1076. repeating patterns of usages. It would be nice if instead of
  1077. writing all the repeating blocks of code all over again in all
  1078. places, we simply refer to such blocks of code with easier to use
  1079. text forms. For example, a block of assembly code checks whether
  1080. one variable is greater than another and if so, execute a block
  1081. of code, else execute another block of code; in C, such block of
  1082. assembly code is represented by an if statement that is close to
  1083. human language.
  1084. [float Figure:
  1085. [Figure 0.7:
  1086. Repeated assembly patterns are generalized into a new language.
  1087. ]
  1088. <Graphics file: C:/Users/Tu Do/os01/book_src/images/02/asm_to_proglang.pdf>
  1089. ]
  1090. People created text forms to represent common blocks of assembly
  1091. code, such as the if syntax above, then write a program to
  1092. translate the text forms into assembly code. The program that
  1093. translates such text forms to machine code is called a [margin:
  1094. compiler
  1095. ]compilercompiler:
  1096. Any software logic a programming language can implement, hardware
  1097. can also implement. The reverse is also true: any hardware logic
  1098. that is implemented in a circuit can be reimplemented in a
  1099. programming language. The simple reason is that programming
  1100. languages, or assembly languages, or machine languages, or logic
  1101. gates are just languages to express computations. It is
  1102. impossible for software to implement something hardware is
  1103. incapable of because programming language is just a simpler way
  1104. to use the underlying hardware. At the end of the day,
  1105. programming languages are translated to machine instructions that
  1106. are valid to a CPU. Otherwise, code is not runnable, thus a
  1107. useless software. In reverse, software can do everything hardware
  1108. (that run the software) can, as programming languages are just an
  1109. easier way to use the hardware.
  1110. In reality, even though all languages are equivalent in power,
  1111. not all of them are capable of express programs of each other.
  1112. Programming languages vary between two ends of a spectrum: high
  1113. level and low level.
  1114. The higher level a programming language is, the distant it
  1115. becomes with hardware. In some high-level programming languages,
  1116. such as Python, a programmer cannot manipulate underlying
  1117. hardware, despite being able to deliver the same computations as
  1118. low-level programming languages. The reason is that high-level
  1119. languages want to hide hardware details to free programmers from
  1120. dealing with irrelevant details not related to current problem
  1121. domains. Such convenience, however, is not free: it requires
  1122. software to carry an extra code for managing hardware details
  1123. (e.g. memory) thus making the code run slower, and it makes
  1124. hardware programming difficult or impossible. The more
  1125. abstractions a programming language imposes, the more difficult
  1126. it is for writing low-level software, such as hardware drivers or
  1127. an operating system. This is the reason why C is usually a
  1128. language of choice for writing an operating system, since C is
  1129. just a thin wrapper of the underlying hardware, making it easy to
  1130. understand how exactly a hardware device runs when executing a
  1131. certain piece of C code.
  1132. Each programming language represents a way of thinking about
  1133. programs. Higher-level programming languages help to focus on
  1134. problem domains that are not related to hardware at all, and
  1135. where programmer performance is more important than computer
  1136. performance. Lower-level programming languages help to focus on
  1137. the inner-working of a machine, thus are best suited for problem
  1138. domains that are related to control hardware. That is why so many
  1139. languages exist. Use the right tools for the right job to achieve
  1140. the best results.
  1141. Abstraction
  1142. AbstractionAbstraction is a technique for hiding complexity that
  1143. is irrelevant to the problem in context. For example, writing
  1144. programs without any other layer except the lowest layer: with
  1145. circuits. Not only a person needs an in-depth understanding of
  1146. how circuits work, making it much more obscure to design a
  1147. circuit because the designer must look at the raw circuits but
  1148. think in higher-level such as logic gates. It is a distracting
  1149. process, as a designer must constantly translate the idea into
  1150. circuits. It is possible for a designer simply thinks his
  1151. high-level ideas straight, and later translate the ideas into
  1152. circuits. Not only it is more efficient, but it is also more
  1153. accurate as a designer can focus all his efforts into verifying
  1154. the design with high-level thinking. When a new designer arrives,
  1155. he can easily understand the high-level designs, thus can
  1156. continue to develop or maintain existing systems.
  1157. Why abstraction works
  1158. In all the layers, abstractions manifest itself:
  1159. • Logic gates abstract away the details of CMOS.
  1160. • Machine language abstracts away the details of logic gates.
  1161. • Assembly language abstracts away the details of machine
  1162. languages.
  1163. • Programming language abstracts away the details of assembly
  1164. languages.
  1165. We see repeating patterns of how lower-layers build upper-layers:
  1166. • A lower layer has a recurring pattern. Then, this recurring
  1167. pattern is taken out and built a language on top of it.
  1168. • A higher layer strips away layer-specific (non-recurring)
  1169. details to focus on the recurring details.
  1170. • The recurring details are given a new and simpler language than
  1171. the languages of the lower layers.
  1172. What to realize is that every layer is just a more convenient
  1173. language to describe the lower layer. Only after a description is
  1174. fully created with the language of the higher layer, it is then
  1175. be implemented with the language of the lower layer.
  1176. • CMOS layer has a recurring pattern that makes sure logic gates
  1177. are reliably translated to CMOS circuits: a k-input gate uses k
  1178. PMOS and k NMOS transistors (Wakerly, 1999). Since digital
  1179. devices use CMOS exclusively, a language arose to describe
  1180. higher level ideas while hiding CMOS circuits: Logic Gates.
  1181. • Logic Gates hides the language of circuits and focuses on how
  1182. to implement primitive Boolean functions and combine them to
  1183. create new functions. All logic gates receive input and
  1184. generate output as binary numbers. Thanks to this recurring
  1185. patterns, logic gates are hidden away for the new language:
  1186. Assembly, which is a set of predefined binary patterns that
  1187. cause the underlying gates to perform an action.
  1188. • Soon, people realized that many recurring patterns arisen from
  1189. within Assembly language. Repeated blocks of Assembly code
  1190. appear in Assembly source files that express the same or
  1191. similar idea. There were many such ideas that can be reliably
  1192. translated into Assembly code. Thus, the ideas were extracted
  1193. for building into the high level programming languages that
  1194. everyone programmer learns today.
  1195. Recurring patterns are the key to abstraction. Recurring patterns
  1196. are why abstraction works. Without them, no language can be
  1197. built, and thus no abstraction. Fortunately, human already
  1198. developed a systematic discipline for studying patterns:
  1199. Mathematics. As quoted from the British mathematician G. H. Hardy
  1200. (2005):
  1201. A mathematician, like a painter or a poet, is a maker of
  1202. patterns. If his patterns are more permanent than theirs, it is
  1203. because they are made with ideas.
  1204. Isn't that a mathematical formula a representation of a pattern?
  1205. A variable represents values with the same properties given by
  1206. constraints? Mathematics provides a formal system to identify and
  1207. describe existing patterns in nature. For that reason, this
  1208. system can certainly be applied in the digital world, which is
  1209. just a subset of the real world. Mathematics can be used as a
  1210. common language to help translation between layers easier, and
  1211. help with the understanding of layers.
  1212. Why abstraction reduces complexity
  1213. Abstraction by building language certainly leverages productivity
  1214. by stripping irrelevant details to a problem. Imagine writing
  1215. programs without any other layout except the lowest layer: with
  1216. circuits. This is how complexity emerges: when high-level ideas
  1217. are expressed with lower-level language, as the example above
  1218. demonstrated. Unfortunately, this is the case with software as
  1219. programming languages at the moment are more emphasized on
  1220. software rather than the problem domains. That is, without prior
  1221. knowledge, code written in a language is unable to express itself
  1222. the knowledge of its target domain. In other words, a language is
  1223. expressive if its syntax is designed to express the problem
  1224. domain it is trying to solve. Consider this example: That is, the
  1225. what it will do rather the how it will do.
  1226. -------------------------------------------
  1227. Graphviz (http://www.graphviz.org/) is a visualization software
  1228. that provides a language, called dot, for describing graph:
  1229. As can be seen, the code perfectly expresses itself how the
  1230. graph is connected. Even a non-programmer can understand and
  1231. use such language easily. If it were to implement in C, it
  1232. would be more troublesome, and this is assuming that the
  1233. functions for drawing graphs are already available. To draw a
  1234. line, in C we might write something like:
  1235. draw_line(a, b);
  1236. However, it is still verbose compared with:
  1237. a -> b;
  1238. Also, a and b must be defined in C, compared to the implicit
  1239. nodes in the dot language. However, if we do not factor in the
  1240. verbosity, then C still has a limitation: it cannot change its
  1241. syntax to suit the problem domain. A domain-specific language
  1242. might even be more verbose, but it makes a domain more
  1243. understandable. If a problem domain must be expressed in C,
  1244. then it is constraint by the syntax of C. Since C is not a
  1245. specialized language for a problem domain that, but is a
  1246. general-purpose programming language, the domain knowledge is
  1247. buried within the implementation details. As a result, a C
  1248. programmer is needed to decipher and extract the domain
  1249. knowledge out. If the domain knowledge cannot be extracted,
  1250. then the software cannot be further developed.
  1251. Linux is full of applications controlled by many domain-specific
  1252. languages and are placed in /etc directory, such as a web server.
  1253. Instead of reprogramming the software, a domain-agnostic language
  1254. is made for it.
  1255. -------------------------------------------
  1256. In general, code that can express a problem domain must be
  1257. understandable by a domain expert. Even within the software
  1258. domain, building a language out of repeated programming patterns
  1259. is useful. It helps people aware the existence of such patterns
  1260. in code and thus making software easier to maintain, as software
  1261. structure is visible as a language. Only a programming language
  1262. that is capable of morphing itself to suit a problem domain can
  1263. achieve that goal. Such language is called a programmable
  1264. programming language. Unfortunately, this approach of turning
  1265. software structure visible is not favored among programmers, as a
  1266. new language must be made out of it along with new toolchain to
  1267. support it. Thus, software structure and domain knowledge are
  1268. buried within code written in the syntax of a general-purpose
  1269. language, and if a programmer is not familiar or even aware of
  1270. the existence of a code pattern, then it is hopeless to
  1271. understand the code. A prime example is reading C code that
  1272. controls hardware, e.g. an operating system: if a programmer
  1273. knows absolutely nothing about hardware, then it is impossible to
  1274. read and write operating system code in C, even if he could have
  1275. 20 years of writing application C code.
  1276. With abstraction, a software engineer can also understand the
  1277. inner-working of a device without specialized knowledge of
  1278. physical circuit design, enables the software engineer to write
  1279. code that controls a device. The separation between logical and
  1280. physical implementation also entails that gate designs can be
  1281. reused even when the underlying technologies changed. For
  1282. example, in some distant future biological computer could be a
  1283. reality, and gates might not be implemented as CMOS but some kind
  1284. of biological cells e.g. as living cells; in either technology:
  1285. electrical or biological, as long as logic gates are physically
  1286. realized, the same computer design could be implemented.
  1287. Computer Architecture
  1288. To write lower level code, a programmer must understand the
  1289. architecture of a computer. It is similar to when one writes
  1290. programs in a software framework, he must know what kinds of
  1291. problems the framework solves, and how to use the framework by
  1292. its provided software interfaces. But before getting to the
  1293. definition of what computer architecture is, we must understand
  1294. what exactly is a computer, as many people still think that a
  1295. computer is a regular computer we put on a desk, or at best, a
  1296. server. Computers come in various shapes and sizes and are
  1297. devices that people never imagine they are computers, and that
  1298. code can run on such devices.
  1299. What is a computer?
  1300. A [margin:
  1301. computer
  1302. ]computercomputer is a hardware device that consists of at least
  1303. a processor (CPU), a memory device and input/output interfaces.
  1304. All the computers can be grouped into two types:
  1305. Single-purpose computer is a computer built at the hardware
  1306. level for specific tasks. For example, dedicated application
  1307. encoders/decoders , timer, image/video/sound processors.
  1308. General-purpose computer is a computer that can be programmed
  1309. (without modifying its hardware) to emulate various features of
  1310. single-purpose computers.
  1311. Server
  1312. A server[margin:
  1313. server
  1314. ]server is a general-purpose high-performance computer with huge
  1315. resources to provide large-scale services for a broad audience.
  1316. The audience are people with their personal computer connected to
  1317. a server.
  1318. [float Figure:
  1319. [Figure 0.8:
  1320. Blade servers. Each blade server is a computer with a modular
  1321. design optimize for the use of physical space and energy. The
  1322. enclosure of blade servers is called a chassis.(Source: [https://commons.wikimedia.org/wiki/File:Wikimedia_Foundation_Servers-8055_35.jpg||Wikimedia]
  1323. , author: Victorgrigas)
  1324. ]
  1325. <Graphics file: C:/Users/Tu Do/os01/book_src/images/03/Wikimedia_Foundation_Servers-8055_35.jpg>
  1326. ]
  1327. Desktop Computer
  1328. A [margin:
  1329. desktop computer
  1330. ]desktop computerdesktop computer is a general-purpose computer
  1331. with an input and output system designed for a human user, with
  1332. moderate resources enough for regular use. The input system
  1333. usually includes a mouse and a keyboard, while the output system
  1334. usually consists of a monitor that can display a large mount of
  1335. pixels. The computer is enclosed in a chassis large enough for
  1336. putting various computer components such as a processor, a
  1337. motherboard, a power supply, a hard drive, etc.
  1338. [float Figure:
  1339. [Figure 0.9:
  1340. A typical desktop computer.
  1341. ]
  1342. <Graphics file: C:/Users/Tu Do/os01/book_src/images/03/computer-158675.svg>
  1343. ]
  1344. Mobile Computer
  1345. A mobile computer[margin:
  1346. mobile computer
  1347. ]mobile computer is similar to a desktop computer with fewer
  1348. resources but can be carried around.
  1349. Game Consoles
  1350. Game consoles are similar to desktop computers but are optimized
  1351. for gaming. Instead of a keyboard and a mouse, the input system
  1352. of a game console are game controllers, which is a device with a
  1353. few buttons for controlling on-screen objects; the output system
  1354. is a television. The chassis is similar to a desktop computer but
  1355. is smaller. Game consoles use custom processors and graphic
  1356. processors but are similar to ones in desktop computers. For
  1357. example, the first Xbox uses a custom Intel Pentium III
  1358. processor.
  1359. Handheld game consoles are similar to game consoles, but
  1360. incorporate both the input and output systems along with the
  1361. computer in a single package.
  1362. Embedded Computer
  1363. An [margin:
  1364. embedded computer
  1365. ]embedded computerembedded computer is a single-board or
  1366. single-chip computer with limited resources designed for
  1367. integrating into larger hardware devices. [float MarginFigure:
  1368. [MarginFigure 5:
  1369. An Intel 82815 Graphics and Memory Controller Hub embedded on a
  1370. PC motherboard. (Source: [https://commons.wikimedia.org/wiki/File:Intel_82815_GMCH.jpg||Wikimedia]
  1371. , author: Qurren)
  1372. ]
  1373. <Graphics file: C:/Users/Tu Do/os01/book_src/images/03/Intel_82815_GMCH.jpg>
  1374. ][float MarginFigure:
  1375. [MarginFigure 6:
  1376. A PIC microcontroller. (Soure: [http://www.microchip.com/wwwproducts/en/PIC18F4620||Microchip]
  1377. )
  1378. ]
  1379. <Graphics file: C:/Users/Tu Do/os01/book_src/images/03/medium-PIC18F4620-PDIP-40.png>
  1380. ]
  1381. A [margin:
  1382. microcontroller
  1383. ]microcontrollerMicrocontroller is an embedded computer designed
  1384. for controlling other hardware devices. A microcontroller is
  1385. mounted on a chip. Microcontrollers are general-purpose
  1386. computers, but with limited resources so that it is only able to
  1387. perform one or a few specialized tasks. These computers are used
  1388. for a single purpose, but they are still general-purpose since it
  1389. is possible to program them to perform different tasks, depends
  1390. on the requirements, without changing the underlying hardware.
  1391. Another type of embedded computer is system-on-chip. A
  1392. system-on-chipsystem-on-chip is a full computer on a single chip.
  1393. Though a microcontroller is housed on a chip, its purpose is
  1394. different: to control some hardware. A microcontroller is usually
  1395. simpler and more limited in hardware resources as it specializes
  1396. only in one purpose when running, whereas a system-on-chip is a
  1397. general-purpose computer that can serve multiple purposes. A
  1398. system-on-chip can run like a regular desktop computer that is
  1399. capable of loading an operating system and run various
  1400. applications. A system-on-chip typically presents in a
  1401. smartphone, such as Apple A5 SoC used in Ipad2 and iPhone 4S, or
  1402. Qualcomm Snapdragon used in many Android phones.[float MarginFigure:
  1403. [MarginFigure 7:
  1404. Apple A5 SoC
  1405. ]
  1406. <Graphics file: C:/Users/Tu Do/os01/book_src/images/03/128px-Apple_A5_Chip.jpg>
  1407. ]
  1408. Be it a microcontroller or a system-on-chip, there must be an
  1409. environment where these devices can connect to other devices.
  1410. This environment is a circuit board called a PCBPCB -- Printed Circuit Board
  1411. Printed Circuit Board. A printed circuit boardPrinted Circuit Board
  1412. is a physical board that contains lines and pads to enable
  1413. electron flows between electrical and electronics components.
  1414. Without a PCB, devices cannot be combined to create a larger
  1415. device. As long as these devices are hidden inside a larger
  1416. device and contribute to a larger device that operates at a
  1417. higher level layer for a higher level purpose, they are embedded
  1418. devices. Writing a program for an embedded device is therefore
  1419. called embedded programmingembedded programming. Embedded
  1420. computers are used in automatically controlled devices including
  1421. power tools, toys, implantable medical devices, office machines,
  1422. engine control systems, appliances, remote controls and other
  1423. types of embedded systems.
  1424. The line between a microcontroller and a system-on-chip is
  1425. blurry. If hardware keeps evolving more powerful, then a
  1426. microcontroller can get enough resources to run a minimal
  1427. operating system on it for multiple specialized purposes. In
  1428. contrast, a system-on-chip is powerful enough to handle the job
  1429. of a microcontroller. However, using a system-on-chip as a
  1430. microcontroller would not be a wise choice as price will rise
  1431. significantly, but we also waste hardware resources since the
  1432. software written for a microcontroller requires little computing
  1433. resources.
  1434. Field Gate Programmable Array
  1435. [margin:
  1436. Field Programmable Gate Array
  1437. ]Field Programmable Gate ArrayField Gate Programmable Array (FPGA
  1438. FPGA) is a hardware an array of reconfigurable gates that makes
  1439. circuit structure programmable after it is shipped away from the
  1440. factory[footnote:
  1441. This is why it is called Field Gate Programmable Array. It is
  1442. changeable “in the field” where it is applied.
  1443. ]. Recall that in the previous chapter, each 74HC00 chip can be
  1444. configured as a gate, and a more sophisticated device can be
  1445. built by combining multiple 74HC00 chips. In a similar manner,
  1446. each FPGA device contains thousands of chips called logic blocks,
  1447. which is a more complicated chip than a 74HC00 chip that can be
  1448. configured to implement a Boolean logic function. These logic
  1449. blocks can be chained together to create a high-level hardware
  1450. feature. This high-level feature is usually a dedicated algorithm
  1451. that needs high-speed processing.
  1452. [float Figure:
  1453. [Figure 0.10:
  1454. FPGA Architecture (Source: [http://www.ni.com/tutorial/6097/en/||National Instruments]
  1455. )
  1456. ]
  1457. <Graphics file: C:/Users/Tu Do/os01/book_src/images/03/fpga_400x212.jpg>
  1458. ]
  1459. Digital devices can be designed by combining logic gates, without
  1460. regarding actual circuit components, since the physical circuits
  1461. are just multiples of CMOS circuits. Digital hardware, including
  1462. various components in a computer, is designed by writing code,
  1463. like a regular programmer, by using a language to describe how
  1464. gates are wired together. This language is called a Hardware
  1465. Description LanguageHardware Description Language. Later the
  1466. hardware description is compiled to a description of connected
  1467. electronic components called a netlistnetlist, which is a more
  1468. detailed description of how gates are connected.
  1469. The difference between FPGA and other embedded computers is that
  1470. programs in FPGA are implemented at the digital logic level,
  1471. while programs in embedded computers like microcontrollers or
  1472. system-on-chip devices are implemented at assembly code level. An
  1473. algorithm written for a FPGA device is a description of the
  1474. algorithm in logic gates, which the FPGA device then follows the
  1475. description to configure itself to run the algorithm. An
  1476. algorithm written for a microcontroller is in assembly
  1477. instructions that a processor can understand and act accordingly.
  1478. FPGA is applied in the cases where the specialized operations are
  1479. unsuitable and costly to run on a regular computer such as
  1480. real-time medical image processing, cruise control system,
  1481. circuit prototyping, video encoding/decoding, etc. These
  1482. applications require high-speed processing that is not achievable
  1483. with a regular processor because a processor wastes a significant
  1484. amount of time in executing many non-specialized instructions -
  1485. which might add up to thousands of instructions or more - to
  1486. implement a specialized operation, thus more circuits at physical
  1487. level to carry the same operation. A FPGA device carries no such
  1488. overhead; instead, it runs a single specialized operation
  1489. implemented in hardware directly.
  1490. Application-Specific Integrated Circuit
  1491. An Application-Specific Integrated CircuitApplication-Specific
  1492. Integrated Circuit (or ASICASIC) is a chip designed for a
  1493. particular purpose rather than for general-purpose use. ASIC does
  1494. not contain a generic array of logic blocks that can be
  1495. reconfigured to adapt to any operation like an FPGA; instead,
  1496. every logic block in an ASIC is made and optimized for the
  1497. circuit itself. FPGA can be considered as the prototyping stage
  1498. of an ASIC, and ASIC as the final stage of circuit production.
  1499. ASIC is even more specialized than FPGA, so it can achieve even
  1500. higher performance. However, ASICs are very costly to manufacture
  1501. and once the circuits are made, if design errors happen,
  1502. everything is thrown away, unlike the FPGA devices which can
  1503. simply be reprogrammed because of the generic gate array.
  1504. Computer Architecture
  1505. The previous section examined various classes of computers.
  1506. Regardless of shapes and sizes, every computer is designed for an
  1507. architect from high level to low level.
  1508. Computer\,Architecture=Instruction\,Set\,Architecture+Computer\,Organization+Hardware
  1509. At the highest-level is the Instruction Set Architecture.
  1510. At the middle-level is the Computer Organization.
  1511. At the lowest-level is the Hardware.
  1512. Instruction Set Architecture
  1513. An instruction setinstruction set is the basic set of commands
  1514. and instructions that a microprocessor understands and can carry
  1515. out.
  1516. An Instruction Set ArchitectureInstruction Set Architecture, or ISA
  1517. ISA, is the design of an environment that implements an
  1518. instruction set. Essentially, a runtime environment similar to
  1519. those interpreters of high-level languages. The design includes
  1520. all the instructions, registers, interrupts, memory models (how
  1521. memory are arranged to be used by programs), addressing modes,
  1522. I/O... of a CPU. The more features (e.g. more instructions) a CPU
  1523. has, the more circuits are required to implement it.
  1524. Computer organization
  1525. [margin:
  1526. Computer organization
  1527. ]Computer organizationComputer organization is the functional
  1528. view of the design of a computer. In this view, hardware
  1529. components of a computer are presented as boxes with input and
  1530. output that connects to each other and form the design of a
  1531. computer. Two computers may have the same ISA, but different
  1532. organizations. For example, both AMD and Intel processors
  1533. implement x86 ISA, but the hardware components of each processor
  1534. that make up the environments for the ISA are not the same.
  1535. Computer organizations may vary depend on a manufacturer's
  1536. design, but they are all originated from the Von Neumann
  1537. architecture[footnote:
  1538. John von Neumann was a mathematician and physicist who invented a
  1539. computer architecture.
  1540. ]:
  1541. [float Figure:
  1542. [Figure 0.11:
  1543. Von-Neumann Architecture
  1544. ]
  1545. <Graphics file: C:/Users/Tu Do/os01/book_src/images/03/von_neumann_architecture.pdf>
  1546. ]
  1547. CPUCPU fetches instructions continuously from main memory and
  1548. execute.
  1549. MemoryMemory stores program code and data.
  1550. BusBus are electrical wires for sending raw bits between the
  1551. above components.
  1552. I/O DevicesI/O Devices are devices that give input to a
  1553. computer i.e. keyboard, mouse, sensor... and takes the output
  1554. from a computer i.e. monitor takes information sent from CPU to
  1555. display it, LED turns on/off according to a pattern computed by
  1556. CPU...
  1557. The Von-Neumann computer operates by storing its instructions in
  1558. main memory, and CPU repeatedly fetches those instructions into
  1559. its internal storage for executing, one after another. Data are
  1560. transferred through a data bus between CPU, memory and I/O
  1561. devices, and where to store in the devices is transferred through
  1562. the address bus by the CPU. This architecture completely
  1563. implements the fetch -- decode -- executefetch -- decode --
  1564. execute cycle.
  1565. The earlier computers were just the exact implementations of the
  1566. Von Neumann architecture, with CPU and memory and I/O devices
  1567. communicate through the same bus. Today, a computer has more
  1568. buses, each is specialized in a type of traffic. However, at the
  1569. core, they are still Von Neumann architecture. To write an OS for
  1570. a Von Neumann computer, a programmer needs to be able to
  1571. understand and write code that controls the cores components:
  1572. CPU, memory, I/O devices, and bus.
  1573. CPUCPU, or Central Processing UnitCentral Processing Unit, is the
  1574. heart and brain of any computer system. Understand a CPU is
  1575. essential to writing an OS from scratch:
  1576. • To use these devices, a programmer needs to controls the CPU to
  1577. use the programming interfaces of other devices. CPU is the
  1578. only way, as CPU is the only direct device a programmer can use
  1579. and the only device that understand code written by a
  1580. programmer.
  1581. • In a CPU, many OS concepts are already implemented directly in
  1582. hardware, e.g. task switching, paging. A kernel programmer
  1583. needs to know how to use the hardware features, to avoid
  1584. duplicating such concept in software, thus wasting computer
  1585. resources.
  1586. • CPU built-in OS features boost both OS performance and
  1587. developer productivity because those features are actual
  1588. hardware, the lowest possible level, and developers are free to
  1589. implement such features.
  1590. • To effectively use the CPU, a programmer needs to understand
  1591. the documentation provided from CPU manufacturer. For example, [[http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html||Intel® 64 and IA-32 Architectures Software Developer Manuals]
  1592. .
  1593. • After understanding one CPU architecture well, it is easier to
  1594. learn other CPU architectures.
  1595. A CPU is an implementation of an ISA, effective the
  1596. implementation of an assembly language (and depends on the CPU
  1597. architecture, the language may vary). Assembly language is one of
  1598. the interfaces that are provided for software engineers to
  1599. control a CPU, thus control a computer. But how can every
  1600. computer device be controlled with only the access to the CPU?
  1601. The simple answer is that a CPU can communicate with other
  1602. devices through these two interfaces, thus commanding them what
  1603. to do:
  1604. Registers Registers[margin:
  1605. Registers
  1606. ]are a hardware component for high-speed data access and
  1607. communication with other hardware devices. Registers allow
  1608. software to control hardware directly by writing to registers
  1609. of a device, or receive information from hardware device when
  1610. reading from registers of a device.
  1611. Not all registers are used for communication with other
  1612. devices. In a CPU, most registers are used as high-speed
  1613. storage for temporary data. Other devices that a CPU can
  1614. communicate always have a set of registers for interfacing with
  1615. the CPU.
  1616. Port Port[margin:
  1617. Port
  1618. ]is a specialized register in a hardware device used for
  1619. communication with other devices. When data are written to a
  1620. port, it causes a hardware device to perform some operation
  1621. according to values written to the port. The different between
  1622. a port and a register is that port does not store data, but
  1623. delegate data to some other circuit.
  1624. These two interfaces are extremely important, as they are the
  1625. only interfaces for controlling hardware with software. Writing
  1626. device drivers is essentially learning the functionality of each
  1627. register and how to use them properly to control the device.
  1628. [margin:
  1629. Memory
  1630. ]MemoryMemory is a storage device that stores information. Memory
  1631. consists of many cells. Each cell is a byte with its address
  1632. number, so a CPU can use such address number to access an exact
  1633. location in memory. Memory is where software instructions (in the
  1634. form of machine language) is stored and retrieved to be executed
  1635. by CPU; memory also stores data needed by some software. Memory
  1636. in a Von Neumann machine does not distinguish between which bytes
  1637. are data and which bytes are software instructions. It's up to
  1638. the software to decide, and if somehow data bytes are fetched and
  1639. executed as instructions, CPU still does it if such bytes
  1640. represents valid instructions, but will produce undesirable
  1641. results. To a CPU, there's no code and data; both are merely
  1642. different types of data for it to act on: one tells it how to do
  1643. something in a specific manner, and one is necessary materials
  1644. for it to carry such action.
  1645. The RAM is controlled by a device called a memory controllermemory controller
  1646. . Currently, most processors have this device embedded, so the
  1647. CPU has a dedicated memory bus connecting the processor to the
  1648. RAM. On older CPU[footnote:
  1649. Prior to the CPU's produced in 2009
  1650. ], however, this device was located in a chip also known as MCH
  1651. or Memory Controller HubMemory Controller Hub. In this case, the
  1652. CPU does not communicate directly to the RAM, but to the MCH
  1653. chip, and this chip then accesses the memory to read or write
  1654. data. The first option provides better performance since there is
  1655. no middleman in the communications between the CPU and the
  1656. memory.
  1657. At the physical level, RAM is implemented as a grid of cells that
  1658. each contain a transistor and an electrical device called a [margin:
  1659. capacitor
  1660. ]capacitorcapacitor, which stores charge for short periods of
  1661. time. The transistor controls access to the capacitor; when
  1662. switched on, it allows a small charge to be read from or written
  1663. to the capacitor. The charge on the capacitor slowly dissipates,
  1664. requiring the inclusion of a refresh circuit to periodically read
  1665. values from the cells and write them back after amplification
  1666. from an external power source.
  1667. Bus[margin:
  1668. Bus
  1669. ]Bus is a subsystem that transfers data between computer
  1670. components or between computers. Physically, buses are just
  1671. electrical wires that connect all components together and each
  1672. wire transfer a single big of data. The total number of wires is
  1673. called bus width[margin:
  1674. bus width
  1675. ]bus width, and is dependent on how many wires a CPU can support.
  1676. If a CPU can only accept 16 bits at a time, then the bus has 16
  1677. wires connecting from a component to the CPU, which means the CPU
  1678. can only retrieve 16 bits of data a time.
  1679. Hardware
  1680. Hardware is a specific implementation of a computer. A line of
  1681. processors implement the same instruction set architecture and
  1682. use nearly identical organizations but differ in hardware
  1683. implementation. For example, the Core i7 family provides a model
  1684. for desktop computers that is more powerful but consumes more
  1685. energy, while another model for laptops is less performant but
  1686. more energy efficient. To write software for a hardware device,
  1687. seldom we need to understand a hardware implementation if
  1688. documents are available. Computer organization and especially the
  1689. instruction set architecture are more relevant to an operating
  1690. system programmer. For that reason, the next chapter is devoted
  1691. to study the x86 instruction set architecture in depth.
  1692. x86 architecture
  1693. A chipsetchipset is a chip with multiple functions. Historically,
  1694. a chipset is actually a set of individual chips, and each is
  1695. responsible for a function, e.g. memory controller, graphic
  1696. controllers, network controller, power controller, etc. As
  1697. hardware progressed, the set of chips were incorporated into a
  1698. single chip, thus more space, energy, and cost efficient. In a
  1699. desktop computer, various hardware devices are connected to each
  1700. other through a PCB called a motherboardmotherboard. Each CPU
  1701. needs a compatible motherboard that can host it. Each motherboard
  1702. is defined by its chipset model that determine the environment
  1703. that a CPU can control. This environment typically consists of
  1704. • a slot or more for CPU
  1705. • a chipset of two chips which are the Northbridge and
  1706. Southbridge chips
  1707. – Northbridge chip is responsible for the high-performance
  1708. communication between CPU, main memory and the graphic card.
  1709. – Southbridge chip is responsible for the communication with
  1710. I/O devices and other devices that are not performance
  1711. sensitive.
  1712. • slots for memory sticks
  1713. • a slot or more for graphic cards.
  1714. • generic slots for other devices, e.g. network card, sound card.
  1715. • ports for I/O devices, e.g. keyboard, mouse, USB.
  1716. [float Figure:
  1717. [Figure 0.12:
  1718. Motherboard organization.
  1719. ]<mobo-organization>
  1720. <Graphics file: C:/Users/Tu Do/os01/book_src/images/03/Motherboard_diagram.svg>
  1721. ]
  1722. To write a complete operating system, a programmer needs to
  1723. understand how to program these devices. After all, an operating
  1724. system manages hardware automatically to free application
  1725. programs doing so. However, of all the components, learning to
  1726. program the CPU is the most important, as it is the component
  1727. present in any computer, regardless of what type a computer is.
  1728. For this reason, the primary focus of this book will be on how to
  1729. program an x86 CPU. Even solely focused on this device, a
  1730. reasonably good minimal operating system can be written. The
  1731. reason is that not all computers include all the devices as in a
  1732. normal desktop computer. For example, an embedded computer might
  1733. only have a CPU and limited internal memory, with pins for
  1734. getting input and producing an output; yet, operating systems
  1735. were written for such devices.
  1736. However, learning how to program an x86 CPU is a daunting task,
  1737. with 3 primary manuals written for it: almost 500 pages for
  1738. volume 1, over 2000 pages for volume 2 and over 1000 pages for
  1739. volume 3. It is an impressive feat for a programmer to master
  1740. every aspect of x86 CPU programming.
  1741. Intel Q35 Chipset
  1742. Q35 is an Intel chipset released September 2007. Q35 is used as
  1743. an example of a high-level computer organization because later we
  1744. will use QEMU to emulate a Q35 system, which is latest Intel
  1745. system that QEMU can emulate. Though released in 2007, Q35 is
  1746. relatively modern to the current hardware, and the knowledge can
  1747. still be reused for current chipset model. With a Q35 chipset,
  1748. the emulated CPU is also relatively up-to-date with features
  1749. presented in current day CPUs so we can use the latest software
  1750. manuals from Intel.
  1751. Figure [mobo-organization] is a typical current-day motherboard
  1752. organization, in which Q35 shares similar organization.
  1753. x86 Execution Environment
  1754. An execution environmentexecution environment is an environment
  1755. that provides the facility to make code executable. The execution
  1756. environment needs to address the following question:
  1757. • Supported operations? data transfer, arithmetic, control,
  1758. floating-point...
  1759. • Where are operands stored? registers, memory, stack,
  1760. accumulator
  1761. • How many explicit operands are there for each instruction? 0,
  1762. 1, 2, or 3
  1763. • How is the operand location specified? register, immediate,
  1764. indirect, . . .
  1765. • What type and size of operands are supported? byte, int, float,
  1766. double, string, vector...
  1767. • etc.
  1768. For the remain of this chapter, please carry on the reading to
  1769. chapter 3 in Intel Manual Volume 1, “Basic Execution Environment”
  1770. .
  1771. x86 Assembly and C
  1772. In this chapter, we will explore assembly language, and how it
  1773. connects to C. But why should we do so? Isn't it better to trust
  1774. the compiler, plus no one writes assembly anymore?
  1775. Not quite. Surely, the compiler at its current state of the art
  1776. is trustworthy, and we do not need to write code in assembly,
  1777. most of the time. A compiler can generate code, but as mentioned
  1778. previously, a high-level language is a collection of patterns of
  1779. a lower-level language. It does not cover everything that a
  1780. hardware platform provides. As a consequence, not every assembly
  1781. instruction can be generated by a compiler, so we still need to
  1782. write assembly code for these circumstances to access
  1783. hardware-specific features. Since hardware-specific features
  1784. require writing assembly code, debugging requires reading it. We
  1785. might spend even more time reading than writing. Working with
  1786. low-level code that interacts directly with hardware, assembly
  1787. code is unavoidable. Also, understand how a compiler generates
  1788. assembly code could improve a programmer's productivity. For
  1789. example, if a job or school assignment requires us to write
  1790. assembly code, we can simply write it in C, then let gcc does the
  1791. hard working of writing the assembly code for us. We merely
  1792. collect the generated assembly code, modify as needed and be done
  1793. with the assignment.
  1794. We will learn objdump extensively, along with how to use Intel
  1795. documents to aid in understanding x86 assembly code.
  1796. objdump
  1797. objdumpobjdump is a program that displays information about
  1798. object files. It will be handy later to debug incorrect layout
  1799. from manual linking. Now, we use objdump to examine how high
  1800. level source code maps to assembly code. For now, we ignore the
  1801. output and learn how to use the command first. It is simple to
  1802. use objdump :
  1803. $ objdump -d hello
  1804. -d option only displays assembled contents of executable
  1805. sections. A sectionsection is a block of memory that contains
  1806. either program code or data. A code section is executable by the
  1807. CPU, while a data section is not executable. Non-executable
  1808. sections, such as .data and .bss (for storing program data),
  1809. debug sections... are not displayed. We will learn more about
  1810. section when studying ELF binary file format in chapter [chap:The-Anatomy-of-a-program]
  1811. . On the other hand:
  1812. $ objdump -D hello
  1813. where -D option displays assembly contents of all sections. If -D
  1814. , -d is implicitly assumed. objdump is mostly used for inspecting
  1815. assembly code, so -d is the most useful and thus is set by
  1816. default.
  1817. The output overruns the terminal screen. To make it easy for
  1818. reading, send all the output to less:
  1819. $ objdump -d hello | less
  1820. To intermix source code and assembly, the binary must be compiled
  1821. with -g option to include source code in it, then add -S option:
  1822. $ objdump -S hello | less
  1823. The default syntax used by objdump is AT&T syntax. To change it
  1824. to the familiar Intel syntax:
  1825. $ objdump -M intel -D hello | less
  1826. When using -M option, option -D or -d must be explicitly
  1827. supplied. Next, we will use objdump to examine how compiled C
  1828. data and code are represented in machine code.
  1829. Finally, we will write a 32-bit kernel, therefore we will need to
  1830. compile a 32-bit binary and examine it in 32-bit mode:
  1831. $ objdump -M i386,intel -D hello | less
  1832. -M i386 tells objdump to display assembly content using 32-bit
  1833. layout. Knowing the difference between 32-bit and 64-bit is
  1834. crucial for writing kernel code. We will examine this matter
  1835. later on when writing our kernel.
  1836. Reading the output
  1837. At the start of the output displays the file format of the object
  1838. file:
  1839. hello: file format elf64-x86-64
  1840. After the line is a series of disassembled sections:
  1841. Disassembly of section .interp:
  1842. ...
  1843. Disassembly of section .note.ABI-tag:
  1844. ...
  1845. Disassembly of section .note.gnu.build-id:
  1846. ...
  1847. ...
  1848. etc
  1849. Finally, each disassembled section displays its actual content -
  1850. which is a sequence of assembly instructions - with the following
  1851. format:
  1852. 4004d6: 55 push rbp
  1853. • The first column is the address of an assembly instruction. In
  1854. the above example, the address is 0x4004d6.
  1855. • The second column is assembly instruction in raw hex values. In
  1856. the above example, the address is 0x55.
  1857. • The third column is the assembly instruction. Depends on the
  1858. section, the assembly instruction might be meaningful or
  1859. meaningless. For example, if the assembly instructions are in a
  1860. .text section, then the assembly instructions are actual
  1861. program code. On the other hand, if the assembly instructions
  1862. are displayed in a .data section, then we can safely ignore the
  1863. displayed instructions. The reason is that objdump doesn't know
  1864. which hex values are code and which are data, so it blindly
  1865. translates every hex values into assembly instructions. In the
  1866. above example, the assembly instruction is push %rbp.
  1867. • The optional fourth column is a comment - appears when there is
  1868. a reference to an address - to inform where the address
  1869. originates. For example, the comment in blue:
  1870.     lea r12,[rip+0x2008ee] # 600e10
  1871. <__frame_dummy_init_array_entry>
  1872. is to inform that the referenced address from [rip+0x2008ee] is
  1873. 0x600e10, where the variable __frame_dummy_init_array_entry
  1874. resides.
  1875. In a disassembled section, it may also contain labels. A label is
  1876. a name given to an assembly instruction. The label denotes the
  1877. purpose of an assembly block to a human reader, to make it easier
  1878. to understand. For example, .text section carries many of such
  1879. labels to denote where code in a program start; .text section
  1880. below carries two functions: _start and deregister_tm_clones. The
  1881. _start function starts at address 4003e0, is annotated to the
  1882. left of the function name. Right below _start label is also the
  1883. instruction at address 4003e0. This whole thing means that a
  1884. label is simply a name of a memory address. The function
  1885. deregister_tm_clones also shares the same format as every
  1886. function in the section.
  1887. 00000000004003e0 <_start>:
  1888. 4003e0: 31 ed xor ebp,ebp
  1889. 4003e2: 49 89 d1 mov r9,rdx
  1890. 4003e5: 5e pop rsi
  1891. ...more assembly code....
  1892. 0000000000400410 <deregister_tm_clones>:
  1893. 400410: b8 3f 10 60 00 mov eax,0x60103f
  1894. 400415: 55 push rbp
  1895. 400416: 48 2d 38 10 60 00 sub rax,0x601038
  1896. ...more assembly code....
  1897. Intel manuals
  1898. The best way to understand and use assembly language properly is
  1899. to understand precisely the underlying computer architecture and
  1900. what each machine instruction does. To do so, the most reliable
  1901. source is to refer to documents provided by vendors. After all,
  1902. hardware vendors are the one who made their machines. To
  1903. understand Intel's instruction set, we need the document “Intel
  1904. 64 and IA-32 architectures software developer's manual combined
  1905. volumes 2A, 2B, 2C, and 2D: Instruction set reference, A-Z”. The
  1906. document can be retrieved here: https://software.intel.com/en-us/articles/intel-sdm
  1907. .
  1908. • Chapter 1 provides brief information about the manual, and the
  1909. comment notations used in the book.
  1910. • Chapter 2 provides an in-depth explanation of the anatomy of an
  1911. assembly instruction, which we will investigate in the next
  1912. section.
  1913. • Chapter 3 - 5 provide the details of every instruction of the
  1914. x86_64 architecture.
  1915. • Chapter 6 provides information about safer mode extensions. We
  1916. won't need to use this chapter.
  1917. The first volume “Intel® 64 and IA-32 Architectures Software
  1918. Developer’s Manual Volume 1: Basic Architecture” describes the
  1919. basic architecture and programming environment of Intel
  1920. processors. In the book, Chapter 5 gives the summary of all Intel
  1921. instructions, by listing instructions into different categories.
  1922. We only need to learn general-purpose instructions listed chapter
  1923. 5.1 for our OS. Chapter 7 describes the purpose of each category.
  1924. Gradually, we will learn all of these instructions.
  1925. Read section 1.3 in volume 2, exclude sections 1.3.5 and 1.3.7.
  1926. Experiment with assembly code
  1927. The subsequent sections examine the anatomy of an assembly
  1928. instruction. To fully understand, it is necessary to write code
  1929. and see the code in its actual form displayed as hex numbers. For
  1930. this purpose, we use nasm assembler to write a few line of
  1931. assembly code and see the generated code.
  1932. -------------------------------------------
  1933. Suppose we want to see the machine code generated for this
  1934. instruction:
  1935. jmp eax
  1936. Then, we use an editor e.g. Emacs, then create a new file,
  1937. write the code and save it in a file, e.g. test.asm. Then, in
  1938. the terminal, run the command:
  1939. $ nasm -f bin test.asm -o test
  1940. -f option specifies the file format, e.g. ELF, of the final
  1941. output file. But in this case, the format is bin, which means
  1942. this file is just a flat binary output without any extra
  1943. information. That is, the written assembly code is translated
  1944. to machine code as is, without the overhead of the metadata
  1945. from file format like ELF. Indeed, after compiling, we can
  1946. examine the output using this command:
  1947. $ hd test
  1948. hd (short for hexdump) is a program that displays the content
  1949. of a file in hex format[margin:
  1950. Though its name is short for hexdump, hd can display in different
  1951. base, e.g. binary, other than hex.
  1952. ]. And get the following output:
  1953. 00000000 66 ff e0 |f..|
  1954. 00000003
  1955. The file only consists of 3 bytes: 66 ff e0, which is
  1956. equivalent to the instruction jmp eax.
  1957. -------------------------------------------
  1958. If we were to use elf as file format:
  1959. $ nasm -f elf test.asm -o test
  1960. It would be more challenging to learn and understand assembly
  1961. instructions with all the added noise[footnote:
  1962. The output from hd.
  1963. ]:
  1964. 00000000 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
  1965. |.ELF............|
  1966. 00000010 01 00 03 00 01 00 00 00 00 00 00 00 00 00 00 00
  1967. |................|
  1968. 00000020 40 00 00 00 00 00 00 00 34 00 00 00 00 00 28 00
  1969. |@.......4.....(.|
  1970. 00000030 05 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00
  1971. |................|
  1972. 00000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  1973. |................|
  1974. *
  1975. 00000060 00 00 00 00 00 00 00 00 01 00 00 00 01 00 00 00
  1976. |................|
  1977. 00000070 06 00 00 00 00 00 00 00 10 01 00 00 02 00 00 00
  1978. |................|
  1979. 00000080 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00
  1980. |................|
  1981. 00000090 07 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00
  1982. |................|
  1983. 000000a0 20 01 00 00 21 00 00 00 00 00 00 00 00 00 00 00 |
  1984. ...!...........|
  1985. 000000b0 01 00 00 00 00 00 00 00 11 00 00 00 02 00 00 00
  1986. |................|
  1987. 000000c0 00 00 00 00 00 00 00 00 50 01 00 00 30 00 00 00
  1988. |........P...0...|
  1989. 000000d0 04 00 00 00 03 00 00 00 04 00 00 00 10 00 00 00
  1990. |................|
  1991. 000000e0 19 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00
  1992. |................|
  1993. 000000f0 80 01 00 00 0d 00 00 00 00 00 00 00 00 00 00 00
  1994. |................|
  1995. 00000100 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  1996. |................|
  1997. 00000110 ff e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  1998. |................|
  1999. 00000120 00 2e 74 65 78 74 00 2e 73 68 73 74 72 74 61 62
  2000. |..text..shstrtab|
  2001. 00000130 00 2e 73 79 6d 74 61 62 00 2e 73 74 72 74 61 62
  2002. |..symtab..strtab|
  2003. 00000140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  2004. |................|
  2005. *
  2006. 00000160 01 00 00 00 00 00 00 00 00 00 00 00 04 00 f1 ff
  2007. |................|
  2008. 00000170 00 00 00 00 00 00 00 00 00 00 00 00 03 00 01 00
  2009. |................|
  2010. 00000180 00 74 65 73 74 2e 61 73 6d 00 00 00 00 00 00 00
  2011. |.disp8-5.asm....|
  2012. 00000190
  2013. Thus, it is better just to use flat binary format in this case,
  2014. to experiment instruction by instruction.
  2015. With such a simple workflow, we are ready to investigate the
  2016. structure of every assembly instruction.
  2017. Note: Using the bin format puts nasm by default into 16-bit mode.
  2018. To enable 32-bit code to be generated, we must add this line at
  2019. the beginning of an nasm source file:
  2020. bits 32
  2021. Anatomy of an Assembly Instruction
  2022. Chapter 2 of the instruction reference manual provides an
  2023. in-depth of view of instruction format. But, the information is
  2024. too much that it can overwhelm beginners. This section provides
  2025. an easier instruction before reading the actual chapter in the
  2026. manual.
  2027. <Graphics file: C:/Users/Tu Do/os01/book_src/images/04/x86_instruction_format.pdf>
  2028. Recall that an assembly instruction is simply a fixed-size series
  2029. of bits. The length of an instruction varies and depends on how
  2030. complicated an instruction is. What every instruction shares is a
  2031. common format described in the figure above that divides the bits
  2032. of an instruction into smaller parts that encode different types
  2033. of information. These parts are:
  2034. Instruction Prefixes appears at the beginning of an
  2035. instruction. Prefixes are optional. A programmer can choose to
  2036. use a prefix or not because in practice, a so-called prefix is
  2037. just another assembly instruction to be inserted before another
  2038. assembly instruction that such prefix is applicable.
  2039. Instructions with 2 or 3-bytes opcodes include the prefixes by
  2040. default.
  2041. Opcode is a unique number that identifies an instruction. Each
  2042. opcode is given an mnemonic name that is human readable, e.g.
  2043. one of the opcodes for instruction add is 04. When a CPU sees
  2044. the number 04 in its instruction cache, it sees instruction add
  2045. and execute accordingly. Opcode can be 1,2 or 3 bytes long and
  2046. includes an additional 3-bit field in the ModR/M byte when
  2047. needed.
  2048. This instruction:
  2049. jmp [0x1234]
  2050. generates the machine code:
  2051. ff 26 34 12
  2052. The very first byte, 0xff is the opcode, which is unique to jmp
  2053. instruction.
  2054. ModR/M specifies operands of an instruction. Operand can either
  2055. be a register, a memory location or an immediate value. This
  2056. component of an instruction consists of 3 smaller parts:
  2057. • mod field, or modifier field, is combined with r/m field for
  2058. a total of 5 bits of information to encode 32 possible
  2059. values: 8 registers and 24 addressing modes.
  2060. • reg/opcode field encodes either a register operand, or
  2061. extends the Opcode field with 3 more bits.
  2062. • r/m field encodes either a register operand or can be
  2063. combined with mod field to encode an addressing mode.
  2064. The tables [mod-rm-16] and [mod-rm-32] list all possible 256
  2065. values of ModR/M byte and how each value maps to an addressing
  2066. mode and a register, in 16-bit and 32-bit modes.
  2067. +---------------------------------------------+-------+-------+-------+-------+-------+-------+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2068. | r8(/r) | AL | CL | DL | BL | AH | CH | DH | BH |
  2069. | r16(/r) | AX | CX | DX | BX | SP | BP¹ | SI | DI |
  2070. | r32(/r) | EAX | ECX | EDX | EBX | ESP | EBP | ESI | EDI |
  2071. | mm(/r) | MM0 | MM1 | MM2 | MM3 | MM4 | MM5 | MM6 | MM7 |
  2072. | xmm(/r) | XMM0 | XMM1 | XMM2 | XMM3 | XMM4 | XMM5 | XMM6 | XMM7 |
  2073. | (In decimal) /digit (Opcode) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
  2074. | (In binary) REG = | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 |
  2075. +---------------------------+--------+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2076. |        Effective Address |   Mod |   R/M | Values of ModR/M Byte (In Hexadecimal) |
  2077. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2078. | [BX + SI] | 00 | 000 | 00 | 08 | 10 | 18 | 20 | 28 | 30 | 38 |
  2079. | [BX + DI] | | 001 | 01 | 09 | 11 | 19 | 21 | 29 | 31 | 39 |
  2080. | [BP + SI] | | 010 | 02 | 0A | 12 | 1A | 22 | 2A | 32 | 3A |
  2081. | [BP + DI] | | 011 | 03 | 0B | 13 | 1B | 23 | 2B | 33 | 3B |
  2082. | [SI] | | 100 | 04 | 0C | 14 | 1C | 24 | 2C | 34 | 3C |
  2083. | [DI] | | 101 | 05 | 0D | 15 | 1D | 25 | 2D | 35 | 3D |
  2084. | disp16² | | 110 | 06 | 0E | 16 | 1E | 26 | 2E | 36 | 3E |
  2085. | [BX] | | 111 | 07 | 0F | 17 | 1F | 27 | 2F | 37 | 3F |
  2086. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2087. | [BX + SI] + disp8³ | 01 | 000 | 40 | 48 | 50 | 58 | 60 | 68 | 70 | 78 |
  2088. | [BX + DI] + disp8 | | 001 | 41 | 49 | 51 | 59 | 61 | 69 | 71 | 79 |
  2089. | [BP + SI] + disp8 | | 010 | 42 | 4A | 52 | 5A | 62 | 6A | 72 | 7A |
  2090. | [BP + DI] + disp8 | | 011 | 43 | 4B | 53 | 5B | 63 | 6B | 73 | 7B |
  2091. | [SI] + disp8 | | 100 | 44 | 4C | 54 | 5C | 64 | 6C | 74 | 7C |
  2092. | [DI] + disp8 | | 101 | 45 | 4D | 55 | 5D | 65 | 6D | 75 | 7D |
  2093. | [BP] + disp8 | | 110 | 46 | 4E | 56 | 5E | 66 | 6E | 76 | 7E |
  2094. | [BX] + disp8 | | 111 | 47 | 4F | 57 | 5F | 67 | 6F | 77 | 7F |
  2095. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2096. | [BX + SI] + disp16 | 10 | 000 | 80 | 88 | 90 | 98 | A0 | A8 | B0 | B8 |
  2097. | [BX + DI] + disp16 | | 001 | 81 | 89 | 91 | 99 | A1 | A9 | B1 | B9 |
  2098. | [BP + SI] + disp16 | | 010 | 82 | 8A | 92 | 9A | A2 | AA | B2 | BA |
  2099. | [BP + DI] + disp16 | | 011 | 83 | 8B | 93 | 9B | A3 | AB | B3 | BB |
  2100. | [SI] + disp16 | | 100 | 84 | 8C | 94 | 9C | A4 | AC | B4 | BC |
  2101. | [DI] + disp16 | | 101 | 85 | 8D | 95 | 9D | A5 | AD | B5 | BD |
  2102. | [BP] + disp16 | | 110 | 86 | 8E | 96 | 9E | A6 | AE | B6 | BE |
  2103. | [BX] + disp16 | | 111 | 87 | 8F | 97 | 9F | A7 | AF | B7 | BF |
  2104. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2105. | EAX/AX/AL/MM0/XMM0 | 11 | 000 | C0 | C8 | D0 | D8 | E0 | E8 | F0 | F8 |
  2106. | ECX/CX/CL/MM1/XMM1 | | 001 | C1 | C9 | D1 | D9 | E1 | E9 | F1 | F9 |
  2107. | EDX/DX/DL/MM2/XMM2 | | 010 | C2 | CA | D2 | DA | E2 | EA | F2 | FA |
  2108. | EBX/BX/BL/MM3/XMM3 | | 011 | C3 | CB | D3 | DB | E3 | EB | F3 | FB |
  2109. | ESP/SP/AHMM4/XMM4 | | 100 | C4 | CC | D4 | DC | E4 | EC | F4 | FC |
  2110. | EBP/BP/CH/MM5/XMM5 | | 101 | C5 | CD | D5 | DD | E5 | ED | F5 | FD |
  2111. | ESI/SI/DH/MM6/XMM6 | | 110 | C6 | CE | D6 | DE | E6 | EE | F6 | FE |
  2112. | EDI/DI/BH/MM7/XMM7 | | 111 | C7 | CF | D7 | DF | E7 | EF | F7 | FF |
  2113. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2114. 1. The default segment register is SS for the effective addresses
  2115. containing a BP index, DS for other effective addresses.
  2116. 2. The disp16 nomenclature denotes a 16-bit displacement that
  2117. follows the ModR/M byte and that is added to the index.
  2118. 3. The disp8 nomenclature denotes an 8-bit displacement that
  2119. follows the ModR/M byte and that is sign-extended and added to
  2120. the index.
  2121. <mod-rm-16>
  2122. +---------------------------------------------+-------+-------+-------+-------+-------+-------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2123. | r8(/r) | AL | CL | DL | BL | AH | CH | DH | BH |
  2124. | r16(/r) | AX | CX | DX | BX | SP | BP | SI | DI |
  2125. | r32(/r) | EAX | ECX | EDX | EBX | ESP | EBP | ESI | EDI |
  2126. | mm(/r) | MM0 | MM1 | MM2 | MM3 | MM4 | MM5 | MM6 | MM7 |
  2127. | xmm(/r) | XMM0 | XMM1 | XMM2 | XMM3 | XMM4 | XMM5 | XMM6 | XMM7 |
  2128. | (In decimal) /digit (Opcode) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
  2129. | (In binary) REG = | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 |
  2130. +---------------------------+--------+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2131. |        Effective Address |   Mod |   R/M | Values of ModR/M Byte (In Hexadecimal) |
  2132. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2133. | [EAX] | 00 | 000 | 00 | 08 | 10 | 18 | 20 | 28 | 30 | 38 |
  2134. | [ECX] | | 001 | 01 | 09 | 11 | 19 | 21 | 29 | 31 | 39 |
  2135. | [EDX] | | 010 | 02 | 0A | 12 | 1A | 22 | 2A | 32 | 3A |
  2136. | [EBX] | | 011 | 03 | 0B | 13 | 1B | 23 | 2B | 33 | 3B |
  2137. | [-][-]¹ | | 100 | 04 | 0C | 14 | 1C | 24 | 2C | 34 | 3C |
  2138. | disp32² | | 101 | 05 | 0D | 15 | 1D | 25 | 2D | 35 | 3D |
  2139. | [ESI] | | 110 | 06 | 0E | 16 | 1E | 26 | 2E | 36 | 3E |
  2140. | [EDI] | | 111 | 07 | 0F | 17 | 1F | 27 | 2F | 37 | 3F |
  2141. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2142. | [EAX] + disp8³ | 01 | 000 | 40 | 48 | 50 | 58 | 60 | 68 | 70 | 78 |
  2143. | [ECX] + disp8 | | 001 | 41 | 49 | 51 | 59 | 61 | 69 | 71 | 79 |
  2144. | [EDX] + disp8 | | 010 | 42 | 4A | 52 | 5A | 62 | 6A | 72 | 7A |
  2145. | [EBX] + disp8 | | 011 | 43 | 4B | 53 | 5B | 63 | 6B | 73 | 7B |
  2146. | [-][-] + disp8 | | 100 | 44 | 4C | 54 | 5C | 64 | 6C | 74 | 7C |
  2147. | [EBP] + disp8 | | 101 | 45 | 4D | 55 | 5D | 65 | 6D | 75 | 7D |
  2148. | [ESI] + disp8 | | 110 | 46 | 4E | 56 | 5E | 66 | 6E | 76 | 7E |
  2149. | [EDI] + disp8 | | 111 | 47 | 4F | 57 | 5F | 67 | 6F | 77 | 7F |
  2150. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2151. | [EAX] + disp32 | 10 | 000 | 80 | 88 | 90 | 98 | A0 | A8 | B0 | B8 |
  2152. | [ECX] + disp32 | | 001 | 81 | 89 | 91 | 99 | A1 | A9 | B1 | B9 |
  2153. | [EDX] + disp32 | | 010 | 82 | 8A | 92 | 9A | A2 | AA | B2 | BA |
  2154. | [EBX] + disp32 | | 011 | 83 | 8B | 93 | 9B | A3 | AB | B3 | BB |
  2155. | [-][-] + disp32 | | 100 | 84 | 8C | 94 | 9C | A4 | AC | B4 | BC |
  2156. | [EBP] + disp32 | | 101 | 85 | 8D | 95 | 9D | A5 | AD | B5 | BD |
  2157. | [ESI] + disp32 | | 110 | 86 | 8E | 96 | 9E | A6 | AE | B6 | BE |
  2158. | [EDI] + disp32 | | 111 | 87 | 8F | 97 | 9F | A7 | AF | B7 | BF |
  2159. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2160. | EAX/AX/AL/MM0/XMM0 | 11 | 000 | C0 | C8 | D0 | D8 | E0 | E8 | F0 | F8 |
  2161. | ECX/CX/CL/MM/XMM1 | | 001 | C1 | C9 | D1 | D9 | E1 | E9 | F1 | F9 |
  2162. | EDX/DX/DL/MM2/XMM2 | | 010 | C2 | CA | D2 | DA | E2 | EA | F2 | FA |
  2163. | EBX/BX/BL/MM3/XMM3 | | 011 | C3 | CB | D3 | DB | E3 | EB | F3 | FB |
  2164. | ESP/SP/AH/MM4/XMM4 | | 100 | C4 | CC | D4 | DC | E4 | EC | F4 | FC |
  2165. | EBP/BP/CH/MM5/XMM5 | | 101 | C5 | CD | D5 | DD | E5 | ED | F5 | FD |
  2166. | ESI/SI/DH/MM6/XMM6 | | 110 | C6 | CE | D6 | DE | E6 | EE | F6 | FE |
  2167. | EDI/DI/BH/MM7/XMM7 | | 111 | C7 | CF | D7 | DF | E7 | EF | F7 | FF |
  2168. +---------------------------+--------+--------+-------+-------+-------+-------+-------+-------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2169. 1. The [-][-] nomenclature means a SIB follows the ModR/M byte.
  2170. 2. The disp32 nomenclature denotes a 32-bit displacement that
  2171. follows the ModR/M byte (or the SIB byte if one is present) and
  2172. that is added to the index.
  2173. 3. The disp8 nomenclature denotes an 8-bit displacement that
  2174. follows the ModR/M byte (or the SIB byte if one is present) and
  2175. that is sign-extended and added to the index.
  2176. <mod-rm-32>
  2177. How to read the table:
  2178. In an instruction, next to the opcode is a ModR/M byte. Then,
  2179. look up the byte value in this table to get the corresponding
  2180. operands in the row and column.
  2181. -------------------------------------------
  2182. An instruction uses this addressing mode:
  2183. jmp [0x1234]
  2184. Then, the machine code is:
  2185. ff 26 34 12
  2186. 0xff is the opcode. Next to it, 0x26 is the ModR/M byte. Look
  2187. up in the 16-bit table [margin:
  2188. Remember, using bin format generates 16-bit code by default
  2189. ], the first operand is in the row, equivalent to a disp16, which
  2190. means a 16-bit offset. Since the instruction does not have a
  2191. second operand, the column can be ignored.
  2192. An instruction uses this addressing mode:
  2193. add eax, ecx
  2194. Then the machine code is:
  2195. 01 c8
  2196. 0x01 is the opcode. Next to it, c8 is the ModR/M byte. Look up
  2197. in the 16-bit table at c8 value, the row tells the first
  2198. operand is ax [margin:
  2199. Remember, using bin format generates 16-bit code by default
  2200. ], the column tells the second operand is cx; the column can't be
  2201. ignored as the second operand is in the instruction.
  2202. Why is the first operand in the row and the second in a column?
  2203. Let's break down the ModR/M byte, with an example value c8,
  2204. into bits:
  2205. +----------+---------------------+-------------+
  2206. | mod | reg/opcode | r/m |
  2207. +----------+---------------------+-------------+
  2208. +----+-----+----+----+-----------+----+----+---+
  2209. | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 0 |
  2210. +----+-----+----+----+-----------+----+----+---+
  2211. The mod field divides addressing modes into 4 different
  2212. categories. Further combines with the r/m field, exactly one
  2213. addressing mode can be selected from one of the 24 rows. If an
  2214. instruction only requires one operand, then the column can be
  2215. ignored. Then the reg/opcode field finally provides the if an
  2216. instruction requires one.
  2217. -------------------------------------------
  2218. SIB is Scale-Index-Base byte. This byte encodes ways to
  2219. calculate the memory position into an element of an array. SIB
  2220. is the name that is based on this formula for calculating an
  2221. effective address:
  2222. \mathtt{Effective\,address=scale*index+base}
  2223. • Index is an offset into an array.
  2224. • Scale is a factor of Index. Scale is one of the values 1, 2,
  2225. 4 or 8; any other value is invalid. To scale with values
  2226. other than 2, 4 or 8, the scale factor must be set to 1, and
  2227. the offset must be calculated manually. For example, if we
  2228. want to get the address of the n[superscript:th] element in an array and each element is 12-bytes long. Because
  2229. each element is 12-bytes long instead of 1, 2, 4 or 8, Scale
  2230. is set to 1 and a compiler needs to calculate the offset:
  2231. \mathtt{Effective\,address=1*(12*n)+base}
  2232. Why do we bother with SIB when we can manually calculate the
  2233. offset? The answer is that in the above scenario, an
  2234. additional mul instruction must be executed to get the
  2235. offset, and the mul instruction consumes more than 1 byte,
  2236. while the SIB only consumes 1 byte. More importantly, if the
  2237. element is repeatedly accessed many times in a loop, e.g.
  2238. millions of times, then an extra mul instruction can
  2239. detriment the performance as the CPU must spend time
  2240. executing millions of these additional mul instructions.
  2241. The values 2, 4 and 8 are not random chosen. They map to
  2242. 16-bit (or 2 bytes), 32-bit (or 4 bytes) and 64-bit (or 8
  2243. bytes) numbers that are often used for intensive numeric
  2244. calculations.
  2245. • Base is the starting address.
  2246. Below is the table listing all 256 values of SIB byte, with the
  2247. lookup rule similar to ModR/M tables:

  2249. | r32(/r) | EAX | ECX | EDX | EBX | ESP | EBP | ESI | EDI |
  2250. | (In decimal) /digit (Opcode) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
  2251. | (In binary) REG = | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 |
  2252. +---------------------------+-------+--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  2253. |        Effective Address |   SS |   R/M | Values of SIB Byte (In Hexadecimal) |

  2255. | [EAX] | 00 | 000 | 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 |
  2256. | [ECX] | | 001 | 08 | 09 | 0A | 0B | 0C | 0D | 0E | 0F |
  2257. | [EDX] | | 010 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 |
  2258. | [EBX] | | 011 | 18 | 19 | 1A | 1B | 1C | 1D | 1E | 1F |
  2259. | none | | 100 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 |
  2260. | [EBP] | | 101 | 28 | 29 | 2A | 2B | 2C | 2D | 2E | 2F |
  2261. | [ESI] | | 110 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 |
  2262. | [EDI] | | 111 | 38 | 39 | 3A | 3B | 3C | 3D | 3E | 3F |

  2264. | [EAX*2] | 01 | 000 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 |
  2265. | [ECX*2] | | 001 | 48 | 49 | 4A | 4B | 4C | 4D | 4E | 4F |
  2266. | [EDX*2] | | 010 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 |
  2267. | [EBX*2] | | 011 | 58 | 59 | 5A | 5B | 5C | 5D | 5E | 5F |
  2268. | none | | 100 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 |
  2269. | [EBP*2] | | 101 | 68 | 69 | 6A | 6B | 6C | 6D | 6E | 6F |
  2270. | [ESI*2] | | 110 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 |
  2271. | [EDI*2] | | 111 | 78 | 79 | 7A | 7B | 7C | 7D | 7E | 7F |

  2273. | [EAX*4] | 10 | 000 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 |
  2274. | [ECX*4] | | 001 | 88 | 89 | 8A | 8B | 8C | 8D | 8E | 8F |
  2275. | [EDX*4] | | 010 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 |
  2276. | [EBX*4] | | 011 | 98 | 99 | 9A | 9B | 9C | 9D | 9E | 9F |
  2277. | none | | 100 | A0 | A1 | A2 | A3 | A4 | A5 | A6 | A7 |
  2278. | [EBP*4] | | 101 | A8 | A9 | AA | AB | AC | AD | AE | AF |
  2279. | [ESI*4] | | 110 | B0 | B1 | B2 | B3 | B4 | B5 | B6 | B7 |
  2280. | [EDI*4] | | 111 | B8 | B9 | BA | BB | BC | BD | BE | BF |

  2282. | [EAX*8] | 11 | 000 | C0 | C1 | C2 | C3 | C4 | C5 | C6 | C7 |
  2283. | [ECX*8] | | 001 | C8 | C9 | CA | CB | CC | CD | CE | CF |
  2284. | [EDX*8] | | 010 | D0 | D1 | D2 | D3 | D4 | D5 | D6 | D7 |
  2285. | [EBX*8] | | 011 | D8 | D9 | DA | DB | DC | DD | DE | DF |
  2286. | none | | 100 | E0 | E1 | E2 | E3 | E4 | E5 | E6 | E7 |
  2287. | [EBP*8] | | 101 | E8 | E9 | EA | EB | EC | ED | EE | EF |
  2288. | [ESI*8] | | 110 | F0 | F1 | F2 | F3 | F4 | F5 | F6 | F7 |
  2289. | [EDI*8] | | 111 | F8 | F9 | FA | FB | FC | FD | FE | FF |

  2291. 1. The [*] nomenclature means a disp32 with no base if the MOD is
  2292. 00B. Otherwise, [*] means disp8 or disp32 + [EBP]. This
  2293. provides the following address modes:
  2294. +-----------+---------------------------------+
  2295. | MOD bits | Effective Address |
  2296. +-----------+---------------------------------+
  2297. +-----------+---------------------------------+
  2298. | 00 | [scaled index] + disp32 |
  2299. +-----------+---------------------------------+
  2300. | 01 | [scaled index] + disp8 + [EBP] |
  2301. +-----------+---------------------------------+
  2302. | 10 | [scaled index] + disp32 + [EBP] |
  2303. +-----------+---------------------------------+
  2304. <sib>
  2305. This instruction:
  2306. jmp [eax*2 + ebx]
  2307. generates the following code:
  2308. 00000000 67 ff 24 43
  2309. First of all, the first byte, 0x67 is not an opcode but a
  2310. prefix. The number is a predefined prefix for address-size
  2311. override prefix. After the prefix, comes the opcode 0xff and
  2312. the ModR/M byte 0x24. The value from ModR/M suggests that
  2313. there exists a SIB byte that follows. The SIB byte is 0x43.
  2314. Look up in the SIB table, the row tells that eax is scaled by
  2315. 2, and the column tells that the base to be added is in ebx.
  2316. Displacement is the offset from the start of the base index.
  2317. This instruction:
  2318. jmp [0x1234]
  2319. generates machine code is:
  2320. ff 26 34 12
  2321. 0x1234, which is generated as 34 12 in raw machine code, is
  2322. the displacement and stands right next to 0x26, which is the
  2323. ModR/M byte.
  2324. This instruction:
  2325. jmp [eax * 4 + 0x1234]
  2326. generates the machine code:
  2327. 67 ff 24 8d 34 12 00 00
  2328. • 0x67 is an address-size override prefix. Its meaning is
  2329. that if an instruction runs a default address size e.g.
  2330. 16-bit, the use of prefix enables the instruction to use
  2331. non-default address size, e.g. 32-bit or 64-bit. Since the
  2332. binary is supposed to be 16-bit, 0x67 changes the
  2333. instruction to 32-bit mode.
  2334. • 0xff is the opcode.
  2335. • 0x24 is the ModR/M byte. The value suggests that a SIB byte
  2336. follows, according to table [mod-rm-32].
  2337. • 34 12 00 00 is the displacement. As can be seen, the
  2338. displacement is 4 bytes in size, which is equivalent to
  2339. 32-bit, due to address-size override prefix.
  2340. Immediate When an instruction accepts a fixed value, e.g.
  2341. 0x1234, as an operand, this optional field holds the value.
  2342. Note that this field is different from displacement: the value
  2343. is not necessary used an offset, but an arbitrary value of
  2344. anything.
  2345. This instruction:
  2346. mov eax, 0x1234
  2347. generates the code:
  2348. 66 b8 34 12 00 00
  2349. • 0x66 is operand-sized override prefix. Similar to
  2350. address-size override prefix, this prefix enables
  2351. operand-size to be non-default.
  2352. • 0xb8 is one of the opcodes for mov instruction.
  2353. • 0x1234 is the value to be stored in register eax. It is
  2354. just a value for storing directly into a register, and
  2355. nothing more. On the other hand, displacement value is an
  2356. offset for some address calculation.
  2357. Read section 2.1 in Volume 2 for even more details.
  2358. Skim through section 5.1 in volume 1. Read chapter 7 in volume
  2359. 1. If there are terminologies that you don't understand e.g.
  2360. segmentation, don't worry as the terms will be explained in
  2361. later chapters or ignored.
  2362. Understand an instruction in detail
  2363. In the instruction reference manual (Volume 2), from chapter 3
  2364. onward, every x86 instruction is documented in detail. Whenever
  2365. the precise behavior of an instruction is needed, we always
  2366. consult this document first. However, before using the document,
  2367. we must know the writing conventions first. Every instruction has
  2368. the following common structure for organizing information:
  2369. Opcode table lists all possible opcodes of an assembly
  2370. instruction.
  2371. Each table contains the following fields, and can have one or
  2372. more rows:
  2373. +---------------------------------------------------------------------------------------+
  2374. | Opcode Instruction Op/En 64/32-bit Mode CPUID
  2375. Feature flag Description |
  2376. +---------------------------------------------------------------------------------------+
  2377. Opcode shows a unique hexadecimal number assigned to an
  2378. instruction. There can be more than one opcode for an
  2379. instruction, each encodes a variant of the instruction. For
  2380. example, one variant requires one operand, but another
  2381. requires two. In this column, there can be other notations
  2382. aside from hexadecimal numbers. For example, /r indicates
  2383. that the ModR/M byte of the instruction contains a reg
  2384. operand and an r/m operand. The detail listing is in section
  2385. 3.1.1.1 and 3.1.1.2 in the Intel's manual, volume 2.
  2386. Instruction gives the syntax of the assembly instruction that a
  2387. programmer can use for writing code. Aside from the mnemonic
  2388. representation of the opcode, e.g. jmp, other symbols
  2389. represent operands with specific properties in the
  2390. instruction. For example, rel8 represents a relative address
  2391. from 128 bytes before the end of the instruction to 127 bytes
  2392. after the end of instruction; similarly rel16/rel32 also
  2393. represents relative addresses, but with the operand size of
  2394. 16/32-bit instead of 8-bit like rel8. For a detailed listing,
  2395. please refer to section 3.1.1.3 of volume 2.
  2396. Op/En is short for Operand/Encoding. An operand encoding
  2397. specifies how a ModR/M byte encodes the operands that an
  2398. instruction requires. If a variant of an instruction requires
  2399. operands, then an additional table named “Instruction Operand
  2400. Encoding” is added for explaining the operand encoding, with
  2401. the following structure:
  2402. +--------+------------+------------+------------+-----------+
  2403. | Op/En | Operand 1 | Operand 2 | Operand 3 | Operand 4 |
  2404. +--------+------------+------------+------------+-----------+
  2405. Most instructions require one to two operands. We make use of
  2406. these instructions for our OS and skip the instructions that
  2407. require three or four operands. The operands can be readable
  2408. or writable or both. The symbol (r) denotes a readable
  2409. operand, and (w) denotes a writable operand. For example,
  2410. when Operand 1 field contains ModRM:r/m (r), it means the
  2411. first operand is encoded in r/m field of ModR/M byte, and is
  2412. only readable.
  2413. 64/32-bit mode indicates whether the opcode sequence is
  2414. supported in a 64-bit mode and possibly 32-bit mode.
  2415. CPUID Feature Flag indicates indicate a particular CPU feature
  2416. must be available to enable the instruction. An instruction
  2417. is invalid if a CPU does not support the required feature.[margin:
  2418. In Linux, the command:
  2419. cat /proc/cpuinfo
  2420. lists the information of available CPUs and its features in flags
  2421. field.
  2422. ]
  2423. Compat/Leg Mode Many instructions do not have this field, but
  2424. instead is replaced with Compat/Leg Mode, which stands for
  2425. Compatibility or Legacy Mode. This mode enables 64-bit
  2426. variants of instructions to run normally in 16 or 32-bit
  2427. mode. [float MarginTable:
  2428. [MarginTable 2:
  2429. Notations in Compat/Leg Mode
  2430. ]
  2431. +-----------+----------------------------------------------------------------------------------+
  2432. | Notation | Description |
  2433. +-----------+----------------------------------------------------------------------------------+
  2434. +-----------+----------------------------------------------------------------------------------+
  2435. | Valid | Supported |
  2436. +-----------+----------------------------------------------------------------------------------+
  2437. | I | Not supported |
  2438. +-----------+----------------------------------------------------------------------------------+
  2439. | N.E. | The 64-bit opcode cannot be encoded as it overlaps with existing
  2440. 32-bit opcode. |
  2441. +-----------+----------------------------------------------------------------------------------+
  2442. ]
  2443. Description briefly explains the variant of an instruction in
  2444. the current row.
  2445. Description specifies the purpose of the instructions and how
  2446. an instruction works in detail.
  2447. Operation is pseudo-code that implements an instruction. If a
  2448. description is vague, this section is the next best source to
  2449. understand an assembly instruction. The syntax is described in
  2450. section 3.1.1.9 in volume 2.
  2451. Flags affected lists the possible changes to system flags in
  2452. EFLAGS register.
  2453. Exceptions list the possible errors that can occur when an
  2454. instruction cannot run correctly. This section is valuable for
  2455. OS debugging. Exceptions fall into one of the following
  2456. categories:
  2457. • Protected Mode Exceptions
  2458. • Real-Address Mode Exception
  2459. • Virtual-8086 Mode Exception
  2460. • Floating-Point Exception
  2461. • SIMD Floating-Point Exception
  2462. • Compatibility Mode Exception
  2463. • 64-bit Mode Exception
  2464. For our OS, we only use Protected Mode Exceptions and
  2465. Real-Address Mode Exceptions. The details are in section 3.1.1.13
  2466. and 3.1.1.14, volume 2.
  2467. Example: jmp instruction
  2468. Let's look at our good old jmp instruction. First, the opcode
  2469. table:
  2470. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2471. | Opcode | Instruction | Op/
  2472. En | 64-bit Mode | Compat/Leg Mode | Description |
  2473. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2474. | EB cb | JMP rel8 | D | Valid | Valid | Jump short, RIP = RIP + 8-bit displacement sign extended to
  2475. 64-bits |
  2476. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2477. | E9 cw | JMP rel16 | D | N.S. | Valid | Jump near, relative, displacement relative to next instruction.
  2478. Not supported in 64-bit mode. |
  2479. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2480. | E9 cd | JMP rel32 | D | Valid | Valid | Jump near, relative, RIP = RIP + 32-bit displacement sign
  2481. extended to 64-bits |
  2482. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2483. | FF /4 | JMP r/m16 | M | N.S. | Valid | Jump near, absolute indirect, address = zero- extended r/m16. Not
  2484. supported in 64-bit mode |
  2485. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2486. | FF /4 | JMP r/m32 | M | N.S. | Valid | Jump near, absolute indirect, address given in r/m32. Not
  2487. supported in 64-bit mode |
  2488. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2489. | FF /4 | JMP r/m64 | M | Valid | N.E | Jump near, absolute indirect, RIP = 64-Bit offset from register
  2490. or memory |
  2491. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2492. | EA cd | JMP ptr16:16 | D | Inv. | Valid | Jump far, absolute, address given in operand |
  2493. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2494. | EA cp | JMP ptr16:32 | D | Inv. | Valid | Jump far, absolute, address given in operand |
  2495. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2496. | FF /5 | JMP m16:16 | D | Valid | Valid | Jump far, absolute indirect, address given in m16:16 |
  2497. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2498. | FF /5 | JMP m16:32 | D | Valid | Valid | Jump far, absolute indirect, address given in m16:32 |
  2499. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2500. | REX.W + FF /5 | JMP m16:64 | D | Valid | N.E. | Jump far, absolute indirect, address given in m16:64 |
  2501. +-----------------+---------------+----------+--------------+------------------+------------------------------------------------------------------------------------------------+
  2502. <jmp-instruction>
  2503. Each row lists a variant of jmp instruction. The first column has
  2504. the opcode EB cb, with an equivalent symbolic form jmp rel8.
  2505. Here, rel8 means 128 bytes offset, counting from the end of the
  2506. instruction. The end of an instruction is the next byte after the
  2507. last byte of an instruction. To make it more concrete, consider
  2508. this assembly code:
  2509. main:
  2510. jmp main
  2511. jmp main2
  2512. jmp main
  2513. main2:
  2514. jmp 0x1234
  2515. generates the machine code:
  2516. [float Table:
  2517. [Table 4:
  2518. Memory address of each opcode
  2519. ]
  2520. +-------------------+ +-------------------------+
  2521. | main | | main2 |
  2522. +-------------------+ +-------------------------+
  2523. \downarrow
  2524. \downarrow
  2525. +----------+--------------+-----+-----+-----+-----+-----+--------------+-----+-----+----+
  2526. | Address | 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 |
  2527. +---------- +-----+-----+-----+-----+-----+--------------+-----+-----+----+
  2528. +----------+--------------+-----+-----+-----+-----+-----+--------------+-----+-----+----+
  2529. | Opcode | eb | fe | eb | 02 | eb | fa | e9 | 2b | 12 | 00 |
  2530. +----------+--------------+-----+-----+-----+-----+-----+--------------+-----+-----+----+
  2531. ]
  2532. The first jmp main instruction is generated into eb fe and
  2533. occupies the addresses 00 and 01; the end of the first jmp main
  2534. is at address 02, past the last byte of the first jmp main which
  2535. is located at the address 01. The value fe is equivalent to -2,
  2536. since eb opcode uses only a byte (8 bits) for relative
  2537. addressing. The offset is -2, and the end address of the first
  2538. jmp main is 02, adding them together we get 00 which is the
  2539. destination address for jumping to.
  2540. Similarly, the jmp main2 instruction is generated into eb 02,
  2541. which means the offset is +2; the end address of jmp main2 is at
  2542. 04, and adding together with the offset we get the destination
  2543. address is 06, which is the start instruction marked by the label
  2544. main2.
  2545. The same rule can be applied to rel16 and rel32 encoding. In the
  2546. example code, jmp 0x1234 uses rel16 (which means 2-byte offset)
  2547. and is generated into e9 2b 12. As the table [jmp-instruction]
  2548. shows, e9 opcode takes a cw operand, which is a 2-byte offset
  2549. (section 3.1.1.1, volume 2). Notice one strange issue here: the
  2550. offset value is 2b 12, while it is supposed to be 34 12. There is
  2551. nothing wrong. Remember, rel8/rel16/rel32 is an offset, not an
  2552. address. A offset is a distance from a point. Since no label is
  2553. given but a number, the offset is calculated from the start of a
  2554. program. In this case, the start of the program is the address
  2555. 00, the end of jmp 0x1234 is the address 09[footnote:
  2556. which means 9 bytes was consumed, starting from address 0.
  2557. ], so the offset is calculated as 0x1234 - 0x9 = 0x122b. That
  2558. solved the mystery!
  2559. The jmp instructions with opcode FF /4 enable jumping to a near,
  2560. absolute address stored in a general-purpose register or a memory
  2561. location; or in short, as written in the description, absolute
  2562. indirect. The symbol /4 is the column with digit 4 in table [mod-rm-16]
  2563. [footnote:
  2564. The column with the following fields:
  2565. AH
  2566. SP
  2567. ESP
  2568. M45
  2569. XMM4
  2570. 4
  2571. 100
  2572. ]. For example:
  2573. jmp [0x1234]
  2574. is generated into:
  2575. ff 26 34 12
  2576. Since this is 16-bit code, we use table [mod-rm-16]. Looking up
  2577. the table, ModR/M value 26 means disp16, which means a 16-bit
  2578. offset from the start of current index[footnote:
  2579. Look at the note under the table.
  2580. ], which is the base address stored in DS register. In this case,
  2581. jmp [0x1234] is implicitly understood as jmp [ds:0x1234], which
  2582. means the destination address is 0x1234 bytes away from the start
  2583. of a data segment.
  2584. The jmp instruction with opcode FF /5 enables jumping to a far,
  2585. absolute address stored in a memory location (as opposed to /4,
  2586. which means stored in a register); in short, a far pointer. To
  2587. generate such instruction, the keyword far is needed to tell nasm
  2588. we are using a far pointer:
  2589. jmp far [eax]
  2590. is generated into:
  2591. 67 ff 28
  2592. Since 28 is the value in the 5th column of the table [mod-rm-32][footnote:
  2593. Remember the prefix 67 indicates the instruction is used as
  2594. 32-bit. The prefix only added if the default environment is
  2595. assumed as 16-bit when generating code by an assembler.
  2596. ] that refers to [eax], we successfully generate an instruction
  2597. for a far jump. After CPU runs the instruction, the program
  2598. counter eip and code segment register cs is set to the memory
  2599. address, stored in the memory location that eax points to, and
  2600. CPU starts fetching code from the new address in cs and eip. To
  2601. make it more concrete, here is an example:
  2602. <Graphics file: C:/Users/Tu Do/os01/book_src/images/04/far_jmp_ex.pdf>
  2603. The far address consumes total of 6 bytes in size for a 16-bit
  2604. segment and 32-bit address, which is encoded as m16:32 from the
  2605. table [jmp-instruction]. As can be seen from the figure above,
  2606. the blue part is a segment address, loaded into cs register with
  2607. the value 0x5678; the red part is the memory address within that
  2608. segment, loaded into eip register with the value 0x1234 and start
  2609. executing from there.
  2610. Finally, the jmp instructions with EA opcode jump to a direct
  2611. absolute address. For example, the instruction:
  2612. jmp 0x5678:0x1234
  2613. is generated into:
  2614. ea 34 12 78 56
  2615. The address 0x5678:0x1234 is right next to the opcode, unlike FF
  2616. /5 instruction that needs an indirect address in eax register.
  2617. We skip the jump instruction with REX prefix, as it is a 64-bit
  2618. instruction.
  2619. Examine compiled data
  2620. In this section, we will examine how data definition in C maps to
  2621. its assembly form. The generated code is extracted from .bss
  2622. section. That means, the assembly code displayed has no[footnote:
  2623. Actually, code is just a type of data, and is often used for
  2624. hijacking into a running program to execute such code. However,
  2625. we have no use for it in this book.
  2626. ], aside from showing that such a value has an equivalent
  2627. assembly opcode that represents an instruction.
  2628. The code-assembly listing is not random, but is based on Chapter
  2629. 4 of Volume 1, “Data Type”. The chapter lists fundamental data
  2630. types that x86 hardware operates on, and through learning the
  2631. generated assembly code, it can be understood how close C maps
  2632. its syntax to hardware, and then a programmer can see why C is
  2633. appropriate for OS programming. The specific objdump command used
  2634. in this section will be:
  2635. $ objdump -z -M intel -S -D <object file> | less
  2636. Note: zero bytes are hidden with three dot symbols: ... To show
  2637. all the zero bytes, we add -z option.
  2638. Fundamental data types
  2639. The most basic types that x86 architecture works with are based
  2640. on sizes, each is twice as large as the previous one: 1 byte (8
  2641. bits), 2 bytes (16 bits), 4 bytes (32 bits), 8 bytes (64 bits)
  2642. and 16 bytes (128 bits).
  2643. <Graphics file: C:/Users/Tu Do/os01/book_src/images/04/fundamental_data_types.pdf>
  2644. These types are simplest: they are just chunks of memory at
  2645. different sizes that enables CPU to access memory efficiently.
  2646. From the manual, section 4.1.1, volume 1:
  2647. Words, doublewords, and quadwords do not need to be aligned in
  2648. memory on natural boundaries. The natural boundaries for words,
  2649. double words, and quadwords are even-numbered addresses,
  2650. addresses evenly divisible by four, and addresses evenly
  2651. divisible by eight, respectively. However, to improve the
  2652. performance of programs, data structures (especially stacks)
  2653. should be aligned on natural boundaries whenever possible. The
  2654. reason for this is that the processor requires two memory
  2655. accesses to make an unaligned memory access; aligned accesses
  2656. require only one memory access. A word or doubleword operand that
  2657. crosses a 4-byte boundary or a quadword operand that crosses an
  2658. 8-byte boundary is considered unaligned and requires two separate
  2659. memory bus cycles for access.
  2660. Some instructions that operate on double quadwords require memory
  2661. operands to be aligned on a natural boundary. These instructions
  2662. generate a general-protection exception (#GP) if an unaligned
  2663. operand is specified. A natural boundary for a double quadword is
  2664. any address evenly divisible by 16. Other instructions that
  2665. operate on double quadwords permit unaligned access (without
  2666. generating a general-protection exception). However, additional
  2667. memory bus cycles are required to access unaligned data from
  2668. memory.
  2669. In C, the following primitive types (must include stdint.h) maps
  2670. to the fundamental types:
  2671. Source
  2672. #include <stdint.h>
  2673. uint8_t @|\color{red}\bfseries byte|@ = 0x12;
  2674. uint16_t @|\color{blue}\bfseries word|@ = 0x1234;
  2675. uint32_t @|\color{green}\bfseries dword|@ = 0x12345678;
  2676. uint64_t @|\color{magenta}\bfseries qword|@ = 0x123456789abcdef;
  2677. unsigned __int128 @|\color{cyan}\bfseries dqword1|@ = (__int128)
  2678. 0x123456789abcdef;
  2679. unsigned __int128 @|\color{cyan}\bfseries dqword2|@ = (__int128)
  2680. 0x123456789abcdef << 64;
  2681. int main(int argc, char *argv[]) {
  2682. return 0;
  2683. }
  2684. Assembly
  2685. 0804a018 <byte>:
  2686. 804a018: 12 00 adc al,BYTE PTR
  2687. [eax]
  2688. 0804a01a <word>:
  2689. 804a01a: 34 12 xor al,0x12
  2690. 0804a01c <dword>:
  2691. 804a01c: 78 56 js 804a074
  2692. <_end+0x48>
  2693. 804a01e: 34 12 xor al,0x12
  2694. 0804a020 <qword>:
  2695. 804a020: ef out dx,eax
  2696. 804a021: cd ab int 0xab
  2697. 804a023: 89 67 45 mov DWORD PTR
  2698. [edi+0x45],esp
  2699. 804a026: 23 01 and eax,DWORD PTR
  2700. [ecx]
  2701. 0000000000601040 <dqword1>:
  2702. 601040: ef out dx,eax
  2703. 601041: cd ab int 0xab
  2704. 601043: 89 67 45 mov DWORD PTR
  2705. [rdi+0x45],esp
  2706. 601046: 23 01 and eax,DWORD PTR
  2707. [rcx]
  2708. 601048: 00 00 add BYTE PTR
  2709. [rax],al
  2710. 60104a: 00 00 add BYTE PTR
  2711. [rax],al
  2712. 60104c: 00 00 add BYTE PTR
  2713. [rax],al
  2714. 60104e: 00 00 add BYTE PTR
  2715. [rax],al
  2716. 0000000000601050 <dqword2>:
  2717. 601050: 00 00 add BYTE PTR
  2718. [rax],al
  2719. 601052: 00 00 add BYTE PTR
  2720. [rax],al
  2721. 601054: 00 00 add BYTE PTR
  2722. [rax],al
  2723. 601056: 00 00 add BYTE PTR
  2724. [rax],al
  2725. 601058: ef out dx,eax
  2726. 601059: cd ab int 0xab
  2727. 60105b: 89 67 45 mov DWORD PTR
  2728. [rdi+0x45],esp
  2729. 60105e: 23 01 and eax,DWORD PTR
  2730. [rcx]
  2731. gcc generates the variables byte, word, dword, qword, dqword1,
  2732. dword2, written earlier, with their respective values highlighted
  2733. in the same colors; variables of the same type are also
  2734. highlighted in the same color. Since this is data section, the
  2735. assembly listing carries no meaning. When byte is declared with
  2736. uint8_t, gcc guarantees that the size of byte is always 1 byte.
  2737. But, an alert reader might notice the 00 value next to the 12
  2738. value in the byte variable. This is normal, as gcc avoid memory
  2739. misalignment by adding extra padding bytespadding bytes. To make
  2740. it easier to see, we look at readelf output of .data section:
  2741. $ readelf -x .data hello
  2742. the output is (the colors mark which values belong to which
  2743. variables):
  2744. Hex dump of section '.data':
  2745. 0x00601020 00000000 00000000 00000000 00000000 ................
  2746. 0x00601030 12003412 78563412 efcdab89 67452301 ..4.xV4.....gE#.
  2747. 0x00601040 efcdab89 67452301 00000000 00000000 ....gE#.........
  2748. 0x00601050 00000000 00000000 efcdab89 67452301 ............gE#.
  2749. As can be seen in the readelf output, variables are allocated
  2750. storage space according to their types and in the declared order
  2751. by the programmer (the colors correspond the the variables).
  2752. Intel is a little-endian machine, which means smaller addresses
  2753. hold bytes with smaller values, larger addresses hold byte with
  2754. larger values. For example, 0x1234 is displayed as 34 12; that
  2755. is, 34 appears first at address 0x601032, then 12 at 0x601033.
  2756. The decimal values within a byte is unchanged, so we see 34 12
  2757. instead of 43 21. This is quite confusing at first, but you will
  2758. get used to it soon.
  2759. Also, isn't it redundant when char type is always 1 byte already
  2760. and why do we bother adding int8_t? The truth is, char type is
  2761. not guaranteed to be 1 byte in size, but only the minimum of 1
  2762. byte in size. In C, a byte is defined to be the size of a char,
  2763. and a char is defined to be smallest addressable unit of the
  2764. underlying hardware platform. There are hardware devices that the
  2765. smallest addressable unit is 16 bit or even bigger, which means
  2766. char is 2 bytes in size and a “byte” in such platforms is
  2767. actually 2 units of 8-bit bytes.
  2768. Not all architectures support the double quadword type. Still,
  2769. gcc does provide support for 128-bit number and generate code
  2770. when a CPU supports it (that is, a CPU must be 64-bit). By
  2771. specifying a variable of type __int128 or unsigned __int128, we
  2772. get a 128-bit variable. If a CPU does not support 64-bit mode,
  2773. gcc throws an error.
  2774. The data types in C, which represents the fundamental data types,
  2775. are also called unsigned numbers. Other than numerical
  2776. calculations, unsigned numbers are used as a tool for structuring
  2777. data in memory; we will this application see later in the book,
  2778. when various data structures are organized into bit groups.
  2779. In all the examples above, when the value of a variable with
  2780. smaller size is assigned to a variable with larger size, the
  2781. value easily fits in the larger variable. On the contrary, the
  2782. value of a variable with larger size is assigned to a variable
  2783. with smaller size, two scenarios occur:
  2784. • The value is greater than the maximum value of the variable
  2785. with smaller layout, so it needs truncating to the size of the
  2786. variable and causing incorrect value.
  2787. • The value is smaller than the maximum value of the variable
  2788. with a smaller layout, so it fits the variable.
  2789. However, the value might be unknown until runtime and can be
  2790. value, it is best not to let such implicit conversion handled by
  2791. the compiler, but explicitly controlled by a programmer.
  2792. Otherwise it will cause subtle bugs that are hard to catch as the
  2793. erroneous values might rarely be used to reproduce the bugs.
  2794. Pointer Data Types
  2795. Pointers are variables that hold memory addresses. x86 works with
  2796. 2 types of pointers:
  2797. Near pointer is a 16-bit/32-bit offset within a segment, also
  2798. called effective address.
  2799. Far pointer is also an offset like a near pointer, but with an
  2800. explicit segment selector.
  2801. <Graphics file: C:/Users/Tu Do/os01/book_src/images/04/pointer_data_type.pdf>
  2802. C only provides near pointer, since far pointer is platform
  2803. dependent, such as x86. In application code, you can assume that
  2804. the address of current segment starts at 0, so the offset is
  2805. actually any memory addres from 0 to the maximum address.
  2806. Source
  2807. #include <stdint.h>
  2808. int8_t i = 0;
  2809. int8_t @|\color{red}\bfseries *p1|@ = (int8_t *) 0x1234;
  2810. int8_t @|\color{blue}\bfseries *p2|@ = &i;
  2811. int main(int argc, char *argv[]) {
  2812. return 0;
  2813. }
  2814. Assembly
  2815. 0000000000601030 <p1>:
  2816. 601030: 34 12 xor al,0x12
  2817. 601032: 00 00 add BYTE PTR
  2818. [rax],al
  2819. 601034: 00 00 add BYTE PTR
  2820. [rax],al
  2821. 601036: 00 00 add BYTE PTR
  2822. [rax],al
  2823. 0000000000601038 <p2>:
  2824. 601038: 41 10 60 00 adc BYTE PTR
  2825. [r8+0x0],spl
  2826. 60103c: 00 00 add BYTE PTR
  2827. [rax],al
  2828. 60103e: 00 00 add BYTE PTR
  2829. [rax],al
  2830. Disassembly of section .bss:
  2831. 0000000000601040 <__bss_start>:
  2832. 601040: 00 00 add BYTE PTR
  2833. [rax],al
  2834. 0000000000601041 <i>:
  2835. 601041: 00 00 add BYTE PTR
  2836. [rax],al
  2837. 601043: 00 00 add BYTE PTR
  2838. [rax],al
  2839. 601045: 00 00 add BYTE PTR
  2840. [rax],al
  2841. 601047: 00 .byte 0x0
  2842. The pointer p1 holds a direct address with the value 0x1234. The
  2843. pointer p2 holds the address of the variable i. Note that both
  2844. the pointers are 8 bytes in size (or 4-byte, if 32-bit).
  2845. Bit Field Data Type
  2846. A bit fieldbit field is a contiguous sequence of bits. Bit fields
  2847. allow data structuring at bit level. For example, a 32-bit data
  2848. can hold multiple bit fields that represent multiples different
  2849. pieces of information, such as bits 0-4 specifies the size of a
  2850. data structure, bit 5-6 specifies permissions and so on. Data
  2851. structures at the bit level are common for low-level programming.
  2852. <Graphics file: C:/Users/Tu Do/os01/book_src/images/04/bit_field_data_type.pdf>
  2853. Source
  2854. struct bit_field {
  2855. int data1:8;
  2856. int data2:8;
  2857. int data3:8;
  2858. int data4:8;
  2859. };
  2860. struct bit_field2 {
  2861. int data1:8;
  2862. int data2:8;
  2863. int data3:8;
  2864. int data4:8;
  2865. char data5:4;
  2866. };
  2867. struct normal_struct {
  2868. int data1;
  2869. int data2;
  2870. int data3;
  2871. int data4;
  2872. };
  2873. struct normal_struct @|\color{red}\bfseries ns|@ = {
  2874. .data1 = @|\color{red}\bfseries 0x12345678|@,
  2875. .data2 = @|\color{red}\bfseries 0x9abcdef0|@,
  2876. .data3 = @|\color{red}\bfseries 0x12345678|@,
  2877. .data4 = @|\color{red}\bfseries 0x9abcdef0|@,
  2878. };
  2879. int @|\color{blue}\bfseries i|@ = 0x12345678;
  2880. struct bit_field @|\color{magenta}\bfseries bf|@ = {
  2881. .data1 = @|\color{magenta}\bfseries 0x12|@,
  2882. .data2 = @|\color{magenta}\bfseries 0x34|@,
  2883. .data3 = @|\color{magenta}\bfseries 0x56|@,
  2884. .data4 = @|\color{magenta}\bfseries 0x78|@
  2885. };
  2886. struct bit_field2 @|\color{green}\bfseries bf2|@ = {
  2887. .data1 = @|\color{green}\bfseries 0x12|@,
  2888. .data2 = @|\color{green}\bfseries 0x34|@,
  2889. .data3 = @|\color{green}\bfseries 0x56|@,
  2890. .data4 = @|\color{green}\bfseries 0x78|@,
  2891. .data5 = @|\color{green}\bfseries 0xf|@
  2892. };
  2893. int main(int argc, char *argv[]) {
  2894. return 0;
  2895. }
  2896. Assembly
  2897. Each variable and its value are given a unique color in the
  2898. assembly listing below:
  2899. 0804a018 <ns>:
  2900. 804a018: 78 56 js 804a070 <_end+0x34>
  2901. 804a01a: 34 12 xor al,0x12
  2902. 804a01c: f0 de bc 9a 78 56 34 lock fidivr WORD PTR
  2903. [edx+ebx*4+0x12345678]
  2904. 804a023: 12
  2905. 804a024: f0 de bc 9a 78 56 34 lock fidivr WORD PTR
  2906. [edx+ebx*4+0x12345678]
  2907. 804a02b: 12
  2908. 0804a028 <i>:
  2909. 804a028: 78 56 js 804a080 <_end+0x44>
  2910. 804a02a: 34 12 xor al,0x12
  2911. 0804a02c <bf>:
  2912. 804a02c: 12 34 56 adc dh,BYTE PTR
  2913. [esi+edx*2]
  2914. 804a02f: 78 12 js 804a043 <_end+0x7>
  2915. 0804a030 <bf2>:
  2916. 804a030: 12 34 56 adc dh,BYTE PTR
  2917. [esi+edx*2]
  2918. 804a033: 78 0f js 804a044 <_end+0x8>
  2919. 804a035: 00 00 add BYTE PTR [eax],al
  2920. 804a037: 00 .byte 0x0
  2921. The sample code creates 4 variables: ns, i, bf, bf2. The
  2922. definition of normal_struct and bit_field structs both specify 4
  2923. integers. bit_field specifies additional information next to its
  2924. member name, separated by a colon, e.g. .data1 : 8. This extra
  2925. information is the bit width of each bit group. It means, even
  2926. though defined as an int, .data1 only consumes 8 bit of
  2927. information. If additional data members are specified after
  2928. .data1, two scenarios happen:
  2929. • If the new data members fit within the remaining bits after
  2930. .data, which are 24 bits[footnote:
  2931. Since .data1 is declared as an int, 32 bits are still allocated,
  2932. but .data1 can only access 8 bits of information.
  2933. ], then the total size of bit_field struct is still 4 bytes, or
  2934. 32 bits.
  2935. • If the new data members don't fit, then the remaining 24 bits
  2936. (3 bytes) are still allocated. However, the new data members
  2937. are allocated brand new storages, without using the previous 24
  2938. bits.
  2939. In the example, the 4 data members: .data1, .data2, .data3 and
  2940. .data4, each can access 8 bits of information, and together can
  2941. access all of 4 bytes of the integer first declared by .data1. As
  2942. can be seen by the generated assembly code, the values of bf are
  2943. follow natural order as written in the C code: 12 34 56 78, since
  2944. each value is a separate members. In contrast, the value of i is
  2945. a number as a whole, so it is subject to the rule of little
  2946. endianess and thus contains the value 78 56 34 12. Note that at
  2947. 804a02f, is the address of the final byte in bf, but next to it
  2948. is a number 12, despite 78 is the last number in it. This extra
  2949. number 12 does not belong to the value of bf. objdump is just
  2950. being confused that 78 is an opcode; 78 corresponds to js
  2951. instruction, and it requires an operand. For that reason, objdump
  2952. grabs whatever the next byte after 78 and put it there. objdump
  2953. is a tool to display assembly code after all. A better tool to
  2954. use is gdb that we will learn in the next chapter. But for this
  2955. chapter, objdump suffices.
  2956. Unlike bf, each data member in ns is allocated fully as an
  2957. integer, 4 bytes each, 16 bytes in total. As we can see, bit
  2958. field and normal struct are different: bit field structure data
  2959. at the bit level, while normal struct works at byte level.
  2960. Finally, the struct of bf2[footnote:
  2961. bit_field2
  2962. ] is the same of bf[footnote:
  2963. bit_field
  2964. ], except it contains one more data member: .data5, and is
  2965. defined as an integer. For this reason, another 4 bytes are
  2966. allocated just for .data5, even though it can only access 8 bits
  2967. of information, and the final value of bf2 is: 12 34 56 78 0f 00
  2968. 00 00. The remaining 3 bytes must be accessed by the mean of a
  2969. pointer, or casting to another data type that can fully access
  2970. all 4 bytes..
  2971. What happens when the definition of bit_field struct and bf
  2972. variable are changed to:
  2973. struct bit_field {
  2974. int data1:8;
  2975. };
  2976. struct bit_field bf = {
  2977. .data1 = 0x1234,
  2978. };
  2979. What will be the value of .data1?
  2980. What happens when the definition of bit_field2 struct is
  2981. changed to:
  2982. struct bit_field2 {
  2983. int data1:8;
  2984. int data5:32;
  2985. };
  2986. What is layout of a variable of type bit_field2?
  2987. String Data Types
  2988. Although share the same name, string as defined by x86 is
  2989. different than a string in C. x86 defines string as “continuous
  2990. sequences of bits, bytes, words, or doublewords”. On the other
  2991. hand, C defines a string as an array of 1-byte characters with a
  2992. zero as the last element of the array to make a null-terminated
  2993. string. This implies that strings in x86 are arrays, not C
  2994. strings. A programmer can define an array of bytes, words or
  2995. doublewords with char or uint8_t, short or uint16_t and int or
  2996. uint32_t, except an array of bits. However, such a feature can be
  2997. easily implemented, as an array of bits is essentially any array
  2998. of bytes, or words or doublewords, but operates at the bit level.
  2999. The following code demonstrates how to define array (string) data
  3000. types:
  3001. Source
  3002. #include <stdint.h>
  3003. uint8_t @|\color{red}\bfseries a8[2]|@ = {0x12, 0x34};
  3004. uint16_t @|\color{blue}\bfseries a16[2]|@ = {0x1234, 0x5678};
  3005. uint32_t @|\color{magenta}\bfseries a32[2]|@ = {0x12345678,
  3006. 0x9abcdef0};
  3007. uint64_t @|\color{green}\bfseries a64[2]|@ = {0x123456789abcdef0,
  3008. 0x123456789abcdef0};
  3009. int main(int argc, char *argv[])
  3010. {
  3011. return 0;
  3012. }
  3013. Assembly
  3014. 0804a018 <a8>:
  3015. 804a018: 12 34 00 adc dh,BYTE PTR
  3016. [eax+eax*1]
  3017. 804a01b: 00 34 12 add BYTE PTR
  3018. [edx+edx*1],dh
  3019. 0804a01c <a16>:
  3020. 804a01c: 34 12 xor al,0x12
  3021. 804a01e: 78 56 js 804a076 <_end+0x3a>
  3022. 0804a020 <a32>:
  3023. 804a020: 78 56 js 804a078 <_end+0x3c>
  3024. 804a022: 34 12 xor al,0x12
  3025. 804a024: f0 de bc 9a f0 de bc lock fidivr WORD PTR
  3026. [edx+ebx*4-0x65432110]
  3027. 804a02b: 9a
  3028. 0804a028 <a64>:
  3029. 804a028: f0 de bc 9a 78 56 34 lock fidivr WORD PTR
  3030. [edx+ebx*4+0x12345678]
  3031. 804a02f: 12
  3032. 804a030: f0 de bc 9a 78 56 34 lock fidivr WORD PTR
  3033. [edx+ebx*4+0x12345678]
  3034. 804a037: 12
  3035. Despite a8 is an array with 2 elements, each is 1-byte long, but
  3036. it is still allocated with 4 bytes. Again, to ensure natural
  3037. alignment for best performance, gcc pads extra zero bytes. As
  3038. shown in the assembly listing, the actual value of a8 is 12 34 00
  3039. 00, with a8[0] equals to 12 and a8[1] equals to 34.
  3040. Then it comes a16 with 2 elements, each is 2-byte long. Since 2
  3041. elements are 4 bytes in total, which is in the natural alignment,
  3042. gcc pads no byte. The value of a16 is 34 12 78 56, with a16[0]
  3043. equals to 34 12 and a16[1] equals to 78 56. Note that, objdump is
  3044. confused again, as de is the opcode for the instruction fidivr
  3045. (short of reverse divide) that requires another operand, so
  3046. objdump grabs whatever the next bytes that makes sense to it for
  3047. creating “an operand”. Only the highlighted values belong to a32.
  3048. Next is a32, with 2 elements, 4 bytes each. Similar to above
  3049. arrays, the value of a32[0] is 78 56 34 12, the value of a32[1]
  3050. is f0 de bc 9a, exactly what is assigned in the C code.
  3051. Finally is a64, also with 2 elements, but 8 bytes each. The total
  3052. size of a64 is 16 bytes, which is in the natural alignment,
  3053. therefore no padding bytes added. The values of both a64[0] and
  3054. a64[1] are the same: f0 de bc 9a 78 56 34 12, that got
  3055. misinterpreted to fidivr instruction.
  3056. [float Figure:
  3057. [Figure 0.13:
  3058. a8, a16, a32 and a64 memory layouts
  3059. ]
  3060. a8:  
  3061. +----------+
  3062. | 12 | 34 |
  3063. +----------+
  3064. a16:
  3065. +--------------------+
  3066. | 34 12   | 78 56    |
  3067. +--------------------+
  3068. a32:
  3069. +----------------------------------------+
  3070. | 78 56 34 12       | f0 de bc 9a        |
  3071. +----------------------------------------+
  3072. a64:
  3073. +---------------------------------------------------------------------------------+
  3074. | f0 de bc 9a 78 56 34 12               | f0 de bc 9a 78 56 34 12   
  3075.              |
  3076. +---------------------------------------------------------------------------------+
  3077. ]
  3078. However, beyond one-dimensional arrays that map directly to
  3079. hardware string type, C provides its own syntax for
  3080. multi-dimensional arrays:
  3081. Source
  3082. #include <stdint.h>
  3083. uint8_t @|\color{red}\bfseries a2[2][2]|@ = {
  3084. {0x12, 0x34},
  3085. {0x56, 0x78}
  3086. };
  3087. uint8_t @|\color{blue}\bfseries a3[2][2][2]|@ = {
  3088. {{0x12, 0x34},
  3089. {0x56, 0x78}},
  3090. {{0x9a, 0xbc},
  3091. {0xde, 0xff}},
  3092. };
  3093. int main(int argc, char *argv[]) {
  3094. return 0;
  3095. }
  3096. Assembly
  3097. 0804a018 <a2>:
  3098. 804a018: 12 34 56 adc dh,BYTE PTR
  3099. [esi+edx*2]
  3100. 804a01b: 78 12 js 804a02f <_end+0x7>
  3101. 0804a01c <a3>:
  3102. 804a01c: 12 34 56 adc dh,BYTE PTR
  3103. [esi+edx*2]
  3104. 804a01f: 78 9a js 8049fbb
  3105. <_DYNAMIC+0xa7>
  3106. 804a021: bc .byte 0xbc
  3107. 804a022: de ff fdivrp st(7),st
  3108. Technically, multi-dimensional arrays are like normal arrays: in
  3109. the end, the total size is translated into flat allocated bytes.
  3110. A 2 x 2 array is allocated with 4 bytes; a 2\times2\times2
  3111. array
  3112. is allocated with 8 bytes, as can be seen in the assembly listing
  3113. of a2[footnote:
  3114. Again, objdump is confused and put the number 12 next to 78 in a3
  3115. listing.
  3116. ] and a3. In low-level assembly code, the representation is the
  3117. same between a[4] and a[2][2]. However, in high-level C code, the
  3118. difference is tremendous. The syntax of multi-dimensional array
  3119. enables a programmer to think with higher level concepts, instead
  3120. of translating manually from high-level concepts to low-level
  3121. code and work with high-level concepts in his head at the same
  3122. time.
  3123. The following two-dimensional array can hold a list of 2 names
  3124. with the length of 10:
  3125. char names[2][10] = {
  3126. "John Doe",
  3127. "Jane Doe"
  3128. };
  3129. To access a name, we simply adjust the column index[footnote:
  3130. The left index is called column index since it changes the index
  3131. based on a column.
  3132. ] e.g. names[0], names[1]. To access individual character within
  3133. a name, we use the row index[footnote:
  3134. Same with column index, the right index is called row index since
  3135. it changes the index based on a row.
  3136. ] e.g. names[0][0] gives the character “J”, names[0][1] gives the
  3137. character “o” and so on.
  3138. Without such syntax, we need to create a 20-byte array e.g.
  3139. names[20], and whenever we want to access a character e.g. to
  3140. check if the names contains with a number in it, we need to
  3141. calculate the index manually. It would be distracting, since we
  3142. constantly need to switch thinkings between the actual problem
  3143. and the translate problem.
  3144. Since this is a repeating pattern, C abstracts away this
  3145. problem with the syntax for define and manipulating
  3146. multi-dimensional array. Through this example, we can clearly
  3147. see the power of abstraction through language can give us. It
  3148. would be ideal if a programmer is equipped with such power to
  3149. define whatever syntax suitable for a problem at hands. Not
  3150. many languages provide such capacity. Fortunately, through C
  3151. macro, we can partially achieve that goal .
  3152. In all cases, an array is guaranteed to generate contiguous bytes
  3153. of memory, regardless of the dimensions it has.
  3154. What is the difference between a multi-dimensional array and an
  3155. array of pointers, or even pointers of pointers?
  3156. Examine compiled code
  3157. This section will explore how compiler transform high level code
  3158. into assembly code that CPU can execute, and see how common
  3159. assembly patterns help to create higher level syntax. -S option
  3160. is added to objdump to better demonstrate the connection between
  3161. high and low level code.
  3162. In this section, the option --no-show-raw-insn is added to
  3163. objdump command to omit the opcodes for clarity:
  3164. $ objdump --no-show-raw-insn -M intel -S -D <object file> | less
  3165. Data Transfer
  3166. Previous section explores how various types of data are created,
  3167. and how they are laid out in memory. Once memory storages are
  3168. allocated for variables, they must be accessible and writable.
  3169. Data transfer instructions move data (bytes, words, doublewords
  3170. or quadwords) between memory and registers, and between
  3171. registers, effectively read from a storage source and write to
  3172. another storage source.
  3173. Source
  3174. #include <stdint.h>
  3175. int32_t i = 0x12345678;
  3176. int main(int argc, char *argv[]) {
  3177. int j = i;
  3178. int k = 0xabcdef;
  3179. return 0;
  3180. }
  3181. Assembly
  3182. 080483db <main>:
  3183. #include <stdint.h>
  3184. int32_t i = 0x12345678;
  3185. int main(int argc, char *argv[]) {
  3186. 80483db: push ebp
  3187. 80483dc: mov ebp,esp
  3188. 80483de: sub esp,0x10
  3189. int j = i;
  3190. 80483e1: mov eax,ds:0x804a018
  3191. 80483e6: mov DWORD PTR [ebp-0x8],eax
  3192. int k = 0xabcdef;
  3193. 80483e9: mov DWORD PTR [ebp-0x4],0xabcdef
  3194. return 0;
  3195. 80483f0: mov eax,0x0
  3196. }
  3197. 80483f5: leave
  3198. 80483f6: ret
  3199. 80483f7: xchg ax,ax
  3200. 80483f9: xchg ax,ax
  3201. 80483fb: xchg ax,ax
  3202. 80483fd: xchg ax,ax
  3203. 80483ff: nop
  3204. The general data movement is performed with the mov instruction.
  3205. Note that despite the instruction being called mov, it actually
  3206. copies data from one destination to another.
  3207. The red instruction copies data from the register esp to the
  3208. register ebp. This mov instruction moves data between registers
  3209. and is assigned the opcode 89.
  3210. The blue instructions copies data from one memory location (the i
  3211. variable) to another (the j variable). There exists no data
  3212. movement from memory to memory; it requires two mov instructions,
  3213. one for copying the data from a memory location to a register,
  3214. and one for copying the data from the register to the destination
  3215. memory location.
  3216. The pink instruction copies an immediate value into memory.
  3217. Finally, the green instruction copies immediate data into a
  3218. register.
  3219. Expressions
  3220. Source
  3221. int expr(int i, int j)
  3222. {
  3223. int add = i + j;
  3224. int sub = i - j;
  3225. int mul = i * j;
  3226. int div = i / j;
  3227. int mod = i % j;
  3228. int neg = -i;
  3229. int and = i & j;
  3230. int or = i | j;
  3231. int xor = i ^ j;
  3232. int not = ~i;
  3233. int shl = i << 8;
  3234. int shr = i >> 8;
  3235. char equal1 = (i == j);
  3236. int equal2 = (i == j);
  3237. char greater = (i > j);
  3238. char less = (i < j);
  3239. char greater_equal = (i >= j);
  3240. char less_equal = (i <= j);
  3241. int logical_and = i && j;
  3242. int logical_or = i || j;
  3243. ++i;
  3244. --i;
  3245. int i1 = i++;
  3246. int i2 = ++i;
  3247. int i3 = i--;
  3248. int i4 = --i;
  3249. return 0;
  3250. }
  3251. int main(int argc, char *argv[]) {
  3252. return 0;
  3253. }
  3254. Assembly
  3255. The full assembly listing is really long. For that reason, we
  3256. examine expression by expression.
  3257. Expression: int add = i + j;
  3258. 80483e1: mov edx,DWORD PTR [ebp+0x8]
  3259. 80483e4: mov eax,DWORD PTR [ebp+0xc]
  3260. 80483e7: add eax,edx
  3261. 80483e9: mov DWORD PTR [ebp-0x34],eax
  3262. The assembly code is straight forward: variable i and j are
  3263. stored in eax and edx respectively, then added together with
  3264. the add instruction, and the final result is stored into eax.
  3265. Then, the result is saved into the local variable add, which
  3266. is at the location [ebp-0x34].
  3267. Expression: int sub = i - j;
  3268. 80483ec: mov eax,DWORD PTR [ebp+0x8]
  3269. 80483ef: sub eax,DWORD PTR [ebp+0xc]
  3270. 80483f2: mov DWORD PTR [ebp-0x30],eax
  3271. Similar to add instruction, x86 provides a sub instruction
  3272. for subtraction. Hence, gcc translates a subtraction into sub
  3273. instruction, with eax is reloaded with i, as eax still
  3274. carries the result from previous expression. Then, j is
  3275. subtracted from i. After the subtraction, the value is saved
  3276. into the variable sub, at location [ebp-0x30].
  3277. Expression: int mul = i * j;
  3278. 80483f5: mov eax,DWORD PTR [ebp+0x8]
  3279. 80483f8: imul eax,DWORD PTR [ebp+0xc]
  3280. 80483fc: mov DWORD PTR [ebp-0x34],eax
  3281. Similar to sub instruction, only eax is reloaded, since it
  3282. carries the result of previous calculation. imul performs
  3283. signed multiply[footnote:
  3284. Unsigned multiply is perform by mul instruction.
  3285. ]. eax is first loaded with i, then is multiplied with j and
  3286. stored the result back into eax, then stored into the
  3287. variable mul at location [ebp-0x34].
  3288. Expression: int div = i / j;
  3289. 80483ff: mov eax,DWORD PTR [ebp+0x8]
  3290. 8048402: cdq
  3291. 8048403: idiv DWORD PTR [ebp+0xc]
  3292. 8048406: mov DWORD PTR [ebp-0x30],eax
  3293. Similar to imul, idiv performs sign divide. But, different
  3294. from imul above idiv only takes one operand:
  3295. 1. First, i is reloaded into eax.
  3296. 2. Then, cdq converts the double word value in eax into a
  3297. quadword value stored in the pair of registers edx:eax, by
  3298. copying the signed (bit 31[superscript:th]) of the value in eax into every bit position in edx. The pair
  3299. edx:eax is the dividend, which is the variable i, and the
  3300. operand to idiv is the divisor, which is the variable j.
  3301. 3. After the calculation, the result is stored into the pair
  3302. edx:eax registers, with the quotient in eax and remainder
  3303. in edx. The quotient is stored in the variable div, at
  3304. location [ebp-0x30].
  3305. Expression: int mod = i % j;
  3306. 8048409: mov eax,DWORD PTR [ebp+0x8]
  3307. 804840c: cdq
  3308. 804840d: idiv DWORD PTR [ebp+0xc]
  3309. 8048410: mov DWORD PTR [ebp-0x2c],edx
  3310. The same idiv instruction also performs the modulo operation,
  3311. since it also calculates a remainder and stores in the
  3312. variable mod, at location [ebp-0x2c].
  3313. Expression: int neg = -i;
  3314. 8048413: mov eax,DWORD PTR [ebp+0x8]
  3315. 8048416: neg eax
  3316. 8048418: mov DWORD PTR [ebp-0x28],eax
  3317. neg replaces the value of operand (the destination operand)
  3318. with its two's complement (this operation is equivalent to
  3319. subtracting the operand from 0). In this example, the value i
  3320. in eax is replaced replaced with -i using neg instruction.
  3321. Then, the new value is stored in the variable neg at
  3322. [ebp-0x28].
  3323. Expression: int and = i & j;
  3324. 804841b: mov eax,DWORD PTR [ebp+0x8]
  3325. 804841e: and eax,DWORD PTR [ebp+0xc]
  3326. 8048421: mov DWORD PTR [ebp-0x24],eax
  3327. and performs a bitwise AND operation on two operands, and
  3328. stores the result in the destination operand, which is the
  3329. variable and at [ebp-0x24].
  3330. Expression: int or = i | j;
  3331. 8048424: mov eax,DWORD PTR [ebp+0x8]
  3332. 8048427: or eax,DWORD PTR [ebp+0xc]
  3333. 804842a: mov DWORD PTR [ebp-0x20],eax
  3334. Similar to and instruction, or performs a bitwise OR
  3335. operation on two operands, and stores the result in the
  3336. destination operand, which is the variable or at [ebp-0x20]
  3337. in this case.
  3338. Expression: int xor = i ^ j;
  3339. 804842d: mov eax,DWORD PTR [ebp+0x8]
  3340. 8048430: xor eax,DWORD PTR [ebp+0xc]
  3341. 8048433: mov DWORD PTR [ebp-0x1c],eax
  3342. Similar to and/or instruction, xor performs a bitwise XOR
  3343. operation on two operands, and stores the result in the
  3344. destination operand, which is the variable xor at [ebp-0x1c].
  3345. Expression: int not = ~i;
  3346. 8048436: mov eax,DWORD PTR [ebp+0x8]
  3347. 8048439: not eax
  3348. 804843b: mov DWORD PTR [ebp-0x18],eax
  3349. not performs a bitwise NOT operation (each 1 is set to 0, and
  3350. each 0 is set to 1) on the destination operand and stores the
  3351. result in the destination operand location, which is the
  3352. variable not at [ebp-0x18].
  3353. Expression: int shl = i << 8;
  3354. 804843e: mov eax,DWORD PTR [ebp+0x8]
  3355. 8048441: shl eax,0x8
  3356. 8048444: mov DWORD PTR [ebp-0x14],eax
  3357. shl (shift logical left) shifts the bits in the destination
  3358. operand to the left by the number of bits specified in the
  3359. source operand. In this case, eax stores i and shl shifts eax
  3360. by 8 bits to the left. A different name for shl is sal (shift
  3361. arithmetic left). Both can be used synonymous. Finally, the
  3362. result is stored in the variable shl at [ebp-0x14].
  3363. Here is a visual demonstration of shl/sal and shr
  3364. instructions:
  3365. After shifting to the left, the right most bit is set for
  3366. Carry Flag in EFLAGS register.
  3367. Expression: int shr = i >> 8;
  3368. 8048447: mov eax,DWORD PTR [ebp+0x8]
  3369. 804844a: sar eax,0x8
  3370. 804844d: mov DWORD PTR [ebp-0x10],eax
  3371. sar is similar to shl/sal, but shift bits to the right and
  3372. extends the sign bit. For right shift, shr and sar are two
  3373. different instructions. shr differs to sar is that it does
  3374. not extend the sign bit. Finally, the result is stored in the
  3375. variable shr at [ebp-0x10].
  3376. In the figure (b), notice that initially, the sign bit is 1,
  3377. but after 1-bit and 10-bit shiftings, the shifted-out bits
  3378. are filled with zeros.
  3379. [float Figure:
  3380. [Figure 0.14:
  3381. SAR Instruction Operation (Source: Figure 7-8, Volume 1)
  3382. ]
  3383. <Graphics file: C:/Users/Tu Do/os01/book_src/images/04/sar.pdf>
  3384. ]
  3385. With sar, the sign bit (the most significant bit) is
  3386. preserved. That is, if the sign bit is 0, the new bits always
  3387. get the value 0; if the sign bit is 1, the new bits always
  3388. get the value 1.
  3389. Expression: char equal1 = (i == j);
  3390. 8048450: mov eax,DWORD PTR [ebp+0x8]
  3391. 8048453: cmp eax,DWORD PTR [ebp+0xc]
  3392. 8048456: sete al
  3393. 8048459: mov BYTE PTR [ebp-0x41],al
  3394. cmp and variants of the variants of set instructions make up
  3395. all the logical comparisons. In this expression, cmp compares
  3396. variable i and j; then sete stores the value 1 to al register
  3397. if the comparison from cmp earlier is equal, or stores 0
  3398. otherwise. The general name for variants of set instruction
  3399. is called SETcc. The suffix cc denotes the condition being
  3400. tested for in EFLAGS register. Appendix B in volume 1,
  3401. “EFLAGS Condition Codes”, lists the conditions it is possible
  3402. to test for with this instruction. Finally, the result is
  3403. stored in the variable equal1 at [ebp-0x41].
  3404. Expression: int equal2 = (i == j);
  3405. 804845c: mov eax,DWORD PTR [ebp+0x8]
  3406. 804845f: cmp eax,DWORD PTR [ebp+0xc]
  3407. 8048462: sete al
  3408. 8048465: movzx eax,al
  3409. 8048468: mov DWORD PTR [ebp-0xc],eax
  3410. Similar to equality comparison, this expression also compares
  3411. for equality, with an exception that the result is stored in
  3412. an int type. For that reason, one more instruction is a
  3413. added: movzx instruction, a variant of mov that copies the
  3414. result into a destination operand and fills the remaining
  3415. bytes with 0. In this case, since eax is 4-byte wide, after
  3416. copying the first byte in al, the remaining bytes of eax are
  3417. filled with 0 to ensure the eax carries the same value as al.
  3418. [float Figure:
  3419. [Figure 0.15:
  3420. movzx instruction
  3421. ] [float Figure:
  3422. [Sub-Figure a:
  3423. eax before movzx
  3424. ]
  3425. +-----+-----+-----+----+
  3426. | 12 | 34 | 56 | 78 |
  3427. +-----+-----+-----+----+
  3428. ] [float Figure:
  3429. [Sub-Figure b:
  3430. after movzx eax, al
  3431. ]
  3432. +-----+-----+-----+----+
  3433. | 00 | 00 | 00 | 78 |
  3434. +-----+-----+-----+----+
  3435. ]
  3436. ]
  3437. Expression: char greater = (i > j);
  3438. 804846b: mov eax,DWORD PTR [ebp+0x8]
  3439. 804846e: cmp eax,DWORD PTR [ebp+0xc]
  3440. 8048471: setg al
  3441. 8048474: mov BYTE PTR [ebp-0x40],al
  3442. Similar to equality comparison, but used setg for greater
  3443. comparison instead.
  3444. Expression: char less = (i < j);
  3445. 8048477: mov eax,DWORD PTR [ebp+0x8]
  3446. 804847a: cmp eax,DWORD PTR [ebp+0xc]
  3447. 804847d: setl al
  3448. 8048480: mov BYTE PTR [ebp-0x3f],al
  3449. Applied setl for less comparison.
  3450. Expression: char greater_equal = (i >= j);
  3451. 8048483: mov eax,DWORD PTR [ebp+0x8]
  3452. 8048486: cmp eax,DWORD PTR [ebp+0xc]
  3453. 8048489: setge al
  3454. 804848c: mov BYTE PTR [ebp-0x3e],al
  3455. Applied setge for greater or equal comparison.
  3456. Expression: char less_equal = (i <= j);
  3457. 804848f: mov eax,DWORD PTR [ebp+0x8]
  3458. 8048492: cmp eax,DWORD PTR [ebp+0xc]
  3459. 8048495: setle al
  3460. 8048498: mov BYTE PTR [ebp-0x3d],al
  3461. Applied setle for less than or equal comparison.
  3462. Expression: int logical_and = (i && j);
  3463. 804849b: cmp DWORD PTR [ebp+0x8],0x0
  3464. 804849f: je 80484ae <expr+0xd3>
  3465. 80484a1: cmp DWORD PTR [ebp+0xc],0x0
  3466. 80484a5: je 80484ae <expr+0xd3>
  3467. 80484a7: mov eax,0x1
  3468. 80484ac: jmp 80484b3 <expr+0xd8>
  3469. 80484ae: mov eax,0x0
  3470. 80484b3: mov DWORD PTR [ebp-0x8],eax
  3471. Logical AND operator && is one of the syntaxes that is made
  3472. entirely in software[footnote:
  3473. That is, there is no equivalent assembly instruction implemented
  3474. in hardware.
  3475. ] with simpler instructions. The algorithm from the assembly code
  3476. is simple:
  3477. 1. First, check if i is 0 with the instruction at 0x804849b.
  3478. (a) If true, jump to 0x80484ae and set eax to 0.
  3479. (b) Set the variable logical_and to 0, as it is the next
  3480. instruction after 0x80484ae.
  3481. 2. If i is not 0, check if j is 0 with the instruction at
  3482. 0x80484a1.
  3483. (a) If true, jump to 0x80484ae and set eax to 0.
  3484. (b) Set the variable logical_and to 0, as it is the next
  3485. instruction after 0x80484ae.
  3486. 3. If both i and j are not 0, the result is certainly 1, or
  3487. true.
  3488. (a) Set it accordingly with the instruction at 0x80484a7.
  3489. (b) Then jump to the instruction at 0x80484b3 to set the
  3490. variable logical_and at [ebp-0x8] to 1.
  3491. Expression: int logical_or = (i || j);
  3492. 80484b6: cmp DWORD PTR [ebp+0x8],0x0
  3493. 80484ba: jne 80484c2 <expr+0xe7>
  3494. 80484bc: cmp DWORD PTR [ebp+0xc],0x0
  3495. 80484c0: je 80484c9 <expr+0xee>
  3496. 80484c2: mov eax,0x1
  3497. 80484c7: jmp 80484ce <expr+0xf3>
  3498. 80484c9: mov eax,0x0
  3499. 80484ce: mov DWORD PTR [ebp-0x4],eax
  3500. Logical OR operator || is similar to logical and above.
  3501. Understand the algorithm is left as an exercise for readers.
  3502. Expression: ++i; and --i; (or i++ and i--)
  3503. 80484d1: add DWORD PTR [ebp+0x8],0x1
  3504. 80484d5: sub DWORD PTR [ebp+0x8],0x1
  3505. The syntax of increment and decrement is similar to logical
  3506. AND and logical OR in that it is made from existing
  3507. instruction, that is add. The difference is that the CPU
  3508. actually does has a built-in instruction, but gcc decided not
  3509. to use the instruction because inc and dec cause a partial
  3510. flag register stall, occurs when an instruction modifies a
  3511. part of the flag register and the following instruction is
  3512. dependent on the outcome of the flags (section 3.5.2.6, Intel Optimization Manual, 2016
  3513. ). The manual even suggests that inc and dec should be
  3514. replaced with add and sub instructions (section 3.5.1.1, Intel Optimization Manual, 2016
  3515. ).
  3516. Expression: int i1 = i++;
  3517. 80484d9: mov eax,DWORD PTR [ebp+0x8]
  3518. 80484dc: lea edx,[eax+0x1]
  3519. 80484df: mov DWORD PTR [ebp+0x8],edx
  3520. 80484e2: mov DWORD PTR [ebp-0x10],eax
  3521. First, i is copied into eax at 80484d9. Then, the value of
  3522. eax + 0x1 is copied into edx as an effective address at
  3523. 80484dc. The lea (load effective address) instruction copies
  3524. a memory address into a register. According to Volume 2, the
  3525. source operand is a memory address specified with one of the
  3526. processors addressing modes. This means, the source operand
  3527. must be specified by the addressing modes defined in
  3528. 16-bit/32-bit ModR/M Byte tables, [mod-rm-16] and [mod-rm-32]
  3529. .
  3530. After loading the incremented value into edx, the value of i
  3531. is increased by 1 at 80484df. Finally, the previous i value
  3532. is stored back to i1 at [ebp-0x8] by the instruction at
  3533. 80484e2.
  3534. Expression: int i2 = ++i;
  3535. 80484e5: add DWORD PTR [ebp+0x8],0x1
  3536. 80484e9: mov eax,DWORD PTR [ebp+0x8]
  3537. 80484ec: mov DWORD PTR [ebp-0xc],eax
  3538. The primary differences between this increment syntax and the
  3539. previous one are:
  3540. • add is used instead of lea to increase i directly.
  3541. • the newly incremented i is stored into i2 instead of the
  3542. old value.
  3543. • the expression only costs 3 instructions instead of 4.
  3544. This prefix-increment syntax is faster than the post-fix one
  3545. used previously. It might not matter much which version to
  3546. use if the increment is only used once or a few hundred times
  3547. in a small loop, but it matters when a loop runs millions or
  3548. more times. Also, depends on different circumstances, it is
  3549. more convenient to use one over the other e.g. if i is an
  3550. index for accessing an array, we want to use the old value
  3551. for accessing previous array element and newly incremented i
  3552. for current element.
  3553. Expression: int i3 = i--;
  3554. 80484ef: mov eax,DWORD PTR [ebp+0x8]
  3555. 80484f2: lea edx,[eax-0x1]
  3556. 80484f5: mov DWORD PTR [ebp+0x8],edx
  3557. 80484f8: mov DWORD PTR [ebp-0x8],eax
  3558. Similar to i++ syntax, and is left as an exercise to readers.
  3559. Expression: int i4 = --i;
  3560. 80484fb: sub DWORD PTR [ebp+0x8],0x1
  3561. 80484ff: mov eax,DWORD PTR [ebp+0x8]
  3562. 8048502: mov DWORD PTR [ebp-0x4],eax
  3563. Similar to ++i syntax, and is left as an exercise to readers.
  3564. Read section 3.5.2.4, “Partial Register Stalls” to understand
  3565. register stalls in general.
  3566. Read the sections from 7.3.1 to 7.3.7 in volume 1.
  3567. Stack
  3568. A stack is a contiguous array of memory locations that holds a
  3569. collection of discrete data. When a new element is added, a stack
  3570. grows down in memory toward lesser addresses, and shrinks up
  3571. toward greater addresses when an element is removed. x86 uses the
  3572. esp register to point to the top of the stack, at the newest
  3573. element. A stack can be originated anywhere in main memory, as
  3574. esp can be set to any memory address. x86 provides two operations
  3575. for manipulating stacks:
  3576. • push instruction and its variants add a new element on top of
  3577. the stack
  3578. • pop instructions and its variants remove the top-most element
  3579. from the stack.
  3580. +----------+----+
  3581. | 0x10000 | 00 |
  3582. +----------+----+
  3583. | 0x10001 | 00 |
  3584. +----------+----+
  3585. | 0x10002 | 00 |
  3586. +----------+----+
  3587. | 0x10003 | 00 |
  3588. +----------+----+ +-----+
  3589. | 0x10004 | 12 | \leftarrow
  3590. | esp |
  3591. +----------+----+ +-----+
  3592. +----------+----+
  3593. | 0x10000 | 00 |
  3594. +----------+----+
  3595. | 0x10001 | 00 |
  3596. +----------+----+ +-----+
  3597. | 0x10002 | 78 | \leftarrow
  3598. | esp |
  3599. +-----+
  3600. +----------+----+
  3601. | 0x10003 | 56 |
  3602. +----------+----+
  3603. | 0x10004 | 12 |
  3604. +----------+----+
  3605. +----------+----+
  3606. | 0x10000 | 00 |
  3607. +----------+----+
  3608. | 0x10001 | 00 |
  3609. +----------+----+
  3610. | 0x10002 | 00 |
  3611. +----------+----+
  3612. | 0x10003 | 00 |
  3613. +----------+----+ +-----+
  3614. | 0x10004 | 12 | \leftarrow
  3615. | esp |
  3616. +----------+----+ +-----+
  3617. Automatic variables
  3618. Local variables are variables that exist within a scope. A scope
  3619. is delimited by a pair of braces: {..}. The most common scope to
  3620. define local variables is at function scope. However, scope can
  3621. be unnamed, and variables created inside an unnamed scope do not
  3622. exist outside of its scope and its inner scope.
  3623. Function scope:
  3624. void foo() {
  3625. int a;
  3626. int b;
  3627. }
  3628. a and b are variables local to the function foo.
  3629. Unnamed scope:
  3630. int foo() {
  3631. int i;
  3632. {
  3633. int a = 1;
  3634. int b = 2;
  3635. {
  3636. return i = a + b;
  3637. }
  3638. }
  3639. }
  3640. a and b are local to where it is defined and local into its
  3641. inner child scope that return i = a + b. However, they do not
  3642. exist at the function scope that creates i.
  3643. When a local variable is created, it is pushed on the stack; when
  3644. a local variable goes out of scope, it is pop out of the stack,
  3645. thus destroyed. When an argument is passed from a caller to a
  3646. callee, it is pushed on the stack; when a callee returns to the
  3647. caller, the arguments are popped out the stack. The local
  3648. variables and arguments are automatically allocated upon enter a
  3649. function and destroyed after exiting a function, that's why it's
  3650. called automatic variables.
  3651. A base frame pointer points to the start of the current function
  3652. frame, and is kept in ebp register. Whenever a function is
  3653. called, it is allocated with its own dedicated storage on stack,
  3654. called stack frame. A stack frame is where all local variables
  3655. and arguments of a function are placed on a stack[footnote:
  3656. Data and only data are exclusively allocated on stack for every
  3657. stack frame. No code resides here.
  3658. ].
  3659. When a function needs a local variable or an argument, it uses
  3660. ebp to access a variable:
  3661. • All local variables are allocated after the ebp pointer. Thus,
  3662. to access a local variable, a number is subtracted from ebp to
  3663. reach the location of the variable.
  3664. • All arguments are allocated before ebp pointer. To access an
  3665. argument, a number is added to ebp to reach the location of the
  3666. argument.
  3667. • The ebp itself pointer points to the return address of its
  3668. caller.
  3669. +--------------------------------------+---------------------------------------------------------------------------+
  3670. | Previous Frame | Current Frame |
  3671. +--------------------------------------+-----------------------------+----------+----------------------------------+
  3672. | Function Arguments | | ebp | Local variables |
  3673. +-----+-----+-----+-----------+--------+-----------------------------+----------+-----+-----+-----+-----------+----+
  3674. | A1 | A2 | A3 | ........ | An | Return Address | Old ebp | L1 | L2 | L3 | ........ | Ln |
  3675. +-----+-----+-----+-----------+--------+-----------------------------+----------+-----+-----+-----+-----------+----+
  3676. A = Argument
  3677. L = Local Variable
  3678. Here is an example to make it more concrete:
  3679. Source
  3680. int add(int @|\color{red}\bfseries a|@, int
  3681. @|\color{green}\bfseries b|@) {
  3682. int @|\color{blue}\bfseries i|@ = @|\color{red}\bfseries a|@
  3683. + @|\color{green}\bfseries b|@;
  3684. return i;
  3685. }
  3686. Assembly
  3687. 080483db <add>:
  3688. #include <stdint.h>
  3689. int add(int a, int b) {
  3690. 80483db: push ebp
  3691. 80483dc: mov ebp,esp
  3692. 80483de: sub esp,0x10
  3693. int i = a + b;
  3694. 80483e1: mov edx,DWORD PTR [ebp+0x8]
  3695. 80483e4: mov eax,DWORD PTR [ebp+0xc]
  3696. 80483e7: add eax,edx
  3697. 80483e9: mov DWORD PTR [ebp-0x4],eax
  3698. return i;
  3699. 80483ec: mov eax,DWORD PTR [ebp-0x4]
  3700. }
  3701. 80483ef: leave
  3702. 80483f0: ret
  3703. In the assembly listing, [ebp-0x4] is the local variable i, since
  3704. it is allocated after ebp, with the length of 4 bytes (an int).
  3705. On the other hand, a and b are arguments and can be accessed with
  3706. ebp:
  3707. • [ebp+0x8] accesses a.
  3708. • [ebp+0xc] access b.
  3709. For accessing arguments, the rule is that the closer a variable
  3710. on stack to ebp, the closer it is to a function name.
  3711. +-------------------+ +-------------------+ +-------------------+ +-------------------+
  3712. | ebp+0xc | | ebp+0x8 | | ebp+0x4 | | ebp |
  3713. +-------------------+ +-------------------+ +-------------------+ +-------------------+
  3714. ---------------
  3715. \downarrow
  3716. \downarrow
  3717. \downarrow
  3718. \downarrow
  3719. +----------+-----+-----+-----+--------------+-----+-----+-----+--------------+-----+-----+-----+--------------+-----+-----+-----+-------------+
  3720. | | 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0a | 0b | 0c | 0d | 0e | 0f |
  3721. +----------+--------------------------------+--------------------------------+--------------------------------+-------------------------------+
  3722. | 0x10000 | b | a | Return Address | Old ebp |
  3723. +----------+--------------------------------+--------------------------------+--------------------------------+-------------------------------+
  3724. +-------------------+ +-------------------+
  3725. | ebp+0x8 | | ebp+0x4 |
  3726. +-------------------+ +-------------------+
  3727. \downarrow
  3728. \downarrow
  3729. +----------+-----+-----+-----+--------------+-----+-----+-----+--------------+-----+-----+-----+--------------+-----+-----+-----+-------------+
  3730. | | 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0a | 0b | 0c | 0d | 0e | 0f |
  3731. +----- +----- +-------------+
  3732. +----------+-----+-----+-----+--------------+-----+-----+-----+--------------+-----+-----+-----+--------------+-------------------------------+
  3733. |  0xffe0 | | | | | | | | | | | | N | i |
  3734. +----------+-----+-----+-----+--------------+-----+-----+-----+--------------+-----+-----+-----+--------------+-------------------------------+
  3735. N = Next local variable starts here
  3736. From the figure, we can see that a and b are laid out in memory
  3737. with the exact order as written in C, relative to the return
  3738. address.
  3739. Function Call and Return<sub:Function-Call-and>
  3740. Source
  3741. #include <stdio.h>
  3742. int add(int a, int b) {
  3743. int local = 0x12345;
  3744. return a + b;
  3745. }
  3746. int main(int argc, char *argv[]) {
  3747. add(1,1);
  3748. return 0;
  3749. }
  3750. Assembly
  3751. For every function call, gcc pushes arguments on the stack in
  3752. reversed order with the push instructions. That is, the
  3753. arguments pushed on stack are in reserved order as it is
  3754. written in high level C code, to ensure the relative order
  3755. between arguments, as seen in previous section how function
  3756. arguments and local variables are laid out. Then, gcc generates
  3757. a call instruction, which then implicitly pushes a return
  3758. address before transferring the control to add function:
  3759. 080483f2 <main>:
  3760. int main(int argc, char *argv[]) {
  3761. 80483f2: push ebp
  3762. 80483f3: mov ebp,esp
  3763. add(1,2);
  3764. 80483f5: push 0x2
  3765. 80483f7: push 0x1
  3766. 80483f9: call 80483db <add>
  3767. 80483fe: add esp,0x8
  3768. return 0;
  3769. 8048401: mov eax,0x0
  3770. }
  3771. 8048406: leave
  3772. 8048407: ret
  3773. Upon finishing the call to add function, the stack is restored by
  3774. adding 0x8 to stack pointer esp (which is equivalent to 2 pop
  3775. instructions). Finally, a leave instruction is executed and main
  3776. returns with a ret instruction. A ret instruction transfers the
  3777. program execution back to the caller to the instruction right
  3778. after the call instruction, the add instruction. The reason ret
  3779. can return to such location is that the return address implicitly
  3780. pushed by the call instruction, which is the address right after
  3781. the call instruction; whenever the CPU executes ret instruction,
  3782. it retrieves the return address that sits right after all the
  3783. arguments on the stack:
  3784. At the end of a function, gcc places a leave instruction to clean
  3785. up all spaces allocated for local variables and restore the frame
  3786. pointer to frame pointer of the caller.
  3787. 080483db <add>:
  3788. #include <stdio.h>
  3789. int add(int a, int b) {
  3790. 80483db: push ebp
  3791. 80483dc: mov ebp,esp
  3792. 80483de: sub esp,0x10
  3793. int local = 0x12345;
  3794. 80483e1: DWORD PTR [ebp-0x4],0x12345
  3795. return a + b;
  3796. 80483e8: mov edx,DWORD PTR [ebp+0x8]
  3797. 80483eb: mov eax,DWORD PTR [ebp+0xc]
  3798. 80483ee: add eax,edx
  3799. }
  3800. 80483f0: leave
  3801. 80483f1: ret
  3802. The above code that gcc generated for function calling is
  3803. actually the standard method x86 defined. Read chapter 6, “
  3804. Produce Calls, Interrupts, and Exceptions”, Intel manual volume
  3805. 1.
  3806. Loop
  3807. Loop is simply resetting the instruction pointer to an already
  3808. executed instruction and starting from there all over again. A
  3809. loop is just one application of jmp instruction. However, because
  3810. looping is a pervasive pattern, it earned its own syntax in C.
  3811. Source
  3812. #include <stdio.h>
  3813. int main(int argc, char *argv[]) {
  3814. for (int i = 0; i < 10; i++) {
  3815. }
  3816. return 0;
  3817. }
  3818. Assembly
  3819. 080483db <main>:
  3820. #include <stdio.h>
  3821. int main(int argc, char *argv[]) {
  3822. 80483db: push ebp
  3823. 80483dc: mov ebp,esp
  3824. 80483de: sub esp,0x10
  3825. for (int i = 0; i < 10; i++) {
  3826. 80483e1: mov DWORD PTR [ebp-0x4],0x0
  3827. 80483e8: jmp 80483ee <main+0x13>
  3828. 80483ea: add DWORD PTR [ebp-0x4],0x1
  3829. 80483ee: cmp DWORD PTR [ebp-0x4],0x9
  3830. 80483f2: jle 80483ea <main+0xf>
  3831. }
  3832. return 0;
  3833. 80483f4: b8 00 00 00 00 mov eax,0x0
  3834. }
  3835. 80483f9: c9 leave
  3836. 80483fa: c3 ret
  3837. 80483fb: 66 90 xchg ax,ax
  3838. 80483fd: 66 90 xchg ax,ax
  3839. 80483ff: 90 nop
  3840. The colors mark corresponding high level code to assembly code:
  3841. 1. The red instruction initialize i to 0.
  3842. 2. The green instructions compare i to 10 by using jle and
  3843. compare it to 9. If true, jump to 80483ea for another
  3844. iteration.
  3845. 3. The blue instruction increase i by 1, making the loop able
  3846. to terminate once the terminate condition is satisfied.
  3847. Why does the increment instruction (the blue instruction)
  3848. appears before the compare instructions (the green
  3849. instructions)?
  3850. What assembly code can be generated for while and do...while?
  3851. Conditional
  3852. Again, conditional in C with if...else... construct is just
  3853. another application of jmp instruction under the hood. It is also
  3854. a pervasive pattern that earned its own syntax in C.
  3855. Source
  3856. #include <stdio.h>
  3857. int main(int argc, char *argv[]) {
  3858. int i = 0;
  3859. if (argc) {
  3860. i = 1;
  3861. } else {
  3862. i = 0;
  3863. }
  3864. return 0;
  3865. }
  3866. Assembly
  3867. int main(int argc, char *argv[]) {
  3868. 80483db: push ebp
  3869. 80483dc: mov ebp,esp
  3870. 80483de: sub esp,0x10
  3871. int i = 0;
  3872. 80483e1: mov DWORD PTR [ebp-0x4],0x0
  3873. if (argc) {
  3874. 80483e8: cmp DWORD PTR [ebp+0x8],0x0
  3875. 80483ec: je 80483f7 <main+0x1c>
  3876. i = 1;
  3877. 80483ee: mov DWORD PTR [ebp-0x4],0x1
  3878. 80483f5: jmp 80483fe <main+0x23>
  3879. } else {
  3880. i = 0;
  3881. 80483f7: mov DWORD PTR [ebp-0x4],0x0
  3882. }
  3883. return 0;
  3884. 80483fe: mov eax,0x0
  3885. }
  3886. 8048403: leave
  3887. 8048404: ret
  3888. The generated assembly code follows the same order as the
  3889. corresponding high level syntax:
  3890. • red instructions represents if branch.
  3891. • blue instructions represents else branch.
  3892. • green instruction is the exit point for both if and else
  3893. branch.
  3894. if branch first compares whether argc is false (equal to 0)
  3895. with cmp instruction. If true, it proceeds to else branch at
  3896. 80483f7. Otherwise, if branch continues with the code of its
  3897. branch, which is the next instruction at 80483ee for copying 1
  3898. to i. Finally, it skips over else branch and proceeds to
  3899. 80483fe, which is the next instruction pasts the if..else...
  3900. construct.
  3901. else branch is entered when cmp instruction from if branch is
  3902. true. else branch starts at 80483f7, which is the first
  3903. instruction of else branch. The instruction copies 0 to i, and
  3904. proceeds naturally to the next instruction pasts the
  3905. if...else... construct without any jump.
  3906. The Anatomy of a Program<chap:The-Anatomy-of-a-program>
  3907. Every program consists of code and data, and only those two
  3908. components made up a program. However, if a program consists
  3909. purely code and data of its own, from the perspective of an
  3910. operating system (as well as human), it does not know in a
  3911. program, which block of binary is a program and which is just raw
  3912. data, where in the program to start execution, which region of
  3913. memory should be protected and which is free to modify. For that
  3914. reason, each program carries extra metadata to communicate with
  3915. the operating system how to handle the program.
  3916. When a source file is compiled, the generated machine code is
  3917. stored into an object file[margin:
  3918. object file
  3919. ]object file, which is just a block of binary. One or more object
  3920. files can be combined to produce an executable binary[margin:
  3921. executable binary
  3922. ]executable binary, which is a complete program runnable in an
  3923. operating system.
  3924. readelf is a program that recognizes and displays the ELF
  3925. metadata of a binary file, be it an object file or an executable
  3926. binary. ELF, or Executable and Linkable Format, is the content at
  3927. the very beginning of an executable to provide an operating
  3928. system necessary information to load into main memory and run the
  3929. executable. ELF can be thought of similar to the table of
  3930. contents of a book. In a book, a table of contents list the page
  3931. numbers of the main sections, subsections, sometimes even figures
  3932. and tables for easy lookup. Similarly, ELF lists various sections
  3933. used for code and data, and the memory addresses of each symbol
  3934. along with other information.
  3935. An ELF binary is composed of:
  3936. • An ELF header[margin:
  3937. ELF header
  3938. ]ELF header: the very first section of an executable that
  3939. describes the file's organization.
  3940. • A Program header table[margin:
  3941. program header table
  3942. ]program header table: is an array of fixed-size structures that
  3943. describes segments of an executable.
  3944. • A Section header table[margin:
  3945. section header table
  3946. ]section header table: is an array of fixed-size structures that
  3947. describes sections of an executable.
  3948. • Segments and section[margin:
  3949. Segments and sections
  3950. ]Segments and sections are the main content of an ELF binary,
  3951. which are the code and data, divided into chunks of different
  3952. purposes.
  3953. A segmentsegment is a composition of zero or more sections and
  3954. is directly loaded by an operating system at runtime.
  3955. A sectionsection is a block of binary that is either:
  3956. – actual program code and data that is available in memory when
  3957. a program runs.
  3958. – metadata about other sections used only in the linking
  3959. process, and disappear from the final executable.
  3960. Linker uses sections to build segments.
  3961. [float Figure:
  3962. [Figure 0.16:
  3963. ELF - Linking View vs Executable View (Source: Wikipedia)
  3964. ]
  3965. <Graphics file: C:/Users/Tu Do/os01/book_src/images/05/Elf-layout--en.pdf>
  3966. ]
  3967. Later we will compile our kernel as an ELF executable with GCC,
  3968. and explicitly specify how segments are created and where they
  3969. are loaded in memory through the use a linker script, a text file
  3970. to instruct how a linker should generate a binary. For now, we
  3971. will examine the anatomy of an ELF executable in detail.
  3972. Reference documents:
  3973. The [margin:
  3974. ELF specification
  3975. ]ELF specification is bundled as a man page in Linux:
  3976. $ man elf
  3977. It is a useful resource to understand and implement ELF. However,
  3978. it will be much easier to use after you finish this chapter, as
  3979. the specification mixes implementation details in it.
  3980. The default specification is a generic one, in which every ELF
  3981. implementation follows. However, each platform provides extra
  3982. features unique to it. The ELF specification for x86 is currently
  3983. maintained on Github by H.J. Lu: https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI
  3984. .
  3985. Platform-dependent details are referred to as “processor specific”
  3986. in the generic ELF specification. We will not explore these
  3987. details, but study the generic details, which are enough for
  3988. crafting an ELF binary image for our operating system.
  3989. ELF header
  3990. To see the information of an ELF header:
  3991. $ readelf -h hello
  3992. The output:
  3993. ELF Header:
  3994. Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  3995. Class: ELF64
  3996. Data: 2's complement, little
  3997. endian
  3998. Version: 1 (current)
  3999. OS/ABI: UNIX - System V
  4000. ABI Version: 0
  4001. Type: EXEC (Executable file)
  4002. Machine: Advanced Micro Devices
  4003. X86-64
  4004. Version: 0x1
  4005. Entry point address: 0x400430
  4006. Start of program headers: 64 (bytes into file)
  4007. Start of section headers: 6648 (bytes into file)
  4008. Flags: 0x0
  4009. Size of this header: 64 (bytes)
  4010. Size of program headers: 56 (bytes)
  4011. Number of program headers: 9
  4012. Size of section headers: 64 (bytes)
  4013. Number of section headers: 31
  4014. Section header string table index: 28
  4015. Let's go through each field:
  4016. Magic
  4017. Displays the raw bytes that uniquely addresses a file is an ELF
  4018. executable binary. Each byte gives a brief information.
  4019. In the example, we have the following magic bytes:
  4020. Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  4021. Examine byte by byte:
  4022. Byte Description
  4023. -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  4024. 7f 45 4c 46 Predefined values. The first byte is always 7F, the remaining 3
  4025. bytes represent the string “ELF”.
  4026. 02 See Class field below.
  4027. 01 See Data field below.
  4028. 01 See Version field below.
  4029. 00 See OS/ABI field below.
  4030. 00 00 00 00 00 00 00 00 Padding bytes. These bytes are unused and are always set to 0.
  4031. Padding bytes are added for proper alignment, and is reserved for
  4032. future use when more information is needed.
  4033. Class
  4034. A byte in Magic field. It specifies the class or capacity of a
  4035. file.
  4036. Possible values:
  4037. Value Description
  4038. ---------------------------
  4039. 0 Invalid class
  4040. 1 32-bit objects
  4041. 2 64-bit objects
  4042. Data
  4043. A byte in Magic field. It specifies the data encoding of the
  4044. processor-specific data in the object file.
  4045. Possible values:
  4046. Value Description
  4047. ------------------------------------------
  4048. 0 Invalid data encoding
  4049. 1 Little endian, 2's complement
  4050. 2 Big endian, 2's complement
  4051. Version
  4052. A byte in Magic. It specifies the ELF header version number.
  4053. Possible values:
  4054. Value Description
  4055. ----------------------------
  4056. 0 Invalid version
  4057. 1 Current version
  4058. OS/ABI
  4059. A byte in Magic field. It specifies the target operating system
  4060. ABI. Originally, it was a padding byte.
  4061. Possible values: Refer to the latest ABI document, as it is a
  4062. long list of different operating systems.
  4063. Type
  4064. Identifies the object file type.
  4065. Value Description
  4066. -----------------------------------
  4067. ---------------------------------------------
  4068. 0 No file type
  4069. 1 Relocatable file
  4070. 2 Executable file
  4071. 3 Shared object file
  4072. 4 Core file
  4073. 0xff00 Processor specific, lower bound
  4074. 0xffff Processor specific, upper bound
  4075. The values from 0xff00 to 0xffff are reserved for a processor
  4076. to define additional file types meaningful to it.
  4077. Machine
  4078. Specifies the required architecture value for an ELF file e.g.
  4079. x86_64, MIPS, SPARC, etc. In the example, the machine is of x86_64
  4080. architecture.
  4081. Possible values: Please refer to the latest ABI document, as it
  4082. is a long list of different architectures.
  4083. Version
  4084. Specifies the version number of the current object file (not
  4085. the version of the ELF header, as the above Version field
  4086. specified).
  4087. Entry point address
  4088. Specifies the memory address where the very first code to be
  4089. executed. The address of main function is the default in a
  4090. normal application program, but it can be any function by
  4091. explicitly specifying the function name to gcc. For the
  4092. operating system we are going to write, this is the single most
  4093. important field that we need to retrieve to bootstrap our
  4094. kernel, and everything else can be ignored.
  4095. Start of program headers
  4096. The offset of the program header table, in bytes. In the
  4097. example, this number is 64 bytes, which means the 65th byte, or
  4098. <start address> + 64, is the start address of the program
  4099. header table. That is, if a program is loaded at address 0x10000
  4100. in memory, then the start address is 0x10000 (the very first
  4101. byte of Magic field, where the value 0x7f resides) and the
  4102. start address of program header table is 0x10000 + 0x40 = 0x10040
  4103. .
  4104. Start of section headers
  4105. The offset of the section header table in bytes, similar to the
  4106. start of program headers. In the example, it is 6648 bytes into
  4107. file.
  4108. Flags
  4109. Hold processor-specific flags associated with the file. When
  4110. the program is loaded, in a x86 machine, EFLAGS register is set
  4111. according to this value. In the example, the value is 0x0,
  4112. which means EFLAGS register is in a clear state.
  4113. Size of this header
  4114. Specifies the total size of ELF header's size in bytes. In the
  4115. example, it is 64 bytes, which is equivalent to Start of
  4116. program headers. Note that these two numbers are not necessary
  4117. equivalent, as program header table might be placed far away
  4118. from the ELF header. The only fixed component in the ELF
  4119. executable binary is the ELF header, which appears at the very
  4120. beginning of the file.
  4121. Size of program headers
  4122. Specifies the size of each program header in bytes. In the
  4123. example, it is 64 bytes.
  4124. Number of program headers
  4125. Specifies the total number of program headers. In the example,
  4126. the file has a total of 9 program headers.
  4127. Size of section headers
  4128. Specifies the size of each section header in bytes. In the
  4129. example, it is 64 bytes.
  4130. Number of section headers
  4131. Specifies the total number of section headers. In the example,
  4132. the file has a total of 31 section headers. In a section header
  4133. table, the first entry in the table is always an empty section.
  4134. Section header string table index
  4135. Specifies the index of the header in the section header table
  4136. that points to the section that holds all null-terminated
  4137. strings. In the example, the index is 28, which means it's the
  4138. 28[superscript:th] entry of the table.
  4139. Section header table
  4140. As we know already, code and data compose a program. However, not
  4141. all types of code and data have the same purpose. For that
  4142. reason, instead of a big chunk of code and data, they are divided
  4143. into smaller chunks, and each chunk must satisfy these conditions
  4144. (according to gABI):
  4145. • Every section in an object file has exactly one section header
  4146. describing it. But, section headers may exist that do not have
  4147. a section.
  4148. • Each section occupies one contiguous (possibly empty) sequence
  4149. of bytes within a file. That means, there's no two regions of
  4150. bytes that are the same section.
  4151. • Sections in a file may not overlap. No byte in a file resides
  4152. in more than one section.
  4153. • An object file may have inactive space. The various headers and
  4154. the sections might not “cover” every byte in an object file.
  4155. The contents of the inactive data are unspecified.
  4156. To get all the headers from an executable binary e.g. hello, use
  4157. the following command:
  4158. $ readelf -S hello
  4159. Here is a sample output (do not worry if you don't understand the
  4160. output. Just skim to get your eyes familiar with it. We will
  4161. dissect it soon enough):
  4162. There are 31 section headers, starting at offset 0x19c8:
  4163. Section Headers:
  4164. [Nr] Name Type Address
  4165. Offset
  4166. Size EntSize Flags Link Info
  4167. Align
  4168. [ 0] NULL 0000000000000000
  4169. 00000000
  4170. 0000000000000000 0000000000000000 0 0 0
  4171. [ 1] .interp PROGBITS 0000000000400238
  4172. 00000238
  4173. 000000000000001c 0000000000000000 A 0 0 1
  4174. [ 2] .note.ABI-tag NOTE 0000000000400254
  4175. 00000254
  4176. 0000000000000020 0000000000000000 A 0 0 4
  4177. [ 3] .note.gnu.build-i NOTE 0000000000400274
  4178. 00000274
  4179. 0000000000000024 0000000000000000 A 0 0 4
  4180. [ 4] .gnu.hash GNU_HASH 0000000000400298
  4181. 00000298
  4182. 000000000000001c 0000000000000000 A 5 0 8
  4183. [ 5] .dynsym DYNSYM 00000000004002b8
  4184. 000002b8
  4185. 0000000000000048 0000000000000018 A 6 1 8
  4186. [ 6] .dynstr STRTAB 0000000000400300
  4187. 00000300
  4188. 0000000000000038 0000000000000000 A 0 0 1
  4189. [ 7] .gnu.version VERSYM 0000000000400338
  4190. 00000338
  4191. 0000000000000006 0000000000000002 A 5 0 2
  4192. [ 8] .gnu.version_r VERNEED 0000000000400340
  4193. 00000340
  4194. 0000000000000020 0000000000000000 A 6 1 8
  4195. [ 9] .rela.dyn RELA 0000000000400360
  4196. 00000360
  4197. 0000000000000018 0000000000000018 A 5 0 8
  4198. [10] .rela.plt RELA 0000000000400378
  4199. 00000378
  4200. 0000000000000018 0000000000000018 AI 5 24 8
  4201. [11] .init PROGBITS 0000000000400390
  4202. 00000390
  4203. 000000000000001a 0000000000000000 AX 0 0 4
  4204. [12] .plt PROGBITS 00000000004003b0
  4205. 000003b0
  4206. 0000000000000020 0000000000000010 AX 0 0
  4207. 16
  4208. [13] .plt.got PROGBITS 00000000004003d0
  4209. 000003d0
  4210. 0000000000000008 0000000000000000 AX 0 0 8
  4211. [14] .text PROGBITS 00000000004003e0
  4212. 000003e0
  4213. 0000000000000192 0000000000000000 AX 0 0
  4214. 16
  4215. [15] .fini PROGBITS 0000000000400574
  4216. 00000574
  4217. 0000000000000009 0000000000000000 AX 0 0 4
  4218. [16] .rodata PROGBITS 0000000000400580
  4219. 00000580
  4220. 0000000000000004 0000000000000004 AM 0 0 4
  4221. [17] .eh_frame_hdr PROGBITS 0000000000400584
  4222. 00000584
  4223. 000000000000003c 0000000000000000 A 0 0 4
  4224. [18] .eh_frame PROGBITS 00000000004005c0
  4225. 000005c0
  4226. 0000000000000114 0000000000000000 A 0 0 8
  4227. [19] .init_array INIT_ARRAY 0000000000600e10
  4228. 00000e10
  4229. 0000000000000008 0000000000000000 WA 0 0 8
  4230. [20] .fini_array FINI_ARRAY 0000000000600e18
  4231. 00000e18
  4232. 0000000000000008 0000000000000000 WA 0 0 8
  4233. [21] .jcr PROGBITS 0000000000600e20
  4234. 00000e20
  4235. 0000000000000008 0000000000000000 WA 0 0 8
  4236. [22] .dynamic DYNAMIC 0000000000600e28
  4237. 00000e28
  4238. 00000000000001d0 0000000000000010 WA 6 0 8
  4239. [23] .got PROGBITS 0000000000600ff8
  4240. 00000ff8
  4241. 0000000000000008 0000000000000008 WA 0 0 8
  4242. [24] .got.plt PROGBITS 0000000000601000
  4243. 00001000
  4244. 0000000000000020 0000000000000008 WA 0 0 8
  4245. [25] .data PROGBITS 0000000000601020
  4246. 00001020
  4247. 0000000000000010 0000000000000000 WA 0 0 8
  4248. [26] .bss NOBITS 0000000000601030
  4249. 00001030
  4250. 0000000000000008 0000000000000000 WA 0 0 1
  4251. [27] .comment PROGBITS 0000000000000000
  4252. 00001030
  4253. 0000000000000034 0000000000000001 MS 0 0 1
  4254. [28] .shstrtab STRTAB 0000000000000000
  4255. 000018b6
  4256. 000000000000010c 0000000000000000 0 0 1
  4257. [29] .symtab SYMTAB 0000000000000000
  4258. 00001068
  4259. 0000000000000648 0000000000000018 30 47 8
  4260. [30] .strtab STRTAB 0000000000000000
  4261. 000016b0
  4262. 0000000000000206 0000000000000000 0 0 1
  4263. Key to Flags:
  4264. W (write), A (alloc), X (execute), M (merge), S (strings), l
  4265. (large)
  4266. I (info), L (link order), G (group), T (TLS), E (exclude), x
  4267. (unknown)
  4268. O (extra OS processing required) o (OS specific), p (processor
  4269. specific)
  4270. The first line:
  4271. There are 31 section headers, starting at offset 0x19c8
  4272. summarizes the total number of sections in the file, and where
  4273. the address where it starts. Then, comes the listing section by
  4274. section with the following header, is also the format of each
  4275. section output:
  4276. [Nr] Name Type Address Offset
  4277. Size EntSize Flags Link Info Align
  4278. Each section has two lines with different fields:
  4279. Nr The index of each section.
  4280. Name The name of each section.
  4281. Type This field (in a section header) identifies the type of
  4282. each section. Types classify sections (similar to types in
  4283. programming languages are used by a compiler).
  4284. Address The starting virtual address of each section. Note that
  4285. the addresses are virtual only when a program runs in an OS
  4286. with support for virtual memory enabled. In our OS, since we
  4287. run on bare metal, the addresses will all be physical.
  4288. Offset The offset of each section into a file. An [margin:
  4289. offset
  4290. ]offsetoffset is a distance in bytes, from the first byte of a
  4291. file to the start of an object, such as a section or a segment
  4292. in the context of an ELF binary file.
  4293. Size The size in bytes of each section.
  4294. EntSize Some sections hold a table of fixed-size entries, such
  4295. as a symbol table. For such a section, this member gives the
  4296. size in bytes of each entry. The member contains 0 if the
  4297. section does not hold a table of fixed-size entries.
  4298. Flags describes attributes of a section. Flags together with a
  4299. type defines the purpose of a section. Two sections can be of
  4300. the same type, but serve different purposes. For example, even
  4301. though .data and .text share the same type, .data holds the
  4302. initialized data of a program while .text holds executable
  4303. instructions of a program. For that reason, .data is given read
  4304. and write permission, but not executable. Any attempt to
  4305. execute code in .data is denied by the running OS: in Linux,
  4306. such invalid section usage gives a segmentation fault.
  4307. ELF gives information to enable an OS with such protection
  4308. mechanism. However, running on bare metal, nothing can prevent
  4309. from doing anything. Our OS can execute code in data section,
  4310. and vice versa, writing to code section.
  4311. [Table 5:
  4312. Section Flags
  4313. ]
  4314. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4315. | Flag | Descriptions |
  4316. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4317. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4318. | W | Bytes in this section are writable during execution. |
  4319. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4320. | A | Memory is allocated for this section during process execution.
  4321. Some control sections do not reside in the memory image of an
  4322. object file; this attribute is off for those sections. |
  4323. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4324. | X | The section contains executable instructions. |
  4325. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4326. | M | The data in the section may be merged to eliminate duplication.
  4327. Each element in the section is compared against other elements in
  4328. sections with the same name, type and flags. Elements that would
  4329. have identical values at program run-time may be merged. |
  4330. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4331. | S | The data elements in the section consist of null-terminated
  4332. character strings. The size of each character is specified in the
  4333. section header's EntSize field. |
  4334. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4335. | l | Specific large section for x86_64 architecture. This flag is not
  4336. specified in the Generic ABI but in x86_64 ABI. |
  4337. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4338. | I | The Info field of this section header holds an index of a section
  4339. header. Otherwise, the number is the index of something else. |
  4340. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4341. | L | Preserve section ordering when linking. If this section is
  4342. combined with other sections in the output file, it must appear
  4343. in the same relative order with respect to those sections, as the
  4344. linked-to section appears with respect to sections the linked-to
  4345. section is combined with. Apply when the Link field of this
  4346. section's header references another section (the linked-to
  4347. section) |
  4348. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4349. | G | This section is a member (perhaps the only one) of a section
  4350. group. |
  4351. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4352. | T | This section holds Thread-Local Storage, meaning that each thread
  4353. has its own distinct instance of this data. A thread is a
  4354. distinct execution flow of code. A program can have multiple
  4355. threads that pack different pieces of code and execute
  4356. separately, at the same time. We will learn more about threads
  4357. when writing our kernel. |
  4358. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4359. | E | Link editor is to exclude this section from executable and shared
  4360. library that it builds when those objects are not to be further
  4361. relocated. |
  4362. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4363. | x | Unknown flag to readelf. It happens because the linking process
  4364. can be done manually with a linker like GNU ld (we will later
  4365. later). That is, section flags can be specified manually, and
  4366. some flags are for a customized ELF that the open-source readelf
  4367. doesn't know of. |
  4368. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4369. | O | This section requires special OS-specific processing (beyond the
  4370. standard linking rules) to avoid incorrect behavior. A link
  4371. editor encounters sections whose headers contain OS-specific
  4372. values it does not recognize by Type or Flags values defined by
  4373. ELF standard, the link editor should combine those sections. |
  4374. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4375. | o | All bits included in this flag are reserved for operating
  4376. system-specific semantics. |
  4377. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4378. | p | All bits included in this flag are reserved for
  4379. processor-specific semantics. If meanings are specified, the
  4380. processor supplement explains them. |
  4381. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4382. Link and Info are numbers that references the indexes of
  4383. sections, symbol table entries, hash table entries. Link field
  4384. holds the index of a section, while Info field holds an index
  4385. of a section, a symbol table entry or a hash table entry,
  4386. depends on the type of a section.
  4387. Later when writing our OS, we will handcraft the kernel image
  4388. by explicitly linking the object files (produced by gcc)
  4389. through a linker script. We will specify the memory layout of
  4390. sections by specifying at what addresses they will appear in
  4391. the final image. But we will not assign any section flag and
  4392. let the linker take care of it. Nevertheless, knowing which
  4393. flag does what is useful.
  4394. Align is a value that enforces the offset of a section should
  4395. be divisible by the value. Only 0 and positive integral powers
  4396. of two are allowed. Values 0 and 1 mean the section has no
  4397. alignment constraint.
  4398. Output of .interp section:
  4399. [Nr] Name Type Address
  4400. Offset
  4401. Size EntSize Flags Link Info
  4402. Align
  4403. [ 1] .interp PROGBITS 0000000000400238
  4404. 00000238
  4405. 000000000000001c 0000000000000000 A 0 0 1
  4406. Nr is 1.
  4407. Type is PROGBITS, which means this section is part of the
  4408. program.
  4409. Address is 0x0000000000400238, which means the program is
  4410. loaded at this virtual memory address at runtime.
  4411. Offset is 0x00000238 bytes into file.
  4412. Size is 0x000000000000001c in bytes.
  4413. EntSize is 0, which means this section does not have any
  4414. fixed-size entry.
  4415. Flags are A (Allocatable), which means this section consumes
  4416. memory at runtime.
  4417. Info and Link are 0 and 0, which means this section links to no
  4418. section or entry in any table.
  4419. Align is 1, which means no alignment.
  4420. Output of the .text section:
  4421. [14] .text PROGBITS 00000000004003e0
  4422. 000003e0
  4423. 0000000000000192 0000000000000000 AX 0 0
  4424. 16
  4425. Nr is 14.
  4426. Type is PROGBITS, which means this section is part of the
  4427. program.
  4428. Address is 0x00000000004003e0, which means the program is
  4429. loaded at this virtual memory address at runtime.
  4430. Offset is 0x000003e0 bytes into file.
  4431. Size is 0x0000000000000192 in bytes.
  4432. EntSize is 0, which means this section does not have any
  4433. fixed-size entry.
  4434. Flags are A (Allocatable) and X (Executable), which means this
  4435. section consumes memory and can be executed as code at runtime.
  4436. Info and Link are 0 and 0, which means this section links to no
  4437. section or entry in any table.
  4438. Align is 16, which means the starting address of the section
  4439. should be divisible by 16, or 0x10. Indeed, it is: \mathtt{0x3e0/0x10=0x3e}
  4440. .
  4441. Understand Section in-depth
  4442. In this section, we will learn different details of section types
  4443. and the purposes of special sections e.g. .bss, .text, .data...
  4444. by looking at each section one by one. We will also examine the
  4445. content of each section as a hexdump with the commands:
  4446. $ readelf -x <section name|section number> <file>
  4447. For example, if you want to examine the content of section with
  4448. index 25 (the .bss section in the sample output) in the file
  4449. hello:
  4450. $ readelf -x 25 hello
  4451. Equivalently, using name instead of index works:
  4452. $ readelf -x .data hello
  4453. If a section contains strings e.g. string symbol table, the flag
  4454. -x can be replaced with -p.
  4455. NULL marks a section header as inactive and does not have an
  4456. associated section. NULL section is always the first entry of
  4457. section header table. It means, any useful section starts from
  4458. 1.
  4459. The sample output of NULL section:
  4460. [Nr] Name Type Address
  4461. Offset
  4462. Size EntSize Flags Link Info
  4463. Align
  4464. [ 0] NULL 0000000000000000
  4465. 00000000
  4466. 0000000000000000 0000000000000000 0 0
  4467. 0
  4468. Examining the content, the section is empty:
  4469. Section '' has no data to dump.
  4470. NOTE marks a section with special information that other
  4471. programs will check for conformance, compatibility... by a
  4472. vendor or a system builder.
  4473. In the sample output, we have 2 NOTE sections:
  4474. [Nr] Name Type Address
  4475. Offset
  4476. Size EntSize Flags Link Info
  4477. Align
  4478. [ 2] .note.ABI-tag NOTE 0000000000400254
  4479. 00000254
  4480. 0000000000000020 0000000000000000 A 0 0
  4481. 4
  4482. [ 3] .note.gnu.build-i NOTE 0000000000400274
  4483. 00000274
  4484. 0000000000000024 0000000000000000 A 0 0
  4485. 4
  4486. Examine 2nd section with the command:
  4487. $ readelf -x 2 hello
  4488. we have:
  4489. Hex dump of section '.note.ABI-tag':
  4490. 0x00400254 04000000 10000000 01000000 474e5500
  4491. ............GNU.
  4492. 0x00400264 00000000 02000000 06000000 20000000 ............
  4493. ...
  4494. PROGBITS indicates a section holding the main content of a
  4495. program, either code or data.
  4496. There are many PROGBITS sections:
  4497. [Nr] Name Type Address
  4498. Offset
  4499. Size EntSize Flags Link Info
  4500. Align
  4501. [ 1] .interp PROGBITS 0000000000400238
  4502. 00000238
  4503. 000000000000001c 0000000000000000 A 0 0
  4504. 1
  4505. ...
  4506. [11] .init PROGBITS 0000000000400390
  4507. 00000390
  4508. 000000000000001a 0000000000000000 AX 0 0
  4509. 4
  4510. [12] .plt PROGBITS 00000000004003b0
  4511. 000003b0
  4512. 0000000000000020 0000000000000010 AX 0 0
  4513. 16
  4514. [13] .plt.got PROGBITS 00000000004003d0
  4515. 000003d0
  4516. 0000000000000008 0000000000000000 AX 0 0
  4517. 8
  4518. [14] .text PROGBITS 00000000004003e0
  4519. 000003e0
  4520. 0000000000000192 0000000000000000 AX 0 0
  4521. 16
  4522. [15] .fini PROGBITS 0000000000400574
  4523. 00000574
  4524. 0000000000000009 0000000000000000 AX 0 0
  4525. 4
  4526. [16] .rodata PROGBITS 0000000000400580
  4527. 00000580
  4528. 0000000000000004 0000000000000004 AM 0 0
  4529. 4
  4530. [17] .eh_frame_hdr PROGBITS 0000000000400584
  4531. 00000584
  4532. 000000000000003c 0000000000000000 A 0 0
  4533. 4
  4534. [18] .eh_frame PROGBITS 00000000004005c0
  4535. 000005c0
  4536. 0000000000000114 0000000000000000 A 0 0
  4537. 8
  4538. ...
  4539. [23] .got PROGBITS 0000000000600ff8
  4540. 00000ff8
  4541. 0000000000000008 0000000000000008 WA 0 0
  4542. 8
  4543. [24] .got.plt PROGBITS 0000000000601000
  4544. 00001000
  4545. 0000000000000020 0000000000000008 WA 0 0
  4546. 8
  4547. [25] .data PROGBITS 0000000000601020
  4548. 00001020
  4549. 0000000000000010 0000000000000000 WA 0 0
  4550. 8
  4551. [27] .comment PROGBITS 0000000000000000
  4552. 00001030
  4553. 0000000000000034 0000000000000001 MS 0 0
  4554. 1
  4555. For our operating system, we only need the following section:
  4556. .text
  4557. This section holds all the compiled code of a program.
  4558. .data
  4559. This section holds the initialized data of a program. Since
  4560. the data are initialized with actual values, gcc allocates
  4561. the section with actual byte in the executable binary.
  4562. .rodata
  4563. This section holds read-only data, such as fixed-size strings
  4564. in a program, e.g. “Hello World”, and others.
  4565. .bss
  4566. This section, shorts for Block Started by Symbol, holds
  4567. uninitialized data of a program. Unlike other sections, no
  4568. space is allocated for this section in the image of the
  4569. executable binary on disk. The section is allocated only when
  4570. the program is loaded into main memory.
  4571. Other sections are mainly needed for dynamic linking, that is
  4572. code linking at runtime for sharing between many programs. To
  4573. enable such feature, an OS as a runtime environment must be
  4574. presented. Since we run our OS on bare metal, we are
  4575. effectively creating such environment. For simplicity, we won't
  4576. add dynamic linking to our OS.
  4577. SYMTAB and DYNSYM These sections hold symbol table. A symbol
  4578. table is an array of entries that describe symbols in a
  4579. program. A symbol is a name assigned to an entity in a program.
  4580. The types of these entities are also the types of symbols, and
  4581. these are the possible types of an entity:
  4582. In the sample output, section 5 and 29 are symbol tables:
  4583. [Nr] Name Type Address
  4584. Offset
  4585. Size EntSize Flags Link Info
  4586. Align
  4587. [ 5] .dynsym DYNSYM 00000000004002b8
  4588. 000002b8
  4589. 0000000000000048 0000000000000018 A 6 1
  4590. 8
  4591. ...
  4592. [29] .symtab SYMTAB 0000000000000000
  4593. 00001068
  4594. 0000000000000648 0000000000000018 30 47
  4595. 8
  4596. To show the symbol table:
  4597. $ readelf -s hello
  4598. Output consists of 2 symbol tables, corresponding to the two
  4599. sections above, .dynsym and .symtab:
  4600. Symbol table '.dynsym' contains 4 entries:
  4601. Num: Value Size Type Bind Vis Ndx
  4602. Name
  4603. 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
  4604. 1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND
  4605. puts@GLIBC_2.2.5 (2)
  4606. 2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND
  4607. __libc_start_main@GLIBC_2.2.5 (2)
  4608. 3: 0000000000000000 0 NOTYPE WEAK DEFAULT UND
  4609. __gmon_start__
  4610. Symbol table '.symtab' contains 67 entries:
  4611. Num: Value Size Type Bind Vis Ndx
  4612. Name
  4613. ..........................................
  4614. 59: 0000000000601040 0 NOTYPE GLOBAL DEFAULT 26
  4615. _end
  4616. 60: 0000000000400430 42 FUNC GLOBAL DEFAULT 14
  4617. _start
  4618. 61: 0000000000601038 0 NOTYPE GLOBAL DEFAULT 26
  4619. __bss_start
  4620. 62: 0000000000400526 32 FUNC GLOBAL DEFAULT 14
  4621. main
  4622. 63: 0000000000000000 0 NOTYPE WEAK DEFAULT UND
  4623. _Jv_RegisterClasses
  4624. 64: 0000000000601038 0 OBJECT GLOBAL HIDDEN 25
  4625. __TMC_END__
  4626. 65: 0000000000000000 0 NOTYPE WEAK DEFAULT UND
  4627. _ITM_registerTMCloneTable
  4628. 66: 00000000004003c8 0 FUNC GLOBAL DEFAULT 11
  4629. _init
  4630. TLS The symbol is associated with a Thread-Local Storage
  4631. entity.
  4632. Num is the index of an entry in a table.
  4633. Value is the virtual memory address where the symbol is
  4634. located.
  4635. Size is the size of the entity associated with a symbol.
  4636. Type is a symbol type according to table.
  4637. NOTYPE The type of a symbol is not specified.
  4638. OBJECT The symbol is associated with a data object. In C, any
  4639. variable definition is of OBJECT type.
  4640. FUNC The symbol is associated with a function or other
  4641. executable code.
  4642. SECTION The symbol is associated with a section, and exists
  4643. primarily for relocation.
  4644. FILE The symbol is the name of a source file associated with
  4645. an executable binary.
  4646. COMMON The symbol labels an uninitialized variable. That is,
  4647. when a variable in C is defined as global variable without
  4648. an initial value, or as an external variable using the
  4649. extern keyword. In other words, these variables stay in
  4650. .bss section.
  4651. Bind is the scope of a symbol.
  4652. LOCAL are symbols that are only visible in the object files
  4653. that defined them. In C, the static modifier marks a symbol
  4654. (e.g. a variable/function) as local to only the file that
  4655. defines it.
  4656. If we define variables and functions with static modifer:
  4657. static int global_static_var = 0;
  4658. static void local_func() {
  4659. }
  4660. int main(int argc, char *argv[])
  4661. {
  4662. static int local_static_var = 0;
  4663. return 0;
  4664. }
  4665. Then we get the static variables listed as local symbols
  4666. after compiling:
  4667. $ gcc -m32 hello.c -o hello
  4668. $ readelf -s hello
  4669. Symbol table '.dynsym' contains 5 entries:
  4670. Num: Value Size Type Bind Vis Ndx Name
  4671. 0: 00000000 0 NOTYPE LOCAL DEFAULT UND
  4672. 1: 00000000 0 FUNC GLOBAL DEFAULT UND
  4673. puts@GLIBC_2.0 (2)
  4674. 2: 00000000 0 NOTYPE WEAK DEFAULT UND
  4675. __gmon_start__
  4676. 3: 00000000 0 FUNC GLOBAL DEFAULT UND
  4677. __libc_start_main@GLIBC_2.0 (2)
  4678. 4: 080484bc 4 OBJECT GLOBAL DEFAULT 16
  4679. _IO_stdin_used
  4680. Symbol table '.symtab' contains 72 entries:
  4681. Num: Value Size Type Bind Vis Ndx Name
  4682. 0: 00000000 0 NOTYPE LOCAL DEFAULT UND
  4683. ......... output omitted .........
  4684. 38: 0804a020 4 OBJECT LOCAL DEFAULT 26
  4685. global_static_var
  4686. 39: 0804840b 6 FUNC LOCAL DEFAULT 14
  4687. local_func
  4688. 40: 0804a024 4 OBJECT LOCAL DEFAULT 26
  4689. local_static_var.1938
  4690. ......... output omitted .........
  4691. GLOBAL are symbols that are accessible by other object files
  4692. when linking together. These symbols are primarily
  4693. non-static functions and non-static global data. The extern
  4694. modifier marks a symbol as externally defined elsewhere but
  4695. is accessible in the final executable binary, so an extern
  4696. variable is also considered GLOBAL.
  4697. Similar to the LOCAL example above, the output lists many
  4698. GLOBAL symbols such as main:
  4699. Num: Value Size Type Bind Vis Ndx Name
  4700. ......... output omitted .........
  4701. 66: 080483e1 10 FUNC GLOBAL DEFAULT 14 main
  4702. ......... output omitted .........
  4703. WEAK are symbols whose definitions can be redefined.
  4704. Normally, a symbol with multiple definitions are reported
  4705. as an error by a compiler. However, this constraint is lax
  4706. when a definition is explicitly marked as weak, which means
  4707. the default implementation can be replaced by a different
  4708. definition at link time.
  4709. Suppose we have a default implementation of the function
  4710. add:
  4711. #include <stdio.h>
  4712. __attribute__((weak)) int add(int a, int b) {
  4713. printf("warning: function is not implemented.\n");
  4714. return 0;
  4715. }
  4716. int main(int argc, char *argv[])
  4717. {
  4718. printf("add(1,2) is %d\n", add(1,2));
  4719. return 0;
  4720. }
  4721. __attribute__((weak)) is a [margin:
  4722. function attribute
  4723. ]function attribute. A function attributefunction attribute is
  4724. extra information for a compiler to handle a function
  4725. differently from a normal function. In this example, weak
  4726. attribute makes the function add a weak function,which
  4727. means the default implementation can be replaced by a
  4728. different definition at link time. Function attribute is
  4729. a feature of a compiler, not standard C.
  4730. If we do not supply a different function definition in a
  4731. different file (must be in a different file, otherwise
  4732. gcc reports as an error), then the default implementation
  4733. is applied. When the function add is called, it only
  4734. prints the message: "warning: function not
  4735. implemented"and returns 0:
  4736. $ ./hello
  4737. warning: function is not implemented.
  4738. add(1,2) is 0
  4739. However, if we supply a different definition in another
  4740. file e.g. math.c:
  4741. int add(int a, int b) {
  4742. return a + b;
  4743. }
  4744. and compile the two files together:
  4745. $ gcc math.c hello.c -o hello
  4746. Then, when running hello, no warning message is printed
  4747. and the correct value is returned.
  4748. Weak symbol is a mechanism to provide a default
  4749. implementation, but replaceable when a better
  4750. implementation is available (e.g. more specialized and
  4751. optimized) at link-time.
  4752. Vis is the visibility of a symbol. The following values are
  4753. available:
  4754. [Table 6:
  4755. Symbol Visibility
  4756. ]
  4757. +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4758. | Value | Description |
  4759. +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4760. +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4761. | DEFAULT | The visibility is specified by the binding type of asymbol.
  4762. • Global and weak symbols are visible outside of their defining
  4763. component (executable file or shared object).
  4764. • Local symbols are hidden. See HIDDEN below. |
  4765. +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4766. | HIDDEN | A symbol is hidden when the name is not visible to any other
  4767. program outside of its running program. |
  4768. +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4769. | PROTECTED | A symbol is protected when it is shared outside of its running
  4770. program or shared libary and cannot be overridden. That is, there
  4771. can only be one definition for this symbol across running
  4772. programs that use it. No program can define its own definition of
  4773. the same symbol. |
  4774. +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4775. | INTERNAL | Visibility is processor-specific and is defined by
  4776. processor-specific ABI. |
  4777. +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4778. Ndx is the index of a section that the symbol is in. Aside from
  4779. fixed index numbers that represent section indexes, index has
  4780. these special values:
  4781. [Table 7:
  4782. Symbol Index
  4783. ]
  4784. +-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4785. | Value | Description |
  4786. +-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4787. +-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4788. | ABS | The index will not be changed by any symbol relocation. |
  4789. +-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4790. | COM | The index refers to an unallocated common block. |
  4791. +-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4792. | UND | The symbol is undefined in the current object file, which means
  4793. the symbol depends on the actual definition in another file.
  4794. Undefined symbols appears when the object file refers to symbols
  4795. that are available at runtime, from shared library. |
  4796. +-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4797. | LORESERVE
  4798. HIRESERVE | LORESERVE is the lower boundary of the reserve indexes. Its value
  4799. is 0xff00.
  4800. HIREVERSE is the upper boundary of the reserve indexes. Its value
  4801. is 0xffff.
  4802. The operating system reserves exclusive indexes between LORESERVE
  4803. and HIRESERVE, which do not map to any actual section header. |
  4804. +-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4805. | XINDEX | The index is larger than LORESERVE. The actual value will be
  4806. contained in the section SYMTAB_SHNDX, where each entry is a
  4807. mapping between a symbol, whose Ndx field is a XINDEX value, and
  4808. the actual index value. |
  4809. +-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  4810. | Others | Sometimes, values such as ANSI_COM, LARGE_COM, SCOM, S