Lets try and understand the memory addressing system and segmentation in Linux [Note: This article is written keeping in mind 8086 microprocessor]. 1. Memory Addresses There are three types of addresses : Logical addresses : Logical addresses are included in the machine language instructions to specify the address of an operand or of an instruction .Since 8086 architecture forces the programs to be divided in segments so these logical addresses tell the segment and the address in segment holding the value of an operand or instruction. Linear addresses :The address derived from logical addresses are known as linear addresses. These addresses are part of a memory segment of a program. These are also known as virtual addresses and their values range from 0x00000000 to 0xffffffff. Physical addresses :These addresses correspond to addresses in RAM. It is the value that the processor places on its address lines in order to access a value in chip-based memory. Now, The Memory Management Unit (MMU) transforms a logical address into a linear address by means of a hardware circuit called a segmentation unit and a second hardware circuit called a paging unit transforms the linear address into a physical address. Refer to the flow below : Logical Address -----> [SEGMENTATION UNIT] -----> Linear Address -----> [PAGING UNIT] -----> Physical address In this article we will study how SEGMENTATION UNIT work but before directly jumping on to the segmentation unit, lets understand the structure of logical address : A logical address consists of two parts: a segment identifier and an offset that specifies the relative address within the segment. The segment identifier is also known as segment selector To make it easy to retrieve segment selectors quickly, the processor provides segmentation registers whose only purpose is to hold Segment Selectors; these registers are called cs, ss, ds, es, fs, and gs. Three of the six segmentation registers have specific purposes: cs :The code segment register, which points to a segment containing program instructions ss :The stack segment register, which points to a segment containing the current program stack ds :The data segment register, which points to a segment containing global and static data The remaining three segmentation registers are general purpose and may refer to arbitrary data segments. 2. Segment Descriptor Each segment is represented by an 8-byte Segment Descriptor that describes the segment characteristics. Segment Descriptors are stored either in the Global Descriptor Table (GDT ) or in the Local Descriptor Table(LDT). Usually only one GDT is defined, while each process is permitted to have its own LDT if it needs to create additional segments besides those stored in the GDT. The address and size of the GDT in main memory are contained in the gdtr control register, while the address and size of the currently used LDT are contained in the ldtr control register. As we already discussed that a logical address consists of two parts: a segment identifier or a segment selector and an offset that specifies the relative address within the segment. Now, A segment selector has three parts : Index : Index at which segment descriptor is present in LDT or GDT Table indicator : tells whether segment descriptor is present in LDT or GDT Requester Privilege level (RPL) : This allows a program to request a resource at a lower privilege level than it would otherwise use. 3. Segmentation Unit This unit examines the 'Table Indicator' (TI) field and determines whether the segmentation descriptor is present in GDT or LDT. Now through the registers (gdtr or ldtr) of GDT and LDT we get the base addresses of two tables (whichever required) Because a Segment Descriptor is 8 bytes long, its relative address inside the GDT or the LDT is obtained by multiplying the 13-bit index field of the Segment Selector by 8. For instance, if the GDT is at 0x00020000 (the value stored in the gdtr register) and the index specified by the Segment Selector is 2, the address of the corresponding Segment Descriptor is 0x00020000 + (2 x 8), or 0x00020010. Now, since we get segment descriptor, so we get the base address of the segment itself through the 'base' field of segment descriptor. Now, the offset part of segment selector is added to this base address determined in the 4th step and hence we have the linear address of the operand or the instruction we are looking for. Up till now, all the explanation was related how segmentation is happens in hardware, now lets study how Linux uses it. 4. How LINUX Uses Segmentation Segmentation was included in 8086 processor so that the programs can be divided into entities such as global variables, local variables and sub routines etc. Linux uses segmentation in a very limited way. It prefers paging over segmentation because through segmentation different linear addresses can be assigned to different processes while through paging, same linear addresses can be mapped into different physical addresses. Thus paging makes memory management simpler as all the programs share the same set of linear addresses. Linux programs running in user mode share user-code-segment and user-data-segment while all the programs running in kernel mode share the kernel-code-segment and kernel-data-segment. Now, since, linear addresses of these segments start with 0 so all the processes, either in user-mode or in kernel-mode use same logical addresses. Also, because of addresses beginning with 0x00000000 in Linux, the logical addresses and linear addresses are the same as the value of the offset field of logical address is always same same as that of corresponding linear address. As stated earlier, the Current Privilege Level of the CPU indicates whether the processor is in User or Kernel Mode and is specified by the RPL field of the Segment Selector stored in the cs register. Whenever the CPL is changed, some segmentation registers must be correspondingly updated. For instance, when the CPL is equal to 3 (User Mode), the data segment register must contain the Segment Selector of the user data segment, but when the CPL is equal to 0, the data segment register must contain the Segment Selector of the kernel data segment. Now, how paging takes place in hardware and in Linux, we will study in the next article.
Hi Poorna, I started reading your articles. They are well documented and very easy to understand. Thanks for sharing your knowledge. Keep it up. Mmk