1. 程式人生 > >ARM 32 實體地址轉換虛擬地址

ARM 32 實體地址轉換虛擬地址

第一章  虛擬記憶體分佈及常用巨集定義

1.1記憶體分佈

 

ARMlinux下虛擬記憶體分佈在核心文件有介紹,與X86是有些不同。

部分地址分段在9850K專案上發現未曾使用,故灰色處理。

Kernel/documentation/arm/memory.txt

Start

End

Use

ffff8000

ffffffff

copy_user_page / clear_user_page use.

ffff4000

ffffffff

cache aliasing on ARMv6 and later CPUs

ffff1000

ffff7fff

Reserved
Platforms must not use this address range

ffff0000

ffff0fff

CPU vector page.
                The CPU vectors are mapped here if the CPU supports vector relocation (control register V bit.)

Fffe0000

fffeffff

XScale cache flush area.  This is used in proc-xscale.S to flush the whole data cache. (XScale does not have TCM.)

fffe8000

fffeffff

DTCM mapping area for platforms with DTCM mounted inside the CPU.

Fffe0000

fffe7fff

ITCM mapping area for platforms with ITCM mounted inside the CPU

ffc00000

ffefffff

Fixmap mapping region.  Addresses provided by fix_to_virt() will be located here

fee00000

feffffff

Mapping of PCI I/O space. This is a static mapping within the vmalloc space.

VMALLOC_START

VMALLOC_END-1

vmalloc() / ioremap() space.
                Memory returned by vmalloc/ioremap will be dynamically placed in this region.Machine specific static mappings are also located here through iotable_init().VMALLOC_START is based upon the value of the high_memory variable, and VMALLOC_END is equal to 0xff800000.

PAGE_OFFSET

high_memory-1

Kernel direct-mapped RAM region.This maps the platforms RAM, and 
typically maps all platform RAM in a 1:1 relationship

PKMAP_BASE

PAGE_OFFSET-1

Permanent kernel mappings One way of mapping HIGHMEM pages into kernel space.

MODULES_VADDR

MODULES_END-1

Kernel module space Kernel modules inserted via insmod are placed here using dynamic mappings.

00001000

TASK_SIZE-1

User space mappings Per-thread mappings are placed here via the mmap() system call.

00000000

00000fff

CPU vector page / null pointer trap CPUs which do not support vector remapping place their vector page here.  NULL pointer dereferences by both the kernel and user space are also caught via this mapping.

 

 

將上表轉化為圖形形式:

 

 

從開機log裡得到印證:

[ 0.000000]c0 Virtual kernel memory layout:

vector     : 0xffff0000 - 0xffff1000     (  4 kB)

fixmap      : 0xffc00000 - 0xfff00000    (3072 kB)

vmalloc    : 0xf0800000 - 0xff800000   ( 240 MB)

lowmem  : 0xc0000000 - 0xf0000000   ( 768 MB)

pkmap      : 0xbfe00000 - 0xc0000000   (   2MB)

modules : 0xbf000000 - 0xbfe00000   (  14MB)

.text        : 0xc0008000 - 0xc0b00000   (11232 kB)

.init           : 0xc1000000 - 0xc1400000   (4096 kB)

.data         : 0xc1400000 - 0xc14c76a4   ( 798 kB)

.bss           : 0xc14c76a4 - 0xc1cc0434   (8164 kB)

 

 

1.2記憶體地址基本概念

 

使用者虛擬地址

這是被使用者空間程式所能見到的常規地址.使用者地址或者是32位的,或者是64位的,這取決於硬體的體系架構。並且每個程序有它自己的虛擬地址空間. 個人理解9850K上使用者虛擬地址為

00001000         ~       TASK_SIZE-1

 

實體地址

該地址在處理器和系統記憶體之間使用.實體地址也是32位或者64位長的,在某些情況下甚至32系統也能使用64位實體記憶體。

 

核心邏輯地址

核心邏輯地址組成了核心的常規地址空間。該地址映射了部分(或者全部)記憶體,並經常被視為實體地址。在大多數體系架構中,邏輯地址和與其相關聯的實體地址的不同,僅僅是在它們之間存在一個固定的偏移量。邏輯地址使用硬體內建的指標大小,因此在安裝了大記憶體的32位系統中,它無法定址全部的實體地址。邏輯地址通常儲存在unsigned long或者void *這樣的變數中。kmalloc返回的記憶體就是核心邏輯地址。

 

核心虛擬地址

核心虛擬地址和邏輯地址的相同之處在於, 它們都將核心空間的地址對映到實體地址上。 核心虛擬地址與實體地址的對映不必是線性的和一對一的, 但這是邏輯地址的特點。所有的邏輯地址都是核心虛擬地址, 但是許多核心虛擬地址不是邏輯地址. 例如, vmalloc 分配的記憶體具有一個虛擬地址(但沒有直接的物理對映). kmap 函式(本章稍後描述)也返回一個虛擬地址.虛擬地址常常儲存於指標變數.

 

 

線性地址

個人理解,線性地址即一個虛擬地址線性的對應一個實體地址;但此線性非一一對映關係。所以可以理解為0-4G虛擬地址都是線性地址。 

 

低端記憶體

存在於核心空間上的邏輯地址記憶體.

 

高階記憶體

是指那些不存在邏輯地址的記憶體, 這是因為它們處於核心虛擬地址之上.

 

9850K上高階記憶體個人理解:  實體地址大於768 M的,即實體地址大於0xb0000000,其page.flag 高兩位為01,分配在highzone。如果手機實體記憶體小於768 M,一般就不配置高階記憶體。

 

 

1.3 Linux記憶體管理常用巨集定義

Linux核心軟體架構習慣與分成硬體相關層和硬體無關層。對於頁表管理,2.6.10以前(包括2.6.10)在硬體無關層使用了3級頁表目錄管理的方式,它不管底層硬體是否實現的也是3級的頁表管理.從2.6.11開始,為了配合64位CPU的體系結構,硬體無關層則使用了4級頁表目錄管理的方式。

9850K上是32位系統,採用的是二級對映。所以Linux硬體無關層走的是二級對映,即跳過了pud和pmd。下面巨集定義是sprdroid7.0_trunk_k44_17b(Kernel4.4)分支上sp9850ka_1h10    工程裡的各種巨集定義實現:

      

kernel/arch/arm/include/asm/pgtable.h

/*

 * Just any arbitrary offset to the start ofthe vmalloc VM area: the

 * current 8MB value just means that there willbe a 8MB "hole" after the

 * physical memory until the kernel virtualmemory starts.  That means that

 * any out-of-bounds memory accesses willhopefully be caught.

 * The vmalloc() routines leaves a hole of 4kBbetween each vmalloced

 * area for the same reason. ;)

 */

#define VMALLOC_OFFSET            (8*1024*1024)

#defineVMALLOC_START              (((unsignedlong)high_memory + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1))        //4034920448 = 0xF0800000  ,high_memory =0xf0000000

#define VMALLOC_END                           0xff800000UL

 

#defineLIBRARY_TEXT_START      0x0c000000

 

#include<asm-generic/pgtable-nopud.h>

#include<asm/memory.h>

#include<asm/pgtable-hwdef.h>

#include<asm-generic/pgtable.h>

/* to find an entryin a page-table-directory */

#definepgd_index(addr)                ((addr)>> PGDIR_SHIFT)

 

#definepgd_offset(mm, addr)      ((mm)->pgd +pgd_index(addr))

 

/* to find an entryin a kernel page-table-directory */

#definepgd_offset_k(addr)  pgd_offset(&init_mm,addr)

 

#definepmd_none(pmd)               (!pmd_val(pmd))

 

static inline pte_t*pmd_page_vaddr(pmd_t pmd)

{

         return __va(pmd_val(pmd) &PHYS_MASK & (s32)PAGE_MASK);

}

 

#definepmd_page(pmd)               pfn_to_page(__phys_to_pfn(pmd_val(pmd)& PHYS_MASK))   // PHYS_MASK為0xffffffff

 

#define__pte_map(pmd)              (pte_t*)kmap_atomic(pmd_page(*(pmd)))

 

#definepte_index(addr)                 (((addr)>> PAGE_SHIFT) & (PTRS_PER_PTE - 1))

 

#definepte_offset_kernel(pmd,addr)     (pmd_page_vaddr(*(pmd))+ pte_index(addr))    //個人認為直接對映區地址沒有pte,此巨集只是給pte頁表存在直接對映區的使用者空間地址或者高階記憶體地址用。

 

#definepte_offset_map(pmd,addr)       (__pte_map(pmd)+ pte_index(addr))

#definepte_unmap(pte)                         __pte_unmap(pte)

 

#definepte_pfn(pte)             ((pte_val(pte)& PHYS_MASK) >> PAGE_SHIFT)

#definepfn_pte(pfn,prot)     __pte(__pfn_to_phys(pfn)| pgprot_val(prot))

 

#definepte_page(pte)          pfn_to_page(pte_pfn(pte))           

#definemk_pte(page,prot)   pfn_pte(page_to_pfn(page),prot)

#definepte_none(pte)          (!pte_val(pte))                                    //判斷為真表示尚未為這個表項建立對映

#definepte_present(pte)      (pte_isset((pte),L_PTE_PRESENT))                  //檢測頁面是否在記憶體中,為真則在記憶體中

 

 

 

kernel/arch/arm/include/asm/ pgtable-2level.h

#definePTRS_PER_PTE                           512

#definePTRS_PER_PMD                1      

#definePTRS_PER_PGD                 2048

 

#definePTE_HWTABLE_PTRS        (PTRS_PER_PTE)

#definePTE_HWTABLE_OFF          (PTE_HWTABLE_PTRS* sizeof(pte_t))

#definePTE_HWTABLE_SIZE         (PTRS_PER_PTE *sizeof(u32))

 

/*

 * PMD_SHIFT determines the size of the area asecond-level page table can map

 * PGDIR_SHIFT determines what a third-levelpage table entry can map

 */

#define PMD_SHIFT              21

#define PGDIR_SHIFT           21

 

#define PMD_SIZE                 (1UL << PMD_SHIFT)

#define PMD_MASK              (~(PMD_SIZE-1))

#define PGDIR_SIZE              (1UL << PGDIR_SHIFT)           //2 的21次方,2MB

#define PGDIR_MASK           (~(PGDIR_SIZE-1))

 

/*

 * section address mask and size definitions.

 */

#defineSECTION_SHIFT                 20

#defineSECTION_SIZE                   (1UL << SECTION_SHIFT)

#defineSECTION_MASK                 (~(SECTION_SIZE-1))

 

#defineUSER_PTRS_PER_PGD      (TASK_SIZE / PGDIR_SIZE)   //0xbf0/2 = 0x5f8,此值* 8即0x2fc

/*

 * The "pud_xxx()" functions here aretrivial when the pmd is folded into

 * the pud: the pud entry is never bad, alwaysexists, and can't be set or

 * cleared.

 */

#definepud_none(pud)                  (0)

#definepud_bad(pud)                    (0)

#definepud_present(pud)             (1)

#definepud_clear(pudp)                do { }while (0)

#defineset_pud(pud,pudp)            do { } while (0)

 

static inline pmd_t*pmd_offset(pud_t *pud, unsigned long addr)

{

         return (pmd_t *)pud;

}

 

 

 

Kernel/ arch/arm/include/asm/page.h      //有MMU不走asm-generic\page.h

/* PAGE_SHIFTdetermines the page size */

#define PAGE_SHIFT             12

#define PAGE_SIZE                (_AC(1,UL) << PAGE_SHIFT)

#define PAGE_MASK             (~((1 << PAGE_SHIFT) - 1))   //0xfffff000

#define PAGE_OFFSET          UL(CONFIG_PAGE_OFFSET)   //0xc0000000

 

#ifndef CONFIG_MMU

#include<asm/page-nommu.h>

#else

#ifdefCONFIG_ARM_LPAE

#include<asm/pgtable-3level-types.h>

#else

#include<asm/pgtable-2level-types.h>

#endif

#endif /*CONFIG_MMU */

#include<asm/memory.h>

 

 

 

kernel/arch/arm/include/asm/fixmap.h

#defineFIXADDR_START                0xffc00000UL

#define FIXADDR_END                   0xfff00000UL

#define FIXADDR_TOP                    (FIXADDR_END -PAGE_SIZE)  

 

 

 

kernel/arch/arm/include/asm/highmem.h

/* start afterfixmap area */

#define PKMAP_BASE                    (PAGE_OFFSET -PMD_SIZE)    // 0xBFE00000

/*PKMAP  size大小為LAST_PKMAP * PAGE_SIZE即512*(1<12) =2MB

x86 32位定義不同 kernel/arch/x86/include/asm/pgtable_32_types.h

#define PKMAP_BASE((FIXADDR_START - PAGE_SIZE * (LAST_PKMAP + 1))& PMD_MASK)

故網上多數圖都是pmkbase在vmalloc_end之後,但ARM上是在PAGE_OFFSET之前的*/

#define LAST_PKMAP                     PTRS_PER_PTE                  //512

#defineLAST_PKMAP_MASK                   (LAST_PKMAP- 1)

#definePKMAP_NR(virt)                (((virt) -PKMAP_BASE) >> PAGE_SHIFT)

#definePKMAP_ADDR(nr)              (PKMAP_BASE +((nr) << PAGE_SHIFT))

 

#define kmap_prot                         PAGE_KERNEL

 

 

 

kernel\arch\arm\include\asm\pgtable-2level-types.h

typedef u32pteval_t;

typedef u32 pmdval_t;

 

typedef pteval_tpte_t;

typedef pmdval_tpmd_t;

typedef pmdval_tpgd_t[2];

typedef pteval_tpgprot_t;

 

#definepte_val(x)      (x)

#definepmd_val(x)      (x)

#define pgd_val(x)       ((x)[0])

#definepgprot_val(x)   (x)

 

#define__pte(x)        (x)

#define__pmd(x)        (x)

#define__pgprot(x)     (x)

 

 

 

kernel\include\asm-generic\memory_model.h                             

/*page結構體 與pfn 的轉換,可以看出page與物理頁是一一對應的*/

/*

 * Convert a physical address to a Page FrameNumber and back

 */

#define      __phys_to_pfn(paddr) ((unsigned long)((paddr) >> PAGE_SHIFT))

#define      __pfn_to_phys(pfn)     PFN_PHYS(pfn)

 

#define page_to_pfn__page_to_pfn

#define pfn_to_page__pfn_to_page

 

#define__pfn_to_page(pfn)          (mem_map + ((pfn) - ARCH_PFN_OFFSET))

#define__page_to_pfn(page)       ((unsignedlong)((page) - mem_map) + \                                              ARCH_PFN_OFFSET)     // 指標相減,結果為兩個指標之間的元素數目

 

 

 

Kernel/include/asm-generic/pgtable-nopud.h

typedef struct {pgd_t pgd; } pud_t;

#define PUD_SHIFT              PGDIR_SHIFT

#definePTRS_PER_PUD        1

#definePUD_SIZE               (1UL <<PUD_SHIFT)

#definePUD_MASK           (~(PUD_SIZE-1))

/*

 * The "pgd_xxx()" functions here aretrivial for a folded two-level

 * setup: the pud is never bad, and a pudalways exists (as it's folded

 * into the pgd entry)

 */

static inline intpgd_none(pgd_t pgd)              { return0; }

static inline intpgd_bad(pgd_t pgd)                { return0; }

static inline intpgd_present(pgd_t pgd)          { return 1; }

static inline voidpgd_clear(pgd_t *pgd)          { }

#definepud_ERROR(pud)                                 (pgd_ERROR((pud).pgd))

 

#definepgd_populate(mm, pgd, pud)              do{ } while (0)

/*

 * (puds are folded into pgds so this doesn'tget actually called,

 * but the define is needed for a genericinline function.)

 */

#defineset_pgd(pgdptr, pgdval)                       set_pud((pud_t*)(pgdptr), (pud_t) { pgdval })

 

static inline pud_t* pud_offset(pgd_t * pgd, unsigned long address)

{

         return (pud_t *)pgd;

}

 

#define pud_val(x)                                   (pgd_val((x).pgd))

#define __pud(x)                                     ((pud_t) { __pgd(x) } )

#definepgd_page(pgd)                           (pud_page((pud_t){pgd }))                     //這兩個巨集定義也有些異常,感覺原生bug,需要去掉。

#definepgd_page_vaddr(pgd)                (pud_page_vaddr((pud_t){pgd }))

 

 

 

kernel\arch\arm\include\asm\memory.h

/*

 * TASK_SIZE - the maximum size of a user spacetask.

 * TASK_UNMAPPED_BASE - the lower boundary ofthe mmap VM area

 */

#define TASK_SIZE                         (UL(CONFIG_PAGE_OFFSET)- UL(SZ_16M))

#defineTASK_UNMAPPED_BASE   ALIGN(TASK_SIZE / 3,SZ_16M)

 

#defineARCH_PFN_OFFSET          PHYS_PFN_OFFSET

 

/*

 * Convert a page to/from a physical address

 */

#definepage_to_phys(page)         (__pfn_to_phys(page_to_pfn(page)))

#definephys_to_page(phys) (pfn_to_page(__phys_to_pfn(phys)))

 

#elifdefined(CONFIG_ARM_PATCH_PHYS_VIRT)

/*

 * Constants used to force the rightinstruction encodings and shifts

 * so that all we need to do is modify the8-bit constant field.

 */

#define__PV_BITS_31_24    0x81000000

#define __PV_BITS_7_0        0x81

 

extern unsignedlong __pv_phys_pfn_offset;

extern u64__pv_offset;

extern voidfixup_pv_table(const void *, unsigned long);

extern const void*__pv_table_begin, *__pv_table_end;

 

#define PHYS_OFFSET ((phys_addr_t)__pv_phys_pfn_offset <<PAGE_SHIFT)                        //0x80000000

#definePHYS_PFN_OFFSET (__pv_phys_pfn_offset)                                                                //0x80000

 

#definevirt_to_pfn(kaddr) \

         ((((unsigned long)(kaddr) -PAGE_OFFSET) >> PAGE_SHIFT) + \

          PHYS_PFN_OFFSET)

 

#define__pv_stub(from,to,instr,type)                       \

         __asm__("@ __pv_stub\n"                              \

         "1:     " instr "      %0,%1, %2\n"             \

         "        .pushsection.pv_table,\"a\"\n"              \

         "        .long 1b\n"                                     \

         "        .popsection\n"                                \

         : "=r" (to)                                        \

         : "r" (from), "I"(type))

 

#define__pv_stub_mov_hi(t)                                    \

         __asm__ volatile("@__pv_stub_mov\n"                 \

         "1:     mov  %R0, %1\n"                           \

         "        .pushsection.pv_table,\"a\"\n"              \

         "        .long 1b\n"                                     \

         "        .popsection\n"                                \

         : "=r" (t)                                          \

         : "I" (__PV_BITS_7_0))

 

#define__pv_add_carry_stub(x, y)                            \

         __asm__ volatile("@__pv_add_carry_stub\n"        \

         "1:     adds  %Q0, %1, %2\n"                    \

         "        adc    %R0, %R0, #0\n"                            \

         "        .pushsection.pv_table,\"a\"\n"              \

         "        .long 1b\n"                                     \

         "        .popsection\n"                                \

         : "+r" (y)                                          \

         : "r" (x), "I"(__PV_BITS_31_24)              \

         : "cc")

 

static inlinephys_addr_t __virt_to_phys(unsigned long x)

{

         phys_addr_t t;

 

         if (sizeof(phys_addr_t) == 4) {

                   __pv_stub(x, t,"add", __PV_BITS_31_24);

         } else {

                   __pv_stub_mov_hi(t);

                   __pv_add_carry_stub(x, t);

         }

         return t;

}

 

static inlineunsigned long __phys_to_virt(phys_addr_t x)

{

         unsigned long t;

 

         /*

          * 'unsigned long' cast discard upper word when

          * phys_addr_t is 64 bit, and makes sure thatinline

          * assembler expression receives 32 bitargument

          * in place where 'r' 32 bit operand isexpected.

          */

         __pv_stub((unsigned long) x, t,"sub", __PV_BITS_31_24);

         return t;

}

#else

 

/*

 * These are *only* valid on the kernel directmapped RAM memory.

 * Note: Drivers should NOT use these.  Theyare the wrong

 * translation for translating DMAaddresses.  Use the driver

 * DMA support - see dma-mapping.h.

 */

#define virt_to_physvirt_to_phys

static inlinephys_addr_t virt_to_phys(const volatile void *x)

{

         return __virt_to_phys((unsignedlong)(x));

}

 

#definephys_to_virt phys_to_virt

static inline void*phys_to_virt(phys_addr_t x)

{

         return (void *)__phys_to_virt(x);

}

 

/*

 * Drivers should NOT use these either.

 */

#define __pa(x)                     __virt_to_phys((unsignedlong)(x))

#define __va(x)                     ((void*)__phys_to_virt((phys_addr_t)(x)))

#definepfn_to_kaddr(pfn)    __va((phys_addr_t)(pfn)<< PAGE_SHIFT)

 

/*

 * Conversion between a struct page and aphysical address.

 *                                 

 * page_to_pfn(page)       convert astruct page * to a PFN number

 * pfn_to_page(pfn) convert a _valid_PFN number to struct page *

 *

 * virt_to_page(k)    convert a_valid_ virtual address to struct page *

 * virt_addr_valid(k) indicateswhether a virtual address is valid

 */

#defineARCH_PFN_OFFSET          PHYS_PFN_OFFSET

 

#definevirt_to_page(kaddr) pfn_to_page(virt_to_pfn(kaddr))

#definevirt_addr_valid(kaddr)       (((unsignedlong)(kaddr) >= PAGE_OFFSET && (unsigned long)(kaddr) < (unsignedlong)high_memory) \

                                               &&pfn_valid(virt_to_pfn(kaddr)))

 

 

 

 

 

 

 

 

 

第二章  虛擬地址到實體地址轉換

 

2.1 MMU 硬體VA到PA轉換

 

9850K 的cpu是Cortex A7的。

The Cortex-A7 MPCore processorimplements the Extended VMSAv7 MMU, which includes the ARMv7-A Virtual MemorySystem Architecture (VMSA), the Security Extensions, the Large Physical AddressExtensions (LPAE), and the Virtualization Extensions.

 

VMSAv7 defines two alternativetranslation table formats:

Short-descriptor format

This is the original format definedin issue A of this Architecture Reference Manual, and is the only

format supported on implementationsthat do not include the Large Physical Address Extension. It

uses 32-bit descriptor entries inthe translation tables, and provides:

• Up to two levels of address lookup.

• 32-bit input addresses.

• Output addresses of up to 40 bits.

• Support for PAs of more than 32 bits by use of supersections, with16MB granularity.

• Support for No access, Client, and Manager domains.

• 32-bit table entries.

Long-descriptor format

The Large Physical Address Extensionadds support for this format. It uses 64-bit descriptor entries

in the translation tables, andprovides:

• Up to three levels of address lookup.

• Input addresses of up to 40 bits, when used for stage 2translations.

• Output addresses of up to 40 bits.

• 4KB assignment granularity across the entire PA range.

• No support for domains, all memory regions are treated as in aClient domain.

• 64-bit table entries.

• Fixed 4KB table size, unless truncated by the size of the inputaddress space.

 

9850K專案是32位系統的,所以用的Short-descriptor format。也即二級MMU 對映,不開LPAE。Short-descriptor translation table format 支援sections對映和pages 對映,也即段對映和頁表對映。

 

 

         一級頁表描述符格式如下圖:

 

 

最後兩個bit標識描述符的型別

0b00, Invalid

The associated VA is unmapped, and anyattempt to access it generates a Translation fault.

Software can use bits[31:2] of thedescriptor for its own purposes, because the hardware ignores

these bits.

 

0b01, Page table

The descriptor gives the address of asecond-level translation table, that specifies the mapping of the associated1MByte VA range.

 

0b10, Section or Supersection

The descriptor gives the base address ofthe Section or Supersection. Bit[18] determines whether

the entry describes a Section or a Supersection.

If the implementation supports the PXNattribute, this encoding also defines the PXN bit as 0.

 

 

0b11, Section or Supersection, if theimplementation supports the PXN attribute

If an implementation supports the PXNattribute, this encoding is identical to 0b10, except that it

defines the PXN bit as 1.

 

0b11, Reserved, UNK/SBZP, if theimplementation does not support the PXN attribute

An attempt to access the associated VAgenerates a Translation fault.

On an implementation that does notsupport the PXN attribute, this encoding must not be used.

 

 

段對映的地址轉換過程,對應程式碼裡就是核心邏輯地址轉換過程:

 

 

 

 

 

二級小頁對映的地址轉換過程:

 

 

                                               Figure B3-11 Small page address translation

 

 

注意上圖Translation table base register裡存的是實體地址

#define cpu_switch_mm(pgd,mm)cpu_do_switch_mm(virt_to_phys(pgd),mm)     //這裡換成了實體地址

 

另外一級頁表描述符高22bit是地址位,因為mmu上一個一級頁表對應256個pte,所以一級頁表描述符是低10bit用來做二級頁表索引。這與linux程式碼不同。Linux做了調整,一個pgd對應的頁表正好是一頁大小(512個linux的、512個hw的),所以要留低12bit用來的二級頁表索引,故*pgd的有效地址為高20bit或者高21bit。

 

 

可以看出,不管section-translation還是page-translation,在一級頁表中都是完成1MB地址的對映,而page-translation的第二級頁表項中完成4K頁的對映。 另外,不管第一級頁表項還是第二級頁表項中除了儲存實體地址,還會有很多bit是空餘的,這些空餘的bit完成了對所對映地址的訪問許可權以及操作屬性的控制,主要包括AP位(access permission)和cache屬性位。具體的含義見mmu手冊或者pgtable-2level-hwdef.h描述。

 

pgtable-2level-hwdef.h

/*

* + Level 1 descriptor (PMD)*

這是硬體定義的一級頁表描述符,crash工具在vtop時,是取vaddr高12bit+base_pgd,然後取值得到的;並非我們軟體pgd_offset取高11bit。kernel 程式碼似乎都未用到這些巨集定義,但實際crash工具有用到,程式碼裡判斷pmd_bad似乎亦與此相關*/

#define PMD_TYPE_MASK                (_AT(pmdval_t, 3) << 0)

#define PMD_TYPE_FAULT                (_AT(pmdval_t, 0) << 0)            //表示該範圍的VA沒有對映到PA,訪問將產生Translationfault.

#define PMD_TYPE_TABLE               (_AT(pmdval_t, 1) << 0)            //頁表對映

#define PMD_TYPE_SECT                  (_AT(pmdval_t, 2) << 0)            //段頁表,相當於沒有pte

 

另外從二級小頁對映轉換過程可以看出,二級頁表描述符最後兩個bit是標識大小頁的。下面程式碼也有定義:

pgtable-2level-hwdef.h

/*

* + Level 2 descriptor (PTE) *

這裡是h/w pte,所以計算得到linux pte後,要偏移2048得到硬體pte,取其值即是這裡二級頁表描述符;程式碼即是(long long)pte_val(pte[PTE_HWTABLE_PTRS]) */

#define PTE_TYPE_MASK                  (_AT(pteval_t, 3) << 0)

#define PTE_TYPE_FAULT                  (_AT(pteval_t, 0) << 0)     //這個地址沒有對映,訪問產生Translationfault

#define PTE_TYPE_LARGE                 (_AT(pteval_t, 1) << 0)     //64KB大頁

#define PTE_TYPE_SMALL                 (_AT(pteval_t, 2) << 0)     //4KB小頁

 

 

綜上,MMU層的地址轉換可參考下圖理解:

 

 

 

 

 

 

 

 

2.2 Linux Kernel虛擬地址轉換

Arm上的linux(正式)頁表採用的是一級段對映結合二級小頁表實現4G空間的訪問。為了配合64位CPU的體系結構,Linux從2.6.11開始採用四級分頁模型:(為了適用於32位和64位系統)

 PGD(Page Global Directory)                  頁全域性目錄

 PUD(Page Upper Directory)                  頁上級目錄

 PMD(Page Middle Directory)                頁中間目錄

 PT(Page Table)                                          頁表

這裡linux軟體的分級與上面MMU略有不同。在linux中,二級對映採用了4K小頁作為最小單元,4KB的頁大小決定了虛擬地址的低12bit留作偏移地址用。也決定了二級頁描述符的低12位作為使用者標誌用,4K的頁大小還決定了虛擬地址空間最多可以映射出(4GB/4KB=1024×1024)個頁。程式中下列巨集用於定義頁的大小:

Kernel/arch/arm/include/asm/page.h

#define PAGE_SHIFT 12

#define PAGE_SIZE (1UL << PAGE_SHIFT)

 

再看linux根據MMU 硬體頁表略微調整,調整後的linux頁表圖:

 

 於是可以確定下面巨集定義:

kernel/arch/arm/include/asm/pgtable-2level.h

#define PTRS_PER_PTE 512   表示每個末級頁表PTE中含有512個條目(9bit)

#define PTRS_PER_PMD 1    表示中間頁表PMD表等同於末級頁表PT

#define PTRS_PER_PGD 2048  表示全域性頁目錄表中含有2048個條目(11bit)

因此概括為,ARM體系下實體記憶體和虛擬記憶體按照4KB的大小進行分頁,頁索引表分為兩級,其中全域性一級頁表PGD含有2048個條目,每個條目對應一個二級頁表物理首地址。二級頁表(PMD或者PT)最多2048個,每個頁表(PT)中含有512個頁表項Page Table Entry(PTE),每個頁表項(PTE)指向一頁物理首地址。

Linux下虛擬地址轉換格式,記住這是linux軟體上定義的,實際MMU略有不同。

 

 

2.3程式碼上虛擬地址轉換實體地址

2.3.1 Crash工具解析地址

 

Crash工具目前看到的多次轉換都是求的linux pte,*pgd高20bit與pteindex相加。

 

  1. 核心虛擬地址( 大於MODULES_VADDR bf000000 ),包含一級段對映

一級段對映:

pgd= c0004000+ vaddr>>20 * 4

 Paddr= *pgd& PAGE_MASK + vaddr& (~PAGE_MASK)

二級頁表對映:

  pgd= c0004000+ vaddr>>20 * 4    

     pte = *pgd& PAGE_MASK  + pte_offset(vaddr)    //大部分是取*pgd高20位,相加找到linux pte,取高21位也行,相加就得到了hw pte;pte為中間9位bit[21:13],可加2048=0x800取值得到arm pte

         Paddr = (rd –p pte)  + vaddr& (~PAGE_MASK)

 

    2) 使用者空間地址(小於MODULES_VADDR bf000000),屬於2級頁表對映的地址:

        pgd = mm.pgd + vaddr>>20 * 4

        pte = *pgd& PAGE_MASK  + pte_offset(vaddr)    //大部分是取*pgd高20位,相加找到linux pte,取高21位也行,相加就得到了hw pte;pte為中間9位bit[21:13],可加2048=0x800取值得到arm pte

        Paddr = (rd –p pte)  + vaddr& (~PAGE_MASK)

 

 

2.3.2 Kernel程式碼流程

1)核心低端地址:

於核心直接對映區的核心邏輯地址(用kmalloc(), __get_free_pages申請的),也就是相當於一級段對映的虛擬地址,使用virt_to_phys()和phys_to_virt()來實現實體地址和核心邏輯地址之間的互相轉換。

 

程式碼上是彙編實現,沒找到直接計算公式,但可歸納成如下公式:

Paddr = vaddr – 0x40000000

 

 

 

#define virt_to_phys virt_to_phys

static inline phys_addr_t virt_to_phys(const volatile void *x)

{

     return__virt_to_phys((unsigned long)(x));

}

 

#define phys_to_virt phys_to_virt

static inline void *phys_to_virt(phys_addr_t x)

{

     return (void*)__phys_to_virt(x);

}

 

 

/*

 * Drivers should NOT use these either.

 */

#define__pa(x)                           __virt_to_phys((unsignedlong)(x))

#define__va(x)                           ((void*)__phys_to_virt((phys_addr_t)(x)))

 

2)使用者空間地址及核心高階地址

 

也就是2級頁表對映的地址,此類地址轉換公式總結如下:

         pgd= mm.pgd + vaddr>>21 * 8    (對於核心高階地址mm.pgd為0xc0004000)

         page=(unsigned int) (*pgd>>PAGE_SHIFT – 0x80000) +mem_map

         pt虛擬地址 = (pte_t*)kmap_atomic(page);

         pte= * pt虛擬地址+  pte_offset(vaddr)    // pte_offset取中間9位bit[21:12],算出的pte可加2048=0x800取值得到arm pte

         Paddr= (*pte)& PAGE_MASK  +  vaddr& (~PAGE_MASK)

/*

 * This is useful to dump out the page tablesassociated with

 * 'addr' in mm 'mm'.

 */

voidshow_pte(struct mm_struct *mm, unsigned long addr)

{

              pgd_t *pgd;

 

              if (!mm)

                            mm = &init_mm;

 

              pr_alert("pgd = %p\n",mm->pgd);

              pgd = pgd_offset(mm, addr);                          //這個pgd值還是虛擬地址,*pgd 這個值是實體地址

              pr_alert("[%08lx]*pgd=%08llx",addr, (long long)pgd_val(*pgd));

 

              do {

                            pud_t *pud;

                            pmd_t *pmd;

                            pte_t*pte;

 

                            if (pgd_none(*pgd))

                                          break;

 

                            if (pgd_bad(*pgd)) {

                                          pr_cont("(bad)");

                                          break;

                            }

 

                            pud =pud_offset(pgd, addr);

                            if (PTRS_PER_PUD !=1)

                                          pr_cont(",*pud=%08llx", (long long)pud_val(*pud));

 

                            if (pud_none(*pud))

                                          break;

 

                            if (pud_bad(*pud)) {

                                          pr_cont("(bad)");

                                          break;

                            }

 

                            pmd =pmd_offset(pud, addr);

                            if (PTRS_PER_PMD !=1)

                                          pr_cont(",*pmd=%08llx", (long long)pmd_val(*pmd));

 

                            if (pmd_none(*pmd))

                                          break;

 

                            if (pmd_bad(*pmd)){                 //核心低端地址這裡會報錯返回

                                          pr_cont("(bad)");

                                          break;

                            }

 

                            /* We must not mapthis if we have highmem enabled */

              &