android emulator虛擬設備分析第一篇之battery

一、概述

本文使用的android版本是5.1.0_r1,goldfish內核版本是3.4,android鏡像是x86架構的。本文以battery爲例,完整地介紹了虛擬設備的實現和使用。


爲什麼android emulator需要虛擬設備,簡單來說就是android系統需要使用,但是host系統卻沒有,比如gps,bluetooth,battery,gsm等。另外,虛擬設備也提供了android emulator和guest os之間交流的方式,比如emulator控制面板中可以設置電池的電量,是否在充電,如圖1所示;也可以設置當前的gps座標等等;更重要的是,將guest os中畫圖的操作放到了host中執行,android emulator才能夠比較流暢的運行guest os。

emulator控制面板

圖1


整個虛擬設備的框架如圖2所示,左上角是guest os;左下角既包括了kernel中虛擬設備的驅動,也包括了emulator中虛擬設備的模擬;右下角是android emulator。

1、guest os通過hal或者直接使用kernel提供的虛擬設備的驅動(一般來說,虛擬設備的驅動會提供一些字符設備文件,以及屬性文件,讀寫這些文件即可)。

2-1、從kernel的角度來看,無需關心設備是真實的,還是虛擬的,只需要關心設備提供的資源,比如IO資源,中斷號,以及如何讀寫設備的寄存器,這裏和普通的驅動程序類似。需要注意的是,虛擬設備都掛在platform bus上,方便動態地分配IO內存空間,以及中斷號,當然platform bus本身的IO內存和中斷號是固定寫死的,和emulator中固定寫死的相對應。

2-2、從emulator的角度看,首先是platform bus的模擬,需要使用固定寫死的IO內存和中斷號,這和kernel是相對應的。然後其他虛擬設備動態註冊IO內存和中斷號到這個platform bus上面。kernel對IO內存進行讀寫時,emulator很明顯可以得知讀寫的是哪一個虛擬物理地址,然後得到虛擬頁面。虛擬頁面有相應的信息,可以得到一個io_index變量,使用這個io_index,可以得知該頁面是哪一個虛擬設備的IO內存,以及這個虛擬設備自己的讀寫函數,使用對應設備的讀寫函數,讀寫虛擬設備的寄存器(每個虛擬設備的寄存器都放在一個結構體中),根據約定好的寄存器的功能,去接收/返回數據。這裏知識點比較多,且涉及到了很多硬件的知識,對於純軟件的開發人員來說,過於複雜,後面會詳細講解。

3、emulator提供了一種抽象的虛擬設備,叫做pipe,對應的設備文件爲/dev/qemu_pipe,提供guest os和emulator通用的數據收發方法。基於這一層通用的數據收發方法,在emulator中註冊了很多qemud service,guest os可以通過讀寫/dev/qemu_pipe去和這些qemud service通信。

PS:

1、guest os中有一個qemud進程,使用虛擬設備ttyS1去提供guest os和emulator的交流方式,是一種舊的方式,速度比較慢,已基本被pipe方式代替。

2、關於platform模型,可以看看這篇資料:http://www.wowotech.net/device_model/platform_device.html

當註冊一個新的設備時,會將設備作爲參數,probe給每一個匹配的驅動程序,看看哪個驅動程序可以處理這個新的設備。

當註冊一個新的驅動時,會將驅動作爲參數,probe給每一個未被處理的匹配的設備,看看新的驅動可以處理哪一個未被處理的設備。

通過驅動和設備的名字進行匹配。

虛擬設備架構


二、內核中虛擬設備的驅動程序

2.1、battery的驅動

首先是虛擬設備battery文檔的學習:

VII. Goldfish battery:
======================

Relevant files:
  $QEMU/hw/android/goldfish/battery.c
  $QEMU/hw/power_supply.h
  $KERNEL/drivers/power/goldfish_battery.c

Device properties:
  Name: goldfish_battery
  Id: -1
  IrqCount: 1
  I/O Registers:
    0x00 INT_STATUS   R: Read battery and A/C status change bits.
    0x04 INT_ENABLE   W: Enable or disable IRQ on status change.
    0x08 AC_ONLINE    R: Read 0 if AC power disconnected, 1 otherwise.
    0x0c STATUS       R: Read battery status (charging/full/... see below).
    0x10 HEALTH       R: Read battery health (good/overheat/... see below).
    0x14 PRESENT      R: Read 1 if battery is present, 0 otherwise.
    0x18 CAPACITY     R: Read battery charge percentage in [0..100] range.

A simple device used to report the state of the virtual device's battery, and
whether the device is powered through a USB or A/C adapter.

The device uses a single IRQ to notify the kernel that the battery or A/C status
changed. When this happens, the kernel should perform an IO_READ(INT_STATUS)
which returns a 2-bit value containing flags:

  bit 0: Set to 1 to indicate a change in battery status.
  bit 1: Set to 1 to indicate a change in A/C status.

Note that reading this register also lowers the IRQ level.

The A/C status can be read with IO_READ(AC_ONLINE), which returns 1 if the
device is powered, or 0 otherwise.

The battery status is spread over multiple I/O registers:

  IO_READ(PRESENT) returns 1 if the battery is present in the virtual device,
  or 0 otherwise.

  IO_READ(CAPACITY) returns the battery's charge percentage, as an integer
  between 0 and 100, inclusive. NOTE: This register is probably misnamed since
  it does not represent the battery's capacity, but it's current charge level.

  IO_READ(STATUS) returns one of the following values:

    0x00  UNKNOWN      Battery state is unknown.
    0x01  CHARGING     Battery is charging.
    0x02  DISCHARGING  Battery is discharging.
    0x03  NOT_CHARGING Battery is not charging (e.g. full or dead).

  IO_READ(HEALTH) returns one of the following values:

    0x00  UNKNOWN         Battery health unknown.
    0x01  GOOD            Battery is in good condition.
    0x02  OVERHEATING     Battery is over-heating.
    0x03  DEAD            Battery is dead.
    0x04  OVERVOLTAGE     Battery generates too much voltage.
    0x05  UNSPEC_FAILURE  Battery has unspecified failure.

The kernel can use IO_WRITE(INT_ENABLE, <flags>) to select which condition
changes should trigger an IRQ. <flags> is a 2-bit value using the same format
as INT_STATUS.
如果你搞過硬件,可以瀏覽一下說明,即可知道這個芯片幹什麼的了,下面一段話無需再看。如果你是搞純軟件的,還是老老實實看吧。

可以把設備當作一個函數,寄存器是它的一些輸入數據、返回數據,以及一些狀態。中斷有點像linux編程中的信號(signal),當設備有數據可讀,可以接收數據,狀態發生變化等等時,可以(當然,也可以不)產生一箇中斷,打斷內核的執行(CPU硬件上的打斷,不是操作系統的調度),跳轉到中斷處理函數(類似於信號處理函數,信號和信號處理函數對應,中斷號和中斷處理函數對應)。具體的跳轉方式如下:使用內核中的函數request_irq申請中斷時,填寫中斷號和中斷函數。內存中有一張表(數組),叫做中斷向量表,以中斷號爲key,以中斷函數的地址爲value,記錄了中斷函數的信息。當中斷髮生時,CPU可以得知中斷號,然後通過中斷向量表查找到對應的中斷處理函數,然後跳轉過去執行。真正的中斷函數是沒有輸入參數和返回值,內核中提供的中斷函數是經過處理的,所以會有int irq, void *dev_id兩個參數。虛擬設備的寄存器的地址都很小,可以理解爲偏移量,那麼base如何獲取呢?首先通過platform bus,得到IO內存的虛擬物理地址,然後使用ioremap將虛擬物理地址映射到內核虛擬地址中,然後可以在內核中使用。注意不能直接當成普通的內存來用,需要使用特殊的readb, writeb, readw, writew, readl, writel,因爲硬件的寄存器,每次讀取,返回的數據可以是不同的;如果要通過寄存器發送一個數組,那麼循環對同一個寄存器進行寫操作即可,寄存器地址不用++;另外對於讀取和寫入的順序以及操作的寬度(8bit, 16bit or 32bit)也有嚴格的要求,不是隨便來的。如果當成普通內存訪問,那麼編譯器可能會去使用緩存,CPU執行指令可能亂序,以及寬度不對,都會導致硬件工作不正常,所以不能當成普通內存指針去使用。


battery驅動代碼在goldfish目錄中的drivers/misc/qemupipe/qemu_pipe.c,爲了簡單起見,註銷,關閉,清理的代碼就不詳細說明了。

驅動的初始化函數是:

static struct platform_driver goldfish_battery_device = {
    .probe      = goldfish_battery_probe,
    .remove     = goldfish_battery_remove,
    .driver = {
        .name = "goldfish-battery"
    }
};

static int __init goldfish_battery_init(void)
{
    return platform_driver_register(&goldfish_battery_device);
}
註冊了一個名爲goldfish-battery的總線設備,它的probe函數爲goldfish_battery_probe,在安裝battery驅動,或者總線上有新的設備時會被調用,去匹配驅動程序和設備(根據驅動的名字和設備的名字匹配)。


goldfish_battery_probe先是對goldfish_battery_data結構體進行初始化,然後使用platform_get_resource去獲取設備的IO內存資源,對IORESOURCE_MEM資源進行ioremap,然後將base保存到data->reg_base中;然後使用platform_get_irq獲取中斷號,並保存到data->irq中並使用request_irq函數註冊了中斷函數goldfish_battery_interrupt。

data->battery和data->ac都是struct power_supply,比如battery:

    data->battery.properties = goldfish_battery_props;
    data->battery.num_properties = ARRAY_SIZE(goldfish_battery_props);
    data->battery.get_property = goldfish_battery_get_property;
    data->battery.name = "battery";
    data->battery.type = POWER_SUPPLY_TYPE_BATTERY;

會有一些屬性名,屬性個數,讀取屬性的函數等信息,power_supply_register之後,在guest os的/sys/class/power_supply/battery中會有一些文件,文件名都和屬性名對應,比如capacity,health,status等,讀函數也就是剛纔的goldfish_battery_get_property,寫函數沒有。guest os用戶空間的程序,直接讀取這些屬性文件,屬性文件的內容,都來自於對寄存器的讀取,比如

 static int goldfish_battery_get_property(struct power_supply *psy,
              enum power_supply_property psp,
              union power_supply_propval *val)
{
 struct goldfish_battery_data *data = container_of(psy,
     struct goldfish_battery_data, battery);
 int ret = 0;

 switch (psp) {
 case POWER_SUPPLY_PROP_STATUS:
     val->intval = GOLDFISH_BATTERY_READ(data, BATTERY_STATUS);
     break;
 case POWER_SUPPLY_PROP_HEALTH:
     val->intval = GOLDFISH_BATTERY_READ(data, BATTERY_HEALTH);
        break;
    case POWER_SUPPLY_PROP_PRESENT:
        val->intval = GOLDFISH_BATTERY_READ(data, BATTERY_PRESENT);
        break;
    case POWER_SUPPLY_PROP_TECHNOLOGY:
        val->intval = POWER_SUPPLY_TECHNOLOGY_LION;
        break;
    case POWER_SUPPLY_PROP_CAPACITY:
        val->intval = GOLDFISH_BATTERY_READ(data, BATTERY_CAPACITY);
        break;
    default:
        ret = -EINVAL;
        break;
    }

    return ret;
}

這樣就可得到虛擬設備battery的信息。

最後,GOLDFISH_BATTERY_WRITE(data, BATTERY_INT_ENABLE, BATTERY_INT_MASK)寫BATTERY_INT_MASK到寄存器BATTERY_INT_ENABLE使能了中斷。當battery以及ac的狀態發生變化時,虛擬設備將產生中斷(這部分代碼在emulator中),然後我們的中斷函數goldfish_battery_interrupt就會被調用了。

完整的goldfish_battery_probe代碼如下:

static int goldfish_battery_probe(struct platform_device *pdev)
{
    int ret;
    struct resource *r;
    struct goldfish_battery_data *data;

    data = kzalloc(sizeof(*data), GFP_KERNEL);
    if (data == NULL) {
        ret = -ENOMEM;
        goto err_data_alloc_failed;
    }
    spin_lock_init(&data->lock);

    data->battery.properties = goldfish_battery_props;
    data->battery.num_properties = ARRAY_SIZE(goldfish_battery_props);
    data->battery.get_property = goldfish_battery_get_property;
    data->battery.name = "battery";
    data->battery.type = POWER_SUPPLY_TYPE_BATTERY;

    data->ac.properties = goldfish_ac_props;
    data->ac.num_properties = ARRAY_SIZE(goldfish_ac_props);
    data->ac.get_property = goldfish_ac_get_property;
    data->ac.name = "ac";
    data->ac.type = POWER_SUPPLY_TYPE_MAINS;

    r = platform_get_resource(pdev, IORESOURCE_MEM, 0);
    if (r == NULL) {
        printk(KERN_ERR "%s: platform_get_resource failed\n", pdev->name);
        ret = -ENODEV;
        goto err_no_io_base;
    }
#if defined(CONFIG_ARM)
    data->reg_base = (void __iomem *)IO_ADDRESS(r->start - IO_START);
#elif defined(CONFIG_X86) || defined(CONFIG_MIPS)
    data->reg_base = ioremap(r->start, r->end - r->start + 1);
#else
#error NOT SUPPORTED
#endif

    data->irq = platform_get_irq(pdev, 0);
    if (data->irq < 0) {
        printk(KERN_ERR "%s: platform_get_irq failed\n", pdev->name);
        ret = -ENODEV;
        goto err_no_irq;
    }

    ret = request_irq(data->irq, goldfish_battery_interrupt, IRQF_SHARED, pdev->name, data);
    if (ret)
        goto err_request_irq_failed;

    ret = power_supply_register(&pdev->dev, &data->ac);
    if (ret)
        goto err_ac_failed;

    ret = power_supply_register(&pdev->dev, &data->battery);
    if (ret)
        goto err_battery_failed;

    platform_set_drvdata(pdev, data);
    battery_data = data;

    GOLDFISH_BATTERY_WRITE(data, BATTERY_INT_ENABLE, BATTERY_INT_MASK);
    return 0;

err_battery_failed:
    power_supply_unregister(&data->ac);
err_ac_failed:
    free_irq(data->irq, data);
err_request_irq_failed:
err_no_irq:
#if defined(CONFIG_ARM)
#elif defined(CONFIG_X86) || defined(CONFIG_MIPS)
    iounmap(data->reg_base);
#else
#error NOT SUPPORTED
#endif
err_no_io_base:
    kfree(data);
err_data_alloc_failed:
    return ret;
}


中斷函數goldfish_battery_interrupt,先讀取STATUS寄存器,判斷是battery的中斷事件,還是ac的

    /* read status flags, which will clear the interrupt */
    status = GOLDFISH_BATTERY_READ(data, BATTERY_INT_STATUS);
    status &= BATTERY_INT_MASK;

然後調用power_supply_changed去通知內核。

完整的goldfish_battery_interrupt如下:

static irqreturn_t goldfish_battery_interrupt(int irq, void *dev_id)
{
    unsigned long irq_flags;
    struct goldfish_battery_data *data = dev_id;
    uint32_t status;

    spin_lock_irqsave(&data->lock, irq_flags);

    /* read status flags, which will clear the interrupt */
    status = GOLDFISH_BATTERY_READ(data, BATTERY_INT_STATUS);
    status &= BATTERY_INT_MASK;

    if (status & BATTERY_STATUS_CHANGED)
        power_supply_changed(&data->battery);
    if (status & AC_STATUS_CHANGED)
        power_supply_changed(&data->ac);

    spin_unlock_irqrestore(&data->lock, irq_flags);
    return status ? IRQ_HANDLED : IRQ_NONE;
}

需要注意一下struct goldfish_battery_data是如何傳遞給中斷函數和platform_device的。

2.2、platform bus的驅動

在看虛擬設備之前,最好把platform bus的驅動程序也看了arch/x86/mach-goldfish/pdev_bus.c

platform bus的文檔

I. Goldfish platform bus:
=========================

The 'platform bus', in Linux kernel speak, is a special device that is capable
of enumerating other platform devices found on the system to the kernel. This
flexibility allows to customize which virtual devices are available when running
a given emulated system configuration.

Relevant files:
  $QEMU/hw/android/goldfish/device.c
  $KERNEL/arch/arm/mach-goldfish/pdev_bus.c
  $KERNEL/arch/x86/mach-goldfish/pdev_bus.c
  $KERNEL/arch/mips/goldfish/pdev_bus.c

Device properties:
  Name: goldfish_device_bus
  Id:   -1
  IrqCount: 1

  32-bit I/O registers (offset, name, abstract)

    0x00 BUS_OP      R: Iterate to next device in enumeration.
                     W: Start device enumeration.

    0x04 GET_NAME    W: Copy device name to kernel memory.
    0x08 NAME_LEN    R: Read length of current device's name.
    0x0c ID          R: Read id of current device.
    0x10 IO_BASE     R: Read I/O base address of current device.
    0x14 IO_SIZE     R: Read I/O base size of current device.
    0x18 IRQ_BASE    R: Read base IRQ of current device.
    0x1c IRQ_COUNT   R: Read IRQ count of current device.

    # For 64-bit guest architectures only:
    0x20 NAME_ADDR_HIGH  W: Write high 32-bit of kernel address of name
                            buffer used by GET_NAME. Must be written to
                            before the GET_NAME write.

The kernel iterates over the list of current devices with something like:

   IO_WRITE(BUS_OP, 0);    // Start iteration, any value other than 0 is invalid.
   for (;;) {
     int ret = IO_READ(BUS_OP);
     if (ret == 0 /* OP_DONE */) {
       // no more devices.
       break;
     }
     else if (ret == 8 /* OP_ADD_DEV */) {
       // Read device properties.
       Device dev;
       dev.name_len  = IO_READ(NAME_LEN);
       dev.id        = IO_READ(ID);
       dev.io_base   = IO_READ(IO_BASE);
       dev.io_size   = IO_READ(IO_SIZE);
       dev.irq_base  = IO_READ(IRQ_BASE);
       dev.irq_count = IO_READ(IRQ_COUNT);

       dev.name = kalloc(dev.name_len + 1);  // allocate room for device name.
    #if 64BIT_GUEST_CPU
       IO_WRITE(NAME_ADDR_HIGH, (uint32_t)(dev.name >> 32));
    #endif
       IO_WRITE(GET_NAME, (uint32_t)dev.name);  // copy to kernel memory.
       dev.name[dev.name_len] = 0;

       .. add device to kernel's list.
     }
     else {
       // Not returned by current goldfish implementation.
     }
   }

The device also uses a single IRQ, which it will raise to indicate to the kernel
that new devices are available, or that some of them have been removed. The
kernel will then start a new enumeration. The IRQ is lowered by the device only
when a IO_READ(BUS_OP) returns 0 (OP_DONE).

NOTE: The kernel hard-codes a platform_device definition with the name
      "goldfish_pdev_bus" for the platform bus (e.g. see
      $KERNEL/arch/arm/mach-goldfish/board-goldfish.c), however, the bus itself
      will appear during enumeration as a device named "goldfish_device_bus"

      The kernel driver for the platform bus only matches the "goldfish_pdev_bus"
      name, and will ignore any device named "goldfish_device_bus".
讀取NAME_LEN可以得到bus上一個設備的名字長度,讀取IO_BASE可以得到IO內存的起始地址,讀取IO_SIZE可以得到IO內存的大小,這些都很容易理解。往GET_NAME寄存器寫一個指針,然後設備名稱被虛擬bus寫入這個指針,也還好。需要注意的是BUS_OP,先寫BUS_OP,開始設備的枚舉,每次讀BUS_OP,如果是PDEV_BUS_OP_ADD_DEV,說明有新的設備,並切換下一個設備,切換之後,再次讀取NAME_LEN,IO_BASE,IO_SIZE將返回的下一個設備的信息了;如果是PDEV_BUS_OP_DONE,說明枚舉完畢,沒有新的設備了。


首先是把和emulator約定好的IO內存和中斷號信息提供給內核:

static struct resource goldfish_pdev_bus_resources[] = {
    {
        .start  = GOLDFISH_PDEV_BUS_BASE,
        .end    = GOLDFISH_PDEV_BUS_BASE + GOLDFISH_PDEV_BUS_END - 1,
        .flags  = IORESOURCE_IO,
    },
    {
        .start  = IRQ_PDEV_BUS,
        .end    = IRQ_PDEV_BUS,
        .flags  = IORESOURCE_IRQ,
    }
};

struct platform_device goldfish_pdev_bus_device = {
    .name = "goldfish_pdev_bus",
    .id = -1,
    .num_resources = ARRAY_SIZE(goldfish_pdev_bus_resources),
    .resource = goldfish_pdev_bus_resources
};
static int __init goldfish_init(void)
{
    return platform_device_register(&goldfish_pdev_bus_device);
}
device_initcall(goldfish_init);


然後是註冊platform bus的驅動:

static struct platform_driver goldfish_pdev_bus_driver = {
    .probe = goldfish_pdev_bus_probe,
    .remove = __devexit_p(goldfish_pdev_bus_remove),
    .driver = {
        .name = "goldfish_pdev_bus"
    }
};

static int __init goldfish_pdev_bus_init(void)
{
    return platform_driver_register(&goldfish_pdev_bus_driver);
}

static void __exit goldfish_pdev_bus_exit(void)
{
    platform_driver_unregister(&goldfish_pdev_bus_driver);
}

module_init(goldfish_pdev_bus_init);
module_exit(goldfish_pdev_bus_exit);

goldfish_pdev_bus_probe函數比battery的probe還要簡單,就不詳細說明了,注意最後往PDEV_BUS_OP寫東西,開始設備的模擬(寫PDEV_BUS_OP,虛擬設備會觸發中斷,然後在中斷函數裏面進行設備的枚舉)。

static int __devinit goldfish_pdev_bus_probe(struct platform_device *pdev)
{
    int ret;
    struct resource *r;
    r = platform_get_resource(pdev, IORESOURCE_IO, 0);
    if(r == NULL)
        return -EINVAL;
    pdev_bus_base = ioremap(GOLDFISH_IO_START + r->start, GOLDFISH_IO_SIZE);

    r = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
    if(r == NULL)
        return -EINVAL;
    pdev_bus_irq = r->start;

    ret = request_irq(pdev_bus_irq, goldfish_pdev_bus_interrupt, IRQF_SHARED, "goldfish_pdev_bus", );
    if(ret)
        goto err_request_irq_failed;

    writel(PDEV_BUS_OP_INIT, pdev_bus_base + PDEV_BUS_OP);

err_request_irq_failed:
    return ret;
}

中斷函數goldfish_pdev_bus_interrupt就是不斷讀取PDEV_BUS_OP,如果返回PDEV_BUS_OP_ADD_DEV,就調用goldfish_new_pdev去添加設備,如果返回PDEV_BUS_OP_DONE就結束。

static irqreturn_t goldfish_pdev_bus_interrupt(int irq, void *dev_id)
{
    irqreturn_t ret = IRQ_NONE;
    while(1) {
        uint32_t op = readl(pdev_bus_base + PDEV_BUS_OP);
        switch(op) {
            case PDEV_BUS_OP_DONE:
                return IRQ_NONE;

            case PDEV_BUS_OP_REMOVE_DEV:
                goldfish_pdev_remove();
                break;

            case PDEV_BUS_OP_ADD_DEV:
                goldfish_new_pdev();
                break;
        }
        ret = IRQ_HANDLED;
    }
}
goldfish_new_pdev通過讀取寄存器,獲得新設備的名稱,IO內存,中斷號等信息,獲取設備的信息後,添加設備結構體到pdev_bus_new_devices鏈表,這樣battery的驅動就可以得到battery設備結構體中的IO內存和中斷號的信息了(platform_get_resource)。

最後調用了schedule_work(&pdev_bus_worker)函數。goldfish_pdev_worker是worker,類似於tasklet,註冊後會在以後某一時刻運行,而不會佔用中斷上下文的時間。該函數主要用於更新三個鏈表,新加設備,已刪除設備,已註冊設備。

static int goldfish_new_pdev(void)
{
    struct pdev_bus_dev *dev;
    uint32_t name_len;
    uint32_t irq = -1, irq_count;
    int resource_count = 2;
    uint32_t base;
    char *name;

    base = readl(pdev_bus_base + PDEV_BUS_IO_BASE);

    irq_count = readl(pdev_bus_base + PDEV_BUS_IRQ_COUNT);
    name_len = readl(pdev_bus_base + PDEV_BUS_NAME_LEN);
    if(irq_count)
        resource_count++;

    dev = kzalloc(sizeof(*dev) + sizeof(struct resource) * resource_count + name_len + 1, GFP_ATOMIC);
    if(dev == NULL)
        return -ENOMEM;

    dev->pdev.num_resources = resource_count;
    dev->pdev.resource = (struct resource *)(dev + 1);
    dev->pdev.name = name = (char *)(dev->pdev.resource + resource_count);
    dev->pdev.dev.coherent_dma_mask = ~0;

    writel((unsigned long)name, pdev_bus_base + PDEV_BUS_GET_NAME);
    name[name_len] = '\0';
    dev->pdev.id = readl(pdev_bus_base + PDEV_BUS_ID);
    dev->pdev.resource[0].start = base;
    dev->pdev.resource[0].end = base + readl(pdev_bus_base + PDEV_BUS_IO_SIZE) - 1;
    dev->pdev.resource[0].flags = IORESOURCE_MEM;
    if(irq_count) {
        irq = readl(pdev_bus_base + PDEV_BUS_IRQ);
        dev->pdev.resource[1].start = irq;
        dev->pdev.resource[1].end = irq + irq_count - 1;
        dev->pdev.resource[1].flags = IORESOURCE_IRQ;
    }

    printk("goldfish_new_pdev %s at %x irq %d\n", name, base, irq);
    list_add_tail(&dev->list, &pdev_bus_new_devices);
    schedule_work(&pdev_bus_worker);

    return 0;
}
static void goldfish_pdev_worker(struct work_struct *work)
{
 int ret;
 struct pdev_bus_dev *pos, *n;

 list_for_each_entry_safe(pos, n, &pdev_bus_removed_devices, list) {
     list_del(&pos->list);
     platform_device_unregister(&pos->pdev);
     kfree(pos);
 }
 list_for_each_entry_safe(pos, n, &pdev_bus_new_devices, list) {
     list_del(&pos->list);
     ret = platform_device_register(&pos->pdev);
     if(ret) {
         printk("goldfish_pdev_worker failed to register device, %s\n", pos->pdev.name);
     }
     else {
         printk("goldfish_pdev_worker registered %s\n", pos->pdev.name);
     }
     list_add_tail(&pos->list, &pdev_bus_registered_devices);
 }
}

內核相關的東西結束了,後面的都是emulator虛擬設備的東西了,會比較難以理解,而且沒有什麼資料。

三、emulator中的虛擬設備

3.1、battery虛擬設備

battery虛擬設備的代碼爲:http://androidxref.com/5.1.0_r1/xref/external/qemu/hw/android/goldfish/battery.c

首先是虛擬設備的寄存器,定義了寄存器的地址,然後使用結構體goldfish_battery_state保存寄存器的信息,當對這個結構體讀寫時,就是讀寫寄存器,用來模擬寄存器。

enum {
  /* status register */
  BATTERY_INT_STATUS      = 0x00,
  /* set this to enable IRQ */
  BATTERY_INT_ENABLE      = 0x04,

  BATTERY_AC_ONLINE       = 0x08,
  BATTERY_STATUS          = 0x0C,
  BATTERY_HEALTH          = 0x10,
  BATTERY_PRESENT         = 0x14,
  BATTERY_CAPACITY        = 0x18,

  BATTERY_STATUS_CHANGED  = 1U << 0,
  AC_STATUS_CHANGED       = 1U << 1,
  BATTERY_INT_MASK        = BATTERY_STATUS_CHANGED | AC_STATUS_CHANGED,
};


struct goldfish_battery_state {
    struct goldfish_device dev;
    // IRQs
    uint32_t int_status;
    // irq enable mask for int_status
    uint32_t int_enable;

    int ac_online;
    int status;
    int health;
    int present;
    int capacity;

    // the fields below are part of the device configuration
    // and don't need to be saved to / restored from snapshots.
    int hw_has_battery;
};


這段代碼實現了類似於序列化的東西,和模擬設備關係不大,以及之後出現的load和save大多數都是一樣的東西,不會贅述。
/* update this each time you update the battery_state struct */
#define  BATTERY_STATE_SAVE_VERSION  1

#define  QFIELD_STRUCT  struct goldfish_battery_state
QFIELD_BEGIN(goldfish_battery_fields)
    QFIELD_INT32(int_status),
    QFIELD_INT32(int_enable),
    QFIELD_INT32(ac_online),
    QFIELD_INT32(status),
    QFIELD_INT32(health),
    QFIELD_INT32(present),
    QFIELD_INT32(capacity),
QFIELD_END

static void  goldfish_battery_save(QEMUFile*  f, void* opaque)
{
    struct goldfish_battery_state*  s = opaque;

    qemu_put_struct(f, goldfish_battery_fields, s);
}

static int   goldfish_battery_load(QEMUFile*  f, void*  opaque, int  version_id)
{
    struct goldfish_battery_state*  s = opaque;

    if (version_id != BATTERY_STATE_SAVE_VERSION)
        return -1;

    return qemu_get_struct(f, goldfish_battery_fields, s);
}
虛擬設備的初始化函數是goldfish_battery_init,往寄存器結構體裏面塞了一些默認值,比如名字,電量什麼的。最後調用了goldfish_device_add去添加設備到bus,這個函數非常關鍵,動態分配了每個設備的IO內存空間,以及中斷號,設置了對應IO內存的讀寫函數數組以及寄存器結構體,後面將詳細說明。

void goldfish_battery_init(int has_battery)
{
    struct goldfish_battery_state *s;

    s = (struct goldfish_battery_state *)g_malloc0(sizeof(*s));
    s->dev.name = "goldfish-battery";
    s->dev.base = 0;    // will be allocated dynamically
    s->dev.size = 0x1000;
    s->dev.irq_count = 1;

    // default values for the battery
    s->ac_online = 1;
    s->hw_has_battery = has_battery;
    if (has_battery) {
        s->status = POWER_SUPPLY_STATUS_CHARGING;
        s->health = POWER_SUPPLY_HEALTH_GOOD;
        s->present = 1;     // battery is present
        s->capacity = 50;   // 50% charged
    } else {
        s->status = POWER_SUPPLY_STATUS_NOT_CHARGING;
        s->health = POWER_SUPPLY_HEALTH_DEAD;
        s->present = 0;
        s->capacity = 0;
    }

    battery_state = s;

    goldfish_device_add(&s->dev, goldfish_battery_readfn, goldfish_battery_writefn, s);

    register_savevm(NULL,
                    "battery_state",
                    0,
                    BATTERY_STATE_SAVE_VERSION,
                    goldfish_battery_save,
                    goldfish_battery_load,
                    s);
}
goldfish_battery_read和goldfish_battery_write是虛擬設備的寄存器的讀寫函數,給定寄存器結構體,以及寄存器(就是偏移量),去模擬寄存器的讀寫。

注意讀BATTERY_INT_STATUS之後,如果有中斷標誌位,則清空,因爲程序已經讀到了有新的中斷事件,沒必要再去觸發一次中斷了。

static uint32_t goldfish_battery_read(void *opaque, hwaddr offset)
{
    uint32_t ret;
    struct goldfish_battery_state *s = opaque;

    switch(offset) {
        case BATTERY_INT_STATUS:
            // return current buffer status flags
            ret = s->int_status & s->int_enable;
            if (ret) {
                goldfish_device_set_irq(&s->dev, 0, 0);
                s->int_status = 0;
            }
            return ret;

     case BATTERY_INT_ENABLE:
         return s->int_enable;
     case BATTERY_AC_ONLINE:
         return s->ac_online;
     case BATTERY_STATUS:
         return s->status;
     case BATTERY_HEALTH:
         return s->health;
     case BATTERY_PRESENT:
         return s->present;
     case BATTERY_CAPACITY:
         return s->capacity;

        default:
            cpu_abort (cpu_single_env, "goldfish_battery_read: Bad offset %x\n", offset);
            return 0;
    }
}

static void goldfish_battery_write(void *opaque, hwaddr offset, uint32_t val)
{
    struct goldfish_battery_state *s = opaque;

    switch(offset) {
        case BATTERY_INT_ENABLE:
            /* enable interrupts */
            s->int_enable = val;
//            s->int_status = (AUDIO_INT_WRITE_BUFFER_1_EMPTY | AUDIO_INT_WRITE_BUFFER_2_EMPTY);
//            goldfish_device_set_irq(&s->dev, 0, (s->int_status & s->int_enable));
            break;

        default:
            cpu_abort (cpu_single_env, "goldfish_audio_write: Bad offset %x\n", offset);
    }
}
讀寫函數有三組,分別對應8bit,16bit,32bit的寬度去讀寫,會在goldfish_device_add時指定這兩個讀寫函數數組。

static CPUReadMemoryFunc *goldfish_battery_readfn[] = {
    goldfish_battery_read,
    goldfish_battery_read,
    goldfish_battery_read
};


static CPUWriteMemoryFunc *goldfish_battery_writefn[] = {
    goldfish_battery_write,
    goldfish_battery_write,
    goldfish_battery_write
};


3.2、platform bus虛擬設備

platform bus虛擬設備的代碼是:http://androidxref.com/5.1.0_r1/xref/external/qemu/hw/android/goldfish/device.c

注意platform bus本身也是一個設備,也在設備鏈表中。

初始化相關的代碼,goldfish_device_init和goldfish_device_bus_init中指定的base, size, irq, irq_count是固定寫死的,和內核中的代碼對應。

static struct bus_state bus_state = {
    .dev = {
        .name = "goldfish_device_bus",
        .id = -1,
        .base = 0x10001000,
        .size = 0x1000,
        .irq = 1,
        .irq_count = 1,
    }
};

void goldfish_device_init(qemu_irq *pic, uint32_t base, uint32_t size, uint32_t irq, uint32_t irq_count)
{
    goldfish_pic = pic;
    goldfish_free_base = base;
    goldfish_free_irq = irq;
}

int goldfish_device_bus_init(uint32_t base, uint32_t irq)
{
    bus_state.dev.base = base;
    bus_state.dev.irq = irq;

    return goldfish_device_add(&bus_state.dev, goldfish_bus_readfn, goldfish_bus_writefn, &bus_state);
}
寫寄存器的函數是goldfish_bus_write,如果是寫PDEV_BUS_OP_INIT,那麼調用goldfish_bus_op_init函數,如果設備鏈表非空,將產生一箇中斷事件,內核代碼中的中斷函數將得到執行,去進行platform bus驅動中所說的設備的枚舉。其他的沒什麼特別的。

static void goldfish_bus_write(void *opaque, hwaddr offset, uint32_t value)
{
    struct bus_state *s = (struct bus_state *)opaque;

    switch(offset) {
        case PDEV_BUS_OP:
            switch(value) {
                case PDEV_BUS_OP_INIT:
                    goldfish_bus_op_init(s);
                    break;
                default:
                    cpu_abort (cpu_single_env, "goldfish_bus_write: Bad PDEV_BUS_OP value %x\n", value);
            };
            break;
        case PDEV_BUS_GET_NAME:
            if(s->current) {
                target_ulong name = (target_ulong)(s->name_addr_high | value);
                safe_memory_rw_debug(current_cpu, name, (void*)s->current->name, strlen(s->current->name), 1);
            }
            break;
        case PDEV_BUS_NAME_ADDR_HIGH:
            s->name_addr_high = ((uint64_t)value << 32);
            goldfish_64bit_guest = 1;
            break;
        default:
            cpu_abort (cpu_single_env, "goldfish_bus_write: Bad offset %x\n", offset);
    }
}
static void goldfish_bus_op_init(struct bus_state *s)
{
    struct goldfish_device *dev = first_device;
    while(dev) {
        dev->reported_state = 0;
        dev = dev->next;
    }
    s->current = NULL;
    goldfish_device_set_irq(&s->dev, 0, first_device != NULL);
}

讀寄存器的函數是goldfish_bus_read,每次讀取PDEV_BUS_OP,都會迭代一個新的設備,返回值說明是否有新的設備,其他的沒什麼特別的。

static uint32_t goldfish_bus_read(void *opaque, hwaddr offset)
{
    struct bus_state *s = (struct bus_state *)opaque;

    switch (offset) {
        case PDEV_BUS_OP:
            if(s->current) {
                s->current->reported_state = 1;
                s->current = s->current->next;
            }
            else {
                s->current = first_device;
            }
            while(s->current && s->current->reported_state == 1)
                s->current = s->current->next;
            if(s->current)
                return PDEV_BUS_OP_ADD_DEV;
            else {
                goldfish_device_set_irq(&s->dev, 0, 0);
                return PDEV_BUS_OP_DONE;
            }

        case PDEV_BUS_NAME_LEN:
            return s->current ? strlen(s->current->name) : 0;
        case PDEV_BUS_ID:
            return s->current ? s->current->id : 0;
        case PDEV_BUS_IO_BASE:
            return s->current ? s->current->base : 0;
        case PDEV_BUS_IO_SIZE:
            return s->current ? s->current->size : 0;
        case PDEV_BUS_IRQ:
            return s->current ? s->current->irq : 0;
        case PDEV_BUS_IRQ_COUNT:
            return s->current ? s->current->irq_count : 0;
    default:
        cpu_abort (cpu_single_env, "goldfish_bus_read: Bad offset %x\n", offset);
        return 0;
    }
}
關於觸發中斷的函數void goldfish_device_set_irq(struct goldfish_device *dev, int irq, int level)需要詳細說明一下。

x86使用的經典的中斷控制器是8258A(文檔),在emulator中,使用的是一個虛擬的8259A(代碼),並沒有使用電腦上的8259A,因爲硬件的8259A,emulator無法去觸發它的中斷請求。中斷相關的初始化代碼爲:http://androidxref.com/5.1.0_r1/xref/external/qemu/hw/i386/pc.c#1031
最多有15個虛擬中斷,兩片8259A級連,從片接在主片的IRQ2上(IRQ from 0 to 7 for every chip)。
dev是具體的虛擬設備;irq是每一個虛擬設備的中斷的序號,如果虛擬設備只有一箇中斷,那麼這裏的irq就是0,如果有兩個,那麼可以是0或者1,這裏的irq並不是系統中所有中斷的序號;level爲1的話產生中斷,爲0取消中斷(不是禁止中斷,僅僅是取消中斷請求)。goldfish_device_set_irq調用qemu_set_irq函數,最終會設置虛擬8259A中IRR(中斷請求寄存器)寄存器上與設置虛擬設備的中斷號所對應的位(http://androidxref.com/5.1.0_r1/xref/external/qemu/hw/intc/i8259.c#83)去觸發中斷事件,然後內核代碼中的中斷函數將得到執行(觸發中斷之後,CPU得到中斷號,查找中斷向量表,跳轉到中斷處理函數去執行)。

void goldfish_device_set_irq(struct goldfish_device *dev, int irq, int level)
{
    if(irq >= dev->irq_count)
        cpu_abort (cpu_single_env, "goldfish_device_set_irq: Bad irq %d >= %d\n", irq, dev->irq_count);
    else
        qemu_set_irq(goldfish_pic[dev->irq + irq], level);
}

3.3、虛擬設備的靈魂goldfish_device_add

goldfish_device_add放在最後,因爲這是一個最最重要的函數,可以解答內核對虛擬設備的寄存器進行讀寫時,emulator怎麼知道是哪一個虛擬設備被訪問了,哪一個虛擬寄存器被訪問了,應該怎麼模擬這個虛擬寄存器的讀寫。

這麼重要的函數,當然只有幾行,調用了其他的函數。這裏先簡要說明下,goldfish_add_device_no_io是根據目前空閒的IO內存地址和中斷號,去給新的設備分配IO內存和中斷號的(如果base or irq不等於0,說明靜態分配好了);cpu_register_io_memory維護了三個數組,分別是三個讀函數的數組,三個寫函數的數組,虛擬設備寄存器結構體的數組,數組下標爲io_index,是動態分配的,注意有幾個io_index是保留的;cpu_register_physical_memory分配虛擬物理內存頁,並將io_index<<3|subwidth保存在了頁面信息PhysPageDesc結構體的phys_offset中。

int goldfish_device_add(struct goldfish_device *dev,
                       CPUReadMemoryFunc **mem_read,
                       CPUWriteMemoryFunc **mem_write,
                       void *opaque)
{
    int iomemtype;
    goldfish_add_device_no_io(dev);
    iomemtype = cpu_register_io_memory(mem_read, mem_write, opaque);
    cpu_register_physical_memory(dev->base, dev->size, iomemtype);
    return 0;
}



動態分配虛擬設備的IO內存和中斷號的函數爲goldfish_add_device_no_io,注意x86上有幾個中斷號是保留的。

int goldfish_add_device_no_io(struct goldfish_device *dev)
{
    if(dev->base == 0) {
        dev->base = goldfish_free_base;
        goldfish_free_base += dev->size;
    }
    if(dev->irq == 0 && dev->irq_count > 0) {
        dev->irq = goldfish_free_irq;
        goldfish_free_irq += dev->irq_count;
#ifdef TARGET_I386
        /* Make sure that we pass by the reserved IRQs. */
        while (goldfish_free_irq == GFD_KBD_IRQ ||
               goldfish_free_irq == GFD_RTC_IRQ ||
               goldfish_free_irq == GFD_MOUSE_IRQ ||
               goldfish_free_irq == GFD_ERR_IRQ) {
            goldfish_free_irq++;
        }
#endif
        if (goldfish_free_irq >= GFD_MAX_IRQ) {
            derror("Goldfish device has exceeded available IRQ number.");
            exit(1);
        }
    }
    //printf("goldfish_add_device: %s, base %x %x, irq %d %d\n",
    //       dev->name, dev->base, dev->size, dev->irq, dev->irq_count);
    dev->next = NULL;
    if(last_device) {
        last_device->next = dev;
    }
    else {
        first_device = dev;
    }
    last_device = dev;
    return 0;
}



折騰三個數組的函數是cpu_register_io_memory,注意io_index是動態分配的,每一個虛擬設備對應一個io_index,通過io_index可以找到這個虛擬設備的三個讀寫函數和寄存器結構體。注意io_index的最大值是IO_MEM_NB_ENTRIES:

/* MMIO pages are identified by a combination of an IO device index and
   3 flags.  The ROMD code stores the page ram offset in iotlb entry,
   so only a limited number of ids are avaiable.  */

#define IO_MEM_NB_ENTRIES  (1 << (TARGET_PAGE_BITS  - IO_MEM_SHIFT))

函數的返回值是io_index << 3 | subwidth,subwidth標記三個讀寫函數是否有NULL的。

當得知io_index以及寄存器(偏移量)時,就可以調用虛擬設備自己的讀寫函數去讀寫寄存器結構體,進行設備的模擬了。如何在kernel寫寄存器時,得知這個io_index呢,下面分析。

/* mem_read and mem_write are arrays of functions containing the
   function to access byte (index 0), word (index 1) and dword (index
   2). Functions can be omitted with a NULL function pointer.
   If io_index is non zero, the corresponding io zone is
   modified. If it is zero, a new io zone is allocated. The return
   value can be used with cpu_register_physical_memory(). (-1) is
   returned if error. */
static int cpu_register_io_memory_fixed(int io_index,
                                        CPUReadMemoryFunc * const *mem_read,
                                        CPUWriteMemoryFunc * const *mem_write,
                                        void *opaque)
{
    int i, subwidth = 0;

    if (io_index <= 0) {
        io_index = get_free_io_mem_idx();
        if (io_index == -1)
            return io_index;
    } else {
        io_index >>= IO_MEM_SHIFT;
        if (io_index >= IO_MEM_NB_ENTRIES)
            return -1;
    }

    for(i = 0;i < 3; i++) {
        if (!mem_read[i] || !mem_write[i])
            subwidth = IO_MEM_SUBWIDTH;
        _io_mem_read[io_index][i] = mem_read[i];
        _io_mem_write[io_index][i] = mem_write[i];
    }
    io_mem_opaque[io_index] = opaque;
    return (io_index << IO_MEM_SHIFT) | subwidth;
}

int cpu_register_io_memory(CPUReadMemoryFunc * const *mem_read,
                           CPUWriteMemoryFunc * const *mem_write,
                           void *opaque)
{
    return cpu_register_io_memory_fixed(0, mem_read, mem_write, opaque);
}



第三個函數cpu_register_physical_memory分配虛擬物理內存,並且將io_index << 3 | subwidth保存在了PhysPageDesc結構體的phys_offset中了。

物理內存管理的代碼很複雜,只需要理解普通的ram是按頁分配,並且phys_offset=0,表示是普通ram;IO內存也是按頁分配的,並且phys_offset就是剛纔的io_index << 3 | subwidth,如果IO內存佔了多個頁面,那麼每個頁面的phys_offset是相同的(region_offset不同),可以找到相同的io_index。

下面是幾個宏的定義,注意IO_MEM_ROM,IO_MEM_UNASSIGNED,IO_MEM_NOTDIRTY是get_free_io_mem_idx預先保留的幾個io_index。

#define TARGET_PAGE_SIZE (1 << TARGET_PAGE_BITS)
#define TARGET_PAGE_MASK ~(TARGET_PAGE_SIZE - 1)
#define IO_MEM_SHIFT       3
#define IO_MEM_RAM         (0 << IO_MEM_SHIFT) /* hardcoded offset */
#define IO_MEM_ROM         (1 << IO_MEM_SHIFT) /* hardcoded offset */
#define IO_MEM_UNASSIGNED  (2 << IO_MEM_SHIFT)
#define IO_MEM_NOTDIRTY    (3 << IO_MEM_SHIFT)
/* Acts like a ROM when read and like a device when written.  */
#define IO_MEM_ROMD        (1)
#define IO_MEM_SUBPAGE     (2)
#define IO_MEM_SUBWIDTH    (4)

static inline void cpu_register_physical_memory(hwaddr start_addr,
                                                ram_addr_t size,
                                                ram_addr_t phys_offset)
{
    cpu_register_physical_memory_offset(start_addr, size, phys_offset, 0);
}
static inline void cpu_register_physical_memory_offset(hwaddr start_addr,
                                                       ram_addr_t size,
                                                       ram_addr_t phys_offset,
                                                       ram_addr_t region_offset)
{
    cpu_register_physical_memory_log(start_addr, size, phys_offset,
                                     region_offset, false);
}
void cpu_register_physical_memory_log(hwaddr start_addr,
                                         ram_addr_t size,
                                         ram_addr_t phys_offset,
                                         ram_addr_t region_offset,
                                         bool log_dirty)
{
    hwaddr addr, end_addr;
    PhysPageDesc *p;
    CPUState *cpu;
    ram_addr_t orig_size = size;
    subpage_t *subpage;

    if (kvm_enabled())
        kvm_set_phys_mem(start_addr, size, phys_offset);
#ifdef CONFIG_HAX
    if (hax_enabled())
        hax_set_phys_mem(start_addr, size, phys_offset);
#endif

    if (phys_offset == IO_MEM_UNASSIGNED) {
        region_offset = start_addr;
    }
    region_offset &= TARGET_PAGE_MASK;
    size = (size + TARGET_PAGE_SIZE - 1) & TARGET_PAGE_MASK;
    end_addr = start_addr + (hwaddr)size;

    addr = start_addr;
    do {
        p = phys_page_find(addr >> TARGET_PAGE_BITS);
        if (p && p->phys_offset != IO_MEM_UNASSIGNED) {
            ram_addr_t orig_memory = p->phys_offset;
            hwaddr start_addr2, end_addr2;
            int need_subpage = 0;

            CHECK_SUBPAGE(addr, start_addr, start_addr2, end_addr, end_addr2,
                          need_subpage);
            if (need_subpage) {
                if (!(orig_memory & IO_MEM_SUBPAGE)) {
                    subpage = subpage_init((addr & TARGET_PAGE_MASK),
                                           &p->phys_offset, orig_memory,
                                           p->region_offset);
                } else {
                    subpage = io_mem_opaque[(orig_memory & ~TARGET_PAGE_MASK)
                                            >> IO_MEM_SHIFT];
                }
                subpage_register(subpage, start_addr2, end_addr2, phys_offset,
                                 region_offset);
                p->region_offset = 0;
            } else {
                p->phys_offset = phys_offset;
                if ((phys_offset & ~TARGET_PAGE_MASK) <= IO_MEM_ROM ||
                    (phys_offset & IO_MEM_ROMD))
                    phys_offset += TARGET_PAGE_SIZE;
            }
        } else {
            p = phys_page_find_alloc(addr >> TARGET_PAGE_BITS, 1);
            p->phys_offset = phys_offset;
            p->region_offset = region_offset;
            if ((phys_offset & ~TARGET_PAGE_MASK) <= IO_MEM_ROM ||
                (phys_offset & IO_MEM_ROMD)) {
                phys_offset += TARGET_PAGE_SIZE;
            } else {
                hwaddr start_addr2, end_addr2;
                int need_subpage = 0;

                CHECK_SUBPAGE(addr, start_addr, start_addr2, end_addr,
                              end_addr2, need_subpage);

                if (need_subpage) {
                    subpage = subpage_init((addr & TARGET_PAGE_MASK),
                                           &p->phys_offset, IO_MEM_UNASSIGNED,
                                           addr & TARGET_PAGE_MASK);
                    subpage_register(subpage, start_addr2, end_addr2,
                                     phys_offset, region_offset);
                    p->region_offset = 0;
                }
            }
        }
        region_offset += TARGET_PAGE_SIZE;
        addr += TARGET_PAGE_SIZE;
    } while (addr != end_addr);

    /* since each CPU stores ram addresses in its TLB cache, we must
       reset the modified entries */
    /* XXX: slow ! */
    CPU_FOREACH(cpu) {
        tlb_flush(cpu->env_ptr, 1);
    }
}


如果使用kvm加速的話,當讀寫MMIO時,會退出:

         case KVM_EXIT_MMIO:
             dprintf("handle_mmio\n");
             cpu_physical_memory_rw(run->mmio.phys_addr,
                                    run->mmio.data,
                                    run->mmio.len,
                                    run->mmio.is_write);
             ret = 1;
             break;

cpu_physical_memory_rw函數將被調用,先判斷是否爲MMIO,如果是,獲取io_index,然後根據不同的訪問寬度(8bit, 16bit, 32bit)去調用io_mem_write(io_index, addr1, val, xxx)和io_mem_read(io_index, addr1, xxx)函數。這兩個函數是對cpu_register_io_memory所維護的三個數組的包裝。這樣,就可以使用寄存器對應的虛擬設備的讀寫函數和寄存器結構體以及偏移量去模擬寄存器的讀寫了。

haxm和tcg原理類似。

void cpu_physical_memory_rw(hwaddr addr, void *buf,
                            int len, int is_write)
{
    int l, io_index;
    uint8_t *ptr;
    uint32_t val;
    hwaddr page;
    ram_addr_t pd;
    uint8_t* buf8 = (uint8_t*)buf;
    PhysPageDesc *p;

    while (len > 0) {
        page = addr & TARGET_PAGE_MASK;
        l = (page + TARGET_PAGE_SIZE) - addr;
        if (l > len)
            l = len;
        p = phys_page_find(page >> TARGET_PAGE_BITS);
        if (!p) {
            pd = IO_MEM_UNASSIGNED;
        } else {
            pd = p->phys_offset;
        }

        if (is_write) {
            if ((pd & ~TARGET_PAGE_MASK) != IO_MEM_RAM) {
                hwaddr addr1 = addr;
                io_index = (pd >> IO_MEM_SHIFT) & (IO_MEM_NB_ENTRIES - 1);
                if (p)
                    addr1 = (addr & ~TARGET_PAGE_MASK) + p->region_offset;
                /* XXX: could force cpu_single_env to NULL to avoid
                   potential bugs */
                if (l >= 4 && ((addr1 & 3) == 0)) {
                    /* 32 bit write access */
                    val = ldl_p(buf8);
                    io_mem_write(io_index, addr1, val, 4);
                    l = 4;
                } else if (l >= 2 && ((addr1 & 1) == 0)) {
                    /* 16 bit write access */
                    val = lduw_p(buf8);
                    io_mem_write(io_index, addr1, val, 2);
                    l = 2;
                } else {
                    /* 8 bit write access */
                    val = ldub_p(buf8);
                    io_mem_write(io_index, addr1, val, 1);
                    l = 1;
                }
            } else {
                ram_addr_t addr1;
                addr1 = (pd & TARGET_PAGE_MASK) + (addr & ~TARGET_PAGE_MASK);
                /* RAM case */
                ptr = qemu_get_ram_ptr(addr1);
                memcpy(ptr, buf8, l);
                invalidate_and_set_dirty(addr1, l);
            }
        } else {
            if ((pd & ~TARGET_PAGE_MASK) > IO_MEM_ROM &&
                !(pd & IO_MEM_ROMD)) {
                hwaddr addr1 = addr;
                /* I/O case */
                io_index = (pd >> IO_MEM_SHIFT) & (IO_MEM_NB_ENTRIES - 1);
                if (p)
                    addr1 = (addr & ~TARGET_PAGE_MASK) + p->region_offset;
                if (l >= 4 && ((addr1 & 3) == 0)) {
                    /* 32 bit read access */
                    val = io_mem_read(io_index, addr1, 4);
                    stl_p(buf8, val);
                    l = 4;
                } else if (l >= 2 && ((addr1 & 1) == 0)) {
                    /* 16 bit read access */
                    val = io_mem_read(io_index, addr1, 2);
                    stw_p(buf8, val);
                    l = 2;
                } else {
                    /* 8 bit read access */
                    val = io_mem_read(io_index, addr1, 1);
                    stb_p(buf8, val);
                    l = 1;
                }
            } else {
                /* RAM case */
                ptr = qemu_get_ram_ptr(pd & TARGET_PAGE_MASK) +
                    (addr & ~TARGET_PAGE_MASK);
                memcpy(buf8, ptr, l);
            }
        }
        len -= l;
        buf8 += l;
        addr += l;
    }
}


參考資料:

驅動程序的編寫可以看:LINUX設備驅動程序(第3版)

硬件的知識,可以看看郭天祥51單片機的視頻

別看譚浩強


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章