windows下如何提高程序啓動速度

啓動速度是很重要的一個點,如何加快呢?有個簡單的原理:局部性原理。在計算速度越來越快的今天,性能的瓶頸很多時候是在I/O上(SSD硬盤的機器明顯比機械硬盤快很多),如果能減少程序運行過程中讀取磁盤的次數,那就能有效提高速度。減少程序運行過程中讀取磁盤次數,就是減少缺頁(Page fault)錯誤,讓運行過程中的多數數據提前加載到物理內存,所以有個詞,叫做“預讀”。

一、系統對啓動加速的支持

1、Prefetch支持

  每當應用程序啓動發生硬缺頁中斷時,操作系統會記錄應用程序訪問的文件及其位置,這些信息會被記錄在\\Windows\Prefetch下,譬如,我機器上很容易就找到了“CHROME.EXE-D999B1BA.pf”文件。當下次啓動程序時,系統會先把程序啓動所需要的數據調入內存,因爲預先知道了未來需要讀取的所有磁盤位置,所以可以做到儘量的順序讀取,減少I/O次數和節省讀取數據時的尋道時間。

2、空間局部性原理的支持

  操作系統的I/O最小單位是一頁4KB,缺頁中斷引發磁盤I/O時,一般不會只取所需要的那一頁數據,而是把相鄰的數據都讀取到物理內存,如果相鄰的數據正好是馬上需要訪問的數據,那就可以減少觸發I/O。

3、緩存的支持

  系統的虛擬內存管理機制、緩存管理機制對這方面提供了支持,程序關閉之後,一般不會立即把該程序的代碼數據佔用的物理內存全部釋放,還會留着一段時間,接着第二次啓動程序,就不需要再從磁盤讀取,I/O少了,速度快了。這個也可以稱呼爲“時間局部性原理”:用戶打開了該程序,關閉掉之後很可能還會打開第二次。  

  系統爲程序啓動的加速做了如此多的工作,以至於我們能做的已經很少了,很少就意味着還可以做些事情。

二、冷啓動加速的方法

  冷啓動就是本操作系統啓動以來某應用程序的首次啓動,相應的熱啓動是操作系統啓動以來非首次啓動應用程序。

  從減少I/O耗時的角度來講,最好是啓動的時候所有數據都已經在物理內存裏了,不需要再去把磁盤數據調進物理內存,這一點熱啓動可以做到(但是我們沒法確認,因爲系統的緩存管理對程序是透明的)。熱啓動,我們在減少I/O上做不了什麼事情,它已經是很好的狀態了。能優化的是冷啓動,它必然會觸發大數據量的I/O,如何才能減少I/O次數,減少I/O耗時呢?分散多次讀取磁盤的速度明顯不如集中讀取,所以要減少I/O耗時就是讓隨機分散讀取變成集中順序讀取。

  其實很簡單,在程序正在啓動的之前,把用到的動態庫當作普通數據讀取一遍,這次集中讀取之後系統會把磁盤數據映射進物理內存中,並且根據時間局部性原理,這些磁盤到物理內存的映射會保留一段時間,到了程序真正啓動過程時,系統就不會隨需的讀取磁盤,啓動速度也就快了。

  chromium就是這麼做的,在\src\chrome\app\client_util.cc的LoadChromeWithDirectory()函數,會在加載chrome.dll之前先把該動態庫讀一遍。預讀取的代碼在\src\chrome\app\image_pre_reader_win.cc文件中,win7跟xp有別,估計是系統對緩存、局部性的支持在不同系統版本上不一致。chromium的這塊代碼很不錯,我們可以直接拿來用,不必花時間去研究系統的支持。

三、chromium啓動加速的效果

  使用Process Monitor查看對chrome.dll使用ReadFile的次數,發現有時候預讀並不頂事,在程序運行的過程中還是會觸發ReadFile,這估計跟當前系統的可用物理內存有關。測試發現最好的情況是剛開完機就打開chromium瀏覽器,啓動過程對chrome.dll的ReadFile就只有預讀的那些。而最壞的情況是系統內存佔用很高,系統不能給chromium進程分配足夠多的物理內存,可能導致ReadFile完之後,引發Page Fault,把之前預讀的數據又替換出物理內存,這樣子預讀就沒效果了。另外,從Process Monitor觀察每次ReadFile的duration發現有時時間長有時時間短,一次時間長之後跟着好幾次時間短的,可能磁盤也有根據局部性原理做緩存。

  chromium在熱啓動的時候也會觸發預讀,這點估計效果有限,可以考慮去掉,說不定可以加快熱啓動速度。如何判斷是冷啓動和熱啓動呢?可以使用ATOM,這個功能只有應用程序纔可用,控制檯程序不可用,詳細參考msdn。例子代碼:

複製代碼
bool IsColdStartUp()
{
    static int nRet = -1;
    if (nRet != -1)
    {
        return nRet == 1;
    }
    nRet = 0;
    ATOM atom = ::GlobalFindAtom(L"cswuyg_test_cold_startup");
    if (atom == 0)
    {
        nRet = 1;
        ::GlobalAddAtom(L"cswuyg_test_cold_startup");
    }
    return nRet == 1;
}
複製代碼

四、Process Monitor工具

  通過Process Monitor觀察到了ReadFile操作,對其中顯示的Fast I/O,Paging I/O,Non-cached I/O很迷茫,搜了些資料。大概是這樣子:

1、Paging I/O 就是讀取磁盤。
2、non-cached 一般是數據不在緩存中,需要從磁盤讀取,或者是故意不使用緩存。
3、如果數據在緩存中,也就是cached,那就可以有Fast I/O

  下邊是詳細的資料信息:

1、看完這個圖基本就知道那是什麼意思了

來自:http://i-web.i.u-tokyo.ac.jp/edu/training/ss/lecture/new-documents/Lectures/15-CacheManager/CacheManager.pdf

2、術語介紹

複製代碼
Q25 What is the difference between cached I/O, user non-cached I/O, and paging I/O? 


In a file system or file system filter driver, read and write operations fall into several different categories. For the purpose of discussing them, we normally consider the following types: 

- Cached I/O. This includes normal user I/O, both via the Fast I/O path as well as via the IRP_MJ_READ and IRP_MJ_WRITE path. It also includes the MDL operations (where the caller requests the FSD return an MDL pointing to the data in the cache). 

- Non-cached user I/O. This includes all non-cached I/O operations that originate outside the virtual memory system. 

- Paging I/O. These are I/O operations initiated by the virtual memory system in order to satisfy the needs of the demand paging system. 

Cached I/O is any I/O that can be satisfied by the file system data cache. In such a case, the operation is normally to copy the data from the virtual cache buffer into the user buffer. If the virtual cache buffer contents are resident in memory, the copy is fast and the results returned to the application quickly. If the virtual cache buffer contents are not all resident in memory, then the copy process will trigger a page fault, which generates a second re-entrant I/O operation via the paging mechanism. 

Non-cached user I/O is I/O that must bypass the cache - even if the data is present in the cache. For read operations, the FSD can retrieve the data directly from the storage device without making any changes to the cache. For write operations, however, an FSD must ensure that the cached data is properly invalidated (if this is even possible, which it will not be if the file is also memory mapped). 

Paging I/O is I/O that must be satisfied from the storage device (whether local to the system or located on some "other" computer system) and it is being requested by the virtual memory system as part of the paging mechanism (and hence has special rules that apply to its behavior as well as its serialization).
複製代碼

來自:http://www.osronline.com/article.cfm?article=17#Q25

3、Fast I/O DISALLOWED 是啥意思

複製代碼
I noticed this "FAST IO DISALLOWED" againest createfile  API used in exe. What does this error mean for.?

It's benign but the explanation is a bit long.
 
Basically, for a few I/O operations there are two ways that a driver can service the request. The first is through a procedural interface where the driver is called with a set of parameters that describe the I/O operation. The other is an interface where the driver receives a packetized description of the I/O operation.
 
The former interface is called the "fast I/O" interface and is entirely optional, the latter interface is the IRP based interface and what most drivers use. A driver may choose to register for both interfaces and in the fast I/O path simply return a code that means, "sorry, can't do it via the fast path, please build me an IRP and call me at my IRP based entry point." This is what you're seeing in the Process Monitor output, someone is returning "no" to the fast I/O path and this results in an IRP being generated and the normal path being taken.
複製代碼

  Fast I/O是可選的,如果系統不支持,那就DISALLOW。所以,就不能根據它來判斷是否是命中緩存了。

來自:http://forum.sysinternals.com/what-is-fast-io-disallowed_topic23154.html

這方面的知識,《Winndows NT File System Internals》第5章的內容有講解。

五、參考資料

1、《C++應用程序性能優化》第二版第9、10章

2、chromium源碼的啓動加速部分

3、以前寫的《C++應用程序性能優化》讀書筆記 http://www.cnblogs.com/cswuyg/archive/2010/08/27/1809808.html


發佈了15 篇原創文章 · 獲贊 0 · 訪問量 1萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章