轉自:https://www.cnblogs.com/zyly/p/16756273.html#_label0
目錄
- 一、ONFI標準
- 二、MTD設備驅動
- 三、MTD設備註冊
- 四、mtdblock.c
- 五、mtdchar.c
一、ONFI標準
Nand Flash是嵌入式世界裏常見的存儲器,對於嵌入式開發而言,Nand Flash主要分爲兩大類:Serial Nand、Raw Nand,這兩類Nand的差異是很大的。
Raw Nand是相對於Serial Nand而言的,Serial Nand即串行接口的Nand Flash,比如採用SPI通信協議的Nand Flash,而Raw Nand是並行接口的Nand Flash。
這裏我們首先介紹ONFI協議,主要是因爲在Nand Flash驅動源碼分析的時候涉及到ONFI協議。而我們使用的K9F2G08U0C這款芯片並沒有支持ONFI協議,我們將該芯片支持的命令和ONFI 1.0規定的命令對比就可以發現。
1.1 ONFI標準
說到Raw Nand發展史,其實早期的Raw Nand沒有統一標準,雖然早在1989年Toshiba便發表了Nand Flash結構,但具體到Raw Nand芯片,各廠商都是自由設計,因此尺寸不統一、存儲結構差異大、接口命令不通用等問題導致客戶使用起來很難受。
爲了改變這一現狀,2006年幾個主流的Raw Nand廠商(Hynix、Intel、Micron、Phison、Sony、ST)聯合起來商量制訂一個Raw Nand標準,這個標準叫Open Nand Flash Interface,簡稱ONFI,2006年12月ONFI 1.0標準正式推出,此後幾乎所有的Raw Nand廠商都按照ONFI標準設計生產Raw Nand,從此不管哪家生產的Raw Nand對嵌入式設計者來說幾乎都是一樣的,至少在驅動代碼層面是一樣的。
ONFI官網:http://www.onfi.org/,在這裏我們下載到ONFI協議規範:
1.2 Raw Nand分類
1.2.1 單元層數
Nand Flash內存單元按照層數可以分爲:
- 單層單元(Single Level Cell,簡稱SLC):這種類型的閃存在讀寫數據時具有最爲精確,並且還具有持續最長的數據讀寫壽命的優點。SLC擦寫壽命約在9萬到10萬次之間。這種類型的閃存由於其使用壽命,準確性和綜合性能,在企業市場上十分受衆。但由於儲存成本高、存儲容量相對較小,在家用市場則不太受青睞。
- 多層單元(Multi Level Cell,簡稱MLC):它的命名來源於它在SLC的1位/單元的基礎上,變成了2位/單元。這樣做的一大優勢在於大大降低了大容量儲存閃存的成本,約3000--10000次擦寫壽命。
- 三層單元(Triple Level Cell,簡稱TLC):TLC閃存是閃存生產中最低廉的規格,其儲存達到了3位/單元,雖然高儲存密度實現了較廉價的大容量格式,但其讀寫的生命週期被極大地縮短,擦寫壽命只有短短的500~1000次,同時讀寫速度較差,只適合普通消費者使用,不能達到工業使用的標準。
- 四層單元(Quad Lebel Cell,簡稱QLC):QLC每個單元可儲存4bit數據,跟TLC相比,QLC的儲存密度提高了33%。QLC不僅能經受1000次編程或擦寫循環(與TLC相當,甚至更好),而且容量提升了,成本也更低。
結論:SLC>MLC>TLC。
目前大多數U盤都是採用TLC芯片顆粒,其優點是價格便宜,不過速度一般,壽命相對較短。
而SSD固態硬盤中,目前MLC顆粒固態硬盤是主流,其價格適中,速度與壽命相對較好,而低價SSD固態硬盤普遍採用的是TLC芯片顆粒,大家在購買固態硬盤的時候,可以在產品參數中去了解。
SLC顆粒固態目前主要在一些高端固態硬盤中出現,售價多數上千元,甚至更貴。
智能手機方面,目前多數智能手機存儲也是採用TLC芯片存儲,而蘋果iPhone6部分產品採用的TLC芯片,另外還有部分採用的是MLC芯片顆粒。總的來說,MLC閃存芯片顆粒是時下主流,產品在速度、壽命以及價格上適中,比較適合推薦。
1.2.2 數據線寬度
數據線寬度可以分爲x8 、x16。
1.2.3 數據採集模式
數據採集模式可以分爲 SDR、DDR。
1.2.4 接口命令標準
接口命令標準可以分爲:非標、ONFI。
1.3 Raw Nand內存模型
ONFI規定了Raw Nand內存單元從大到小最多分爲:Device、LUN(Die、Target)、Plane、Block、Page、Cell。
- Device:就是指單片Nand Flash,對外提供Package封裝的芯片,1個Device包含1個或者多個LUN;
- LUN(Die、Target):是接收和執行Flash命令的基本單元,1個LUN包含1個或者多個plane。
- Plane:1個Plane包含多個Block。
- Block:能夠執行擦除操作的最小單元,通常由多個Page組成。
- Page:能夠執行編程和讀操作的最小單元,通常大小爲2KB等。
- Cell:Page中的最小操作擦寫讀單元,對應一個浮柵晶體管,可以存儲1bit或多bit。
其中Page和Block是必有的,因爲Page是讀寫的最小單元,Block是擦除的最小單元。而LUN和Plane則不是必有的(如沒有,可認爲LUN=1, Plane=1),一般在大容量Raw Nand(至少8Gb以上)上纔會出現。
常見的Nand Flash內部只有一個chip(LUN)、每個chip只有1個plane,而有些複雜得,容量更大的Nand Flash,內部有多個chip,每個chip有多個plane。這類的Nand Flash,其實就是多了一個主控將多塊Flash疊加在一起,如下圖:
注:對於chip的概念,我理解就是上面的LUN,其實任何某個型號的Nand Flash,都可以稱其是一個chip,但是實際上,這裏我們所提到的,是針對內部來說的,也就是某型號的Nand Flash,內部有幾個chip,比如:
- 三星的2GB的K9WAG08U1A芯片(可以理解爲外部芯片/型號)內部裝了2個單片是1GB的K9K8G08U0A,此時就稱K9WAG08U1A內部有2個chip;
- 而有些單個的chip,內部又包含多個plane,比如上面的K9K8G08U0A內部包含4個單片是2Gb的Plane;
1.4 Raw Nand信號與封裝
ONFI規定了Raw Nand信號線與封裝,如下是典型的x8 Raw Nand內部結構圖:
除了內存單元外,還有兩大組成,分別是IO控制單元和邏輯控制單元,信號線主要掛在IO控制與邏輯單元,x8 Raw Nand主要有15根信號線(其中必須的是13根,CE¯¯¯¯¯¯¯¯CE¯和RB¯¯¯¯RB¯可以不用)。
引腳名稱 | 描述 |
CLE | 命令使能,當CLE爲高電平時,WE¯¯¯¯¯¯¯¯¯WE¯ 上升沿鎖存I/O輸入到命令寄存器 |
ALE | 地址使能,當ALE爲高電平時,WE¯¯¯¯¯¯¯¯¯WE¯上升沿鎖存I/O輸入到地址寄存器 |
CE¯¯¯¯¯¯¯¯CE¯ | 片選信號,低電位有效 |
RE¯¯¯¯¯¯¯¯RE¯ | 讀使能,低電位有效 |
WE¯¯¯¯¯¯¯¯¯WE¯ | WE¯¯¯¯¯¯¯¯¯WE¯上升沿鎖存I/O輸入到命令、地址、數據寄存器 |
WP¯¯¯¯¯¯¯¯¯WP¯ | 寫保護 |
RB¯¯¯¯RB¯ | 就緒/忙輸出信號(低電平表示操作還在進行中,高電平表示操作完成) |
VCC | 電源 |
VSS | 地 |
NC | 不接 |
I/O0 ~ I/O7 | 數據輸入輸出(命令、地址、數據公用數據總線) |
ONFI規定的封裝標準有很多,比如TSOP48、LGA52、BGA63/100/132/152/272/316,其中對於嵌入式開發而言,最常用的是如下圖扁平封裝的TSOP-48,這種封裝常用於容量較小的Raw Nand(1/2/4/8/16/32Gb),1-32Gb容量對於嵌入式設計而言差不多夠用,且TSOP-48封裝易於PCB設計,因此得以流行。
1.5 Raw Nand接口命令
ONFI 1.0規定了Raw Nand接口命令,如下表所示,其中一部分是必須要支持的(M),還有一部分是可選支持的(O)。必須支持的命令裏最常用的是Read(Read Page)、Page Program、Block Erase、Read Status這三條,涵蓋讀寫擦最基本的三種操作。
此外比較重要的還有:
- Read Status,用於獲取命令執行狀態與結果。
- Read Parameter Page:用於獲取芯片內部存儲的出廠信息(包括內存結構、特性、時序、其他行爲參數等),其結構已由ONFI規定如下表,在設計Nand軟件驅動時,可以通過獲取這個Parameter Page來做到代碼通用。
二、MTD設備驅動
MTD(Memory Technology Drivers)是用於訪問memory設備( ROM 、 Flash)的Linux 的子系統, MTD 的主要目的是爲了使新的memory設備的驅動更加簡單,爲此它在硬件和上層之間提供了一個抽象的接口。
2.1 MTD子系統概要
在介紹MTD之前,我們思考一個問題,linux內核爲什麼抽象出了MTD子系統呢?
我們回顧一下我們上一節塊設備驅動編寫的流程:
- 調用register_blkdev註冊塊設備主設備號;
- 使用alloc_disk申請一個通用磁盤對象gendisk;
- 使用blk_mq_init_sq_queue初始化一個請求隊列;
- 設置成員參數major、first_minor、disk_name、fops;
- 設置請求隊列queue,等於之前初始化的請求隊列;
- 設置gendisk結構體的成員;
- 使用add_disk註冊gendisk;
針對於每一種型號的Flash設備,我們進行塊設備驅動編寫的時候,都要重複進行如上的操作。那我們就開始想了,各種型號的Flash設備有什麼區別呢?以Nand Flash爲例,主要就是內存模型(頁大小、塊大小、頁數/塊、OOB等)、以及時序參數略有差別,那我們是否可以將與Nand Flash緊密相關的部分抽離出來,由Nand Flash驅動層提供,而其他相同部分單獨抽離出來。MTD子系統就是做了這樣的事情。
2.2 MTD子系統框架
如上圖所示,MTD程序框架通用可以分爲四層,從上到下以此爲設備節點、MTD設備層、MTD原始設備層,Flash驅動層。
- 設備節點:通過mknod在/dev子目錄下建立MTD塊設備節點(主設備號爲31)和MTD字符設備節點(主設備號爲90),通過訪問此設備節點即可訪問MTD字符設備和塊設備 。
- MTD設備層:基於MTD原始設備,linux系統可以定義出MTD的塊設備(主設備號31)和字符設備(設備號90)。其中:
- mtdchar.c:MTD字符設備接口相關實現;
- mtdblock.c:MTD塊設備接口相關實現;這部分負責設備的建立、數據的讀寫、優化處理等。這跟傳統的塊設備驅動類型,塊設備主設備號的申請,gendisk結構體的分配設置、隊列的初始化等,這些都是由內核自動完成。
- MTD原始設備層:用於描述MTD原始設備的數據結構是mtd_info,它定義了大量的關於MTD的數據和操作函數。其中:
- mtdcore.c: MTD原始設備接口相關實現;
- mtdpart.c : MTD分區接口相關實現;
- Flash驅動層:Flash驅動層負責對Flash硬件的讀、寫和擦除操作,Nand Flash和Nor Flash有不同的協議和硬件細節,這部分知道發什麼,如發送什麼命令可以識別、讀寫、擦除等操作,以及硬件該怎麼發。Nand Flash有Nand的協議,Nor Flash有Nor的協議,不同協議有不同的函數,通過對應的結構體和函數構造對應的操作環境。用戶只需要完成Flash驅動層的相關結構體的分配、設置、註冊,並建立從具體設備到MTD原始設備映射關係。
- Nand Flash芯片的驅動位於drivers/mtd/nand/子目錄下,Nand Flash使用nand_chip結構體;
- Nor Flash芯片驅動位於drivers/mtd/chips/子目錄下,Nor Flash使用map_info結構體;
2.2.1 Flash驅動層
(1) Nor Flash驅動
linux內核實現了針對CFI、JEDEC等接口標準的通用Nor Flash驅動。在上述接口驅動基礎上,芯片級驅動較簡單 :定義具體內存映射結構體map_info,然後通過接口類型後調用do_map_probe。
以scb2_flash.c(位於drivers/mtd/maps/)爲例:
- 定義map_info結構體,初始化成員name、size、phys、bankwidth;
- 通過ioremap映射成員virt(虛擬內存地址);
- 通過函數simple_map_init初始化map_info成員函數read、write、copy_from、copy_to;
- 通過do_map_probe進行CFI接口探測,返回mtd_info結構體;
- 通過parse_mtd_partitions、add_mtd_partitions註冊MTD原始設備;
(2) Nand Flash驅動
linux內核實現了通用Nand Flash驅動(drivers/mtd/nand/raw/nand_base.c),芯片級驅動需要實現nand_chip結構。
MTD使用nand_chip來表示一個Nand Flash芯片, 該結構體包含了關於Nand Flash的內存模型信息,讀寫方法,ECC模式,硬件控制等一系列底層機制。
以s3c2410.c(位於drivers/mtd/nand/raw)爲例:
-
分配nand_chip內存;
-
根據SOC Nand控制器初始化nand_chip成員,比如:chip->legacy(成員write_buf、read_buf、select_chip、cmd_ctrl、dev_ready、IO_ADDR_R、IO_ADDR_W)、chip->controller;
- 設置chip->priv爲mtd_info;
-
以mtd_info爲參數調用nand_scan()探測Nand Flash,nand_scan()會讀取nand芯片ID:
- 初始化chip->base.mtd(成員writesize、oobsize、erasesize等);
- 初始化chip->base.memorg(成員bits_per_cell、pagesize、oobsize、pages_per_eraseblock、planes_per_lun、luns_per_target、ntatgets等);
- 初始化chip->options、chip->base.eccreq;
- 初始化chip->ecc各個成員(設置ecc模式及處理函數);
- chip成員中所有未初始化函數指針則使用nand_base.c中的默認函數;
-
mtd_info和mtd_partition爲參數調用mtd_device_register()進行MTD設備註冊;
2.3 核心結構體
2.3.1 struct mtd_info
linux內核使用mtd_info結構體表示MTD原始設備,描述一個設備或一個多分區設備中的一個分區,這其中定義了大量關於MTD的數據和操作函數;所有mtd_info結構體都被存放在mtd_info數組mtd_table中。
mtd_info定義在include/linux/mtd/mtd.h:
struct mtd_info { u_char type; // MTD設備類型 包括MTD_NORFALSH、MTD_NANDFALSH等 uint32_t flags; // 標誌 MTD_WRITEABLE、MTD_NO_ERASE等 uint32_t orig_flags; /* Flags as before running mtd checks */ uint64_t size; // Total size of the MTD MTD設備總容量 /* "Major" erase size for the device. Naïve users may take this * to be the only erase size available, or may use the more detailed * information below if they desire */ uint32_t erasesize; // MTD設備擦除單位大小,對於Nand Flash來說就是Block的大小 /* Minimal writable flash unit size. In case of NOR flash it is 1 (even * though individual bits can be cleared), in case of NAND flash it is * one NAND page (or half, or one-fourths of it), in case of ECC-ed NOR * it is of ECC block size, etc. It is illegal to have writesize = 0. * Any driver registering a struct mtd_info must ensure a writesize of * 1 or larger. */ uint32_t writesize; // 可寫入數據最小字節數,對於Nor Flash是字節,對於Nand Flash爲一頁 /* * Size of the write buffer used by the MTD. MTD devices having a write * buffer can write multiple writesize chunks at a time. E.g. while * writing 4 * writesize bytes to a device with 2 * writesize bytes * buffer the MTD driver can (but doesn't have to) do 2 writesize * operations, but not 4. Currently, all NANDs have writebufsize * equivalent to writesize (NAND page size). Some NOR flashes do have * writebufsize greater than writesize. uint32_t writebufsize; uint32_t oobsize; // Amount of OOB data per block (e.g. 16) uint32_t oobavail; // Available OOB bytes per block /* * If erasesize is a power of 2 then the shift is stored in * erasesize_shift otherwise erasesize_shift is zero. Ditto writesize. */ unsigned int erasesize_shift; // 擦除數據偏移值,根據erasesize計算 unsigned int writesize_shift; // 寫入數據偏移值,根據writesize計算 /* Masks based on erasesize_shift and writesize_shift */ unsigned int erasesize_mask; // 擦除數據大小掩碼,根據erasesize_shift計算 unsigned int writesize_mask; // 寫入數據大小掩碼,根據writesize_shift計算 /* * read ops return -EUCLEAN if max number of bitflips corrected on any * one region comprising an ecc step equals or exceeds this value. * Settable by driver, else defaults to ecc_strength. User can override * in sysfs. N.B. The meaning of the -EUCLEAN return code has changed; * see Documentation/ABI/testing/sysfs-class-mtd for more detail. */ unsigned int bitflip_threshold; /* Kernel-only stuff starts here. */ const char *name; // MTD設備名稱 int index; // 索引值 /* OOB layout description */ const struct mtd_ooblayout_ops *ooblayout; // oob佈局描述 /* NAND pairing scheme, only provided for MLC/TLC NANDs */ const struct mtd_pairing_scheme *pairing; /* the ecc step size. */ unsigned int ecc_step_size; /* max number of correctible bit errors per ecc step */ unsigned int ecc_strength; /* Data for variable erase regions. If numeraseregions is zero, * it means that the whole device has erasesize as given above. */ int numeraseregions; // 可變擦除區域的數目,通常爲1 struct mtd_erase_region_info *eraseregions; // 可變擦除區域 /* * Do not call via these pointers, use corresponding mtd_*() * wrappers instead. */ int (*_erase) (struct mtd_info *mtd, struct erase_info *instr); // 擦除 int (*_point) (struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen, void **virt, resource_size_t *phys); int (*_unpoint) (struct mtd_info *mtd, loff_t from, size_t len); int (*_read) (struct mtd_info *mtd, loff_t from, size_t len, // 讀取 size_t *retlen, u_char *buf); int (*_write) (struct mtd_info *mtd, loff_t to, size_t len, // 寫入 size_t *retlen, const u_char *buf); int (*_panic_write) (struct mtd_info *mtd, loff_t to, size_t len, size_t *retlen, const u_char *buf); int (*_read_oob) (struct mtd_info *mtd, loff_t from, struct mtd_oob_ops *ops); int (*_write_oob) (struct mtd_info *mtd, loff_t to, struct mtd_oob_ops *ops); int (*_get_fact_prot_info) (struct mtd_info *mtd, size_t len, size_t *retlen, struct otp_info *buf); int (*_read_fact_prot_reg) (struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen, u_char *buf); int (*_get_user_prot_info) (struct mtd_info *mtd, size_t len, size_t *retlen, struct otp_info *buf); int (*_read_user_prot_reg) (struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen, u_char *buf); int (*_write_user_prot_reg) (struct mtd_info *mtd, loff_t to, size_t len, size_t *retlen, u_char *buf); int (*_lock_user_prot_reg) (struct mtd_info *mtd, loff_t from, size_t len); int (*_writev) (struct mtd_info *mtd, const struct kvec *vecs, unsigned long count, loff_t to, size_t *retlen); void (*_sync) (struct mtd_info *mtd); int (*_lock) (struct mtd_info *mtd, loff_t ofs, uint64_t len); int (*_unlock) (struct mtd_info *mtd, loff_t ofs, uint64_t len); int (*_is_locked) (struct mtd_info *mtd, loff_t ofs, uint64_t len); int (*_block_isreserved) (struct mtd_info *mtd, loff_t ofs); int (*_block_isbad) (struct mtd_info *mtd, loff_t ofs); int (*_block_markbad) (struct mtd_info *mtd, loff_t ofs); int (*_max_bad_blocks) (struct mtd_info *mtd, loff_t ofs, size_t len); int (*_suspend) (struct mtd_info *mtd); void (*_resume) (struct mtd_info *mtd); void (*_reboot) (struct mtd_info *mtd); /* * If the driver is something smart, like UBI, it may need to maintain * its own reference counting. The below functions are only for driver. */ int (*_get_device) (struct mtd_info *mtd); void (*_put_device) (struct mtd_info *mtd); struct notifier_block reboot_notifier; /* default mode before reboot */ /* ECC status information */ struct mtd_ecc_stats ecc_stats; /* Subpage shift (NAND) */ int subpage_sft; void *priv; struct module *owner; struct device dev; int usecount; struct mtd_debug_info dbg; struct nvmem_device *nvmem; };
mtd_info結構體中的read()、write()、read_oob()、write_oob()、erase()是MTD設備驅動要實現的主要函數,這是MTD原始設備與Flash驅動層之間的接口;linux已經已經幫我們實現了一套適合大部分Flash設備的mtd_info成員函數。
2.3.2 mtd_part
在MTD中使用mtd_part來表示分區,其中包含了mtd_info,每一個分區都是被看做一個MTD原始設備,在mtd_table中,mtd_part.mtd_info中的大部分數據都從該分區的主分區mtd_part->master中獲得。master不作爲一個MTD原始設備加入mtd_table中。
mtd_part定義在drivers/mtd/mtdpart.c:
/** * struct mtd_part - our partition node structure * * @mtd: struct holding partition details * @parent: parent mtd - flash device or another partition * @offset: partition offset relative to the *flash device* */ struct mtd_part { struct mtd_info mtd; // 分區信息 struct mtd_info *parent; // 分區的主分區 uint64_t offset; // 分區的偏移地址 struct list_head list; // 雙向鏈表,將mtd_part鏈接成一個鏈表 };
2.3.3 struct mtd_partition
在MTD中用mtd_partition來表示分區的信息,mtd_partition定義在include/linux/mtd/partitions.h:
/* * Partition definition structure: * * An array of struct partition is passed along with a MTD object to * mtd_device_register() to create them. * * For each partition, these fields are available: * name: string that will be used to label the partition's MTD device. * types: some partitions can be containers using specific format to describe * embedded subpartitions / volumes. E.g. many home routers use "firmware" * partition that contains at least kernel and rootfs. In such case an * extra parser is needed that will detect these dynamic partitions and * report them to the MTD subsystem. If set this property stores an array * of parser names to use when looking for subpartitions. * size: the partition size; if defined as MTDPART_SIZ_FULL, the partition * will extend to the end of the master MTD device. * offset: absolute starting position within the master MTD device; if * defined as MTDPART_OFS_APPEND, the partition will start where the * previous one ended; if MTDPART_OFS_NXTBLK, at the next erase block; * if MTDPART_OFS_RETAIN, consume as much as possible, leaving size * after the end of partition. * mask_flags: contains flags that have to be masked (removed) from the * master MTD flag set for the corresponding MTD partition. * For example, to force a read-only partition, simply adding * MTD_WRITEABLE to the mask_flags will do the trick. * * Note: writeable partitions require their size and offset be * erasesize aligned (e.g. use MTDPART_OFS_NEXTBLK). */ struct mtd_partition { const char *name; /* identifier string 分區名 */ const char *const *types; /* names of parsers to use if any */ uint64_t size; /* partition size 分區大小 */ uint64_t offset; /* offset within the master MTD space 分區的偏移值 */ uint32_t mask_flags; /* master MTD flags to mask out for this partition 標誌掩碼 */ struct device_node *of_node; };
2.3.4 struct nand_chip
nand_chip是一個比較重要的數據結構,MTD使用nand_chip來表示一個Nand Flash內部的芯片,該結構體包含了關於Nand Flash的內存模型信息,讀寫方法,ECC模式,硬件控制等一系列底層機制。其定義在include/linux/mtd/rawnand.h:
/** * struct nand_chip - NAND Private Flash Chip Data * @base: Inherit from the generic NAND device * @legacy: All legacy fields/hooks. If you develop a new driver, * don't even try to use any of these fields/hooks, and if * you're modifying an existing driver that is using those * fields/hooks, you should consider reworking the driver * avoid using them. * @setup_read_retry: [FLASHSPECIFIC] flash (vendor) specific function for * setting the read-retry mode. Mostly needed for MLC NAND. * @ecc: [BOARDSPECIFIC] ECC control structure * @buf_align: minimum buffer alignment required by a platform * @oob_poi: "poison value buffer," used for laying out OOB data * before writing * @page_shift: [INTERN] number of address bits in a page (column * address bits). * @phys_erase_shift: [INTERN] number of address bits in a physical eraseblock * @bbt_erase_shift: [INTERN] number of address bits in a bbt entry * @chip_shift: [INTERN] number of address bits in one chip * @options: [BOARDSPECIFIC] various chip options. They can partly * be set to inform nand_scan about special functionality. * See the defines for further explanation. * @bbt_options: [INTERN] bad block specific options. All options used * here must come from bbm.h. By default, these options * will be copied to the appropriate nand_bbt_descr's. * @badblockpos: [INTERN] position of the bad block marker in the oob * area. * @badblockbits: [INTERN] minimum number of set bits in a good block's * bad block marker position; i.e., BBM == 11110111b is * not bad when badblockbits == 7 * @onfi_timing_mode_default: [INTERN] default ONFI timing mode. This field is * set to the actually used ONFI mode if the chip is * ONFI compliant or deduced from the datasheet if * the NAND chip is not ONFI compliant. * @pagemask: [INTERN] page number mask = number of (pages / chip) - 1 * @data_buf: [INTERN] buffer for data, size is (page size + oobsize). * @pagecache: Structure containing page cache related fields * @pagecache.bitflips: Number of bitflips of the cached page * @pagecache.page: Page number currently in the cache. -1 means no page is * currently cached * @subpagesize: [INTERN] holds the subpagesize * @id: [INTERN] holds NAND ID * @parameters: [INTERN] holds generic parameters under an easily * readable form. * @data_interface: [INTERN] NAND interface timing information * @cur_cs: currently selected target. -1 means no target selected, * otherwise we should always have cur_cs >= 0 && * cur_cs < nanddev_ntargets(). NAND Controller drivers * should not modify this value, but they're allowed to * read it. * @read_retries: [INTERN] the number of read retry modes supported * @lock: lock protecting the suspended field. Also used to * serialize accesses to the NAND device. * @suspended: set to 1 when the device is suspended, 0 when it's not. * @bbt: [INTERN] bad block table pointer * @bbt_td: [REPLACEABLE] bad block table descriptor for flash * lookup. * @bbt_md: [REPLACEABLE] bad block table mirror descriptor * @badblock_pattern: [REPLACEABLE] bad block scan pattern used for initial * bad block scan. * @controller: [REPLACEABLE] a pointer to a hardware controller * structure which is shared among multiple independent * devices. * @priv: [OPTIONAL] pointer to private chip data * @manufacturer: [INTERN] Contains manufacturer information * @manufacturer.desc: [INTERN] Contains manufacturer's description * @manufacturer.priv: [INTERN] Contains manufacturer private information */ struct nand_chip { struct nand_device base; // 可以看作mtd_info子類 struct nand_legacy legacy; // 硬件操作函數 int (*setup_read_retry)(struct nand_chip *chip, int retry_mode); unsigned int options; // 與具體的nand芯片相關的一些選項,如NAND_BUSWIDTH_16等 unsigned int bbt_options; int page_shift; // 用來表示nand芯片的page大小,如某nand芯片的一個page有512個字節,那麼該值就是9 int phys_erase_shift; // 用來表示nand芯片每次可擦除的大小,如某nand芯片每次可擦除16kb(通常爲一個block大小),那麼該值就是14 int bbt_erase_shift; // 用來表示bad block table的大小,通常bbt佔用一個block,所以該值通常和phys_erase_shift相同 int chip_shift; // 使用位表示nand芯片的容量 int pagemask; // nand總容量/每頁字節數 - 1 得到頁掩碼 u8 *data_buf; struct { unsigned int bitflips; int page; } pagecache; int subpagesize; int onfi_timing_mode_default; unsigned int badblockpos; int badblockbits; struct nand_id id; // 保存從nand讀取到的設備id信息,包含廠家ID、設備ID等 struct nand_parameters parameters; struct nand_data_interface data_interface; int cur_cs; // 當前選中的目標 int read_retries; struct mutex lock; unsigned int suspended : 1; uint8_t *oob_poi; struct nand_controller *controller; // nand controller struct nand_ecc_ctrl ecc; // ecc校驗結構體,裏面有大量函數進行ecc校驗 unsigned long buf_align; uint8_t *bbt; struct nand_bbt_descr *bbt_td; struct nand_bbt_descr *bbt_md; struct nand_bbt_descr *badblock_pattern; void *priv; struct { const struct nand_manufacturer *desc; void *priv; } manufacturer; // 廠家ID信息 };
nand_chip中的ecc主要做一些與ecc有關的操作,如read_page_raw、write_pager_raw,裏面含有大量函數進行ecc校驗。
nand_chip中的legacy中讀寫函數,如read_buf、cmdfunc等,與具體的Nand Controller相關,這部分函數與硬件交互,通常需要我們自己根據SOC Nand Controller來實現。
2.3.5 struct nand_legacy
nand_legacy該結構體就是保存與SOC Nand Controller硬件相關的函數:
/** * struct nand_legacy - NAND chip legacy fields/hooks * @IO_ADDR_R: address to read the 8 I/O lines of the flash device * @IO_ADDR_W: address to write the 8 I/O lines of the flash device * @select_chip: select/deselect a specific target/die * @read_byte: read one byte from the chip * @write_byte: write a single byte to the chip on the low 8 I/O lines * @write_buf: write data from the buffer to the chip * @read_buf: read data from the chip into the buffer * @cmd_ctrl: hardware specific function for controlling ALE/CLE/nCE. Also used * to write command and address * @cmdfunc: hardware specific function for writing commands to the chip. * @dev_ready: hardware specific function for accessing device ready/busy line. * If set to NULL no access to ready/busy is available and the * ready/busy information is read from the chip status register. * @waitfunc: hardware specific function for wait on ready. * @block_bad: check if a block is bad, using OOB markers * @block_markbad: mark a block bad * @set_features: set the NAND chip features * @get_features: get the NAND chip features * @chip_delay: chip dependent delay for transferring data from array to read * regs (tR). * @dummy_controller: dummy controller implementation for drivers that can * only control a single chip * * If you look at this structure you're already wrong. These fields/hooks are * all deprecated. */ struct nand_legacy { void __iomem *IO_ADDR_R; // 讀8根I/O線地址 比如S3C2440設置爲數據寄存器地址 NFDATA void __iomem *IO_ADDR_W; // 寫8根I/O線地址 比如S3C2440設置爲數據寄存器地址 NFDATA void (*select_chip)(struct nand_chip *chip, int cs); // 片選/取消片選 u8 (*read_byte)(struct nand_chip *chip); // 讀取一個字節數據 void (*write_byte)(struct nand_chip *chip, u8 byte); // 寫入一個字節數據 void (*write_buf)(struct nand_chip *chip, const u8 *buf, int len); // 寫入len個長度字節 void (*read_buf)(struct nand_chip *chip, u8 *buf, int len); // 讀取len個長度字節 void (*cmd_ctrl)(struct nand_chip *chip, int dat, unsigned int ctrl); // 硬件相關控制函數 寫命令/地址 void (*cmdfunc)(struct nand_chip *chip, unsigned command, int column, // 發送寫數據命令 傳入列地址、頁地址 int page_addr); int (*dev_ready)(struct nand_chip *chip); // 獲取nand狀態 繁忙/就緒 int (*waitfunc)(struct nand_chip *chip); // 等待nand就緒 int (*block_bad)(struct nand_chip *chip, loff_t ofs); // 檢測是否有壞塊 int (*block_markbad)(struct nand_chip *chip, loff_t ofs); // 標記壞塊 int (*set_features)(struct nand_chip *chip, int feature_addr, u8 *subfeature_para); int (*get_features)(struct nand_chip *chip, int feature_addr, u8 *subfeature_para); int chip_delay; // 延遲時間 struct nand_controller dummy_controller; };
2.3.6 struct nand_ecc_ctrl
nand_ecc_ctrl中的讀寫函數read_page_raw、write_pager_raw等主要是用來做一些與ecc有關的操作:
/** * struct nand_ecc_ctrl - Control structure for ECC * @mode: ECC mode * @algo: ECC algorithm * @steps: number of ECC steps per page * @size: data bytes per ECC step * @bytes: ECC bytes per step * @strength: max number of correctible bits per ECC step * @total: total number of ECC bytes per page * @prepad: padding information for syndrome based ECC generators * @postpad: padding information for syndrome based ECC generators * @options: ECC specific options (see NAND_ECC_XXX flags defined above) * @priv: pointer to private ECC control data * @calc_buf: buffer for calculated ECC, size is oobsize. * @code_buf: buffer for ECC read from flash, size is oobsize. * @hwctl: function to control hardware ECC generator. Must only * be provided if an hardware ECC is available * @calculate: function for ECC calculation or readback from ECC hardware * @correct: function for ECC correction, matching to ECC generator (sw/hw). * Should return a positive number representing the number of * corrected bitflips, -EBADMSG if the number of bitflips exceed * ECC strength, or any other error code if the error is not * directly related to correction. * If -EBADMSG is returned the input buffers should be left * untouched. * @read_page_raw: function to read a raw page without ECC. This function * should hide the specific layout used by the ECC * controller and always return contiguous in-band and * out-of-band data even if they're not stored * contiguously on the NAND chip (e.g. * NAND_ECC_HW_SYNDROME interleaves in-band and * out-of-band data). * @write_page_raw: function to write a raw page without ECC. This function * should hide the specific layout used by the ECC * controller and consider the passed data as contiguous * in-band and out-of-band data. ECC controller is * responsible for doing the appropriate transformations * to adapt to its specific layout (e.g. * NAND_ECC_HW_SYNDROME interleaves in-band and * out-of-band data). * @read_page: function to read a page according to the ECC generator * requirements; returns maximum number of bitflips corrected in * any single ECC step, -EIO hw error * @read_subpage: function to read parts of the page covered by ECC; * returns same as read_page() * @write_subpage: function to write parts of the page covered by ECC. * @write_page: function to write a page according to the ECC generator * requirements. * @write_oob_raw: function to write chip OOB data without ECC * @read_oob_raw: function to read chip OOB data without ECC * @read_oob: function to read chip OOB data * @write_oob: function to write chip OOB data */ struct nand_ecc_ctrl { nand_ecc_modes_t mode; enum nand_ecc_algo algo; int steps; int size; int bytes; int total; int strength; int prepad; int postpad; unsigned int options; void *priv; u8 *calc_buf; u8 *code_buf; void (*hwctl)(struct nand_chip *chip, int mode); int (*calculate)(struct nand_chip *chip, const uint8_t *dat, uint8_t *ecc_code); int (*correct)(struct nand_chip *chip, uint8_t *dat, uint8_t *read_ecc, uint8_t *calc_ecc); int (*read_page_raw)(struct nand_chip *chip, uint8_t *buf, int oob_required, int page); int (*write_page_raw)(struct nand_chip *chip, const uint8_t *buf, int oob_required, int page); int (*read_page)(struct nand_chip *chip, uint8_t *buf, int oob_required, int page); int (*read_subpage)(struct nand_chip *chip, uint32_t offs, uint32_t len, uint8_t *buf, int page); int (*write_subpage)(struct nand_chip *chip, uint32_t offset, uint32_t data_len, const uint8_t *data_buf, int oob_required, int page); int (*write_page)(struct nand_chip *chip, const uint8_t *buf, int oob_required, int page); int (*write_oob_raw)(struct nand_chip *chip, int page); int (*read_oob_raw)(struct nand_chip *chip, int page); int (*read_oob)(struct nand_chip *chip, int page); int (*write_oob)(struct nand_chip *chip, int page); };
2.3.7 struct nand_manufacturer
nand_manufacturer保存生產廠家信息,定義在drivers/mtd/nand/raw/internals.h:
/* * NAND Flash Manufacturer ID Codes */ #define NAND_MFR_AMD 0x01 #define NAND_MFR_ATO 0x9b #define NAND_MFR_EON 0x92 #define NAND_MFR_ESMT 0xc8 #define NAND_MFR_FUJITSU 0x04 #define NAND_MFR_HYNIX 0xad #define NAND_MFR_INTEL 0x89 #define NAND_MFR_MACRONIX 0xc2 #define NAND_MFR_MICRON 0x2c #define NAND_MFR_NATIONAL 0x8f #define NAND_MFR_RENESAS 0x07 #define NAND_MFR_SAMSUNG 0xec // 三星廠家 #define NAND_MFR_SANDISK 0x45 #define NAND_MFR_STMICRO 0x20 #define NAND_MFR_TOSHIBA 0x98 #define NAND_MFR_WINBOND 0xef /** * struct nand_manufacturer_ops - NAND Manufacturer operations * @detect: detect the NAND memory organization and capabilities * @init: initialize all vendor specific fields (like the ->read_retry() * implementation) if any. * @cleanup: the ->init() function may have allocated resources, ->cleanup() * is here to let vendor specific code release those resources. * @fixup_onfi_param_page: apply vendor specific fixups to the ONFI parameter * page. This is called after the checksum is verified. */ struct nand_manufacturer_ops { void (*detect)(struct nand_chip *chip); int (*init)(struct nand_chip *chip); void (*cleanup)(struct nand_chip *chip); void (*fixup_onfi_param_page)(struct nand_chip *chip, struct nand_onfi_params *p); }; /** * struct nand_manufacturer - NAND Flash Manufacturer structure * @name: Manufacturer name * @id: manufacturer ID code of device. * @ops: manufacturer operations */ struct nand_manufacturer { int id; // 廠家ID char *name; // 廠家名字 const struct nand_manufacturer_ops *ops; // 操作函數 };
2.3.8 struct nand_device
struct nand_device定義在include/linux/mtd/nand.h:
/** * struct nand_device - NAND device * @mtd: MTD instance attached to the NAND device * @memorg: memory layout * @eccreq: ECC requirements * @rowconv: position to row address converter * @bbt: bad block table info * @ops: NAND operations attached to the NAND device * * Generic NAND object. Specialized NAND layers (raw NAND, SPI NAND, OneNAND) * should declare their own NAND object embedding a nand_device struct (that's * how inheritance is done). * struct_nand_device->memorg and struct_nand_device->eccreq should be filled * at device detection time to reflect the NAND device * capabilities/requirements. Once this is done nanddev_init() can be called. * It will take care of converting NAND information into MTD ones, which means * the specialized NAND layers should never manually tweak * struct_nand_device->mtd except for the ->_read/write() hooks. */ struct nand_device { struct mtd_info mtd; struct nand_memory_organization memorg; struct nand_ecc_req eccreq; struct nand_row_converter rowconv; struct nand_bbt bbt; const struct nand_ops *ops; };
2.3.9 結構體關係圖
2.4 核心函數
如果MTD設備只有一個分區,那麼使用下面兩個函數註冊和註銷MTD設備:
int add_mtd_device(struct mtd_info *mtd) int del_mtd_device (struct mtd_info *mtd)
如果MTD設備存在其他分區,那麼使用下面兩個函數註冊和註銷MTD設備:
int add_mtd_partitions(struct mtd_info *master,const struct mtd_partition *parts,int nbparts) int del_mtd_partitions(struct mtd_info *master)
三、MTD設備註冊
3.1 add_mtd_device
add_mtd_device定義在drivers/mtd/mtdcore.c:
/** * add_mtd_device - register an MTD device * @mtd: pointer to new MTD device info structure * * Add a device to the list of MTD devices present in the system, and * notify each currently active MTD 'user' of its arrival. Returns * zero on success or non-zero on failure. */ int add_mtd_device(struct mtd_info *mtd) { struct mtd_notifier *not; int i, error; /* * May occur, for instance, on buggy drivers which call * mtd_device_parse_register() multiple times on the same master MTD, * especially with CONFIG_MTD_PARTITIONED_MASTER=y. */ if (WARN_ONCE(mtd->dev.type, "MTD already registered\n")) return -EEXIST; BUG_ON(mtd->writesize == 0); /* * MTD drivers should implement ->_{write,read}() or * ->_{write,read}_oob(), but not both. */ if (WARN_ON((mtd->_write && mtd->_write_oob) || // 校驗函數指針 (mtd->_read && mtd->_read_oob))) return -EINVAL; if (WARN_ON((!mtd->erasesize || !mtd->_erase) && !(mtd->flags & MTD_NO_ERASE))) return -EINVAL; mutex_lock(&mtd_table_mutex); // 互斥鎖 i = idr_alloc(&mtd_idr, mtd, 0, 0, GFP_KERNEL); // 爲mtd設備分配index if (i < 0) { error = i; goto fail_locked; } mtd->index = i; mtd->usecount = 0; /* default value if not set by driver */ if (mtd->bitflip_threshold == 0) // 計算擦除數據偏移 mtd->bitflip_threshold = mtd->ecc_strength; if (is_power_of_2(mtd->erasesize)) mtd->erasesize_shift = ffs(mtd->erasesize) - 1; else mtd->erasesize_shift = 0; if (is_power_of_2(mtd->writesize)) // 計算寫入數據偏移值 mtd->writesize_shift = ffs(mtd->writesize) - 1; else mtd->writesize_shift = 0; mtd->erasesize_mask = (1 << mtd->erasesize_shift) - 1; // 計算擦除數據大小掩碼 mtd->writesize_mask = (1 << mtd->writesize_shift) - 1; // 計算寫入數據大小掩碼 /* Some chips always power up locked. Unlock them now */ if ((mtd->flags & MTD_WRITEABLE) && (mtd->flags & MTD_POWERUP_LOCK)) { // 有些芯片總是通電鎖定,立即解鎖(一般flash芯片都支持lock機制,在驅動上很少使用) error = mtd_unlock(mtd, 0, mtd->size); if (error && error != -EOPNOTSUPP) printk(KERN_WARNING "%s: unlock failed, writes may not work\n", mtd->name); /* Ignore unlock failures? */ error = 0; } /* Caller should have set dev.parent to match the * physical device, if appropriate. */ mtd->dev.type = &mtd_devtype; // 設置設備類型 mtd->dev.class = &mtd_class; // 設置設備類 會在/syc/class創建mtd類 mtd->dev.devt = MTD_DEVT(i); // 設置設備號,關於設備號的申請是在mtdchar.c模塊入口函數中完成的 dev_set_name(&mtd->dev, "mtd%d", i); // 設置設備節點名字mtd%d dev_set_drvdata(&mtd->dev, mtd); // mtd->dev.driver_data = mtd; of_node_get(mtd_get_of_node(mtd)); error = device_register(&mtd->dev); // 註冊MTD字符設備,會在/sys/class/mtd類下創建mtd%d文件,然後mdev通過這個自動創建/dev/mtd%d這個字符設備節點 if (error) goto fail_added; /* Add the nvmem provider */ error = mtd_nvmem_add(mtd); if (error) goto fail_nvmem_add; if (!IS_ERR_OR_NULL(dfs_dir_mtd)) { mtd->dbg.dfs_dir = debugfs_create_dir(dev_name(&mtd->dev), dfs_dir_mtd); if (IS_ERR_OR_NULL(mtd->dbg.dfs_dir)) { pr_debug("mtd device %s won't show data in debugfs\n", dev_name(&mtd->dev)); } } device_create(&mtd_class, mtd->dev.parent, MTD_DEVT(i) + 1, NULL, // 創建MTD字符設備,內部調用了device_register 在/sys/class/mtd下創建mtd%dro設備,然後mdev通過這個自動創建/dev/mtd%dro這個字符設備節點 "mtd%dro", i); pr_debug("mtd: Giving out device %d to %s\n", i, mtd->name); /* No need to get a refcount on the module containing the notifier, since we hold the mtd_table_mutex */ list_for_each_entry(not, &mtd_notifiers, list) // 調用mtd子系統的notify機制,實現針對mtd設備添加、移除,移除notify機制,實現註冊的notify hook not->add(mtd); mutex_unlock(&mtd_table_mutex); // 解鎖 /* We _know_ we aren't being removed, because our caller is still holding us here. So none of this try_ nonsense, and no bitching about it either. :) */ __module_get(THIS_MODULE); return 0; fail_nvmem_add: device_unregister(&mtd->dev); fail_added: of_node_put(mtd_get_of_node(mtd)); idr_remove(&mtd_idr, i); fail_locked: mutex_unlock(&mtd_table_mutex); return error; }
該函數主要進行了以下操作:
(1) 對mtd原始設備必要字段以及函數指針進行校驗;
(2) 在mtd_idr樹中爲該mtd原始設備分配節點,並返回分配的節點ID:
i = idr_alloc(&mtd_idr, mtd, 0, 0, GFP_KERNEL); // 分配ID mtd_idr是一個redix樹、將mtd與新分配的ID關聯
idr_alloc函數用於爲mtd_idr樹新增一個節點,該節點在mtd_idr樹中有唯一的ID,並且將這個節點與mtd關聯。通過ID就可以定位到mtd。
此外該函數第三個參數和第四個參數含義如下:爲ID的起始範圍,結束範圍設置爲0,表示mtd_idr樹允許的最大ID。
全局變量mtd_idr定義在drivers/mtd/mtdcore.c:
static DEFINE_IDR(mtd_idr);
關於IDR的定義這裏就不介紹了,IDR主要實現ID與數據結構的綁定具體可以參考linux內核IDR機制詳解(一)。
後續字符設備及塊設備註冊需要該ID,比如後面設置mtd設備對應的device類型變量設備號爲MTD_DEVT(i);
#define MTD_DEVT(index) MKDEV(MTD_CHAR_MAJOR, (index)*2)
主設備號爲MTD_CHAR_MAJOR,即90,次設備號爲index*2;
(3) 設備mtd原始設備的erasesize_shift、writesize_shift、erasesize_mask、writesize_mask等信息;
(4) 針對設置可寫屬性,且上電時對Flash進行lock的芯片,則調用unlock接口,進行解鎖(一般Flasg芯片都支持lock機制,但在驅動上很少使用);
(5) 設置mtd原始設備對應的device類型變量所屬的class爲mtd_class,並設置其設備號,類型、名稱、driver_data;
mtd_class定義爲:
static struct class mtd_class = { .name = "mtd", .owner = THIS_MODULE, .pm = MTD_CLS_PM_OPS, };
(6) 調用device_register完成名字爲mtd%d MTD字符設備的註冊;
(7)調用device_create完成名字爲mtd%dro MTD字符設備的創建、初始化以及註冊;
(8) 調用mtd子系統的notify機制,實現針對mtd設備添加、移除,移除notify機制,實現註冊的notify hook;
list_for_each_entry(not, &mtd_notifiers, list) not->add(mtd);
list_for_each_entry函數包含三個參數,以此爲pos、head、member;它實際上是一個for循環,利用傳入的pos作爲循環變量,從鏈表頭head開始,逐項向後(next方向)移動pos,直至又回到head。
鏈表mtd_notifiers定義爲:
static LIST_HEAD(mtd_notifiers);
這裏實際上就是遍歷這個鏈表得到當前時刻的元素not,類型爲mtd_notifiers,然後調用not->add(mtd)方法,在這個方法裏會進行名字爲mtdblock%d MTD塊設備的註冊。
3.2 add_mtd_partitions
add_mtd_partitions定義在drivers/mtd/mtdpart.c:
/* * This function, given a master MTD object and a partition table, creates * and registers slave MTD objects which are bound to the master according to * the partition definitions. * * For historical reasons, this function's caller only registers the master * if the MTD_PARTITIONED_MASTER config option is set. */ int add_mtd_partitions(struct mtd_info *master, // MTD設備信息 const struct mtd_partition *parts, // 分區表 int nbparts) // 分區個數 { struct mtd_part *slave; uint64_t cur_offset = 0; int i, ret; printk(KERN_NOTICE "Creating %d MTD partitions on \"%s\":\n", nbparts, master->name); for (i = 0; i < nbparts; i++) { // 遍歷分區表 slave = allocate_partition(master, parts + i, i, cur_offset); // 分配mtd_part if (IS_ERR(slave)) { ret = PTR_ERR(slave); goto err_del_partitions; } mutex_lock(&mtd_partitions_mutex); list_add(&slave->list, &mtd_partitions); // slave添加到鏈表mtd_partitions mutex_unlock(&mtd_partitions_mutex); ret = add_mtd_device(&slave->mtd); // 爲每個分區註冊mtd設備,會在/dev下成成mtdblock%d文件塊設備文件 if (ret) { mutex_lock(&mtd_partitions_mutex); list_del(&slave->list); mutex_unlock(&mtd_partitions_mutex); free_partition(slave); goto err_del_partitions; } mtd_add_partition_attrs(slave); /* Look for subpartitions */ parse_mtd_partitions(&slave->mtd, parts[i].types, NULL); cur_offset = slave->offset + slave->mtd.size; } return 0; err_del_partitions: del_mtd_partitions(master); return ret; }
3.2.1 allocate_partition
allocate_partition定義在drivers/mtd/mtdpart.c:
static struct mtd_part *allocate_partition(struct mtd_info *parent, const struct mtd_partition *part, int partno, uint64_t cur_offset) { int wr_alignment = (parent->flags & MTD_NO_ERASE) ? parent->writesize : parent->erasesize; struct mtd_part *slave; u32 remainder; char *name; u64 tmp; /* allocate the partition structure */ slave = kzalloc(sizeof(*slave), GFP_KERNEL); name = kstrdup(part->name, GFP_KERNEL); if (!name || !slave) { printk(KERN_ERR"memory allocation error while creating partitions for \"%s\"\n", parent->name); kfree(name); kfree(slave); return ERR_PTR(-ENOMEM); } /* set up the MTD object for this partition */ slave->mtd.type = parent->type; slave->mtd.flags = parent->orig_flags & ~part->mask_flags; slave->mtd.orig_flags = slave->mtd.flags; slave->mtd.size = part->size; slave->mtd.writesize = parent->writesize; slave->mtd.writebufsize = parent->writebufsize; slave->mtd.oobsize = parent->oobsize; slave->mtd.oobavail = parent->oobavail; slave->mtd.subpage_sft = parent->subpage_sft; slave->mtd.pairing = parent->pairing; slave->mtd.name = name; slave->mtd.owner = parent->owner; /* NOTE: Historically, we didn't arrange MTDs as a tree out of * concern for showing the same data in multiple partitions. * However, it is very useful to have the master node present, * so the MTD_PARTITIONED_MASTER option allows that. The master * will have device nodes etc only if this is set, so make the * parent conditional on that option. Note, this is a way to * distinguish between the master and the partition in sysfs. */ slave->mtd.dev.parent = IS_ENABLED(CONFIG_MTD_PARTITIONED_MASTER) || mtd_is_partition(parent) ? &parent->dev : parent->dev.parent; slave->mtd.dev.of_node = part->of_node; if (parent->_read) slave->mtd._read = part_read; if (parent->_write) slave->mtd._write = part_write; if (parent->_panic_write) slave->mtd._panic_write = part_panic_write; if (parent->_point && parent->_unpoint) { slave->mtd._point = part_point; slave->mtd._unpoint = part_unpoint; } if (parent->_read_oob) slave->mtd._read_oob = part_read_oob; if (parent->_write_oob) slave->mtd._write_oob = part_write_oob; if (parent->_read_user_prot_reg) slave->mtd._read_user_prot_reg = part_read_user_prot_reg; if (parent->_read_fact_prot_reg) slave->mtd._read_fact_prot_reg = part_read_fact_prot_reg; if (parent->_write_user_prot_reg) slave->mtd._write_user_prot_reg = part_write_user_prot_reg; if (parent->_lock_user_prot_reg) slave->mtd._lock_user_prot_reg = part_lock_user_prot_reg; if (parent->_get_user_prot_info) slave->mtd._get_user_prot_info = part_get_user_prot_info; if (parent->_get_fact_prot_info) slave->mtd._get_fact_prot_info = part_get_fact_prot_info; if (parent->_sync) slave->mtd._sync = part_sync; if (!partno && !parent->dev.class && parent->_suspend && parent->_resume) { slave->mtd._suspend = part_suspend; slave->mtd._resume = part_resume; } if (parent->_writev) slave->mtd._writev = part_writev; if (parent->_lock) slave->mtd._lock = part_lock; if (parent->_unlock) slave->mtd._unlock = part_unlock; if (parent->_is_locked) slave->mtd._is_locked = part_is_locked; if (parent->_block_isreserved) slave->mtd._block_isreserved = part_block_isreserved; if (parent->_block_isbad) slave->mtd._block_isbad = part_block_isbad; if (parent->_block_markbad) slave->mtd._block_markbad = part_block_markbad; if (parent->_max_bad_blocks) slave->mtd._max_bad_blocks = part_max_bad_blocks; if (parent->_get_device) slave->mtd._get_device = part_get_device; if (parent->_put_device) slave->mtd._put_device = part_put_device; slave->mtd._erase = part_erase; slave->parent = parent; slave->offset = part->offset; if (slave->offset == MTDPART_OFS_APPEND) slave->offset = cur_offset; if (slave->offset == MTDPART_OFS_NXTBLK) { tmp = cur_offset; slave->offset = cur_offset; remainder = do_div(tmp, wr_alignment); if (remainder) { slave->offset += wr_alignment - remainder; printk(KERN_NOTICE "Moving partition %d: " "0x%012llx -> 0x%012llx\n", partno, (unsigned long long)cur_offset, (unsigned long long)slave->offset); } } if (slave->offset == MTDPART_OFS_RETAIN) { slave->offset = cur_offset; if (parent->size - slave->offset >= slave->mtd.size) { slave->mtd.size = parent->size - slave->offset - slave->mtd.size; } else { printk(KERN_ERR "mtd partition \"%s\" doesn't have enough space: %#llx < %#llx, disabled\n", part->name, parent->size - slave->offset, slave->mtd.size); /* register to preserve ordering */ goto out_register; } } if (slave->mtd.size == MTDPART_SIZ_FULL) slave->mtd.size = parent->size - slave->offset; printk(KERN_NOTICE "0x%012llx-0x%012llx : \"%s\"\n", (unsigned long long)slave->offset, (unsigned long long)(slave->offset + slave->mtd.size), slave->mtd.name); /* let's do some sanity checks */ if (slave->offset >= parent->size) { /* let's register it anyway to preserve ordering */ slave->offset = 0; slave->mtd.size = 0; /* Initialize ->erasesize to make add_mtd_device() happy. */ slave->mtd.erasesize = parent->erasesize; printk(KERN_ERR"mtd: partition \"%s\" is out of reach -- disabled\n", part->name); goto out_register; } if (slave->offset + slave->mtd.size > parent->size) { slave->mtd.size = parent->size - slave->offset; printk(KERN_WARNING"mtd: partition \"%s\" extends beyond the end of device \"%s\" -- size truncated to %#llx\n", part->name, parent->name, (unsigned long long)slave->mtd.size); } if (parent->numeraseregions > 1) { /* Deal with variable erase size stuff */ int i, max = parent->numeraseregions; u64 end = slave->offset + slave->mtd.size; struct mtd_erase_region_info *regions = parent->eraseregions; /* Find the first erase regions which is part of this * partition. */ for (i = 0; i < max && regions[i].offset <= slave->offset; i++) ; /* The loop searched for the region _behind_ the first one */ if (i > 0) i--; /* Pick biggest erasesize */ for (; i < max && regions[i].offset < end; i++) { if (slave->mtd.erasesize < regions[i].erasesize) { slave->mtd.erasesize = regions[i].erasesize; } } BUG_ON(slave->mtd.erasesize == 0); } else { /* Single erase size */ slave->mtd.erasesize = parent->erasesize; } /* * Slave erasesize might differ from the master one if the master * exposes several regions with different erasesize. Adjust * wr_alignment accordingly. */ if (!(slave->mtd.flags & MTD_NO_ERASE)) wr_alignment = slave->mtd.erasesize; tmp = part_absolute_offset(parent) + slave->offset; remainder = do_div(tmp, wr_alignment); if ((slave->mtd.flags & MTD_WRITEABLE) && remainder) { /* Doesn't start on a boundary of major erase size */ /* FIXME: Let it be writable if it is on a boundary of * _minor_ erase size though */ slave->mtd.flags &= ~MTD_WRITEABLE; printk(KERN_WARNING"mtd: partition \"%s\" doesn't start on an erase/write block boundary -- force read-only\n", part->name); } tmp = part_absolute_offset(parent) + slave->mtd.size; remainder = do_div(tmp, wr_alignment); if ((slave->mtd.flags & MTD_WRITEABLE) && remainder) { slave->mtd.flags &= ~MTD_WRITEABLE; printk(KERN_WARNING"mtd: partition \"%s\" doesn't end on an erase/write block -- force read-only\n", part->name); } mtd_set_ooblayout(&slave->mtd, &part_ooblayout_ops); slave->mtd.ecc_step_size = parent->ecc_step_size; slave->mtd.ecc_strength = parent->ecc_strength; slave->mtd.bitflip_threshold = parent->bitflip_threshold; if (parent->_block_isbad) { uint64_t offs = 0; while (offs < slave->mtd.size) { if (mtd_block_isreserved(parent, offs + slave->offset)) slave->mtd.ecc_stats.bbtblocks++; else if (mtd_block_isbad(parent, offs + slave->offset)) slave->mtd.ecc_stats.badblocks++; offs += slave->mtd.erasesize; } } out_register: return slave; }
3.2.2 mtd_partitions
鏈表mtd_partitions定義在drivers/mtd/mtdpart.c:
static LIST_HEAD(mtd_partitions);
3.3 mtd_device_register
宏mtd_device_register定義在include/linux/mtd/mtd.h:
#define mtd_device_register(master, parts, nr_parts) \ mtd_device_parse_register(master, NULL, NULL, parts, nr_parts)
函數mtd_device_parse_register定義在drivers/mtd/mtdcore.c:
/** * mtd_device_parse_register - parse partitions and register an MTD device. * * @mtd: the MTD device to register * @types: the list of MTD partition probes to try, see * 'parse_mtd_partitions()' for more information * @parser_data: MTD partition parser-specific data * @parts: fallback partition information to register, if parsing fails; * only valid if %nr_parts > %0 * @nr_parts: the number of partitions in parts, if zero then the full * MTD device is registered if no partition info is found * * This function aggregates MTD partitions parsing (done by * 'parse_mtd_partitions()') and MTD device and partitions registering. It * basically follows the most common pattern found in many MTD drivers: * * * If the MTD_PARTITIONED_MASTER option is set, then the device as a whole is * registered first. * * Then It tries to probe partitions on MTD device @mtd using parsers * specified in @types (if @types is %NULL, then the default list of parsers * is used, see 'parse_mtd_partitions()' for more information). If none are * found this functions tries to fallback to information specified in * @parts/@nr_parts. * * If no partitions were found this function just registers the MTD device * @mtd and exits. * * Returns zero in case of success and a negative error code in case of failure. */ int mtd_device_parse_register(struct mtd_info *mtd, const char * const *types, struct mtd_part_parser_data *parser_data, const struct mtd_partition *parts, // 分區表 int nr_parts) // 分區個數 { int ret; mtd_set_dev_defaults(mtd); if (IS_ENABLED(CONFIG_MTD_PARTITIONED_MASTER)) { // 將Nand Flash當做一個分區註冊進內核 ret = add_mtd_device(mtd); // 註冊MTD設備 if (ret) return ret; } /* Prefer parsed partitions over driver-provided fallback */ ret = parse_mtd_partitions(mtd, types, parser_data); if (ret > 0) ret = 0; else if (nr_parts) // 註冊MTD設備 ret = add_mtd_partitions(mtd, parts, nr_parts); else if (!device_is_registered(&mtd->dev)) ret = add_mtd_device(mtd); else ret = 0; if (ret) goto out; /* * FIXME: some drivers unfortunately call this function more than once. * So we have to check if we've already assigned the reboot notifier. * * Generally, we can make multiple calls work for most cases, but it * does cause problems with parse_mtd_partitions() above (e.g., * cmdlineparts will register partitions more than once). */ WARN_ONCE(mtd->_reboot && mtd->reboot_notifier.notifier_call, "MTD already registered\n"); if (mtd->_reboot && !mtd->reboot_notifier.notifier_call) { mtd->reboot_notifier.notifier_call = mtd_reboot_notifier; register_reboot_notifier(&mtd->reboot_notifier); } out: if (ret && device_is_registered(&mtd->dev)) del_mtd_device(mtd); // 卸載MTD設備 return ret; }
四、mtdblock.c
之前我們已經介紹過mtdbloc.c文件,該文件實現了MTD塊設備相關接口,我們直接定位到drivers/mtd/mtdblock.c文件,並對源碼進行解析。
4.1 模塊入口函數
我們定位到MTD塊設備模塊入口函數:
static struct mtd_blktrans_ops mtdblock_tr = { // 這裏面定義了MTD塊設備相關信息以及操作函數 .name = "mtdblock", .major = MTD_BLOCK_MAJOR, // MTD塊設備主設備號 31 .part_bits = 0, // 磁盤設備分區位數 0表示不分區 1表示有2個分區 2表示有4個分區... .blksize = 512, // 扇區大小 .open = mtdblock_open, .flush = mtdblock_flush, .release = mtdblock_release, .readsect = mtdblock_readsect, .writesect = mtdblock_writesect, .add_mtd = mtdblock_add_mtd, .remove_dev = mtdblock_remove_dev, .owner = THIS_MODULE, }; static int __init init_mtdblock(void) { return register_mtd_blktrans(&mtdblock_tr); }
4.2 register_mtd_blktrans
定位到register_mtd_blktrans函數,該函數位於drivers/mtd/mtd_blkdevs.c:
int register_mtd_blktrans(struct mtd_blktrans_ops *tr) { struct mtd_info *mtd; int ret; /* Register the notifier if/when the first device type is registered, to prevent the link/init ordering from fucking us over. */ if (!blktrans_notifier.list.next) // next指向NULL,進入 register_mtd_user(&blktrans_notifier); // 註冊blktrans_notifier到mtd_notifiers鏈表 mutex_lock(&mtd_table_mutex); ret = register_blkdev(tr->major, tr->name); // 註冊塊設備,主設備號爲MTD_BLOCK_MAJOR,定義爲31 if (ret < 0) { printk(KERN_WARNING "Unable to register %s block device on major %d: %d\n", tr->name, tr->major, ret); mutex_unlock(&mtd_table_mutex); return ret; } if (ret) tr->major = ret; tr->blkshift = ffs(tr->blksize) - 1; INIT_LIST_HEAD(&tr->devs); list_add(&tr->list, &blktrans_majors); // 註冊tr到鏈表blktrans_majors mtd_for_each_device(mtd) if (mtd->type != MTD_ABSENT) tr->add_mtd(tr, mtd); mutex_unlock(&mtd_table_mutex); return 0; }
該函數主要包含三部分:
- 調用register_mtd_user:註冊blktrans_notifier到鏈表mtd_notifiers,然後遍歷全局變量mtd_idr獲取mtd,執行blktrans_notify_add(mtd);
- 調用register_blkdev註冊塊設備,主設備號爲31,塊設備名稱爲mtdblock;
- 註冊mtdblock_tr到鏈表blktrans_majors,鏈表定義爲static LIST_HEAD(blktrans_majors);;
- 然後遍歷全局變量mtd_idr獲取mtd,執行mtdblock_add_mtd(mtdblock_tr,mtd);
4.2.1 mtd_notifier
mtd_notifier定義在include/linux/mtd/mtd.h:
struct mtd_notifier { void (*add)(struct mtd_info *mtd); void (*remove)(struct mtd_info *mtd); struct list_head list; };
4.2.2 blktrans_notifier
這裏我們關注一下register_mtd_user(&blktrans_notifier),變量blktrans_notifier,定義在drivers/mtd/mtd_blkdevs.c:
static struct mtd_notifier blktrans_notifier = { .add = blktrans_notify_add, .remove = blktrans_notify_remove, };
4.2.3 register_mtd_user
register_mtd_user函數將new->list添加到鏈表mtd_notifiers:
/** * register_mtd_user - register a 'user' of MTD devices. * @new: pointer to notifier info structure * * Registers a pair of callbacks function to be called upon addition * or removal of MTD devices. Cau ses the 'add' callback to be immediately * invoked for each MTD device currently present in the system. */ void register_mtd_user (struct mtd_notifier *new) { struct mtd_info *mtd; mutex_lock(&mtd_table_mutex); // 互斥鎖 list_add(&new->list, &mtd_notifiers); // 加入鏈表 __module_get(THIS_MODULE); mtd_for_each_device(mtd) // 遍歷mtd_idr,得到mtd new->add(mtd); // 最終執行blktrans_notify_add(mtd) mutex_unlock(&mtd_table_mutex); // 解鎖 }
4.2.4 mtd_for_each_device
mtd_for_each_device宏定義在drivers/mtd/mtdcore.h:
#define mtd_for_each_device(mtd) \ for ((mtd) = __mtd_next_device(0); \ (mtd) != NULL; \ (mtd) = __mtd_next_device(mtd->index + 1))
__mtd_next_device定義在drivers/mtd/mtdcore.c:
struct mtd_info *__mtd_next_device(int i) { return idr_get_next(&mtd_idr, &i); }
這裏實際上就是去遍歷mtd_idr這個redix樹上的所有節點,得到每個節點關聯的mtd。
4.2.5 blktrans_notify_add
然後進入blktrans_notifier變量的blktrans_notify_add ()函數。
static void blktrans_notify_add(struct mtd_info *mtd) { struct mtd_blktrans_ops *tr; if (mtd->type == MTD_ABSENT) return; list_for_each_entry(tr, &blktrans_majors, list) // 遍歷blktrans_majors鏈表 tr->add_mtd(tr, mtd); // 執行mtd_blktrans_ops結構體的add_mtd }
在MTD塊設備驅動入口函數中,會將mtdblock_tr添加到鏈表blktrans_majors,所以這裏遍歷blktrans_majors鏈表,實際上得到的tr就是mtdblock_tr:然後執行mtdblock_tr.add_mtd(mtdblock_tr,mtd)方法。
mtdblock_tr的add_mtd函數,就是mtdblock_add_mtd函數。
4.2.6 在mtdblock_add_mtd
static void mtdblock_add_mtd(struct mtd_blktrans_ops *tr, struct mtd_info *mtd) { struct mtdblk_dev *dev = kzalloc(sizeof(*dev), GFP_KERNEL); if (!dev) return; dev->mbd.mtd = mtd; // 設置MTD原始設備 dev->mbd.devnum = mtd->index; // 設置起始次設備號 dev->mbd.size = mtd->size >> 9; // 總扇區個數 dev->mbd.tr = tr; if (!(mtd->flags & MTD_WRITEABLE)) dev->mbd.readonly = 1; if (add_mtd_blktrans_dev(&dev->mbd)) kfree(dev); }
mtdblock_add_mtd函數:
- 分配了一個mtdblk_dev結構體遍歷dev:
- 初始化dev成員;
- 調用add_mtd_blktrans_dev(dev->mtd);
mtdblk_dev數據結構實際描述的就是一個MTD塊設備,其包含MTD原始設備,定義在drivers/mtd/mtdblock.c:
struct mtdblk_dev { struct mtd_blktrans_dev mbd; int count; struct mutex cache_mutex; unsigned char *cache_data; unsigned long cache_offset; unsigned int cache_size; enum { STATE_EMPTY, STATE_CLEAN, STATE_DIRTY } cache_state; };
struct mtd_blktrans_dev { struct mtd_blktrans_ops *tr; // MTD設備相關信息以及操作函數 struct list_head list; struct mtd_info *mtd; // MTD原始設備 struct mutex lock; int devnum; // 用於計算起始次設備號(devnum<<tr->part_bits,左移0位),由於一個MTD塊設備可能存在若干個分區,假設有2個分區 那兩個分區次設備號就是devnum+1,devnum+2,其中devnum表示整個磁盤 bool bg_stop; unsigned long size; // 扇區個數 int readonly; int open; struct kref ref; struct gendisk *disk; // 磁盤設備 struct attribute_group *disk_attributes; struct request_queue *rq; // 請求隊列 struct list_head rq_list; struct blk_mq_tag_set *tag_set; // 標籤集 spinlock_t queue_lock; void *priv; fmode_t file_mode; };
4.2.7 add_mtd_blktrans_dev
add_mtd_blktrans_dev定義在drivers/mtd/mtd_blkdevs.c:
int add_mtd_blktrans_dev(struct mtd_blktrans_dev *new) { struct mtd_blktrans_ops *tr = new->tr; struct mtd_blktrans_dev *d; int last_devnum = -1; struct gendisk *gd; int ret; if (mutex_trylock(&mtd_table_mutex)) { mutex_unlock(&mtd_table_mutex); BUG(); } mutex_lock(&blktrans_ref_mutex); list_for_each_entry(d, &tr->devs, list) { // tr->devs是個鏈表,遍歷鏈表得到mtd_blktrans_dev if (new->devnum == -1) { // new設備未設置devnum號,分配一個空閒的devnum,默認從0開始分配,逐漸遞增..... /* Use first free number */ if (d->devnum != last_devnum+1) { /* Found a free devnum. Plug it in here */ new->devnum = last_devnum+1; // 新的devnum list_add_tail(&new->list, &d->list); // 將當前new添加到鏈表尾部 goto added; } } else if (d->devnum == new->devnum) { // new設置的devnum已經被佔用 /* Required number taken */ mutex_unlock(&blktrans_ref_mutex); return -EBUSY; } else if (d->devnum > new->devnum) { /* Required number was free */ list_add_tail(&new->list, &d->list); goto added; } last_devnum = d->devnum; // 更新最新設備分配的次設備號 } ret = -EBUSY; if (new->devnum == -1) new->devnum = last_devnum+1; /* Check that the device and any partitions will get valid * minor numbers and that the disk naming code below can cope * with this number. */ if (new->devnum > (MINORMASK >> tr->part_bits) || (tr->part_bits && new->devnum >= 27 * 26)) { mutex_unlock(&blktrans_ref_mutex); goto error1; } list_add_tail(&new->list, &tr->devs); added: mutex_unlock(&blktrans_ref_mutex); mutex_init(&new->lock); kref_init(&new->ref); if (!tr->writesect) new->readonly = 1; /* Create gendisk */ ret = -ENOMEM; gd = alloc_disk(1 << tr->part_bits); // 分配一個gendisk結構體,設置分區個數 if (!gd) goto error2; new->disk = gd; gd->private_data = new; // 私有數據 gd->major = tr->major; // 設置主設備號 gd->first_minor = (new->devnum) << tr->part_bits; // 設置起始次設備號 gd->fops = &mtd_block_ops; // 設置塊設備操作函數 if (tr->part_bits) //0 if (new->devnum < 26) snprintf(gd->disk_name, sizeof(gd->disk_name), "%s%c", tr->name, 'a' + new->devnum); else snprintf(gd->disk_name, sizeof(gd->disk_name), "%s%c%c", tr->name, 'a' - 1 + new->devnum / 26, 'a' + new->devnum % 26); else // 設置磁盤名 即/dev/mtdblock%d snprintf(gd->disk_name, sizeof(gd->disk_name), "%s%d", tr->name, new->devnum); set_capacity(gd, ((u64)new->size * tr->blksize) >> 9); // 設置容量 單位扇區 /* Create the request queue */ spin_lock_init(&new->queue_lock); INIT_LIST_HEAD(&new->rq_list); new->tag_set = kzalloc(sizeof(*new->tag_set), GFP_KERNEL); if (!new->tag_set) goto error3; new->rq = blk_mq_init_sq_queue(new->tag_set, &mtd_mq_ops, 2, BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_BLOCKING); // 設置請求隊列,同時設置塊設備驅動行爲的回調函數爲mtd_mq_ops if (IS_ERR(new->rq)) { ret = PTR_ERR(new->rq); new->rq = NULL; goto error4; } if (tr->flush) blk_queue_write_cache(new->rq, true, false); new->rq->queuedata = new; blk_queue_logical_block_size(new->rq, tr->blksize); blk_queue_flag_set(QUEUE_FLAG_NONROT, new->rq); blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, new->rq); if (tr->discard) { blk_queue_flag_set(QUEUE_FLAG_DISCARD, new->rq); blk_queue_max_discard_sectors(new->rq, UINT_MAX); } gd->queue = new->rq; // 設置請求隊列 if (new->readonly) set_disk_ro(gd, 1); device_add_disk(&new->mtd->dev, gd, NULL); // 向內核註冊gendisk if (new->disk_attributes) { ret = sysfs_create_group(&disk_to_dev(gd)->kobj, new->disk_attributes); WARN_ON(ret); } return 0; error4: kfree(new->tag_set); error3: put_disk(new->disk); error2: list_del(&new->list); error1: return ret; }
從該函數我們可以看到無論註冊多少個MTD塊設備,其主設備號都是31,只是次設備號不一樣而已,主設備號用來表示一個特定的驅動程序。次設備號用來表示使用該驅動程序的各設備。
4.2.8 mtd_block_ops
這裏我們關注一下MTD塊設備操作集mtd_block_ops,定義在drivers/mtd/mtd_blkdevs.c。
static const struct block_device_operations mtd_block_ops = { .owner = THIS_MODULE, .open = blktrans_open, .release = blktrans_release, .ioctl = blktrans_ioctl, .getgeo = blktrans_getgeo, };
其中部分函數指針的意義:
- open:當打開一個MTD塊設備的時候被調用;
- release:當關閉一個MTD塊設備的時候被調用;
- getgeo:獲取驅動器的集合信息,獲取到的信息會被填充在一個hd_geometry結構中;
- ioctl:對MTD塊設備進行一些特殊操作時調用;
4.2.9 blktrans_open
static int blktrans_open(struct block_device *bdev, fmode_t mode) { struct mtd_blktrans_dev *dev = blktrans_dev_get(bdev->bd_disk); int ret = 0; if (!dev) return -ERESTARTSYS; /* FIXME: busy loop! -arnd*/ mutex_lock(&mtd_table_mutex); mutex_lock(&dev->lock); if (dev->open) goto unlock; kref_get(&dev->ref); __module_get(dev->tr->owner); if (!dev->mtd) goto unlock; if (dev->tr->open) { ret = dev->tr->open(dev); // 實際上調用了mtd_blktrans_ops的open函數 if (ret) goto error_put; } ret = __get_mtd_device(dev->mtd); if (ret) goto error_release; dev->file_mode = mode; unlock: dev->open++; mutex_unlock(&dev->lock); mutex_unlock(&mtd_table_mutex); blktrans_dev_put(dev); return ret; error_release: if (dev->tr->release) dev->tr->release(dev); error_put: module_put(dev->tr->owner); kref_put(&dev->ref, blktrans_dev_release); mutex_unlock(&dev->lock); mutex_unlock(&mtd_table_mutex); blktrans_dev_put(dev);
4.2.10 blktrans_ioctl
static int blktrans_ioctl(struct block_device *bdev, fmode_t mode, unsigned int cmd, unsigned long arg) { struct mtd_blktrans_dev *dev = blktrans_dev_get(bdev->bd_disk); int ret = -ENXIO; if (!dev) return ret; mutex_lock(&dev->lock); if (!dev->mtd) goto unlock; switch (cmd) { case BLKFLSBUF: ret = dev->tr->flush ? dev->tr->flush(dev) : 0; break; default: ret = -ENOTTY; } unlock: mutex_unlock(&dev->lock); blktrans_dev_put(dev); return ret; }
4.2.11 mtd_mq_ops
這裏我們關注一下MTD塊設備驅動mq的操作集合,定義在drivers/mtd/mtd_blkdevs.c。
static const struct blk_mq_ops mtd_mq_ops = { .queue_rq = mtd_queue_rq, };
在上一節分析我們已經知道將request請求派發給塊設備驅動的時候會被調用queue_rq函數,該函數本質上就是進行磁盤和內存之間的數據交互操作。比如將內存數據寫入磁盤、或者從磁盤讀取數據到內存等。
static blk_status_t mtd_queue_rq(struct blk_mq_hw_ctx *hctx, const struct blk_mq_queue_data *bd) { struct mtd_blktrans_dev *dev; dev = hctx->queue->queuedata; if (!dev) { blk_mq_start_request(bd->rq); return BLK_STS_IOERR; } spin_lock_irq(&dev->queue_lock); list_add_tail(&bd->rq->queuelist, &dev->rq_list); mtd_blktrans_work(dev); // 這裏就不細究了,讀取操作會調用mtdblock_tr.readsect、寫入操作會調用mtdblock_tr.writesect,有興趣自己研究哈 spin_unlock_irq(&dev->queue_lock); return BLK_STS_OK; }
4.3 MTD塊設備流程圖
register_mtd_blktrans函數執行流程如圖:
MTD塊設備的入口函數:
- 將blktrans_notifier添加到mtd_notifiers鏈表中;
- 上圖第一個雙向循環裏mtd_idr樹只有根節點,所以並不會進入循環,循環內這塊代碼不會執行;
- 然後接着註冊塊設備號主設備號,主設備號爲31,塊設備名稱爲mtdblock;
- 然後進入下面第二個循環裏,同理,第二個循環也不會進入。
然後在add_mtd_device(mtd)函數中:
- 爲mtd原始設備分配節點;
- 設置mtd原始設備的erasesize_shift、writesize_shift、erasesize_mask、writesize_mask等信息;
- 設置mtd原始設備對應的device類型變量所屬的class爲mtd_class,並設置其設備號,類型、名稱、driver_data;調用device_register完成名字爲mtd%d MTD字符設備的註冊;
- 調用device_create完成名字爲mtd%dro MTD字符設備的創建、初始化以及註冊;
- 遍歷blktrans_notifier,當查找到有blktrans_notifier時,就調用blktrans_notifier->add(mtd):
- 分配gendisk結構體,設置成員參數:
- private_data;
- 設置主設備號major(MTD_BLOCK_MAJOR,值爲31);
- 設置起始次設備號first_minor(如果註冊了多個MTD設備,該值是逐漸遞增的);
- 磁盤設備disk_name,設置爲mtdblock%d,會在/dev下創建該文件;
- 塊設備操作集fops;
- 初始化請求隊列;
- 最後註冊gendisk。
比如開發板啓動後,我們加載Nand Flash驅動後,可以查看到如下信息:
[root@zy:/]# ls /sys/class/mtd/ -l total 0 lrwxrwxrwx 1 0 0 0 Jan 1 01:19 mtd0 -> ../../devices/virtual/mtd/mtd0 lrwxrwxrwx 1 0 0 0 Jan 1 01:19 mtd0ro -> ../../devices/virtual/mtd/mtd0ro lrwxrwxrwx 1 0 0 0 Jan 1 01:19 mtd1 -> ../../devices/virtual/mtd/mtd1 lrwxrwxrwx 1 0 0 0 Jan 1 01:19 mtd1ro -> ../../devices/virtual/mtd/mtd1ro lrwxrwxrwx 1 0 0 0 Jan 1 01:19 mtd2 -> ../../devices/virtual/mtd/mtd2 lrwxrwxrwx 1 0 0 0 Jan 1 01:19 mtd2ro -> ../../devices/virtual/mtd/mtd2ro lrwxrwxrwx 1 0 0 0 Jan 1 01:19 mtd3 -> ../../devices/virtual/mtd/mtd3 lrwxrwxrwx 1 0 0 0 Jan 1 01:19 mtd3ro -> ../../devices/virtual/mtd/mtd3ro [root@zy:/]# ls -l /dev/mtd* crw-rw---- 1 0 0 90, 0 Jan 1 00:00 /dev/mtd0 crw-rw---- 1 0 0 90, 1 Jan 1 00:00 /dev/mtd0ro crw-rw---- 1 0 0 90, 2 Jan 1 00:00 /dev/mtd1 crw-rw---- 1 0 0 90, 3 Jan 1 00:00 /dev/mtd1ro crw-rw---- 1 0 0 90, 4 Jan 1 00:00 /dev/mtd2 crw-rw---- 1 0 0 90, 5 Jan 1 00:00 /dev/mtd2ro crw-rw---- 1 0 0 90, 6 Jan 1 00:00 /dev/mtd3 crw-rw---- 1 0 0 90, 7 Jan 1 00:00 /dev/mtd3ro brw-rw---- 1 0 0 31, 0 Jan 1 00:00 /dev/mtdblock0 brw-rw---- 1 0 0 31, 1 Jan 1 00:00 /dev/mtdblock1 brw-rw---- 1 0 0 31, 2 Jan 1 00:00 /dev/mtdblock2 brw-rw---- 1 0 0 31, 3 Jan 1 00:00 /dev/mtdblock3
五、mtdchar.c
之前我們已經介紹過mtdchar.c文件,該文件實現了MTD字符設備相關接口,我們直接定位到drivers/mtd/mtdchar.c文件,並對源碼進行解析。
5.1 模塊入口函數
static const struct file_operations mtd_fops = { // 字符設備操作集 .owner = THIS_MODULE, .llseek = mtdchar_lseek, .read = mtdchar_read, .write = mtdchar_write, .unlocked_ioctl = mtdchar_unlocked_ioctl, #ifdef CONFIG_COMPAT .compat_ioctl = mtdchar_compat_ioctl, #endif .open = mtdchar_open, .release = mtdchar_close, .mmap = mtdchar_mmap, #ifndef CONFIG_MMU .get_unmapped_area = mtdchar_get_unmapped_area, .mmap_capabilities = mtdchar_mmap_capabilities, #endif }; int __init init_mtdchar(void) { int ret; ret = __register_chrdev(MTD_CHAR_MAJOR, 0, 1 << MINORBITS, // MTD字符設備主設備號90, MINORBITS=20 "mtd", &mtd_fops); // 字符設備名稱爲mtd%d if (ret < 0) { pr_err("Can't allocate major number %d for MTD\n", MTD_CHAR_MAJOR); return ret; } return ret; }
5.2 __register_chrdev
定位到__register_chrdev函數,該函數位於fs/char_dev.c:
/** * __register_chrdev() - create and register a cdev occupying a range of minors * @major: major device number or 0 for dynamic allocation * @baseminor: first of the requested range of minor numbers * @count: the number of minor numbers required * @name: name of this range of devices * @fops: file operations associated with this devices * * If @major == 0 this functions will dynamically allocate a major and return * its number. * * If @major > 0 this function will attempt to reserve a device with the given * major number and will return zero on success. * * Returns a -ve errno on failure. * * The name of this device has nothing to do with the name of the device in * /dev. It only helps to keep track of the different owners of devices. If * your module name has only one type of devices it's ok to use e.g. the name * of the module here. */ int __register_chrdev(unsigned int major, unsigned int baseminor, unsigned int count, const char *name, const struct file_operations *fops) { struct char_device_struct *cd; struct cdev *cdev; int err = -ENOMEM; cd = __register_chrdev_region(major, baseminor, count, name); // 靜態註冊一組字符設備號 if (IS_ERR(cd)) return PTR_ERR(cd); cdev = cdev_alloc(); // 動態申請字符設備 if (!cdev) goto out2; cdev->owner = fops->owner; // 初始化字符設備 cdev->ops = fops; kobject_set_name(&cdev->kobj, "%s", name); err = cdev_add(cdev, MKDEV(cd->major, baseminor), count); // 將字符設備註冊到系統 if (err) goto out; cd->cdev = cdev; return major ? 0 : cd->major; out: kobject_put(&cdev->kobj); out2: kfree(__unregister_chrdev_region(cd->major, baseminor, count)); return err; }
實際上我們發現模塊入口函數中主要進行了:
- 字符設備號的申請,主設備號90,次設備號數量1<<20;
- 字符設備的動態申請;
- 字符設備的註冊;
但是這裏並沒有創建class類、以及類下的文件,這一塊是在add_mtd_device中實現的:
- 調用class_create、device_create生成/sys/class下的class類(這裏爲mtd)以及class類下的dev文件,供mdev程序掃描生成/dev下的節點;
參考文章
[2]痞子衡嵌入式:並行NAND接口標準(ONFI)及SLC Raw NAND簡介