作者:劉昊昱
博客:http://blog.csdn.net/liuhaoyutz
內核版本:3.10.1
一、kset結構定義
kset結構體定義在include/linux/kobject.h文件中,其內容如下:
142/**
143 * struct kset - a set of kobjects of a specific type, belonging to a specific subsystem.
144 *
145 * A kset defines a group of kobjects. They can be individually
146 * different "types" but overall these kobjects all want to be grouped
147 * together and operated on in the same manner. ksets are used to
148 * define the attribute callbacks and other common events that happen to
149 * a kobject.
150 *
151 * @list: the list of all kobjects for this kset
152 * @list_lock: a lock for iterating over the kobjects
153 * @kobj: the embedded kobject for this kset (recursion, isn't it fun...)
154 * @uevent_ops: the set of uevent operations for this kset. These are
155 * called whenever a kobject has something happen to it so that the kset
156 * can add new environment variables, or filter out the uevents if so
157 * desired.
158 */
159struct kset {
160 struct list_head list;
161 spinlock_t list_lock;
162 struct kobject kobj;
163 const struct kset_uevent_ops *uevent_ops;
164};
從註釋可以看出,kset是一組kobject的集合,這些kobject可以具有不同的“types”,下面來看kset的成員變量:
list用於將該kset下的所有kobject鏈接成一個鏈表。
list_lock是一個自旋鎖,在遍歷該kset下的kobject時用來加鎖。
kobj是代表該kset的一個kobject。
uevent_ops是一組函數指針,當kset中的某個kobject狀態發生變化需要通知用戶空間時,就通過這些函數來完成。uevent_ops是struct kset_uevent_ops類型,該結構體定義在include/linux/kobject.h文件中,其定義如下:
123struct kset_uevent_ops {
124 int (* const filter)(struct kset *kset, struct kobject *kobj);
125 const char *(* const name)(struct kset *kset, struct kobject *kobj);
126 int (* const uevent)(struct kset *kset, struct kobject *kobj,
127 struct kobj_uevent_env *env);
128};
關於kset_uevent_ops結構體中的成員函數的作用,我們後面再分析。
二、kset的創建和註冊
要創建並註冊一個kset,使用的是kset_create_and_add函數,該函數定義在lib/kobject.c文件中,其內容如下:
827/**
828 * kset_create_and_add - create a struct kset dynamically and add it to sysfs
829 *
830 * @name: the name for the kset
831 * @uevent_ops: a struct kset_uevent_ops for the kset
832 * @parent_kobj: the parent kobject of this kset, if any.
833 *
834 * This function creates a kset structure dynamically and registers it
835 * with sysfs. When you are finished with this structure, call
836 * kset_unregister() and the structure will be dynamically freed when it
837 * is no longer being used.
838 *
839 * If the kset was not able to be created, NULL will be returned.
840 */
841struct kset *kset_create_and_add(const char *name,
842 const struct kset_uevent_ops *uevent_ops,
843 struct kobject *parent_kobj)
844{
845 struct kset *kset;
846 int error;
847
848 kset = kset_create(name, uevent_ops, parent_kobj);
849 if (!kset)
850 return NULL;
851 error = kset_register(kset);
852 if (error) {
853 kfree(kset);
854 return NULL;
855 }
856 return kset;
857}
828行,從註釋可以看出,kset_create_and_add函數的作用是動態創建一個kset結構並把它註冊到sysfs文件系統中。注意該函數的三個參數:
name是kset的名字,它會被賦值給kset.kobj.name。
uevent_ops是struct kset_uevent_ops變量,它會被賦值給kset.uevent_ops。
parent_kobj是該kset的父kobject,它會被賦值給kset.kobj.parent。
848行,調用kset_create函數動態創建kset結構並對其進行初始化,該函數定義在lib/kobject.c文件中,其內容如下:
783/**
784 * kset_create - create a struct kset dynamically
785 *
786 * @name: the name for the kset
787 * @uevent_ops: a struct kset_uevent_ops for the kset
788 * @parent_kobj: the parent kobject of this kset, if any.
789 *
790 * This function creates a kset structure dynamically. This structure can
791 * then be registered with the system and show up in sysfs with a call to
792 * kset_register(). When you are finished with this structure, if
793 * kset_register() has been called, call kset_unregister() and the
794 * structure will be dynamically freed when it is no longer being used.
795 *
796 * If the kset was not able to be created, NULL will be returned.
797 */
798static struct kset *kset_create(const char *name,
799 const struct kset_uevent_ops *uevent_ops,
800 struct kobject *parent_kobj)
801{
802 struct kset *kset;
803 int retval;
804
805 kset = kzalloc(sizeof(*kset), GFP_KERNEL);
806 if (!kset)
807 return NULL;
808 retval = kobject_set_name(&kset->kobj, name);
809 if (retval) {
810 kfree(kset);
811 return NULL;
812 }
813 kset->uevent_ops = uevent_ops;
814 kset->kobj.parent = parent_kobj;
815
816 /*
817 * The kobject of this kset will have a type of kset_ktype and belong to
818 * no kset itself. That way we can properly free it when it is
819 * finished being used.
820 */
821 kset->kobj.ktype = &kset_ktype;
822 kset->kobj.kset = NULL;
823
824 return kset;
825}
805行,爲kset結構分配內存空間。
808行,將name參數賦值給kset.kobj.name。它對應kset在sysfs文件系統中的目錄名。
813行,將uevent_ops賦值給kset->uevent_ops。
814行,將parent_kobj 賦值給kset->kobj.parent。
816-822行,由註釋可以知道,kset.kobj.ktype被賦於一個kset_ktype類型,並且kset.kobj.kset爲NULL,即該kset不屬於任何其它kset。這樣可以保證在不再繼續使用該kset時可以正確的釋放它。這裏我們要來看一下kset_ktype的定義,它定義在lib/kobject.c文件中,其內容如下:
778static struct kobj_type kset_ktype = {
779 .sysfs_ops = &kobj_sysfs_ops,
780 .release = kset_release,
781};
kobj_sysfs_ops定義在lib/kobject.c文件中,其內容如下:
708const struct sysfs_ops kobj_sysfs_ops = {
709 .show = kobj_attr_show,
710 .store = kobj_attr_store,
711};
結合上篇文章中對kobject的分析,我們可以得出如下結論:
如果用戶空間程序要對kset對應的sysfs文件系統下的屬性文件進行讀操作時,kobj_attr_show函數會被調用。
如果用戶空間程序要對kset對應的sysfs文件系統下的屬性文件進行寫操作時,
kobj_attr_store函數會被調用。
下面我們來看kobj_attr_show函數,它定義在lib/kobject.c文件中:
683/* default kobject attribute operations */
684static ssize_t kobj_attr_show(struct kobject *kobj, struct attribute *attr,
685 char *buf)
686{
687 struct kobj_attribute *kattr;
688 ssize_t ret = -EIO;
689
690 kattr = container_of(attr, struct kobj_attribute, attr);
691 if (kattr->show)
692 ret = kattr->show(kobj, kattr, buf);
693 return ret;
694}
注意683行的註釋,這是默認的kobject attribute操作函數。在這函數中,通過container_of取得包含attr變量的struct kobj_attribute變量kattr,然後調用kattr->show()函數。
kobj_attr_store函數與kobj_attr_show函數類似,同樣定義在lib/kobject.c文件中:
696static ssize_t kobj_attr_store(struct kobject *kobj, struct attribute *attr,
697 const char *buf, size_t count)
698{
699 struct kobj_attribute *kattr;
700 ssize_t ret = -EIO;
701
702 kattr = container_of(attr, struct kobj_attribute, attr);
703 if (kattr->store)
704 ret = kattr->store(kobj, kattr, buf, count);
705 return ret;
706}
在該函數中,通過container_of取得包含attr變量的struct kobj_attribute變量kattr,然後調用kattr->store()函數。
這樣,如果用戶空間程序要對kset對應的sysfs文件系統下的屬性文件進行讀寫操作時,就會轉而調用包含相應attribute的kobj_attribute結構體的show/store函數。實際上這種用法是和宏__ATTR結合在一起使用的,後面我們會再分析。
到此,kobject_create函數我們就分析完了,回到kset_create_and_add函數,
851行,調用kset_register(kset)函數註冊kset,該函數定義在lib/kobject.c文件中,其內容如下:
713/**
714 * kset_register - initialize and add a kset.
715 * @k: kset.
716 */
717int kset_register(struct kset *k)
718{
719 int err;
720
721 if (!k)
722 return -EINVAL;
723
724 kset_init(k);
725 err = kobject_add_internal(&k->kobj);
726 if (err)
727 return err;
728 kobject_uevent(&k->kobj, KOBJ_ADD);
729 return 0;
730}
724行,首先對kset進行初始化。kset的初始化是通過調用kset_init函數完成的,該函數定義在lib/kobject.c文件中,其內容如下:
672/**
673 * kset_init - initialize a kset for use
674 * @k: kset
675 */
676void kset_init(struct kset *k)
677{
678 kobject_init_internal(&k->kobj);
679 INIT_LIST_HEAD(&k->list);
680 spin_lock_init(&k->list_lock);
681}
可見,只是簡單初始化kset.kobj,kset.list,和kset.list_lock。
725行,將kset.kobj加入到kobject層次結構和sysfs文件系統中。
728行,調用kobject_uevent(&k->kobj, KOBJ_ADD),通知用戶空間添加了一個kobject,即kset.kobj。kobject_uevent函數定義在lib/kobject_uevent.c文件中,其內容如下:
322/**
323 * kobject_uevent - notify userspace by sending an uevent
324 *
325 * @action: action that is happening
326 * @kobj: struct kobject that the action is happening to
327 *
328 * Returns 0 if kobject_uevent() is completed with success or the
329 * corresponding error when it fails.
330 */
331int kobject_uevent(struct kobject *kobj, enum kobject_action action)
332{
333 return kobject_uevent_env(kobj, action, NULL);
334}
從註釋可以看出,kobject_uevent函數的作用是通過發送一個uevent通知用戶空間內核中發生了某些事情。至於發生了什麼事情,由第二個參數action指定,action是enum kobject_action類型變量,定義在include/linux/kobject.h文件中,其內容如下:
40/*
41 * The actions here must match the index to the string array
42 * in lib/kobject_uevent.c
43 *
44 * Do not add new actions here without checking with the driver-core
45 * maintainers. Action strings are not meant to express subsystem
46 * or device specific properties. In most cases you want to send a
47 * kobject_uevent_env(kobj, KOBJ_CHANGE, env) with additional event
48 * specific variables added to the event environment.
49 */
50enum kobject_action {
51 KOBJ_ADD,
52 KOBJ_REMOVE,
53 KOBJ_CHANGE,
54 KOBJ_MOVE,
55 KOBJ_ONLINE,
56 KOBJ_OFFLINE,
57 KOBJ_MAX
58};
可見,一共有這7種事件可以通知用戶空間。
回到kobject_uevent函數,333行,調用kobject_uevent_env函數來發送uevent,該函數定義在lib/kobject_uevent.c文件中,其內容如下:
121/**
122 * kobject_uevent_env - send an uevent with environmental data
123 *
124 * @action: action that is happening
125 * @kobj: struct kobject that the action is happening to
126 * @envp_ext: pointer to environmental data
127 *
128 * Returns 0 if kobject_uevent_env() is completed with success or the
129 * corresponding error when it fails.
130 */
131int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
132 char *envp_ext[])
133{
134 struct kobj_uevent_env *env;
135 const char *action_string = kobject_actions[action];
136 const char *devpath = NULL;
137 const char *subsystem;
138 struct kobject *top_kobj;
139 struct kset *kset;
140 const struct kset_uevent_ops *uevent_ops;
141 int i = 0;
142 int retval = 0;
143#ifdef CONFIG_NET
144 struct uevent_sock *ue_sk;
145#endif
146
147 pr_debug("kobject: '%s' (%p): %s\n",
148 kobject_name(kobj), kobj, __func__);
149
150 /* search the kset we belong to */
151 top_kobj = kobj;
152 while (!top_kobj->kset && top_kobj->parent)
153 top_kobj = top_kobj->parent;
154
155 if (!top_kobj->kset) {
156 pr_debug("kobject: '%s' (%p): %s: attempted to send uevent "
157 "without kset!\n", kobject_name(kobj), kobj,
158 __func__);
159 return -EINVAL;
160 }
161
162 kset = top_kobj->kset;
163 uevent_ops = kset->uevent_ops;
164
165 /* skip the event, if uevent_suppress is set*/
166 if (kobj->uevent_suppress) {
167 pr_debug("kobject: '%s' (%p): %s: uevent_suppress "
168 "caused the event to drop!\n",
169 kobject_name(kobj), kobj, __func__);
170 return 0;
171 }
172 /* skip the event, if the filter returns zero. */
173 if (uevent_ops && uevent_ops->filter)
174 if (!uevent_ops->filter(kset, kobj)) {
175 pr_debug("kobject: '%s' (%p): %s: filter function "
176 "caused the event to drop!\n",
177 kobject_name(kobj), kobj, __func__);
178 return 0;
179 }
180
181 /* originating subsystem */
182 if (uevent_ops && uevent_ops->name)
183 subsystem = uevent_ops->name(kset, kobj);
184 else
185 subsystem = kobject_name(&kset->kobj);
186 if (!subsystem) {
187 pr_debug("kobject: '%s' (%p): %s: unset subsystem caused the "
188 "event to drop!\n", kobject_name(kobj), kobj,
189 __func__);
190 return 0;
191 }
192
193 /* environment buffer */
194 env = kzalloc(sizeof(struct kobj_uevent_env), GFP_KERNEL);
195 if (!env)
196 return -ENOMEM;
197
198 /* complete object path */
199 devpath = kobject_get_path(kobj, GFP_KERNEL);
200 if (!devpath) {
201 retval = -ENOENT;
202 goto exit;
203 }
204
205 /* default keys */
206 retval = add_uevent_var(env, "ACTION=%s", action_string);
207 if (retval)
208 goto exit;
209 retval = add_uevent_var(env, "DEVPATH=%s", devpath);
210 if (retval)
211 goto exit;
212 retval = add_uevent_var(env, "SUBSYSTEM=%s", subsystem);
213 if (retval)
214 goto exit;
215
216 /* keys passed in from the caller */
217 if (envp_ext) {
218 for (i = 0; envp_ext[i]; i++) {
219 retval = add_uevent_var(env, "%s", envp_ext[i]);
220 if (retval)
221 goto exit;
222 }
223 }
224
225 /* let the kset specific function add its stuff */
226 if (uevent_ops && uevent_ops->uevent) {
227 retval = uevent_ops->uevent(kset, kobj, env);
228 if (retval) {
229 pr_debug("kobject: '%s' (%p): %s: uevent() returned "
230 "%d\n", kobject_name(kobj), kobj,
231 __func__, retval);
232 goto exit;
233 }
234 }
235
236 /*
237 * Mark "add" and "remove" events in the object to ensure proper
238 * events to userspace during automatic cleanup. If the object did
239 * send an "add" event, "remove" will automatically generated by
240 * the core, if not already done by the caller.
241 */
242 if (action == KOBJ_ADD)
243 kobj->state_add_uevent_sent = 1;
244 else if (action == KOBJ_REMOVE)
245 kobj->state_remove_uevent_sent = 1;
246
247 mutex_lock(&uevent_sock_mutex);
248 /* we will send an event, so request a new sequence number */
249 retval = add_uevent_var(env, "SEQNUM=%llu", (unsigned long long)++uevent_seqnum);
250 if (retval) {
251 mutex_unlock(&uevent_sock_mutex);
252 goto exit;
253 }
254
255#if defined(CONFIG_NET)
256 /* send netlink message */
257 list_for_each_entry(ue_sk, &uevent_sock_list, list) {
258 struct sock *uevent_sock = ue_sk->sk;
259 struct sk_buff *skb;
260 size_t len;
261
262 if (!netlink_has_listeners(uevent_sock, 1))
263 continue;
264
265 /* allocate message with the maximum possible size */
266 len = strlen(action_string) + strlen(devpath) + 2;
267 skb = alloc_skb(len + env->buflen, GFP_KERNEL);
268 if (skb) {
269 char *scratch;
270
271 /* add header */
272 scratch = skb_put(skb, len);
273 sprintf(scratch, "%s@%s", action_string, devpath);
274
275 /* copy keys to our continuous event payload buffer */
276 for (i = 0; i < env->envp_idx; i++) {
277 len = strlen(env->envp[i]) + 1;
278 scratch = skb_put(skb, len);
279 strcpy(scratch, env->envp[i]);
280 }
281
282 NETLINK_CB(skb).dst_group = 1;
283 retval = netlink_broadcast_filtered(uevent_sock, skb,
284 0, 1, GFP_KERNEL,
285 kobj_bcast_filter,
286 kobj);
287 /* ENOBUFS should be handled in userspace */
288 if (retval == -ENOBUFS || retval == -ESRCH)
289 retval = 0;
290 } else
291 retval = -ENOMEM;
292 }
293#endif
294 mutex_unlock(&uevent_sock_mutex);
295
296 /* call uevent_helper, usually only enabled during early boot */
297 if (uevent_helper[0] && !kobj_usermode_filter(kobj)) {
298 char *argv [3];
299
300 argv [0] = uevent_helper;
301 argv [1] = (char *)subsystem;
302 argv [2] = NULL;
303 retval = add_uevent_var(env, "HOME=/");
304 if (retval)
305 goto exit;
306 retval = add_uevent_var(env,
307 "PATH=/sbin:/bin:/usr/sbin:/usr/bin");
308 if (retval)
309 goto exit;
310
311 retval = call_usermodehelper(argv[0], argv,
312 env->envp, UMH_WAIT_EXEC);
313 }
314
315exit:
316 kfree(devpath);
317 kfree(env);
318 return retval;
319}
122行,從註釋可以看出,kobject_uevent_env函數的作用是發送帶有環境變量數據的uevent。
150-160行,查找kobject所屬的kset,如果這個kobject沒有所屬的kset,則看這個kobject.parent有沒有所屬的kset,如果還沒有,繼續沿着kobject層次結構樹向上查找,直到找到一個具有所屬kset的祖先kobject,如果確實沒有找到,則出錯退出。所以當前kobject的層次結構樹中,必須有一個具有所屬的kset。因爲對事件的處理函數包含在kobject.kset.uevent_ops中,要處理事件,就必須找到上層一個不爲空的kset。
值得注意的是,在創建kset的過程中,kset_create_and_add->kset_create,在kset_create函數中,將kset.kobj.kset設置爲NULL,所以kset.kobj本身沒有所屬的kset,但是同樣在kset_create函數中,kset.kobj.parent設置爲parent_kobj,所以kset.kobj必然通過其上層祖先查找kset。
162行,取得相應的kset。
163行,將kset.uevent_ops賦值給uevent_ops變量。
165-171行,如果kobj->uevent_suppress被設置爲1,則不發送uevent,退出。
172-179行,如果uevent_ops->filter(kset, kobj)返回值爲0,說明kobj希望發送的uevent被頂層kset過濾掉了,不再發送。
181-191行,通過uevent_ops->name函數取得子系統名,如果uevent_ops->name爲NULL,則使用kset.kobj.name做爲子系統名。事實上,一個kset就是一個所謂的“subsystem”。
194行,分配struct kobj_uevent_env變量空間給env,該結構體用來保存環境變量,它定義在include/linux/kobject.h文件中,其內容如下:
116struct kobj_uevent_env {
117 char *envp[UEVENT_NUM_ENVP];
118 int envp_idx;
119 char buf[UEVENT_BUFFER_SIZE];
120 int buflen;
121};
199行,調用kobject_get_path取得kobject的絕對路徑。
205-214行,調用add_uevent_var函數將ACTION、DEVPATH、SUBSYSTEM三個默認環境變量添加到env中。add_uevent_var函數定義在lib/kobject_uevent.c文件中,其作用是“add key value string to the environment buffer”。
217-223行,如果調用kobject_uevent_env函數時,通過第三個參數envp_ext傳遞進來了其它相關環境變量,也通過add_uevent_var函數添加到env中。
225-234行,如果uevent_ops->uevent不爲空,則調用uevent_ops->uevent,kset可以通過該函數完成自己特定的功能。
236-246行,如果action是KOBJ_ADD,則設置kobj->state_add_uevent_sent爲1。如果action是KOBJ_REMOVE,則設置kobj->state_remove_uevent_sent爲1。其作用註釋中說的很清楚“Mark "add" and "remove" events in the object to ensure proper events to userspace during automatic cleanup. If the object did send an "add" event, "remove" will automatically generated by the core, if not already done by the caller.”。
249行,將SEQNUM環境變量添加到env中。
kobject_uevent_env函數剩下的部分,用來和用戶空間進程進行交互(或者在內核空間啓動執行一個用戶空間程序)。在Linux中,有兩種方式完成這種交互,一個是代碼中由CONFIG_NET宏包含的部分,即255-293行,這部分代碼通過udev的方式向用戶空間廣播當前kset對象中的uevent事件。另外一種方式是在內核空間啓動一個用戶空間進程/sbin/hotplug,通過給該進程傳遞內核設定的環境變量的方式來通知用戶空間kset對象中的uevent事件,即代碼中296-312行。
熱插拔(hotplug)是指當有設備插入或撥出系統時,內核可以檢測到這種狀態變化,並通知用戶空間加載或移除該設備對應的驅動程序模塊。在Linux系統上內核有兩種機制可以通知用戶空間執行加載或移除操作,一種是udev,另一種是/sbin/hotplug,在Linux發展的早期,只有/sbin/hotplug,它的幕後推手是內核中的call_usermodehelper函數,它能從內核空間啓動一個用戶空間程序。隨着內核的發展,出現了udev機制並逐漸取代了/sbin/hotplug。udev的實現基於內核中的網絡機制,它通過創建標準的socket接口來監聽來自內核的網絡廣播包,並對接收到的包進行分析處理。
至此,kobject_uevent_env函數我們就分析完了,同時,kobject_uevent、kset_register、kset_create_and_add函數也分析完了,我們瞭解了kset的創建和註冊過程。