當設備接收到一個包,會從類型字段得到協議類型,如:IP,802.3,ARP,IPv6等。然後根據類型,調用不同的處理函數,這類似面向對象的操作,通過下面的方式實現:
- 定義一個結構,用來將類型和函數對應起來
[ include/linux/netdevice.h ]struct packet_type { __be16 type; /* This is really htons(ether_type).包的類型 */ struct net_device *dev; /* NULL is wildcarded here.對應的網絡設備 */ int (*func) (struct sk_buff *, struct net_device *, struct packet_type *, struct net_device *); bool (*id_match)(struct packet_type *ptype, struct sock *sk); void *af_packet_priv; struct list_head list; };
- 定義一個全局列表,所有packet_type類型爲ETH_P_ALL(接收所有類型的包)的都掛在此列表上
[ net/core/dev.c ]
struct list_head ptype_all __read_mostly; /* Taps */ - 定義一個哈希表,其中的key爲包的類型
[ net/core/dev.c ]
struct list_head ptype_base[PTYPE_HASH_SIZE] __read_mostly;
[ include/linux/netdevice.h ]/* * The list of packet types we will receive (as opposed to discard) * and the routines to invoke. * * Why 16. Because with 16 the only overlap we get on a hash of the * low nibble of the protocol value is RARP/SNAP/X.25. * * NOTE: That is no longer true with the addition of VLAN tags. Not * sure which should go first, but I bet it won't make much * difference if we are running VLANs. The good news is that * this protocol won't be in the list unless compiled in, so * the average user (w/out VLANs) will not be adversely affected. * --BLG * * 0800 IP * 8100 802.1Q VLAN * 0001 802.3 * 0002 AX.25 * 0004 802.2 * 8035 RARP * 0005 SNAP * 0805 X.25 * 0806 ARP * 8137 IPX * 0009 Localtalk * 86DD IPv6 */ #define PTYPE_HASH_SIZE (16) #define PTYPE_HASH_MASK (PTYPE_HASH_SIZE - 1)
以下函數用來向全局列表註冊packet_type類型:
[ net/core/dev.c ]
/*
* Add a protocol ID to the list. Now that the input handler is
* smarter we can dispense with all the messy stuff that used to be
* here.
*
* BEWARE!!! Protocol handlers, mangling input packets,
* MUST BE last in hash buckets and checking protocol handlers
* MUST start from promiscuous ptype_all chain in net_bh.
* It is true now, do not change it.
* Explanation follows: if protocol handler, mangling packet, will
* be the first on list, it is not able to sense, that packet
* is cloned and should be copied-on-write, so that it will
* change it and subsequent readers will get broken packet.
* --ANK (980803)
*/
static inline struct list_head *ptype_head(const struct packet_type *pt)
{
if (pt->type == htons(ETH_P_ALL)) // 接收所有類型的包
return &ptype_all;
else
return &ptype_base[ntohs(pt->type) & PTYPE_HASH_MASK];
}
/**
* dev_add_pack - add packet handler
* @pt: packet type declaration
*
* Add a protocol handler to the networking stack. The passed &packet_type
* is linked into kernel lists and may not be freed until it has been
* removed from the kernel lists.
*
* This call does not sleep therefore it can not
* guarantee all CPU's that are in middle of receiving packets
* will see the new packet type (until the next received packet).
*/
void dev_add_pack(struct packet_type *pt)
{
struct list_head *head = ptype_head(pt); // 得到要掛載的列表
spin_lock(&ptype_lock);
list_add_rcu(&pt->list, head); // 將pt掛到列表上
spin_unlock(&ptype_lock);
}
EXPORT_SYMBOL(dev_add_pack);
爲提高接收和發送的效率,尤其是在大負載下的效率,內核作了特別的處理,就是所謂的offload。內核提供了與上述相同的實現方法:
- 所有接收的包都有不同的類型,如IP,802.3,ARP,IPv6等,每種類型都有對應的packet_offload類型
[ include/linux/netdevice.h ]struct packet_offload { __be16 type; /* This is really htons(ether_type). */ struct offload_callbacks callbacks; struct list_head list; }; struct offload_callbacks { struct sk_buff *(*gso_segment)(struct sk_buff *skb, netdev_features_t features); int (*gso_send_check)(struct sk_buff *skb); struct sk_buff **(*gro_receive)(struct sk_buff **head, struct sk_buff *skb); int (*gro_complete)(struct sk_buff *skb, int nhoff); };
- 內核申明一個全局數組offload_base
[ net/core/dev.c ]static struct list_head offload_base __read_mostly;
通過下面的函數將packet_offload註冊到數組中:
[ net/core/dev.c ]
/**
* dev_add_offload - register offload handlers
* @po: protocol offload declaration
*
* Add protocol offload handlers to the networking stack. The passed
* &proto_offload is linked into kernel lists and may not be freed until
* it has been removed from the kernel lists.
*
* This call does not sleep therefore it can not
* guarantee all CPU's that are in middle of receiving packets
* will see the new offload handlers (until the next received packet).
*/
void dev_add_offload(struct packet_offload *po)
{
struct list_head *head = &offload_base; // 全局列表
spin_lock(&offload_lock);
list_add_rcu(&po->list, head);
spin_unlock(&offload_lock);
}
EXPORT_SYMBOL(dev_add_offload);