Byte and Bit Order Dissection(轉載)

Byte and Bit Order Dissection

September 2nd, 2003 by Kevin Kaichuan He in

Discussing the differences between big and little endianness, bit and byte order and what it all means.

Editors' Note: This article has been updated since its original posting.

Software and hardware engineers who have to deal with byte and bit order issues know the process is like walking a maze. Though we usually come out of it, we consume a handful of our brain cells each time. This article tries to summarize the various areas in which the business of byte and bit order plays a role, including CPU, buses, devices and networking protocols. We dive into the details and hope to provide a good reference on this topic. The article also tries to suggest some guidelines and rules of thumb developed from practice.

Byte Order: the Endianness

We probably are familiar with the word endianness. First introduced by Danny Cohen in 1980, it describes the method a computer system uses to represent multi-byte integers.

Two types of endianness exist, big endian and little endian. Big endian refers to the method that stores the most significant byte of an integer at the lowest byte address. Little endian is the opposite; it refers to the method of storing the most significant byte of an integer at the highest byte address.

Bit order usually follows the same endianness as the byte order for a given computer system. That is, in a big endian system the most significant bit is stored at the lowest bit address; in a little endian system, the least significant bit is stored at the lowest bit address.

Every effort is made to avoid bit swapping in software when designing a system, because bit swapping is both expensive and tedious. Later sections describe how hardware takes care of it.

Documentation Guideline

Just as most people write a number from left to right, the layout of a multi-byte integer should flow from left to right, that is, from the most significant to the least significant byte. This is the most clear way to write integers, as we can see in the following examples.

Here is how we would write the integer 0x0a0b0c0d for both big endian and little endian systems, according to the rule above:

Write Integer for Big Endian System

byte  addr       0         1       2        3
bit  offset  01234567 01234567 01234567 01234567
     binary  00001010 00001011 00001100 00001101
        hex     0a       0b      0c        0d

Write Integer for Little Endian System

byte  addr      3         2       1        0
bit  offset  76543210 76543210 76543210 76543210
     binary  00001010 00001011 00001100 00001101
        hex     0a       0b      0c        0d

In both cases above, we can read from left to right and the number is 0x0a0b0c0d.

If we do not follow the rule, we might write the number in the following way:

byte  addr      0         1       2        3
bit  offset  01234567 01234567 01234567 01234567
     binary  10110000 00110000 11010000 01010000

As you can see, it's hard to make out what number we're trying to represent.

Simplified Computer System Used in this Article

Without losing generality, a simplified view of the computer system discussed in this article is drawn below.

CPU, local bus and internal memory/cache all are considered to be CPU, because they usually share the same endianness. Discussion of bus endianness, however, covers only external bus. The CPU register width, memory word width and bus width are assumed to be 32 bits for this article.

Endianness of CPU

The CPU endianness is the byte and bit order in which it interprets multi-byte integers from on-chip registers, local bus, in-line cache, memory and so on.

Little endian CPUs include Intel and DEC. Big endian CPUs include Motorola 680x0, Sun Sparc and IBM (e.g., PowerPC). MIPs and ARM can be configured either way.

The CPU endianness affects the CPU's instruction set. Different GNU C toolchains for compiling the C code ought to be used for CPUs of different endianness. For example, mips-linux-gcc and mipsel-linux-gcc are used to compile MIPs code for big endian and little endian, respectively.

The CPU endianness also has an impact on software programs if we need to access part of a multi-byte integer. The following program illustrates that situation. If one accesses the whole 32-bit integer, the CPU endianness is invisible to software programs.

union {
    uint32_t my_int;
    uint8_t  my_bytes[4];
} endian_tester;
endian_tester et;
et.my_int = 0x0a0b0c0d;
if(et.my_bytes[0] == 0x0a )
    printf( "I'm on a big-endian system/n" );
else
    printf( "I'm on a little-endian system/n" );
Endianness of Bus

The bus we refer to here is the external bus we showed in the figure above. We use PCI as an example below. The bus, as we know, is an intermediary component that interconnects CPUs, devices and various other components on the system. The endianness of bus is a standard for byte/bit order that bus protocol defines and with which other components comply.

Take an example of the PCI bus known as little endian. It implies the following: among the 32 address/data bus line AD [31:0], it expects a 32-bit device and connects its most significant data line to AD31 and least significant data line to AD0. A big endian bus protocol would be the opposite.

For a partial word device connected to bus, for example, an 8-bit device, little endian bus-like PCI specifies that the eight data lines of the device be connected to AD[7:0]. For a big endian bus protocol, it would be connected to AD[24:31].

In addition, for PCI bus the protocol requires each PCI device to implement a configuration space. This is a set of configuration registers that have the same byte order as the bus.

Just as all the devices need to follow bus's rules regarding byte/bit endianness, so does the CPU. If a CPU operates in an endianness different from the bus, the bus controller/bridge usually is the place where the conversion is performed.

An alert reader nows ask this question, "so what happens if the endianness of the device is different from the endianness of the bus?" In this case, we need to do some extra work for communication to occur, which is covered in the next section.

Endianness of Devices

Kevin's Theory #1: When a multi-byte data unit travels across the boundary of two reverse endian systems, the conversion is made such that memory contiguousness to the unit is preserved.

We assume CPU and bus share the same endianness in the following discussion. If the endianness of a device is the same as that of CPU/bus, then no conversion is needed.

In the case of different endianness between the device and the CPU/bus, we offer two solutions here from a hardware wiring point of view. We assume CPU/bus is little endian and the device is big endian in the following discussion.

Word Consistent Approach

In this approach, we swap the entire 32-bit word of the device data line. We represent the data line of device as D[0:31], where D(0) stores the most significant bit, and bus line as AD[31:0]. This approach suggests wiring D(i) to AD(31-i), where i = 0, ..., 31. Word Consistent means the semantic of the whole word is preserved.

To illustrate, the following code represents a 32-bit descriptor register in a big endian NIC card:

After applying the Word Consistent swap (wiring D[0:31] to AD[31:0]) , the result in the CPU/bus is:

Notice that it automatically is little endian for CPU/bus. No software byte or bit swapping is needed.

The above example is for those simple cases where data does not cross a 32-bit memory boundary. Now, let's take a look at a case where it does. In the following code, vlan[0:24] has a value of 0xabcdef and crosses a 32-bit memory boundary.

After the Word Consistent swap, the result is:

Do you see what happened? The vlan field has been broken into two noncontiguous memory spaces: bytes[1:0] and byte(7). It violates Kevin's Theory #1, and we are not able to define a nice C structure to access the in-contiguous vlan fields.

Therefore, the Word Consistent solution works only for data within word boundaries and does not work for data that may cross a word boundary. The second approach solves this problem for us.

Byte Consistent Approach

In this approach, we do not swap bytes, but we do swap the bits within each byte lane (bit at device bit-offset i goes to bus bit-offset (7-i), where i=0...7) in hardware wiring. Byte Consistent means the semantic of the byte is preserved.

After applying this method, the big endian NIC device value in above results in this CPU/bus value:

Now, the three bytes of the vlan field are in contiguous memory space, and the content of each byte reads correctly. But this result still looks messy in byte order. However, because we now occupy a contiguous memory space, let the software do a byte swap for this 5-byte data structure. We get the following result:

We see that software byte swapping needs to be performed as the second procedure in this approach. Byte swapping is affordable in software, unlike bit swapping.

Kevin's Theory #2: In a C structure that contains bit fields, if field A is defined in front of field B, then field A always occupies a lower bit address than field B.

Now that everything is sorted out nicely, we can define the C structure as the following to access the descriptor in the NIC:

struct nic_tag_reg {
        uint64_t vlan:24 __attribute__((packed));
        uint64_t rx  :6  __attribute__((packed));
        uint64_t tag :10 __attribute__((packed));
};
Endianness of Network Protocols

The endianness of network protocols defines the order in which the bits and bytes of an integer field of a network protocol header are sent and received. We also introduce a term called wire address here. A lower wire address bit or byte always is transmitted and received in front of a higher wire address bit or byte.

In fact, for network endianness, it is a little different than what we have seen so far. Another factor is in the picture: the bit transmission/reception order on the physical wire. Lower layer protocols, such as Ethernet, have specifications for bit transmission/reception order, and sometimes it can be the reverse of the upper layer protocol endianness. We look at this situation in our examples.

The endianness of NIC devices usually follow the endianness of the network protocols they support, so it could be different from the endianness of the CPU on the system. Most network protocols are big endian; here we take Ethernet and IP as examples.

Endianness of Ethernet

Ethernet is big endian. This means the most significant byte of an integer field is placed at a lower wire byte address and transmitted/received in front of the least significant byte. For example, the protocol field with a value of 0x0806(ARP) in the Ethernet header has a wire layout like this:

wire byte offset:     0       1
hex             :    08      06

Notice that the MAC address field of the Ethernet header is considered as a string of characters, in which case the byte order does not matter. For example, a MAC address 12:34:56:78:9a:bc has a layout on the wire like that shown below, and byte 12 is transmitted first.

Bit Transmission/Reception Order

The bit transmission/reception order specifies how the bits within a byte are transmitted/received on the wire. For Ethernet, the order is from the least significant bit (lower wire address offset) to the most significant bit (higher wire address offset). This apparently is little endian. The byte order remains the same as big endian, as described in early section. Therefore, here we see the situation where the byte order and the bit transmission/reception order are the reverse.

The following is an illustration of Ethernet bit transmission/reception order:

We see from this that the group (multicast) bit, the least significant bit of the first byte, appeared as the first bit on the wire. Ethernet and 802.3 hardware behave consistently with the bit transmission/reception order above.

In this case, where the protocol byte order and the bit transmission/reception order are different, the NIC must convert the bit transmission/reception order from/to the host(CPU) bit order. By doing so, the upper layers do not have to worry about bit order and need only to sort out the byte order. In fact, this is another form of the Byte Consistent approach, where byte semantics are preserved when data travels across different endian domains.

The bit transmission/reception order generally is invisible to the CPU and software, but is important to hardware considerations such as the serdes (serializer/deserializer) of PHY and the wiring of NIC device data lines to the bus.

Parsing Ethernet Header in Software

For either endianness, the Ethernet header can be parsed by software with the C structure below:

struct ethhdr
{
        unsigned char   h_dest[ETH_ALEN];       
        unsigned char   h_source[ETH_ALEN];     
        unsigned short  h_proto;                
};

The h_dest and h_source fields are byte arrays, so no conversion is needed. The h_proto field here is an integer, therefore a ntohs() is needed before the host accesses this field, and htons() is needed before the host fills up this field.

Endianness of IP

IP's byte order also is big endian. The bit endianness of IP inherits that of the CPU, and the NIC takes care of converting it from/to the bit transmission/reception order on the wire.

For big endian hosts, IP header fields can be accessed directly. For little endian hosts, which are most PCs in the world (x86), byte swap needs to be be performed in software for the integer fields in the IP header.

Below is the structure of iphdr from the Linux kernel. We use ntohs() before reading integer fields and htons() before writing them. Essentially, these two functions do nothing for big endian hosts and perform byte swapping for little endian hosts.

struct iphdr {
#if defined(__LITTLE_ENDIAN_BITFIELD)
        __u8    ihl:4,
                version:4;
#elif defined (__BIG_ENDIAN_BITFIELD)
        __u8    version:4,
                ihl:4;
#else
#error  "Please fix <asm/byteorder.h>"
#endif
        __u8    tos;
        __u16   tot_len;
        __u16   id;
        __u16   frag_off;
        __u8    ttl;
        __u8    protocol;
        __u16   check;
        __u32   saddr;
        __u32   daddr;
        /*The options start here. */
};

Take a look at some interesting fields in the IP header:

version and ihl fields: According to IP standard, version is the most significant four bits of the first byte of an IP header. ihl is the least significant four bits of the first byte of the IP header.

There are two methods to access these fields. Method 1 directly extracts them from the data. If ver_ihl holds the first byte of the IP header, then (ver_ihl & 0x0f) gives the ihl field and (ver_ihl > > 4) gives the version field. This applies for hosts with either endianness.

Method 2 is to define the structure as above, then access these fields from the structure itself. In the above structure, if the host is little endian, then we define ihl before version; if the host is big endian, we define version before ihl. If we apply Kevin's Theory #2 here that an earlier defined field always occupies a lower memory address, we find that the above definition in C structure fits the IP standard pretty well.

saddr and daddr fields: these two fields can be treated as either byte or integer arrays. If they are treated as byte arrays, there is no need to do endianness conversion. If they are treated as integers, then conversions need to be performed as needed. Below is a function with integer interpretation:

/*  dot2ip - convert a dotted decimal string into an 
 *           IP address 
 */
uint32_t dot2ip(char *pdot)
{
  uint32_t i,my_ip;
  my_ip=0;
  for (i=0; i<IP_ALEN; ++i) {
    my_ip = my_ip*256+atoi(pdot);
    if ((pdot = (char *) index(pdot, '.')) == NULL)
        break;             
        ++pdot;
    }
    return my_ip;
}

And here is the function with byte array interpretation:

uint32_t dot2ip2(char *pdot)
{
  int i;
  uint8_t ip[IP_ALEN];
  for (i=0; i<IP_ALEN; ++i) {
    ip[i] = atoi(pdot);
    if ((pdot = (char *) index(pdot, '.')) == NULL)
        break;          
     ++pdot;
  }
  return *((uint32_t *)ip);
}
Summary

The topic of byte and bit endianness can go even further than what we discussed here. Hopefully this article has covered the main aspects of it. See you next time in the maze.

Kevin Kaichuan He is a senior system software engineer at Solustek Corp. He currently is working on board bring-up, embedded Linux and networking stacks projects. His previous work experience includes being a software engineer at Cisco Systems and a research assistant in Computer Science at Purdue University. In his spare time, he enjoys digital photography, PS2 games and movies.

 
Byte and Bit Order Dissection
 
作者:Kevin He2003-09-02
原文地址:http://www.linuxjournal.com/article/6788
 
譯者:Love. Katherine,2007-04-14
譯文地址:http://blog.csdn.net/lovekatherine/archive/2007/04/14/1564731.aspx
 
轉載時務必以超鏈接形式標明文章原始出處及作者、譯者信息。

 
討論大端與小端、比特序與節序的區別,以及它們的作用範圍
 
編輯提示:本文自最初發表後已
                           
做過修改
 
 
那些不得不和比特序、字節序問題打交道的軟件或硬件工程師,都很清楚這過程就像是走迷宮。儘管通常我們都能走出迷宮,但是每次都要犧牲數量可觀的腦細胞。本文試圖概括需要處理比特序和字節序問題的領域,包括CPU、總線、硬件設備以及網絡協議。我們將深入問題的細節,並希望能在這個問題上提供有價值的參考。本文同時還試圖提供一些從實踐中總結出的指導和拇指法則。
 
 
大小端
 
我們對"endianness"這個名詞估計都很熟悉了。它首先被Danny Cohen於1980引入,用來表述計算機系統表示多字節整數的方式。
 
endianness分爲兩種:大端和小端。(從字節序的角度來看)大端方式是將整數中最高位byte存放在最低地址中。而小端方式則相反,將整數中的最高位byte存放在最高地址中。
 
對於某個確定的計算機系統,比特序通常與字節序保持一致。換言之,在大端系統中,每個byte中最高位bit存放在內存最低位;在小端系統中,最低位bit存放在內存最低位。
 
在設計計算機系統時,應該盡一切可能避免通過軟件方式執行bit換位,因爲這樣不僅會產生巨大開銷,也是件令程序員感到乏味的工作。後文將介紹如何通過硬件方式處理這一問題。
 
書寫規則
 
正如大部分人是按照從左至右的順序書寫數字,一個多字節整數的內存佈局也應該遵循同樣的方式,即從左至右爲數值的最高位至最低位。正如我們在下面的例子中所看到的,這是書寫整數最清晰的方式。
 
根據上述規則,我們按以下方式分別在大端和小端系統中值爲0x0a0b0c0d的整數。
 
 
在大端系統中書寫整數:
 
byte addr       0         1       2        3
bit  offset 01234567 01234567 01234567 01234567
     binary 00001010 00001011 00001100 00001101
      hex     0a       0b      0c        0d
 
 
 
在小端系統中書寫整數
 
byte addr      3         2       1        0
bit  offset 76543210 76543210 76543210 76543210
     binary 00001010 00001011 00001100 00001101
      hex     0a      0b      0c        0d
 
以上兩種情形,我們都是按從左至右的順序讀,整數值爲0X0a0b0c0d
 
假設我們不遵循上述的規則,也許我們會以如下方式書寫整數:
 
byte addr      0         1       2        3
bit  offset 01234567 01234567 01234567 01234567
     binary 10110000 00110000 11010000 01010000
 
正如你所看到的,這種方式下想要看出我們要表達的整數是件困難的事情。
 
本文中使用的簡化計算機系統
 
在不失一般性的前提下,在本文中使用下圖所描述的簡化計算機系統:
 
 
 
CPU、內部總線和內存/Cache這些部件由於通常擁有相同的endianness,可以作爲一個整體用CPU來代表。而對於總線endianness討論,只涉及外部總線。CPU寄存器寬度、內存字寬和總線寬度在本文中被設定爲32bits。
 
CPUendianness
 
 
CPU的endianness是指它在寄存器、內部總線、Cahce和內存中表示多字節整數時所採取的字節序和比特序。
 
小端的CPU包括Intel和DEC。大端CPU包括Motorola 680x0, Sun Sparc and IBM (如PowerPC)。MIPs and ARM可以設定爲任選其一。
 
 
 
CPU的endianness影響着CPU的指令集。對於使用不同endianness的CPU,應該使用不同的GNU工具包來編譯代碼。例如,mips-linux-gcc和mipsel-linux-gcc分別用來編譯生成運行於大端和小端模式的MIPS之上的代碼。
 
如果我們(程序員)需要訪問多字節整數的一部分時,也必須考慮CPU的endianness。以下的程序展示了該種情形。注意,在訪問32-bit整數的整體時,CPU的endianness對於軟件(程序員)是不可見的。
 
union {
    uint32_t my_int;
    uint8_t my_bytes[4];
} endian_tester;
endian_tester et;
et.my_int = 0x0a0b0c0d;
if(et.my_bytes[0] == 0x0a )
    printf( "I'm on a big-endian system/n" );
else
    printf( "I'm on a little-endian system/n" );
 
 
總線的Endianness
 
此處我們所談論的總線是在上圖中顯示的外部總線。下文以PCI總線爲例。正如我們所知,總線是聯接CPU、外設以及其它各種設備的媒介部件。總線的endianness是由總線協議定義的、所有聯接到其上的部件都必須遵守的比特/字節序標準。
 
以類型爲小端的PCI總線爲例:對於PCI的32位地址/數據線AD[31:0],要求所有聯接到PCI上的32-bit設備將其最高位數據線聯接到AD31,最低位數據線聯接到AD0。類型爲大端的總線協議則有相反的要求。
 
對於一個數據寬度不滿總線寬度的設備,例如一個8-bit設備,小端的總線如PCI規定設備的8根數據線應聯接到AD[7:0],而對於大端的總線協議,則要求聯接到AD[24:31]。
 
此外,對於PCI,總線協議要求每個PCI設備實現可配置空間——即一組與總線具有相同字節序的可配置寄存器。
 
正如所有的設備都需要遵守(外部)總線所規定的比特/字節序標準,CPU也一樣。如果CPU與(外部)總線工作於不同的endianness模式,那麼總線控制器/橋通常是完成轉換的部件。
 
一個機敏的讀者現在會提出這樣的疑問:“既然如此,如果設備的endianness模式與總線的endianness模式不匹配,會怎樣?“ 在這種情況下,必須執行額外的轉換工作才能進行信息傳遞,這將在下一節談到。
 
 
設備的Endianness
 
Kevin定理#1: 當一個多字節數據單元在兩個具有相反endianness系統之間傳輸時,需要執行轉換以維護數據單元的內存空間連續性。
 
我們在下面的討論中假設CPU和總線具備相同的endianness。如果設備的endianness與CPU/bus相同,那麼不需要執行轉換。
 
在設備與CPU/bus的endianness不同的情形下,從硬件接線的角度,我們在此提供兩種解決方式。以下的討論假設CPU/bus類型爲小端,而設備類型爲大端。
 
字一致方案
 
在該解決方案中,我們對整個32-bit的設備數據線進行變換。我們用D[0:31]表示設備的數據線,其中D[0]存放最高位,而對於總線用AD[31:0]表示。該方案建議將D[i]聯到AD[31-i],其中i=0,...,31。字一致意味着整個(32-bit)字的語義得到了維護。
 
下圖顯示的是一個類型爲大端的NIC card中的32-bit描述符寄存器。
 
 
 
 
在執行字一致交換後,在CPU/bus上的結果數據爲:
 
 
 
注意,轉化的結果自動符合CPU/bus的字節序和比特序要求,而不需要通過軟件(程序員)進行字節或比特的交換。
 
上述例子是針對數據並未超過32-bit內存邊界的簡單情形。現在我們看一個穿越邊界的例子。在下面的例子中,vlan[0:24]的值爲0xabcdef,並且穿越了32-bit內存邊界。
 
 
在字一致轉化後,結果爲:
 
 
看到這裏發生了什麼?轉換後的vlan被分割爲兩個非連續的內存空間:bytes[1:0]和byte[7]。這違背了Kevin定理#1,而且我們無法定義一個結構良好的C結構來訪問內存空間非連續的vlan。
 
 
因此,字一致方案只適用於數據位於字邊界之內的情形,對於存在邊界穿越的數據並不適用。第二種方案可解決該問題。
 
字節一致方案
 
在該方案中,我們不執行字節間的變換,但是我們還是要對每個字節中的比特通過硬件繞線進行變換(設備中偏移量爲i的比特轉換爲bus中偏移量爲7-i的比特,i=0...7)。字節一致意味着字節的語義得到了維護。
 
在應該了該方案後,上圖所示大端NIC設備中的值轉換後的結果爲:
 
 
現在,vlan的三個字節位於連續的內存空間,並且每個字節的內容可以被正確讀出。但是轉換後的記過在字節序角度看來依然很亂。然而,由於我們現在擁有一塊連續的內存空間,可以交給軟件來完成圖中5字節數據交換的任務。最終結果爲:
 
 
我們看到,在這種解決方案中軟件執行的字節交換作爲第二階段。字節交換是由軟件完成的,這不同於比特交換。
 
Kevin定理#:2 在C中一個包含位域的結構中,如果位域A在位域B之前定義,那麼位域A所佔據的內存空間永遠低於B所佔用的內存空間。
 
現在一切都已經分類的井井有條,我們可以定義如下的C結構來訪問NIC中的描述符:
 
struct nic_tag_reg {
        uint64_t vlan:24 __attribute__((packed));
        uint64_t rx :6 __attribute__((packed));
        uint64_t tag :10 __attribute__((packed));
};
 
 
網絡協議的Endianness
 
網絡協議的endianness定義了網絡協議頭部中整數域發送和傳輸時所遵循的比特序和字節序。我們在此還要引入一個概念:繞線地址。一個低繞線地址比特或字節在發送和接受時永遠位於高繞線地址比特或字節之前。
 
實際上,對於網絡endianness,它於我們之前所看到的endianness有些許不同。對於網絡endianness,還存在另外一個影響因素:物理連線上比特的發送和接受順序。底層協議,例如以太網,對於比特的傳輸和接受順序有特定規定,有時這個規定是與上層協議的endianness相反的。我們將在下面的例子中考慮這種情形。
 
NIC設備的endianness通常遵循它們所支持的網絡協議所使用的endianness類型,因此可能與系統中CPU的endianness不同。多數網絡協議是大端的。此處我們以以太網和IP爲例。
 
 
 
以太網的endianness
 
以太網是大端的。這意味着一個整數域的最高字節存放於低繞線地址,並且在接受和發送時位於最低字節之前。例如,以太網頭部值爲0x0806(ARP)協議域有如下的繞線佈局:
 
wire byte offset:               0       1
hex             :           08      06
 
注意,以太網頭部中MAC地址被視爲字符串,因此不受字節序的影響。例如,MAC地址12:34:56:78:9a:bc有如下的繞線佈局,並且值爲12的字節被首先傳輸。
 
比特傳輸/接收序
 
比特傳輸/接受序規定了一個字節內的所有bit在物理線路中傳輸的順序。對於以太網,順序是由最不重要bit(低繞線地址)至最重要bit(高繞線地址)。這顯然屬於小端的類型。字節序仍保持爲大端,如前所敘。因此,我們看到在這種情況下,字節序和比特傳輸/接收序是相反的。
 
下圖展示了以太網的比特傳輸/接收序:
 
 
我們看到,MAC地址第一個字節中的最不重要bit,即組(多播)位,作爲第一個bit出現在物理線路上。以太網和802.3硬件按照上述字節發傳輸/接受順序一致性的工作。
 
 
在協議字節序與比特傳輸/接收序不同的情形下:NIC必須在傳輸時完成由主機(CPU)至比特序到以太網比特比特傳輸序的轉換,而在接受時完成由以太網比特接受序至主機(CPU)比特序的轉換。這樣,上層協議就不用擔心比特序而只需保證字節序的正確。實際上,這是另一種形式的字節一致轉換方案,它保證了數據通過不同endianness時字節級語義的完整性。
 
 
比特傳輸/接受序通常對於CPU和軟件是不可見的,但是對於硬件而言是個重要的問題,例如物理層的串並轉化,NIC的數據線與總線的聯接。
 
基於軟件的以太網頭部語法分析
 
對於任何類型的endianness,以太網頭部可以用下面的C結構來完成軟件的語法分析:
 
struct ethhdr
{
        unsigned char   h_dest[ETH_ALEN];      
        unsigned char   h_source[ETH_ALEN];    
        unsigned short h_proto;               
};
 
 
h_dest和h_source域是字節數組,因此不需要轉換。h_proto域是整數,因此在主機訪問該域前需調用ntohs(),而在填充該域前需調用htons()。
 
IPendianness
 
IP的字節序也爲大端。而IP的比特序從CPU處繼承,並由NIC負責其與物理傳輸線路中的比特傳輸/發送序進行轉化。
 
對於大端主機,IP頭部中的域可以被直接訪問。對於小端主機(多數爲基於x86的PC)需要對IP頭部中的整數域進行字節變換才能進行訪問和填充。
 
下面是Linux Kernel中定義的iphdr結構。我們在讀取整數前調用ntohs(),在填寫整數前調用htons()。本質上,這兩個函數在大端主機上不執行任何操作,而在小端主機上執行字節變換。
 
struct iphdr {
#if defined(__LITTLE_ENDIAN_BITFIELD)
        __u8    ihl:4,
                version:4;
#elif defined (__BIG_ENDIAN_BITFIELD)
        __u8    version:4,
                ihl:4;
#else
#error "Please fix <asm/byteorder.h>"
#endif
        __u8    tos;
        __u16   tot_len;
        __u16   id;
        __u16   frag_off;
        __u8    ttl;
        __u8    protocol;
        __u16   check;
        __u32   saddr;
        __u32   daddr;
        /*The options start here. */
};
 
 
我們來查看IP頭部中一些有意思的域。
 
 
version and ihl :根據IP標準,IP頭部第一個字節中的最高4bit表示IP協議的版本。ihl表示第一個字節低4位bit。
 
有兩種方法可以用來訪問這些域。方法1直接從數據中進行提取。假設ver_ihl存放着IP頭部的第一個字節,那麼(ver_ihl &0x0f)可得到ihl域,而(ver_ihl>>4)可得到verion域。這種方法對於任何一種endianness類型都適用。
 
方法二是定義上述的結構,然後通過結構來訪問這些域。在上述結構中,如果主機爲小端,那麼我們定義ihl在version之前;如果主機爲大端,我們定義version在ihl之後。如果我們在此應用Kevin 定理#2—— 一個先定義的的域永遠佔據低地址空間,我們可以發現以上的C結構定義很好的符合了IP標準。
 
saddr and daddr fields:這兩個域可以被視爲整數或字節數組。如果視爲字節數組的話,沒有必要進行轉化。如果被視爲整數,那麼則需要轉化,以下是一個基於整數解釋的函數
 
/* dot2ip - convert a dotted decimal string into an
 *           IP address
 */
uint32_t dot2ip(char *pdot)
{
 uint32_t i,my_ip;
 my_ip=0;
 for (i=0; i<IP_ALEN; ++i) {
    my_ip = my_ip*256+atoi(pdot);
    if ((pdot = (char *) index(pdot, '.')) == NULL)
        break;            
        ++pdot;
    }
    return my_ip;
}
 
 
下面則是基於字節數組的函數:
 
uint32_t dot2ip2(char *pdot)
{
 int i;
 uint8_t ip[IP_ALEN];
 for (i=0; i<IP_ALEN; ++i) {
    ip[i] = atoi(pdot);
       if ((pdot = (char *) index(pdot, '.')) == NULL)
        break;         
     ++pdot;
 }
 return *((uint32_t *)ip);
}
 
總結
 
本文所討論的關於字節序和比特序的問題還可以進一步深入 。希望本文已經介紹了該問題的主要方面。迷宮裏下次見吧。
 
Kevin Kaichuan He 是一名Solustek Corp的高級軟件工程師。他目前的工作致力於borad bring-up、嵌入式Linux和網絡協議棧工程。他之前曾是Cisco公司的軟件工程師、Purdue大學計算機系的助教。業餘時間,他喜歡數碼攝像,PS2遊戲和電影。
 
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章