ENDIANNESS
(Redirected from Little endian)
In computing, 'endianness' is the byte (and sometimes bit) ordering in memory used to represent some kind of data. Typical cases are the order in which integer values are stored as bytes in computer memory (relative to a given memory addressing scheme) and the transmission order over a network or other medium. When specifically talking about bytes, endianness is also referred to simply as 'byte order'.
[1]
Generally speaking, endianness is a particular attribute of a representation format - which byte of a UCS-2 character would be stored at the lower address, etc. Byte order is an important consideration in network programming, since two computers with different byte orders may be communicating. Failure to account for varying endianness is notorious among computer programmers as a source of bugs.
Most modern computer processors agree on bit ordering "inside" individual bytes (this was not always the case). This means that any single-byte value will be read the same on almost any computer one may send it to.
Integers are usually stored as sequences of bytes, so that the encoded value can be obtained by simple concatenation. The two most common of them are:
★ increasing numeric significance with increasing memory addresses, known as ''little-endian'', and
★ its opposite, called ''big-endian''.[2]
Again, big-endian does not mean "ending big", but "big end first".
Intel's x86 processors use the little-endian format (sometimes called the ''Intel format''). Motorola processors have generally used big-endian. PowerPC (which includes Apple's Macintosh line prior to the Intel switch) and System/370 also adopt big-endian. SPARC historically used big-endian, though version 9 is bi-endian (see below).
Some architectures (including ARM, PowerPC (but not the PPC970/G5), DEC Alpha, SPARC V9, MIPS, PA-RISC and IA64) feature switchable endianness. That can improve performance or simplify the logic of networking devices and software. The word ''bi-endian'', said of hardware, denotes the capability to compute or pass data in either of two different endian formats (usually big-endian and little-endian).
Many of these architectures can be switched via software to default to a specific endian format (usually done when the computer starts up); however, on some systems the default endianness is selected by hardware on the motherboard and cannot be changed via software (e.g., the DEC Alpha, which runs only in big-endian mode on the Cray T3E).
Note that "bi-endian" refers primarily to how a processor treats ''data'' accesses. ''Instruction'' accesses (fetches of instruction words) on a given processor may still assume a fixed endianness, even if ''data'' accesses are fully bi-endian.
Note, too, that some nominally bi-endian CPUs may actually employ internal "magic" (as opposed to really switching to a different endianness) in one of their operating modes. For instance, some PowerPC processors in little-endian mode act as little-endian from the point of view of the executing programs but they do not actually store data in memory in little-endian format (multi-byte values are swapped during memory load/store operations). This can cause problems when memory is transferred to an external device if some part of the software, e.g. a device driver, does not account for the situation.
On some machines, while integers were represented in little-endian form, floating-point numbers were represented in big-endian form. [3] Because there are many floating formats, and a lack of a standard "network" representation, no standard for transferring floating point values has been made. This means that floating point data written on one machine may not be readable on another — even if both use IEEE 754 floating point arithmetic (as the endian-ness of the memory representation is not part of the IEEE specification). [4]
The choice of big-endian vs. little-endian has been the subject of flame wars. The very term ''big-endian'' comes from Jonathan Swift's satiric novel ''Gulliver’s Travels'', where tensions are described in Lilliput and Blefuscu because a faction called the ''Big-endians'' prefer to crack open their soft-boiled eggs from the big end, contrary to Lilliputian royal edict.[3] The terms ''little-endian'' and ''endianness'' have a similar ironic intent.[4]
An often cited argument in favour of big-endian is that it is consistent with the ordering used in natural languages. But that is far from being universal, both in spoken and written form:
★ spoken: though most spoken languages express most numbers, especially those larger than a hundred, in a "big-endian manner"[5] (in modern English, for example, one says "twenty-four", not "four-and-twenty") there are notable exceptions such as the German, Danish, Dutch, and the Slovenian languages, which use "little-endian" for numbers up to 99 and "mixed endianness" for larger numbers (e.g. ''vierundzwanzig''/''vierentwintig'' (24, literally "four-and-twenty"), and ''hundertvierundzwanzig'' (124, literally "hundred four-and-twenty"). The Sanskrit language is another exception which uses "little-endian" for small (e.g. chaturvinsh (24, literally "four-and-twenty")) as well as large numbers (e.g. chaturvinshatyadhikashatatam (124, literally "four-and-twenty-over-hundred")).
★ written: the Hindu-Arabic numeral system is used worldwide and is such that the most significant digits are always written to the left of the less significant ones. Writing left to right, this system is therefore "big-endian" (big end first). Writing right to left, this numeral system is "little-endian". It is worth noting, also, that in quite a few languages the spoken order of numerals is inconsistent with how they appear written; and that in some languages, such as Persian and Hebrew, it is common to interrupt the writing of text (right-to-left) to write a number in the opposite order (left-to-right)
Little-endian has the property that, in the absence of alignment restrictions, values can be read from memory at different widths without using different addresses. For example, a 32-bit memory location with content 4A 00 00 00 can be read at the same address as either 8-bit (value = 4A), 16-bit (004A), or 32-bit (0000004A). (This example works only if the value makes sense in all three sizes, which means the value fits in just 8 bits.) This little-endian property is rarely used, and doesn't imply that little-endian has any performance advantage in variable-width data access.
'A note on some non-idiomatic usages':
some authors extend the usage of the word "endianness", and of related terms, to entities such as street addresses, date formats and others. It should be noticed however that such usages—basically reducing ''endianness'' to a mere synonym of ''ordering of the parts''—are non-standard usage (e.g., ISO 8601:2004 talks about "descending order year-month-day", not about "big-endian format"), do not have widespread usage, and are generally (other than for date formats) employed in a metaphorical sense.
:''Note: the prefix 0x indicates hexadecimal notation.''
To further illustrate the above notions this section provides example layouts of a 32-bit number in the most common variants of endianness. There is no general guarantee that a platform will use one of these formats but in practice there are few if any exceptions.
All the examples refer to the storage in memory of the value 0x0A0B0C0D.
★ ''With 8-bit atomic element size and 1-byte (octet) address increment'':
The most significant byte (''MSB'') value, which is 0x0A in our example, is stored at the memory location with the lowest address, the next byte value in significance, 0x0B, is stored at the following memory location and so on. This is akin to Left-to-Right reading order in hexadecimal.
★ ''With 16-bit atomic element size'':
The most significant atomic element stores now the value 0x0A0B, followed by 0x0C0D.
★ ''With 8-bit atomic element size and 1-byte (octet) address increment'':
The least significant byte (''LSB'') value, 0x0D, is at the lowest address. The other bytes follow in increasing order of significance.
★ ''With 16-bit atomic element size'':
The least significant 16-bit unit stores the value 0x0C0D, immediately followed by 0x0A0B.
★ ''With byte addresses increasing from right to left'':
The 16-bit atomic element byte ordering may look a little backwards as written above, but this is because little-endian is best written with addressing increasing towards the left. If we write the bytes this way then the ordering makes slightly more sense:
The least significant byte (''LSB'') value, 0x0D, is at the lowest address. The other bytes follow in increasing order of significance.
The least significant 16-bit unit stores the value 0x0C0D, immediately followed by 0x0A0B.
Still other architectures, generically called ''middle-endian'' or ''mixed-endian'', may have a more complicated ordering; PDP-11, for instance, stored some 32-bit words, counting from the most significant, as: 2nd byte first, then 1st, then 4th, and finally 3rd.
★ ''storage of a 32-bit word on a PDP-11''
Note that this can be interpreted as storing the most significant "half" (16-bits) followed by the less significant half (as if big-endian) but with each half stored in little-endian format. This ordering is known as ''PDP-endianness''.
The ARM architecture can also produce this format when writing a 32-bit word to an address 2 bytes from a 32-bit word alignment.
Networks generally use big-endian order, and thus it is called 'network order' when sending information over a network in a common format. The historical reason is that this allowed routing while a telephone number was being composed. In fact, the Internet Protocol defines a standard big-endian ''network byte order''. This byte order is used for all numeric values in the packet headers and by many higher level protocols and file formats that are designed for use over IP. The Berkeley sockets API defines a set of functions to convert 16- and 32-bit integers to and from network byte order: the htonl (host-to-network-long) and htons (host-to-network-short) functions convert 32-bit and 16-bit values respectively from machine (''host'') to network order; whereas the ntohl and ntohs functions convert from network to host order.
While the lowest network protocols may deal with sub-byte formatting, all the layers above them usually consider the ''byte'' (mostly meant as ''octet'') as their atomic unit.
The terms ''bit endianness'' or ''bit-level endianness'' are seldom used when talking about the representation of a stored value, as they are only meaningful for the rare computer architectures which support addressing of individual bits. They are used however to refer to the transmission order of bits over a serial medium. Most often that order is transparently managed by the hardware and is the bit-level analogue of little-endian (low-bit first), although protocols exist which require the opposite ordering (e.g. I²C). In networking, the decision about the order of transmission of bits is made in the very bottom of the data link layer of the OSI model.
1.
For hardware, the Jargon File also reports the less common expression ''byte sex'' [1]. It is unclear whether this terminology is also used when more than two orderings are possible. Similarly, the manual for the ORCA/M assembler refers to a field indicating the order of the bytes in a number field as
2. Note that, in these expressions, the term "end" is meant as "extremity", not as "last part"; and that ''big'' and ''little'' say which extremity is written first.
3. on Wikisource
4. Endian FAQ – includes the paper Internet Engineering Note (IEN) 137: ''On Holy Wars and a Plea for Peace'' [ftp://ftp.rfc-editor.org/in-notes/ien/ien137.txt;type=a ftp mirror] by Danny Cohen (1 April 1980), but adds much more context.
5. ''Cf.'' entries 539 and 704 of the Linguistic Universals Database
★ Endianness in Embedded Systems
★ White Paper: Endianness or Where is Byte 0?
★ Byte Ordering PPC
★ The Layout of Data in Memory
In computing, 'endianness' is the byte (and sometimes bit) ordering in memory used to represent some kind of data. Typical cases are the order in which integer values are stored as bytes in computer memory (relative to a given memory addressing scheme) and the transmission order over a network or other medium. When specifically talking about bytes, endianness is also referred to simply as 'byte order'.
[1]
Generally speaking, endianness is a particular attribute of a representation format - which byte of a UCS-2 character would be stored at the lower address, etc. Byte order is an important consideration in network programming, since two computers with different byte orders may be communicating. Failure to account for varying endianness is notorious among computer programmers as a source of bugs.
Endianness and hardware
Most modern computer processors agree on bit ordering "inside" individual bytes (this was not always the case). This means that any single-byte value will be read the same on almost any computer one may send it to.
Integers are usually stored as sequences of bytes, so that the encoded value can be obtained by simple concatenation. The two most common of them are:
★ increasing numeric significance with increasing memory addresses, known as ''little-endian'', and
★ its opposite, called ''big-endian''.[2]
Again, big-endian does not mean "ending big", but "big end first".
Intel's x86 processors use the little-endian format (sometimes called the ''Intel format''). Motorola processors have generally used big-endian. PowerPC (which includes Apple's Macintosh line prior to the Intel switch) and System/370 also adopt big-endian. SPARC historically used big-endian, though version 9 is bi-endian (see below).
Bi-endian hardware
Some architectures (including ARM, PowerPC (but not the PPC970/G5), DEC Alpha, SPARC V9, MIPS, PA-RISC and IA64) feature switchable endianness. That can improve performance or simplify the logic of networking devices and software. The word ''bi-endian'', said of hardware, denotes the capability to compute or pass data in either of two different endian formats (usually big-endian and little-endian).
Many of these architectures can be switched via software to default to a specific endian format (usually done when the computer starts up); however, on some systems the default endianness is selected by hardware on the motherboard and cannot be changed via software (e.g., the DEC Alpha, which runs only in big-endian mode on the Cray T3E).
Note that "bi-endian" refers primarily to how a processor treats ''data'' accesses. ''Instruction'' accesses (fetches of instruction words) on a given processor may still assume a fixed endianness, even if ''data'' accesses are fully bi-endian.
Note, too, that some nominally bi-endian CPUs may actually employ internal "magic" (as opposed to really switching to a different endianness) in one of their operating modes. For instance, some PowerPC processors in little-endian mode act as little-endian from the point of view of the executing programs but they do not actually store data in memory in little-endian format (multi-byte values are swapped during memory load/store operations). This can cause problems when memory is transferred to an external device if some part of the software, e.g. a device driver, does not account for the situation.
Floating-point and endianness
On some machines, while integers were represented in little-endian form, floating-point numbers were represented in big-endian form. [3] Because there are many floating formats, and a lack of a standard "network" representation, no standard for transferring floating point values has been made. This means that floating point data written on one machine may not be readable on another — even if both use IEEE 754 floating point arithmetic (as the endian-ness of the memory representation is not part of the IEEE specification). [4]
Discussion, background, etymology
The choice of big-endian vs. little-endian has been the subject of flame wars. The very term ''big-endian'' comes from Jonathan Swift's satiric novel ''Gulliver’s Travels'', where tensions are described in Lilliput and Blefuscu because a faction called the ''Big-endians'' prefer to crack open their soft-boiled eggs from the big end, contrary to Lilliputian royal edict.[3] The terms ''little-endian'' and ''endianness'' have a similar ironic intent.[4]
An often cited argument in favour of big-endian is that it is consistent with the ordering used in natural languages. But that is far from being universal, both in spoken and written form:
★ spoken: though most spoken languages express most numbers, especially those larger than a hundred, in a "big-endian manner"[5] (in modern English, for example, one says "twenty-four", not "four-and-twenty") there are notable exceptions such as the German, Danish, Dutch, and the Slovenian languages, which use "little-endian" for numbers up to 99 and "mixed endianness" for larger numbers (e.g. ''vierundzwanzig''/''vierentwintig'' (24, literally "four-and-twenty"), and ''hundertvierundzwanzig'' (124, literally "hundred four-and-twenty"). The Sanskrit language is another exception which uses "little-endian" for small (e.g. chaturvinsh (24, literally "four-and-twenty")) as well as large numbers (e.g. chaturvinshatyadhikashatatam (124, literally "four-and-twenty-over-hundred")).
★ written: the Hindu-Arabic numeral system is used worldwide and is such that the most significant digits are always written to the left of the less significant ones. Writing left to right, this system is therefore "big-endian" (big end first). Writing right to left, this numeral system is "little-endian". It is worth noting, also, that in quite a few languages the spoken order of numerals is inconsistent with how they appear written; and that in some languages, such as Persian and Hebrew, it is common to interrupt the writing of text (right-to-left) to write a number in the opposite order (left-to-right)
Little-endian has the property that, in the absence of alignment restrictions, values can be read from memory at different widths without using different addresses. For example, a 32-bit memory location with content 4A 00 00 00 can be read at the same address as either 8-bit (value = 4A), 16-bit (004A), or 32-bit (0000004A). (This example works only if the value makes sense in all three sizes, which means the value fits in just 8 bits.) This little-endian property is rarely used, and doesn't imply that little-endian has any performance advantage in variable-width data access.
'A note on some non-idiomatic usages':
some authors extend the usage of the word "endianness", and of related terms, to entities such as street addresses, date formats and others. It should be noticed however that such usages—basically reducing ''endianness'' to a mere synonym of ''ordering of the parts''—are non-standard usage (e.g., ISO 8601:2004 talks about "descending order year-month-day", not about "big-endian format"), do not have widespread usage, and are generally (other than for date formats) employed in a metaphorical sense.
Examples of storing the value 0x0A0B0C0D in memory
:''Note: the prefix 0x indicates hexadecimal notation.''
To further illustrate the above notions this section provides example layouts of a 32-bit number in the most common variants of endianness. There is no general guarantee that a platform will use one of these formats but in practice there are few if any exceptions.
All the examples refer to the storage in memory of the value 0x0A0B0C0D.
Big-endian
★ ''With 8-bit atomic element size and 1-byte (octet) address increment'':
| ''increasing addresses'' → | |||||
| 0x0A | 0x0B | 0x0C | 0x0D | ||
The most significant byte (''MSB'') value, which is 0x0A in our example, is stored at the memory location with the lowest address, the next byte value in significance, 0x0B, is stored at the following memory location and so on. This is akin to Left-to-Right reading order in hexadecimal.
★ ''With 16-bit atomic element size'':
| ''increasing addresses'' → | |||||
| 0x0A0B | 0x0C0D | ||||
The most significant atomic element stores now the value 0x0A0B, followed by 0x0C0D.
Little-endian
★ ''With 8-bit atomic element size and 1-byte (octet) address increment'':
| ''increasing addresses'' → | |||||
| 0x0D | 0x0C | 0x0B | 0x0A | ||
The least significant byte (''LSB'') value, 0x0D, is at the lowest address. The other bytes follow in increasing order of significance.
★ ''With 16-bit atomic element size'':
| ''increasing addresses'' → | |||||
| 0x0C0D | 0x0A0B | ||||
The least significant 16-bit unit stores the value 0x0C0D, immediately followed by 0x0A0B.
★ ''With byte addresses increasing from right to left'':
The 16-bit atomic element byte ordering may look a little backwards as written above, but this is because little-endian is best written with addressing increasing towards the left. If we write the bytes this way then the ordering makes slightly more sense:
| ← ''increasing addresses'' | |||||
| 0x0A | 0x0B | 0x0C | 0x0D | ||
The least significant byte (''LSB'') value, 0x0D, is at the lowest address. The other bytes follow in increasing order of significance.
| ← ''increasing addresses'' | |||||
| 0x0A0B | 0x0C0D | ||||
The least significant 16-bit unit stores the value 0x0C0D, immediately followed by 0x0A0B.
Middle-endian
Still other architectures, generically called ''middle-endian'' or ''mixed-endian'', may have a more complicated ordering; PDP-11, for instance, stored some 32-bit words, counting from the most significant, as: 2nd byte first, then 1st, then 4th, and finally 3rd.
★ ''storage of a 32-bit word on a PDP-11''
| ''increasing addresses'' → | |||||
| 0x0B | 0x0A | 0x0D | 0x0C | ||
Note that this can be interpreted as storing the most significant "half" (16-bits) followed by the less significant half (as if big-endian) but with each half stored in little-endian format. This ordering is known as ''PDP-endianness''.
The ARM architecture can also produce this format when writing a 32-bit word to an address 2 bytes from a 32-bit word alignment.
Endianness in networking
Networks generally use big-endian order, and thus it is called 'network order' when sending information over a network in a common format. The historical reason is that this allowed routing while a telephone number was being composed. In fact, the Internet Protocol defines a standard big-endian ''network byte order''. This byte order is used for all numeric values in the packet headers and by many higher level protocols and file formats that are designed for use over IP. The Berkeley sockets API defines a set of functions to convert 16- and 32-bit integers to and from network byte order: the htonl (host-to-network-long) and htons (host-to-network-short) functions convert 32-bit and 16-bit values respectively from machine (''host'') to network order; whereas the ntohl and ntohs functions convert from network to host order.
While the lowest network protocols may deal with sub-byte formatting, all the layers above them usually consider the ''byte'' (mostly meant as ''octet'') as their atomic unit.
"Bit endianness"
The terms ''bit endianness'' or ''bit-level endianness'' are seldom used when talking about the representation of a stored value, as they are only meaningful for the rare computer architectures which support addressing of individual bits. They are used however to refer to the transmission order of bits over a serial medium. Most often that order is transparently managed by the hardware and is the bit-level analogue of little-endian (low-bit first), although protocols exist which require the opposite ordering (e.g. I²C). In networking, the decision about the order of transmission of bits is made in the very bottom of the data link layer of the OSI model.
Notes
1.
For hardware, the Jargon File also reports the less common expression ''byte sex'' [1]. It is unclear whether this terminology is also used when more than two orderings are possible. Similarly, the manual for the ORCA/M assembler refers to a field indicating the order of the bytes in a number field as
NUMSEX, and the Mac OS X operating system refers to "byte sex" in its compiler tools [2].2. Note that, in these expressions, the term "end" is meant as "extremity", not as "last part"; and that ''big'' and ''little'' say which extremity is written first.
3. on Wikisource
4. Endian FAQ – includes the paper Internet Engineering Note (IEN) 137: ''On Holy Wars and a Plea for Peace'' [ftp://ftp.rfc-editor.org/in-notes/ien/ien137.txt;type=a ftp mirror] by Danny Cohen (1 April 1980), but adds much more context.
5. ''Cf.'' entries 539 and 704 of the Linguistic Universals Database
External links
★ Endianness in Embedded Systems
★ White Paper: Endianness or Where is Byte 0?
★ Byte Ordering PPC
★ The Layout of Data in Memory
This article provided by Wikipedia. To edit the contents of this article, click here for original source.
psst.. try this: add to faves

العربية
中国
Français
Deutsch
Ελληνική
हिन्दी
Italiano
日本語
Português
Русский
Español



