From 04d079fa613cdb5157013cb25a2e6c4ae4986cfc Mon Sep 17 00:00:00 2001 From: Beman Date: Fri, 2 Jan 2015 10:46:53 -0500 Subject: [PATCH] Refactor "Choosing the approach" into its own page. It was dominating the main home page, yet is really a topic for study after other material is absorbed. --- doc/arithmetic.html | 7 +- doc/buffers.html | 7 +- doc/choosing_approach.html | 305 +++++++++++++++++++++++++++++++++++++ doc/conversion.html | 7 +- doc/index.html | 258 +++---------------------------- 5 files changed, 335 insertions(+), 249 deletions(-) create mode 100644 doc/choosing_approach.html diff --git a/doc/arithmetic.html b/doc/arithmetic.html index 848bf3c..ddb9c5c 100644 --- a/doc/arithmetic.html +++ b/doc/arithmetic.html @@ -27,11 +27,12 @@ - + Buffer Types     + Choosing Approach
Boost Home     + Endian Home     Conversion Functions     Arithmetic Types     - Buffer Types
@@ -681,7 +682,7 @@ differs from endian representation size. Vicente Botet and other reviewers suggested supporting floating point types.


Last revised: -05 December, 2014

+02 January, 2015

© Copyright Beman Dawes, 2006-2009, 2013

Distributed under the Boost Software License, Version 1.0. See www.boost.org/ LICENSE_1_0.txt

diff --git a/doc/buffers.html b/doc/buffers.html index e171fbb..54c7cd0 100644 --- a/doc/buffers.html +++ b/doc/buffers.html @@ -27,11 +27,12 @@ - + Buffer Types     + Choosing Approach
Boost Home     + Endian Home     Conversion Functions     Arithmetic Types     - Buffer Types
@@ -617,7 +618,7 @@ any Boost object libraries.


Last revised: -06 December, 2014

+02 January, 2015

© Copyright Beman Dawes, 2006-2009, 2013

Distributed under the Boost Software License, Version 1.0. See www.boost.org/ LICENSE_1_0.txt

diff --git a/doc/choosing_approach.html b/doc/choosing_approach.html new file mode 100644 index 0000000..338aa97 --- /dev/null +++ b/doc/choosing_approach.html @@ -0,0 +1,305 @@ + + + + + + +Choosing Approach + + + + + + + + + + +
+ +Boost logo + Choosing the Approach
+ + + + + +
+ Endian Home     + Conversion Functions     + Arithmetic Types     + Buffer Types     + Choosing Approach
+

+ + + + + + + + + + + + + + +
+ Contents
+Introduction
+Choosing between conversion functions,
+   buffer types, and arithmetic types
+   Characteristics
+      Endianness invariants
+      Conversion explicitness
+      Arithmetic operations
+      Sizes
+      Alignments
+   Design patterns
+      Convert only as needed (i.e. lazy)
+      Convert in anticipation of need
+      Generally +as needed, locally in anticipation
+   Use case examples
+      Porting endian unaware codebase
+      Porting endian aware codebase
+      Reliability and arithmetic-speed
+      Reliability and ease-of-use
+ Headers
+ <boost/endian/conversion.hpp>
+ <boost/endian/buffers.hpp>
+ <boost/endian/arithmetic.hpp>
+ +

Introduction

+ +

Deciding which is the best endianness approach (conversion functions, buffer +types, or arithmetic types) for a particular application involves complex +engineering trade-offs. It is hard to assess those trade-offs without some +understanding of the different interfaces, so you might want to read the +conversion functions, +buffer types, and arithmetic types pages +before diving into this page.

+ +

Choosing between conversion functions, buffer types, +and arithmetic types

+ +

The best approach to endianness for a particular application depends on the interaction between +the application's needs and the characteristics of each of the three approaches.

+ +

Recommendation: If you are new to endianness, uncertain, or don't want to invest +the time to +study +engineering trade-offs, use endian arithmetic types. They are safe, easy +to use, and easy to maintain. Use the + +anticipating need design pattern locally around performance hot spots like lengthy loops, +if needed. 

+ +

Characteristics

+ +

The characteristics that differentiate the three approaches to endianness are the endianness +invariants, conversion explicitness, arithmetic operations, sizes available, and +alignment requirements.

+ +

Endianness invariants

+ +
+ +

Endian conversion functions use objects of the ordinary C++ arithmetic +types like int or unsigned short to hold values. That +breaks the implicit invariant that the C++ language rules apply. The usual +language rules only apply if the endianness of the object is currently set to the native endianness for the platform. That can +make it very hard to reason about complex logic flow, and result in difficult to +find bugs.

+ +

Endian buffer and arithmetic types hold values internally as arrays of +characters with an invariant that the endianness of the array never changes. +That makes these types easier to use and programs easier to maintain.

+ +
+ +

Conversion explicitness

+ +
+ +

Endian conversion functions and buffer types never perform +implicit conversions. This gives users explicit control of when conversion +occurs, and may help avoid unnecessary conversions.

+ +

Endian arithmetic types perform conversion implicitly. That makes +these types very easy to use, but can result in unnecessary conversions. Failure +to hoist conversions out of inner loops can bring a performance penalty.

+ +
+ +

Arithmetic operations

+ +
+ +

Endian conversion functions do not supply arithmetic +operations, but this is not a concern since this approach uses ordinary C++ +arithmetic types to hold values.

+ +

Endian buffer types do not supply arithmetic operations. Although this +approach avoids unnecessary conversions, it can result in the introduction of +additional variables and confuse maintenance programmers.

+ +

Endian arithmetic types do supply arithmetic operations. They +are very easy to use if lots of arithmetic is involved.

+ +
+ +

Sizes

+ +
+ +

Endianness conversion functions only support 1, 2, 4, and 8 byte +integers. That's sufficient for many applications.

+ +

Endian buffer and arithmetic types support 1, 2, 3, 4, 5, 6, 7, and 8 +byte integers. For an application where memory use or I/O speed is the limiting +factor, using sizes tailored to application needs can be useful.

+ +
+ +

Alignments

+ +
+ +

Endianness conversion functions only support aligned integer and +floating-point types. That's sufficient for most applications.

+ +

Endian buffer and arithmetic types support both aligned and unaligned +integer and floating-point types. Unaligned types are rarely needed, but when +needed they are often very useful and workarounds are painful. For example,

+ +
+

Non-portable code like this:

+

struct S {
+   uint16_t a;  // big endian
+   uint32_t b;  // big endian
+ } __attribute__ ((packed));
+

+

Can be replaced with portable code like this:

+
+

struct S {
+   big_uint16_ut a;
+   big_uint32_ut b;
+ };
+

+
+ +
+ +

Design patterns

+ +

Applications often traffic in endian data as records or packets containing +multiple endian data elements. For simplicity, we will just call them records.

+ +

If desired endianness differs from native endianness, a conversion has to be +performed. When should that conversion occur? Three design patterns have +evolved.

+ +

Convert only as needed (i.e. lazy)

+ +

This pattern defers conversion to the point in the code where the data +element is actually used.

+ +

This pattern is appropriate when which endian element is actually used varies +greatly according to record content or other circumstances

+ +

Convert in anticipation of need

+ +

This pattern performs conversion to native endianness in anticipation of use, +such as immediately after reading records. If needed, conversion to the output +endianness is performed after all possible needs have passed, such as just +before writing records.

+ +

One implementation of this pattern is to create a proxy record with +endianness converted to native in a read function, and expose only that proxy to +the rest of the implementation. If a write function, if needed, handles the +conversion from native to the desired output endianness.

+ +

This pattern is appropriate when all endian elements in a record are +typically used regardless of record content or other circumstances

+ +

Convert +generally only as needed, but locally in anticipation of need

+ +

This pattern in general defers conversion but for specific local needs does +anticipatory conversion.

+ +

This pattern is particularly appropriate when coupled with the endian buffer +or arithmetic types.

+ +

Use case examples

+ +

Porting endian unaware codebase

+ +

An existing codebase runs on big endian systems. It does not +currently deal with endianness. The codebase needs to be modified so it can run +on  little endian systems under various operating systems. To ease +transition and protect value of existing files, external data will continue to +be maintained as big endian.

+ +

The endian +arithmetic approach is recommended to meet these needs. A relatively small +number of header files dealing with binary I/O layouts need to change types. For +example,  +short or int16_t would change to big_int16_t. No +changes are required for .cpp files.

+ +

Porting endian aware codebase

+ +

An existing codebase runs on little-endian Linux systems. It already +deals with endianness via +Linux provided +functions. Because of a business merger, the codebase has to be quickly +modified for Windows and possibly other operating systems, while still +supporting Linux. The codebase is reliable and the programmers are all +well-aware of endian issues.

+ +

These factors all argue for an endian conversion +approach that just mechanically changes the calls to htobe32, +etc. to boost::endian::native_to_big, etc. and replaces <endian.h> +with <boost/endian/conversion.hpp>.

+ +

Reliability and arithmetic-speed

+ +

A new, complex, multi-threaded application is to be developed that must run +on little endian machines, but do big endian network I/O. The developers believe +computational speed for endian variable is critical but have seen numerous bugs +result from inability to reason about endian conversion state. They are also +worried that future maintenance changes could inadvertently introduce a lot of +slow conversions if full-blown endian arithmetic types are used.

+ +

The endian buffers approach is made-to-order for +this use case.

+ +

Reliability and ease-of-use

+ +

A new, complex, multi-threaded application is to be developed that must run +on little endian machines, but do big endian network I/O. The developers believe +computational speed for endian variables is not critical but have seen +numerous bugs result from inability to reason about endian conversion state. +They are also concerned about ease-of-use both during development and long-term +maintenance.

+ +

Removing concern about conversion speed and adding concern about ease-of-use +tips the balance strongly in favor the endian +arithmetic approach.

+ +
+

Last revised: +02 January, 2015

+

© Copyright Beman Dawes, 2011, 2013, 2014

+

Distributed under the Boost Software License, Version 1.0. See +www.boost.org/ LICENSE_1_0.txt

+ +

 

+ + + + \ No newline at end of file diff --git a/doc/conversion.html b/doc/conversion.html index 5ea86b5..8a5e81b 100644 --- a/doc/conversion.html +++ b/doc/conversion.html @@ -24,11 +24,12 @@ - + Buffer Types     + Choosing Approach
Boost Home     + Endian Home     Conversion Functions     Arithmetic Types     - Buffer Types
@@ -376,7 +377,7 @@ Pierre Talbot provided the int8_t endian_reverse() and templated endian_reverse_inplace() implementations.


Last revised: -16 December, 2014

+02 January, 2015

© Copyright Beman Dawes, 2011, 2013

Distributed under the Boost Software License, Version 1.0. See www.boost.org/ LICENSE_1_0.txt

diff --git a/doc/index.html b/doc/index.html index 4914b12..539b816 100644 --- a/doc/index.html +++ b/doc/index.html @@ -22,16 +22,19 @@ - +
- + Buffer Types     + Choosing Approach
Boost Home     + Endian Home     Conversion Functions     Arithmetic Types     - Buffer Types

+
@@ -44,22 +47,6 @@ Introduction to the Boost.Endian library
Choosing between conversion functions,
  buffer types, and arithmetic types
-   Characteristics
-      Endianness invariants
-      Conversion explicitness
-      Arithmetic operations
-      Sizes
-      Alignments
-   Design patterns
-      Convert only as needed (i.e. lazy)
-      Convert in anticipation of need
-      Generally -as needed, locally in anticipation
-   Use case examples
-      Porting endian unaware codebase
-      Porting endian aware codebase
-      Reliability and arithmetic-speed
-      Reliability and ease-of-use
Built-in support for Intrinsics
Performance
   Timings for Example 2
@@ -88,7 +75,7 @@ as needed, locally in anticipation
endianness of integers, floating point numbers, and user-defined types.

    -
  • Three approaches to dealing with endianness are supported. Each has a +
  • Three approaches to endianness are supported. Each has a long history of successful use, and each approach has use cases where it is preferred over the other approaches.
     
  • @@ -111,13 +98,6 @@ floating point numbers, and user-defined types.

-

 

- -

 

- -

 

-

 

-

Introduction to endianness

Consider the following code:

@@ -196,223 +176,21 @@ integers. The types may be aligned.

Choosing between conversion functions, buffer types, and arithmetic types

-

The best approach to endianness for a particular application depends on the interaction between -the application's needs and the characteristics of each of the three (conversion -functions, buffer types, and arithmetic types) approaches.

- -

Recommendation: If you are new to endianness, uncertain, or don't want to invest -the time to -study -engineering trade-offs, use endian arithmetic types. They are safe, easy -to use, and easy to maintain. Use the - -anticipating need design pattern locally around performance hot spots like lengthy loops, -if needed. 

- -

Characteristics

- -

The characteristics that differentiate the three approaches to endianness are the endianness -invariants, conversion explicitness, arithmetic operations, sizes available, and -alignment requirements.

- -

Endianness invariants

- -
- -

Endian conversion functions use objects of the ordinary C++ arithmetic -types like int or unsigned short to hold values. That -breaks the implicit invariant that the C++ language rules apply. The usual -language rules only apply if the endianness of the object is currently set to the native endianness for the platform. That can -make it very hard to reason about complex logic flow, and result in difficult to -find bugs.

- -

Endian buffer and arithmetic types hold values internally as arrays of -characters with an invariant that the endianness of the array never changes. -That makes these types easier to use and programs easier to maintain.

- -
- -

Conversion explicitness

- -
- -

Endian conversion functions and buffer types never perform -implicit conversions. This gives users explicit control of when conversion -occurs, and may help avoid unnecessary conversions.

- -

Endian arithmetic types perform conversion implicitly. That makes -these types very easy to use, but can result in unnecessary conversions. Failure -to hoist conversions out of inner loops can bring a performance penalty.

- -
- -

Arithmetic operations

- -
- -

Endian conversion functions do not supply arithmetic -operations, but this is not a concern since this approach uses ordinary C++ -arithmetic types to hold values.

- -

Endian buffer types do not supply arithmetic operations. Although this -approach avoids unnecessary conversions, it can result in the introduction of -additional variables and confuse maintenance programmers.

- -

Endian arithmetic types do supply arithmetic operations. They -are very easy to use if lots of arithmetic is involved.

- -
- -

Sizes

- -
- -

Endianness conversion functions only support 1, 2, 4, and 8 byte -integers. That's sufficient for many applications.

- -

Endian buffer and arithmetic types support 1, 2, 3, 4, 5, 6, 7, and 8 -byte integers. For an application where memory use or I/O speed is the limiting -factor, using sizes tailored to application needs can be useful.

- -
- -

Alignments

- -
- -

Endianness conversion functions only support aligned integer and -floating-point types. That's sufficient for most applications.

- -

Endian buffer and arithmetic types support both aligned and unaligned -integer and floating-point types. Unaligned types are rarely needed, but when -needed they are often very useful and workarounds are painful. For example,

- -
-

Non-portable code like this:

-

struct S {
-   uint16_t a;  // big endian
-   uint32_t b;  // big endian
- } __attribute__ ((packed));
-

-

Can be replaced with portable code like this:

-
-

struct S {
-   big_uint16_ut a;
-   big_uint32_ut b;
- };
-

-
- -
- -

Design patterns

- -

Applications often traffic in endian data as records or packets containing -multiple endian data elements. For simplicity, we will just call them records.

- -

If desired endianness differs from native endianness, a conversion has to be -performed. When should that conversion occur? Three design patterns have -evolved.

- -

Convert only as needed (i.e. lazy)

- -

This pattern defers conversion to the point in the code where the data -element is actually used.

- -

This pattern is appropriate when which endian element is actually used varies -greatly according to record content or other circumstances

- -

Convert in anticipation of need

- -

This pattern performs conversion to native endianness in anticipation of use, -such as immediately after reading records. If needed, conversion to the output -endianness is performed after all possible needs have passed, such as just -before writing records.

- -

One implementation of this pattern is to create a proxy record with -endianness converted to native in a read function, and expose only that proxy to -the rest of the implementation. If a write function, if needed, handles the -conversion from native to the desired output endianness.

- -

This pattern is appropriate when all endian elements in a record are -typically used regardless of record content or other circumstances

- -

Convert -generally only as needed, but locally in anticipation of need

- -

This pattern in general defers conversion but for specific local needs does -anticipatory conversion.

- -

This pattern is particularly appropriate when coupled with the endian buffer -or arithmetic types.

- -

Use case examples

- -

Porting endian unaware codebase

- -

An existing codebase runs on big endian systems. It does not -currently deal with endianness. The codebase needs to be modified so it can run -on  little endian systems under various operating systems. To ease -transition and protect value of existing files, external data will continue to -be maintained as big endian.

- -

The endian -arithmetic approach is recommended to meet these needs. A relatively small -number of header files dealing with binary I/O layouts need to change types. For -example,  -short or int16_t would change to big_int16_t. No -changes are required for .cpp files.

- -

Porting endian aware codebase

- -

An existing codebase runs on little-endian Linux systems. It already -deals with endianness via -Linux provided -functions. Because of a business merger, the codebase has to be quickly -modified for Windows and possibly other operating systems, while still -supporting Linux. The codebase is reliable and the programmers are all -well-aware of endian issues.

- -

These factors all argue for an endian conversion -approach that just mechanically changes the calls to htobe32, -etc. to boost::endian::native_to_big, etc. and replaces <endian.h> -with <boost/endian/conversion.hpp>.

- -

Reliability and arithmetic-speed

- -

A new, complex, multi-threaded application is to be developed that must run -on little endian machines, but do big endian network I/O. The developers believe -computational speed for endian variable is critical but have seen numerous bugs -result from inability to reason about endian conversion state. They are also -worried that future maintenance changes could inadvertently introduce a lot of -slow conversions if full-blown endian arithmetic types are used.

- -

The endian buffers approach is made-to-order for -this use case.

- -

Reliability and ease-of-use

- -

A new, complex, multi-threaded application is to be developed that must run -on little endian machines, but do big endian network I/O. The developers believe -computational speed for endian variables is not critical but have seen -numerous bugs result from inability to reason about endian conversion state. -They are also concerned about ease-of-use both during development and long-term -maintenance.

- -

Removing concern about conversion speed and adding concern about ease-of-use -tips the balance strongly in favor the endian -arithmetic approach.

+

This section has been moved to its own +Choosing the Approach page.

Built-in support for Intrinsics

-

Supply compilers, including GCC, Clang, and Visual C++, supply built-in support for byte swapping intrinsics. -The library uses these intrinsics when available since they may result in smaller and faster generated code, particularly for release +

Most compilers, including GCC, Clang, and Visual C++, supply built-in support for byte swapping intrinsics. +The Endian library uses these intrinsics when available since they may result in smaller and faster generated code, particularly for release builds.

-

Defining BOOST_ENDIAN_NO_INTRINSICS will suppress use +

Defining the macro BOOST_ENDIAN_NO_INTRINSICS will suppress use of the intrinsics. Useful when intrinsic headers such as -byteswap.h are not being found on your platform.

+byteswap.h are not being found by your compiler, perhaps because it +is an older release or has very limited supporting libraries.

The macro BOOST_ENDIAN_INTRINSIC_MSG is defined as either "no byte swap intrinsics" or a string describing the -particular set of intrinsics being used.

+particular set of intrinsics being used. Useful for eliminating missing +intrinsics as a source of performance issues.

Performance

@@ -762,7 +540,7 @@ stores, multiple instructions are required on common platforms.

I/O formats?

Using the unaligned integer types to save internal or external -memory space is a minor secondary use case.

+memory space is a minor secondary use.

Why bother with binary I/O? Why not just use C++ Standard Library stream inserters and extractors?

@@ -876,7 +654,7 @@ Blechmann, Tim Moore, tymofey, Tomas Puverle, Vincente Botet, Yuval Ronen and Vitaly Budovski,.


Last revised: -31 December, 2014

+02 January, 2015

© Copyright Beman Dawes, 2011, 2013

Distributed under the Boost Software License, Version 1.0. See www.boost.org/ LICENSE_1_0.txt