Integer Patch
The Integer Patch adds little-endian support and output base awareness to the Integer class. The endian support enables the Integer class to construct Integers
from strings and byte arrays in little-endian format. Its useful for algorithms like Poly1305 and MS CAPI interop, where many parameters are provided in little-endian format.
The base support allows Integers
to honor std::ios_base::showbase
and std::ios_base::noshowbase
flags. std::ios_base::showbase
is enabled by default, so enabling it produces no changes to a stream's default behavior. If std::noshowbase
is in effect, then the Integer
class will not emit the suffix indicating the base. The suffixes are b, o, h, or . (the last is for decimal).
An optional lookup table was also added for parsing ASCII strings. The lookout table avoids 4 if/then/else's and 6 compares. It should be useful on processors like ARM in thumb mode because it avoids the branching (and subsequent stalls) at the expense of 256 bytes.
There is a test source file named integer-test.c++
that allows testing of the little-endian conversion routines.
Note well: this is not part of the Crypto++ library. You must download and install the patch below.
Patch
The changes the patch makes can be found in integer.diff
. The essence of the patch is to add an additional ByteOrder
parameter to select Integer
constructors, and then parse the incoming array in little endian format if LITTLE_ENDIAN_ORDER
is specified. A default ByteOrder
parameter of BIG_ENDIAN_ORDER
is used, so existing code works as expected. The patch also adds awareness for ostream
's std::showbase
and std::noshowbase
.
The files that changed are:
- config.h
- integer.h integer.cpp
- integer-test.c++
Little Endian Integers
For little endian support, the constructors that changed (with their new signatures) are:
explicit Integer(const char *str, ByteOrder order = BIG_ENDIAN_ORDER)
explicit Integer(const wchar_t *str, ByteOrder order = BIG_ENDIAN_ORDER)
Integer(const byte *encodedInteger, size_t byteCount, Signedness s=UNSIGNED, ByteOrder o=BIG_ENDIAN_ORDER)
Integer(BufferedTransformation &bt, size_t byteCount, Signedness s=UNSIGNED, ByteOrder o=BIG_ENDIAN_ORDER)
The little-endian functionality was not extended to the various Decode
(and Encode
) functions.
For binary, octal and decimal, the incoming string is simply parsed in reverse. So 123456789010/LE
converts to 098765432110/BE
. The tricky part to little endian support is handling an odd number of hexadecimal digits or nibbles (a nibble is a 4-bit chunk). In an ideal world, we get would always encounter two digits or nibbles, and there would never be a single digit or nibble because hexadecimal demands two of them.
For example, 0x1FF16/BE
are three nibbles. In big endian, they are the two octets 0x01
and 0xFF
. To ensure the consistent results with little endian 0xFF116/LE
, an odd nibble is shifted down due to the missing nibble. That is, 0xFF116/LE
is interpreted as 0xFF
and 0x01
; and not 0xFF
and 0x10
. If 0xFF116/LE
was interpreted as 0xFF
and 0x10
, then that would break the most basic case of 0x1
. That is, something that should intuitively be 1 (0x116/LE
) would be interpreted as 1016 or 1610.
std::showbase and Suffixes
By default, Crypto++ always applies a base suffix to its output and there is no way to control it. If you want Crypto++ Integers
to honor std::showbase
and std::noshowbase
, then uncomment the define CRYPTOPP_USE_STD_SHOWBASE
in config.h
. The define is already present, and it just needs to be uncommented.
C++ I/O streams don't use std::showbase
by default, so the suffixes that Crypto++ normally applies will be suppressed without further action. If you want to show the suffixes, then enable std::showbase
by performing similar to the following:
Integer n(32 + 15); cout.setf(std::ios::showbase); cout << std::oct << n << endl; cout << std::dec << n << endl; cout << std::hex << n << endl; cout.unsetf(std::ios::showbase); cout << std::oct << n << endl; cout << std::dec << n << endl; cout << std::hex << n << endl; cout << std::showbase << endl; cout << std::oct << n << endl; cout << std::dec << n << endl; cout << std::hex << n << endl; cout << std::noshowbase << endl; cout << std::oct << n << endl; cout << std::dec << n << endl; cout << std::hex << n << endl;
It will produce output similar to:
$ ./integer-test.exe 57o 47. 2fh 57 47 2f 57o 47. 2fh 57 47 2f
ASCII Lookup Table
An optional lookup table was added for parsing ASCII strings. If you want to use the lookup table, then uncomment the define CRYPTOPP_USE_ASCII_CHAR_VALUE_LOOKUP_TABLE
in config.h
. The define is already present, and it just needs to be uncommented. The lookout table avoids 4 if/then/else's and 6 compares. It should be useful on processors like ARM in thumb mode because it avoids the branching (and subsequent stalls) at the expense of 256 bytes.
The unused values in the table are set to 46, which is the period ('.') character. Its just a filler that can be seen under a debugger. Any value greater than 16 could have been used.
Do not use the table for a system that uses EBCDIC as the execution or runtime encoding. EBCDIC character encodings are different than ASCII, and the table won't produce expected results. For example, in ASCII A is 6510, while in EBCDIC A is 19310. And EBCDIC lower case letters proceed capital letters, while in ASCII capital letters proceed lower case letters.
In a morbid sort of humorous way, the C and C++ standards don't guarantee the letters A through F or a through f are contiguous (only the characters 0 through 9). So tests like the following could fail on obscure systems. If you find such a system, then please report it :)
int digit; char ch = str[idx]; if(ch >= 'A' && ch <= 'F') digit = ch - 'A' + 10; ...
Testing
To test the patch, drop integer-test.c++
in the cryptopp
directory, and then compile it:
$ g++ -DDEBUG=1 -g3 -Os -Wall -Wextra \ -I. integer-test.c++ ./libcryptopp.a -o integer-test.exe
A typical output line is simply the name of the test (for example, H-1 or Hex-1), the big endian value(s) and little endian value(s) of the constructed integer. All the values on a line should be the same. (See the sample output below for the hexadecimal tests)
Issue the following to determine if there are failures. The grep -B 1
prints the failed message, the test name and values which failed. A failure would look similar to below:
$ ./integer-test.exe | grep -B 1 FAILED ... H-XXX: 01, 01, 10, 10 FAILED
$ ./integer-test.exe H-1a: 0, 0, 0, 0 H-2a: 0, 0, 0, 0 H-3a: 0, 0, 0, 0 H-1b: 0, 0, 0, 0 H-2b: 0, 0, 0, 0 H-3b: 0, 0, 0, 0 H-4a: 1, 1, 1, 1 H-5a: 1, 1, 1, 1 H-4b: -1, -1, -1, -1 H-5b: -1, -1, -1, -1 H-6a: 1, 1, 1, 1 H-7a: 1, 1, 1, 1 H-8a: 1, 1, 1, 1 H-9a: 1, 1, 1, 1 H-6b: -1, -1, -1, -1 H-7b: -1, -1, -1, -1 H-8b: -1, -1, -1, -1 H-9b: -1, -1, -1, -1 H-10: 1, 1, 1, 1 H-11: 123, 123, 123, 123 H-12: 12345, 12345, 12345, 12345 H-13: 1234567, 1234567, 1234567, 1234567 H-14: 123456789, 123456789, 123456789, 123456789 H-15: 123456789ab, 123456789ab, 123456789ab, 123456789ab H-16: 123456789ab, 123456789ab, 123456789ab, 123456789ab H-17: 123456789abcd, 123456789abcd, 123456789abcd, 123456789abcd H-18: 123456789abcd, 123456789abcd, 123456789abcd, 123456789abcd H-19: 123456789abcdef, 123456789abcdef, 123456789abcdef, 123456789abcdef H-20: 123456789abcdef, 123456789abcdef, 123456789abcdef, 123456789abcdef H-21: 1, 1, 1, 1 H-22: 123, 123, 123, 123 H-23: 12345, 12345, 12345, 12345 H-24: 1234567, 1234567, 1234567, 1234567 H-25: 123456789, 123456789, 123456789, 123456789 H-26: 123456789ab, 123456789ab, 123456789ab, 123456789ab H-27: 123456789ab, 123456789ab, 123456789ab, 123456789ab H-28: 123456789abcd, 123456789abcd, 123456789abcd, 123456789abcd H-29: 123456789abcd, 123456789abcd, 123456789abcd, 123456789abcd H-30: 123456789abcdef, 123456789abcdef, 123456789abcdef, 123456789abcdef H-31: 123456789abcdef, 123456789abcdef, 123456789abcdef, 123456789abcdef H-32: 1, 1, 1, 1 H-33: 2301, 2301, 2301, 2301 H-34: 452301, 452301, 452301, 452301 H-35: 67452301, 67452301, 67452301, 67452301 H-36: 8967452301, 8967452301, 8967452301, 8967452301 H-37: ab8967452301, ab8967452301, ab8967452301, ab8967452301 H-38: ab8967452301, ab8967452301, ab8967452301, ab8967452301 H-39: cdab8967452301, cdab8967452301, cdab8967452301, cdab8967452301 H-40: cdab8967452301, cdab8967452301, cdab8967452301, cdab8967452301 H-41: efcdab8967452301, efcdab8967452301, efcdab8967452301, efcdab8967452301 H-41: efcdab8967452301, efcdab8967452301, efcdab8967452301, efcdab8967452301 H-42: 1ff, 1ff, 1ff, 1ff H-43: ff01, ff01, ff01, ff01 H-44: 1ff, 1ff, 1ff, 1ff H-45: 0, 0, 0, 0 H-46: 0, 0, 0, 0 ...
Downloads
cryptopp-integer.zip - Patch for little-endian support and output base awareness to the Integer class.