Std::byte
C++17 added std::byte
in 2016 by way of P0298R0, A byte type definition. The Crypto++ library also provides a byte
in the global namespace. Additionally, Windows provides a byte
in their Windows Kit. Compile problems can arise under certain conditions when the language's byte or Windows' byte collides with the library's byte. A secondary problem for the library is, the language's byte is not semantically equivalent to the library's byte.
The incompatibilities introduced by C++17 were design choices, like ensuring a std::byte
was neither a character type or integer type, and only bit operations are allowed on it. Generally speaking, they were good design choices that intersected badly with some libraries and programs. Not all libraries were affected. For example, Jack Lloyd's Botan uses uint8_t
rather than a byte, so there was no clash of the types.
Romain Geissler performed the early testing and made a couple of pull requests to reconcile the issues with the library and user programs when using C++17 and std::byte
were in effect and std
symbols were placed in the global namespace; see PR 438, Use ::byte instead of byte. This page will discuss how and why the Crypto++ library was modified to minimize user discomfort when using both C++17 and the Crypto++ library. Uri Blumenthal, Romain Geissler, Marcel Raad and Jeffrey Walton worked on finding the change that caused the least amount of pain.
We opened Issue 442, Test C++17 byte change with dry runs from various projects to document the changes and estimate the impact on dependent projects. In Issue 442, we called for testers based on GitHub projects with activity in 2016 and 2017. We also performed the fix on several open source projects so we could witness the pain points first hand.
If you need to setup a rig to test the changes below, then Fedora 26 ships with GCC 7.1. You can also find GCC 7 in Debian 10/Buster (Debian 9 with Testing repo enabled). For Debian, you must apt-get install gcc-7 g++-7
. Its not clear which version of the Microsoft compilers support C++17 and std::byte
.
The change detailed below occurred at Crypto++ 6.0 because it broke ABI compatibility. The check-in occurred at 00f9818b5d8e. Also see config.h on the Crypto++ wiki.
The Problem
The C++17 language provides a std::byte
. Crypto++ also provides a byte
and its in the global namespace. The situation gives rise to at least two problems. The first problem is a collision of symbols under some circumstances. The second is, C++17 std::byte
is not semantically equivalent to the library's byte
type.
Symbol collision
The first problem arises when std::byte
and Crypto++ byte
clash. It can happen, for example, if the std
namespace is dumped into the global namespace (the using namespace std;
below). It can also happen when Windows SDK and Kits are used in a program. It was first reported by Romain Geissler; see PR 437, Fix C++17 build and PR 438, Use ::byte instead of byte.
using namespace std; int main(int argc, char* argv[]) { // Ambiguous: is it std::byte or Crypto++ byte? byte block[16]; ... }
The Crypto++ byte
has been around since the early 1990s. Its presence predates C++ namespaces by about 5 years. When C++ added namespaces around 1998, all Crypto++ symbols were added to the CryptoPP
namespace. However, byte
remained in the global namespace.
The decision to leave byte
in the global namespace was documented in code comments, and probably had something to do with Microsoft platforms, early C++ compilers and name lookups (from config.h
):
// put in global namespace to avoid ambiguity with other byte typedefs typedef unsigned char byte;
Semantic equivalence
A secondary problem is, the C++ std::byte
is not semantically equivalent to the Crypto++ byte
. For example, the following code will not compile using std::byte
. Similar code would compile using the Crypto++ byte
, including integral operations like addition.
$ cat -n byte.cxx 1 #include <cstddef> 2 3 int main(int argc, char* argv[]) 4 { 5 std::byte b1{0x00}; 6 7 std::byte b2 = {0x00}; 8 9 std::byte b3{0x00}; b3 |= 0xAA; 10 11 std::byte b4{0x00}; b4 += 0xAA; 12 13 return 0; 14 }
Attempting to compile it results in:
$ g++ -std=c++17 byte.cxx -o test.exe byte.cxx: In function ‘int main(int, char**)’: byte.cxx:7:23: error: cannot convert ‘int’ to ‘std::byte’ in initialization std::byte b2 = {0x00}; ^ byte.cxx:9:26: error: no match for ‘operator|=’ (operand types are ‘std::byte’ and ‘int’) std::byte b3{0x00}; b3 |= 0xAA; ~~~^~~~~~~ In file included from byte.cxx:1:0: /usr/include/c++/7/cstddef:136:3: note: candidate: constexpr std::byte& std::operator|=(std::byte&, std::byte) operator|=(byte& __l, byte __r) noexcept ^~~~~~~~ /usr/include/c++/7/cstddef:136:3: note: no known conversion for argument 2 from ‘int’ to ‘std::byte’ byte.cxx:11:26: error: no match for ‘operator+=’ (operand types are ‘std::byte’ and ‘int’) std::byte b4{0x00}; b4 += 0xAA; ~~~^~~~~~~
The Decision
The compromise to accommodate C++17 std::byte
was move the Crypto++ byte
from the global namespace into the CryptoPP
namespace. The change occurred at check-in 00f9818b5d8e and it will be available in Crypto++ 6.0. The library's change addresses the root cause, but it could cause issues in some user programs.
Fortunately, the issues in user programs is easily addressable in many instances. A user program can simply add typedef unsigned char byte;
to a common header, like stdafx.h
, and the previous balance is restored. See Fixing Programs below for more alternatives.
Root Cause
Placing symbols in other namespaces was the root cause of the problem. More precisely, the library adding byte
type to the global namespace. From the C++ engineering perspective, the library's byte
always belonged at CryptoPP::byte
.
Prior to the clash, we were mostly OK with supplying a byte
in the global namespace for convenience. It played well with Microsoft's byte
which was also placed in the global namespace. However the C++17 changes showed how fragile adding symbols in a namespace other than our own could be.
Alternatives
While studying the issue and contemplating changes, the library had several alternatives to choose from. They included the following.
- Do nothing. Let users figure it out.
- Leave
byte
, and scope function parameters with::byte
. This is Geissler's patch. - Remove
byte
. Avoid collisions in library, let users figure out the rest. - Add
CryptoPP::byte
. Avoid collisions in library. - Add function overloads. Accept
std::byte
in addition toCryptoPP::byte
. - Use
std::byte
. Cast tostd::byte
for copy and assignment. Cast to integral for non-bitops. - Leave
byte
, addCryptoPP::byte
. Avoid collisions in library, let users figure out the rest.
Geissler's patch scoped all uses of ::byte
in the library. It was a good, intermediate solution to the issue under C++17. However, the patch was effectively treating the symptom and not the underlying problem. The underlying problem was the library adding symbols in a namespace other than its own. Ultimately we chose to address the underlying problem.
Fixing Programs
Some user programs which link to the Crypto++ library will need to be fixed. There are several ways a user program can be fixed and some of the alternatives are listed below. Allowing users to make a choice is important because the library does not want to make policy decisions that bind user programs. We apologize for forcing other projects into action over this.
The work-arounds listed below are in random order, and do not indicate an order of preference. We expect Supply byte definition to be an expedient fix, but it may not be a good long term fix. It may not be a good long term fix because the program is doing what got the library into trouble.
Use SecByteBlock
Some uses of the former unscoped byte
are a little fast and loose because sensitive information was not zeroized, and a SecByteBlock
should have been used. If the data is sensitive, then consider switching to a SecByteBlock
. SecByteBlock
has both a begin()
and size()
, so it can be used almost everywhere you are using a byte array.
int main(int argc, char* argv[]) { // formerly byte block[16]; CryptoPP::SecByteBlock block(16); CryptoPP::OS_GenerateRandomBlock(false, block.begin(), block.size()); ... }
Use Crypto++ byte
If you don't handle sensitive information and you want to use the library's byte
, then CryptoPP::byte
is available as shown below.
int main(int argc, char* argv[]) { // formerly byte block[16]; CryptoPP::byte block[16]; CryptoPP::OS_GenerateRandomBlock(false, block, sizeof(block)); ... }
In fact, Crypto++ used a similar strategy for test.cpp
because Microsoft Windows Kits provide a byte
in the global namespace that creates the ambiguous symbol. Also see Issue 447, Windows and "error C2872: 'byte' : ambiguous symbol" and Commit 00e133745663.
Use <stdint.h> uint8_t
You can simply use uint8_t
in your programs. Some libraries, like Botan, side stepped the problem by using uint8_t
.
$ cat test.cxx #include "cryptlib.h" #include "osrng.h" #include "hex.h" #include <iostream> #include <cstdint> int main(int argc, char* argv[]) { using namespace CryptoPP; AutoSeededRandomPool prng; uint8_t key[32]; prng.GenerateBlock(key, sizeof(key)); std::cout << "Key: "; StringSource(key, 32, true, new HexEncoder(new FileSink(std::cout))); std::cout << std::endl; return 0; }
Supply byte definition
A user program supplying its own definition for a byte
is probably the easiest solution to fill the void created by removing byte
from the global namespace. It gets you back to where things were before the library made its change, and allows you to deal with declarations like using namespace std
how you see fit.
typedef unsigned char byte; int main(int argc, char* argv[]) { byte block[16]; CryptoPP::OS_GenerateRandomBlock(false, block, sizeof(block)); ... }
The library used typedef unsigned char byte;
for Kalyna's anonymous namespace at check-in fe637956388f8dd6. The change insulated Kalyna from byte
related changes in outer scopes. It also means Kalyna's code has access to at least two definitions of a byte
. The first at ::byte
or CryptoPP::byte
, and the second at CryptoPP::<kalyna anonymous>::byte
.
When we were investigating the impact of the change, we also examined popular projects like Pycryptopp. Pycryptopp needed the byte
added to three source file just like Kalyna (and Pycryptopp did not need it for all project files). Also see Pull Request 43 on the Pycryptopp GitHub.
You can get fancier, if desired. In the code below, an unscoped byte
alias is created with using byte = CryptoPP::byte
if Crypto++ is 6.0 or later and C++11 or later are available. Otherwise, the user program simply provides its own compatible definition of a byte
in the global namespace.
#include <cryptopp/config.h> #if (CRYPTOPP_VERSION >= 600) && (__cplusplus >= 201103L) using byte = CryptoPP::byte; #else typedef unsigned char byte; #endif int main(int argc, char* argv[]) { byte block[16]; CryptoPP::OS_GenerateRandomBlock(false, block, sizeof(block)); ... }
Scope use of byte
Romain Geissler's patch scoped all uses of byte
in the library. It was a good, intermediate solution to the issue under C++17. Geissler's patch also had the benefit of enduring using namespace std;
. However, the patch does not treat the underlying problem of adding symbols in a namespace other than the library's namespace.
You can find the patch at Pull Request 438, Use ::byte instead of byte. Below is an example of the change in signature for a library function. A similar change occurred in every library source file which used byte
.
Former, unscoped byte:
namespace CryptoPP { OS_GenerateRandomBlock(bool blocking, byte* output, size_t size); };
Patched, scoped byte:
namespace CryptoPP { OS_GenerateRandomBlock(bool blocking, ::byte* output, size_t size); };
using namespace X
By far, using namespace std
is one of the most criticized statements in a program during review. As early as 2000 Herb Sutter advised against it at Migrating to Namespaces. In the context of Crypto++, using namespace CryptoPP
falls into the same category. User programs should not do it. With that said, we understand it shows up in some user programs.
The Crypto++ project slowly changed the library to follow Sutter's advice. The change for the library was minimal because most uses of using namespace std;
were limited to several source files (*.cpp
files) in its test framework (test.cpp
, datatest.cpp
, validate.cpp
, benchmark.cpp
, etc). The last library check-in for the goal occurred at 80c4033baf09.
Sutter's advice and the byte
collision mean user programs should change:
using namespace std; int main(int argc, char* argv[]) { cout << "Hello World" << endl; ... }
To:
int main(int argc, char* argv[]) { std::cout << "Hello World" << std::endl; ... }
Or:
int main(int argc, char* argv[]) { using std::cout; using std::endl; cout << "Hello World" << endl; ... }
Investigation
Before providing the check-in at 00f9818b5d8e, several alternatives were tested to see how well they integrated and how much discomfort they could cause a typical user program. Testing occurred on Fedora 26 with GCC 7.1 and the -std=c++17
compiler option. The baseline user program is shown below and its written according to Herb Sutter's recommendations in Migrating to Namespaces, and includes:
- avoid
using
directives entirely, especially in header files - never write namespace using declarations in header files
- user code should prefix all standard library names with
std::
Internally, the Crypto++ library does not violate Sutter's rules in header files. The Crypto++ test program and source files, like test.cpp
, do use using namespace CryptoPP
, however.
$ cat test.cxx #include "cryptlib.h" #include "secblock.h" #include "osrng.h" #include "files.h" #include "hex.h" #include <iostream> #include <cstddef> void GenerateBlock(byte* block, size_t size) { CryptoPP::OS_GenerateRandomBlock(false, block, size); } int main(int argc, char* argv[]) { CryptoPP::HexEncoder encoder(new CryptoPP::FileSink(std::cout)); CryptoPP::SecByteBlock block1(16); std::byte block2[16]; GenerateBlock(block1, block1.size()); GenerateBlock(reinterpret_cast<byte*>(block2), sizeof(block2)); std::cout << "Block 1: "; encoder.Put(block1, block1.size()); encoder.MessageEnd(); std::cout << std::endl; std::cout << "Block 2: "; encoder.Put(reinterpret_cast<const byte*>(block2), sizeof(block2)); encoder.MessageEnd(); std::cout << std::endl; return 0; }
Variations of the program were tested with using
statements. They were:
- Crypto++
byte
in global namespace using std::byte
using CryptoPP::byte
using std::byte
andusing CryptoPP::byte
using namespace std
using byte = std::byte
The first attempt to uptake std::byte
was a using byte = std::byte
in the library. The attempt failed miserably because the library treats byte
as an integral type and performs both bit and math operations on them. The library's use of byte
is incompatible with the standard.
Below is an example of the failure introduced by using byte = std::byte
from misc.h
.
In file included from cryptlib.cpp:19:0: misc.h: In function ‘void CryptoPP::IncrementCounterByOne(byte*, unsigned int)’: misc.h:1118:12: error: no match for ‘operator++’ (operand type is ‘byte {aka std::byte}’) carry = !++inout[i]; ^~~~~~~~~~ misc.h: In function ‘void CryptoPP::IncrementCounterByOne(byte*, const byte*, unsigned int)’: misc.h:1134:33: error: no match for ‘operator+’ (operand types are ‘const byte {aka const std::byte}’ and ‘int’) carry = ((output[i] = input[i]+1) == 0); ~~~~~~~~^~ misc.h: In function ‘byte CryptoPP::BitReverse(byte)’: misc.h:1847:22: error: no match for ‘operator&’ (operand types are ‘byte {aka std::byte}’ and ‘int’) value = byte((value & 0xAA) >> 1) | byte((value & 0x55) << 1); ~~~~~~^~~~~~ ...
CryptoPP::byte
Moving byte
into the CryptoPP
namespace produced initial errors due to use of anonymous namespaces. For example:
g++ -DNDEBUG -g2 -O3 -std=c++17 -Wall -Wextra -fPIC -march=native -pipe -c kalynatab.cpp kalyna.cpp: In function ‘void {anonymous}::MakeOddKey(const word64*, CryptoPP::word64*)’: kalyna.cpp:76:11: error: ‘byte’ does not name a type const byte* even = reinterpret_cast<const byte*>(evenkey); ^~~~ kalyna.cpp:77:5: error: ‘byte’ was not declared in this scope byte* odd = reinterpret_cast<byte*>(oddkey); ^~~~ ...
Once the source files were cleaned up at Commit fe637956388f8dd6 the library files compiled as expected. User programs experienced the expected 'byte' was not declared in this scope
messages due to the namespace change.
using CryptoPP::byte
using CryptoPP::byte;
is a natural continuation of moving to CryptoPP::byte
for user programs. Unfortunately using byte = CryptoPP::byte;
produced the error below due to using namespace std;
and using namespace CryptoPP;
. In fact, using byte = CryptoPP::byte;
made the problem worse because it introduced a third symbol into the name lookup by way of an alias.
test.cxx: In function ‘int main(int, char**)’: test.cxx:27:3: error: reference to ‘byte’ is ambiguous byte block3[16]; ^~~~ test.cxx:12:28: note: candidates are: using byte = CryptoPP::byte using byte = CryptoPP::byte; ^ In file included from stdcpp.h:48:0, from cryptlib.h:97, from test.cxx:1: /usr/include/c++/7/cstddef:64:14: note: enum class std::byte enum class byte : unsigned char {}; ^~~~
A twist on Windows with a Windows Kit installed is:
1> test.cpp 1>test.cpp(159): error C2872: 'byte' : ambiguous symbol 1> could be 'c:\program files (x86)\windows kits\8.0\include\shared\rpcndr.h(164) : unsigned char byte' 1> or 'c:\users\cryptopp\config.h(203) : CryptoPP::byte'
According to StoryTeller on Stack Overflow at byte and ambiguous symbol due to using declarations?:
A using declaration (as opposed to a type alias) will not resolve the issue [of ambiguous symbols]. Both identifiers will be visible to unqualified name lookup. Neither is defined in the enclosing declarative region.
Function Overloads
A potential path to avoid user casting when passing a std::byte
to a function that expected a byte
or CryptoPP::byte
was to provide overloads in key classes. For example, in class BufferedTransformation
:
class BufferedTransformation : public Algorithm, public Waitable { public: virtual ~BufferedTransformation() {} ... size_t Put(const byte *inString, size_t length, bool blocking=true) {return Put2(inString, length, 0, blocking);} #if CRYPTOPP_CXX17 size_t Put(const std::byte *inString, size_t length, bool blocking=true) {return Put2(reinterpret_cast<const byte*>(inString), length, 0, blocking);} #endif ... };
This made it somewhat easier for user code to use both std::byte
and CryptoPP::byte
by moving casts into the library, but it was an incomplete remediation as some overloads were missing. Additionally, there was no way to overload some functions, like those where the byte*
is a return value. For example, virtual byte * CreatePutSpace(size_t &size)
.
Finally, the conditional compilation in the header leaves something to be desired. We don't want users to witness hacks that may or may not be present. We would prefer they be hidden away in a source file and forgotten about once compilation of the library completed.
using namespace std
The code that caused the most problems during testing was user code with using namespace std
. While Sutter strongly discourages it, we expect it to show up in a fair number of places in user code. This is also the impetus for Romain-Geissler's Pull Request 438.
When using namespace std
was present in user code, one of the better decisions for the library was to move byte
into the CryptoPP
namespace. The test code was changed as follows.
$ cat test.cxx #include "cryptlib.h" #include "secblock.h" #include "osrng.h" #include "files.h" #include "hex.h" #include <iostream> #include <cstddef> using namespace std; void GenerateBlock(CryptoPP::byte* block, size_t size) { CryptoPP::OS_GenerateRandomBlock(false, block, size); } int main(int argc, char* argv[]) { CryptoPP::HexEncoder encoder(new CryptoPP::FileSink(std::cout)); CryptoPP::SecByteBlock block1(16); std::byte block2[16]; GenerateBlock(block1, block1.size()); GenerateBlock(reinterpret_cast<CryptoPP::byte*>(block2), sizeof(block2)); std::cout << "Block 1: "; encoder.Put(block1, block1.size()); encoder.MessageEnd(); std::cout << std::endl; std::cout << "Block 2: "; encoder.Put(reinterpret_cast<const CryptoPP::byte*>(block2), sizeof(block2)); encoder.MessageEnd(); std::cout << std::endl; return 0; }
The code in main
could avoid the extra casts by overloading the exemplary GenerateBlock
:
// Original void GenerateBlock(CryptoPP::byte* block, size_t size) { CryptoPP::OS_GenerateRandomBlock(false, block, size); } // Overload void GenerateBlock(std::byte* block, size_t size) { CryptoPP::OS_GenerateRandomBlock(false, reinterpret_cast<CryptoPP::byte*>(block), size); }