Crypto++ 8.9
Free C++ class library of cryptographic schemes
integer.cpp
1// integer.cpp - originally written and placed in the public domain by Wei Dai
2// contains public domain code contributed by Alister Lee and Leonard Janke
3
4// Notes by JW: The Integer class needs to do two things. First, it needs
5// to set function pointers on some platforms, like X86 and X64. The
6// function pointers select a fast multiply and addition based on the cpu.
7// Second, it wants to create Integer::Zero(), Integer::One() and
8// Integer::Two().
9// The function pointers are initialized in the InitializeInteger class by
10// calling SetFunctionPointers(). The call to SetFunctionPointers() is
11// guarded to run once using a double-checked pattern. We don't use C++
12// std::call_once due to bad interactions between libstdc++, glibc and
13// pthreads. The bad interactions were causing crashes for us on platforms
14// like Sparc and PowerPC. Since we are only setting function pointers we
15// don't have to worry about leaking memory. The worst case seems to be the
16// pointers gets written twice until the init flag is set and visible to
17// all threads.
18// For Integer::Zero(), Integer::One() and Integer::Two(), we use one of three
19// strategies. First, if initialization priorities are available then we use
20// them. Initialization priorities are init_priority() on Linux and init_seg()
21// on Windows. OS X and several other platforms lack them. Initialization
22// priorities are platform specific but they are also the most trouble free
23// with deterministic destruction.
24// Second, if C++11 dynamic initialization is available, then we use it. After
25// the std::call_once fiasco we moved to dynamic initialization to avoid
26// unknown troubles platforms that are tested less frequently. In addition
27// Microsoft platforms mostly do not provide dynamic initialization.
28// The MSDN docs claim they do but they don't in practice because we need
29// Visual Studio 2017 and Windows 10 or above.
30// Third, we fall back to Wei's original code of a Singleton. Wei's original
31// code was much simpler. It simply used the Singleton pattern, but it always
32// produced memory findings on some platforms. The Singleton generates memory
33// findings because it uses a Create On First Use pattern (a dumb Nifty
34// Counter) and the compiler had to be smart enough to fold them to return
35// the same object. Unix and Linux compilers do a good job of folding objects,
36// but Microsoft compilers do a rather poor job for some versions of the
37// compiler.
38// Another problem with the Singleton is resource destruction requires running
39// resource acquisition in reverse. For resources provided through the
40// Singletons, there is no way to express the dependency order to safely
41// destroy resources. (That's one of the problems C++11 dynamic
42// initialization with concurrent execution is supposed to solve).
43// The final problem with Singletons is resource/memory exhaustion in languages
44// like Java and .Net. Java and .Net load and unload a native DLL hundreds or
45// thousands of times during the life of a program. Each load produces a
46// memory leak and they can add up quickly. If they library is being used in
47// Java or .Net then Singleton must be avoided at all costs.
48//
49// The code below has a path cut-in for BMI2 using mulx and adcx instructions.
50// There was a modest speedup of approximately 0.03 ms in public key Integer
51// operations. We had to disable BMI2 for the moment because some OS X machines
52// were advertising BMI/BMI2 support but caused SIGILL's at runtime. Also see
53// https://github.com/weidai11/cryptopp/issues/850.
54
55#include "pch.h"
56#include "config.h"
57
58#ifndef CRYPTOPP_IMPORTS
59
60#include "integer.h"
61#include "secblock.h"
62#include "modarith.h"
63#include "nbtheory.h"
64#include "smartptr.h"
65#include "algparam.h"
66#include "filters.h"
67#include "stdcpp.h"
68#include "asn.h"
69#include "oids.h"
70#include "words.h"
71#include "pubkey.h" // for P1363_KDF2
72#include "sha.h"
73#include "cpu.h"
74#include "misc.h"
75
76#include <iostream>
77
78#if (CRYPTOPP_MSC_VERSION >= 1400) && !defined(_M_ARM)
79 #include <intrin.h>
80#endif
81
82#ifdef __DECCXX
83 #include <c_asm.h>
84#endif
85
86// "Error: The operand ___LKDB cannot be assigned to",
87// http://github.com/weidai11/cryptopp/issues/188
88#if (__SUNPRO_CC >= 0x5130)
89# define MAYBE_CONST
90# define MAYBE_UNCONST_CAST(x) const_cast<word*>(x)
91#else
92# define MAYBE_CONST const
93# define MAYBE_UNCONST_CAST(x) x
94#endif
95
96// "Inline assembly operands don't work with .intel_syntax",
97// http://llvm.org/bugs/show_bug.cgi?id=24232
98#if CRYPTOPP_BOOL_X32 || defined(CRYPTOPP_DISABLE_MIXED_ASM)
99# undef CRYPTOPP_X86_ASM_AVAILABLE
100# undef CRYPTOPP_X32_ASM_AVAILABLE
101# undef CRYPTOPP_X64_ASM_AVAILABLE
102# undef CRYPTOPP_SSE2_ASM_AVAILABLE
103# undef CRYPTOPP_SSSE3_ASM_AVAILABLE
104#else
105# define CRYPTOPP_INTEGER_SSE2 (CRYPTOPP_SSE2_ASM_AVAILABLE && (CRYPTOPP_BOOL_X86))
106#endif
107
108// ***************** C++ Static Initialization ********************
109
110NAMESPACE_BEGIN(CryptoPP)
111
112// Function body near the middle of the file
113static void SetFunctionPointers();
114
115// Use a double-checked pattern. We are not leaking anything so it
116// does not matter if a pointer is written twice during a race.
117// Avoid std::call_once due to too many problems on platforms like
118// Solaris and Sparc. Also see
119// http://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146 and
120// http://github.com/weidai11/cryptopp/issues/707.
121InitializeInteger::InitializeInteger()
122{
123 static bool s_flag;
125 if (s_flag == false)
126 {
127 SetFunctionPointers();
128 s_flag = true;
130 }
131}
132
133template <long i>
134struct NewInteger
135{
136 Integer * operator()() const
137 {
138 return new Integer(i);
139 }
140};
141
142// ***************** Library code ********************
143
144inline static int Compare(const word *A, const word *B, size_t N)
145{
146 while (N--)
147 if (A[N] > B[N])
148 return 1;
149 else if (A[N] < B[N])
150 return -1;
151
152 return 0;
153}
154
155inline static int Increment(word *A, size_t N, word B=1)
156{
158 word t = A[0];
159 A[0] = t+B;
160 if (A[0] >= t)
161 return 0;
162 for (unsigned i=1; i<N; i++)
163 if (++A[i])
164 return 0;
165 return 1;
166}
167
168inline static int Decrement(word *A, size_t N, word B=1)
169{
171 word t = A[0];
172 A[0] = t-B;
173 if (A[0] <= t)
174 return 0;
175 for (unsigned i=1; i<N; i++)
176 if (A[i]--)
177 return 0;
178 return 1;
179}
180
181static void TwosComplement(word *A, size_t N)
182{
183 Decrement(A, N);
184 for (unsigned i=0; i<N; i++)
185 A[i] = ~A[i];
186}
187
188static word AtomicInverseModPower2(word A)
189{
190 CRYPTOPP_ASSERT(A%2==1);
191
192 word R=A%8;
193
194 for (unsigned i=3; i<WORD_BITS; i*=2)
195 R = R*(2-R*A);
196
197 CRYPTOPP_ASSERT(R*A==1);
198 return R;
199}
200
201// ********************************************************
202
203#if !defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE) || ((defined(__aarch64__) || defined(__x86_64__)) && defined(CRYPTOPP_WORD128_AVAILABLE))
204 #define TWO_64_BIT_WORDS 1
205 #define Declare2Words(x) word x##0, x##1;
206 #define AssignWord(a, b) a##0 = b; a##1 = 0;
207 #define Add2WordsBy1(a, b, c) a##0 = b##0 + c; a##1 = b##1 + (a##0 < c);
208 #define LowWord(a) a##0
209 #define HighWord(a) a##1
210 #ifdef CRYPTOPP_MSC_VERSION
211 #define MultiplyWordsLoHi(p0, p1, a, b) p0 = _umul128(a, b, &p1);
212 #ifndef __INTEL_COMPILER
213 #define Double3Words(c, d) d##1 = __shiftleft128(d##0, d##1, 1); d##0 = __shiftleft128(c, d##0, 1); c *= 2;
214 #endif
215 #elif defined(__aarch32__) || defined(__aarch64__)
216 #define MultiplyWordsLoHi(p0, p1, a, b) p0 = a*b; asm ("umulh %0,%1,%2" : "=r"(p1) : "r"(a), "r"(b));
217 #elif defined(__DECCXX)
218 #define MultiplyWordsLoHi(p0, p1, a, b) p0 = a*b; p1 = asm("umulh %a0, %a1, %v0", a, b);
219 #elif defined(__x86_64__)
220 #if defined(__SUNPRO_CC) && __SUNPRO_CC < 0x5100
221 // Sun Studio's gcc-style inline assembly is heavily bugged as of version 5.9 Patch 124864-09 2008/12/16, but this one works
222 #define MultiplyWordsLoHi(p0, p1, a, b) asm ("mulq %3" : "=a"(p0), "=d"(p1) : "a"(a), "r"(b) : "cc");
223 #elif defined(__BMI2__) && 0
224 #define MultiplyWordsLoHi(p0, p1, a, b) asm ("mulxq %3, %0, %1" : "=r"(p0), "=r"(p1) : "d"(a), "r"(b));
225 #define MulAcc(c, d, a, b) asm ("mulxq %6, %3, %4; addq %3, %0; adcxq %4, %1; adcxq %7, %2;" : "+&r"(c), "+&r"(d##0), "+&r"(d##1), "=&r"(p0), "=&r"(p1) : "d"(a), "r"(b), "r"(W64LIT(0)) : "cc");
226 #define Double3Words(c, d) asm ("addq %0, %0; adcxq %1, %1; adcxq %2, %2;" : "+r"(c), "+r"(d##0), "+r"(d##1) : : "cc");
227 #define Acc2WordsBy1(a, b) asm ("addq %2, %0; adcxq %3, %1;" : "+&r"(a##0), "+r"(a##1) : "r"(b), "r"(W64LIT(0)) : "cc");
228 #define Acc2WordsBy2(a, b) asm ("addq %2, %0; adcxq %3, %1;" : "+r"(a##0), "+r"(a##1) : "r"(b##0), "r"(b##1) : "cc");
229 #define Acc3WordsBy2(c, d, e) asm ("addq %5, %0; adcxq %6, %1; adcxq %7, %2;" : "+r"(c), "=&r"(e##0), "=&r"(e##1) : "1"(d##0), "2"(d##1), "r"(e##0), "r"(e##1), "r"(W64LIT(0)) : "cc");
230 #else
231 #define MultiplyWordsLoHi(p0, p1, a, b) asm ("mulq %3" : "=a"(p0), "=d"(p1) : "a"(a), "g"(b) : "cc");
232 #define MulAcc(c, d, a, b) asm ("mulq %6; addq %3, %0; adcq %4, %1; adcq $0, %2;" : "+r"(c), "+r"(d##0), "+r"(d##1), "=a"(p0), "=d"(p1) : "a"(a), "g"(b) : "cc");
233 #define Double3Words(c, d) asm ("addq %0, %0; adcq %1, %1; adcq %2, %2;" : "+r"(c), "+r"(d##0), "+r"(d##1) : : "cc");
234 #define Acc2WordsBy1(a, b) asm ("addq %2, %0; adcq $0, %1;" : "+r"(a##0), "+r"(a##1) : "r"(b) : "cc");
235 #define Acc2WordsBy2(a, b) asm ("addq %2, %0; adcq %3, %1;" : "+r"(a##0), "+r"(a##1) : "r"(b##0), "r"(b##1) : "cc");
236 #define Acc3WordsBy2(c, d, e) asm ("addq %5, %0; adcq %6, %1; adcq $0, %2;" : "+r"(c), "=r"(e##0), "=r"(e##1) : "1"(d##0), "2"(d##1), "r"(e##0), "r"(e##1) : "cc");
237 #endif
238 #endif
239 #define MultiplyWords(p, a, b) MultiplyWordsLoHi(p##0, p##1, a, b)
240 #ifndef Double3Words
241 #define Double3Words(c, d) d##1 = 2*d##1 + (d##0>>(WORD_BITS-1)); d##0 = 2*d##0 + (c>>(WORD_BITS-1)); c *= 2;
242 #endif
243 #ifndef Acc2WordsBy2
244 #define Acc2WordsBy2(a, b) a##0 += b##0; a##1 += a##0 < b##0; a##1 += b##1;
245 #endif
246 #define AddWithCarry(u, a, b) {word t = a+b; u##0 = t + u##1; u##1 = (t<a) + (u##0<t);}
247 #define SubtractWithBorrow(u, a, b) {word t = a-b; u##0 = t - u##1; u##1 = (t>a) + (u##0>t);}
248 #define GetCarry(u) u##1
249 #define GetBorrow(u) u##1
250#else
251 #define Declare2Words(x) dword x;
252 #if CRYPTOPP_MSC_VERSION >= 1400 && !defined(__INTEL_COMPILER) && (defined(_M_IX86) || defined(_M_X64) || defined(_M_IA64))
253 #define MultiplyWords(p, a, b) p = __emulu(a, b);
254 #else
255 #define MultiplyWords(p, a, b) p = (dword)a*b;
256 #endif
257 #define AssignWord(a, b) a = b;
258 #define Add2WordsBy1(a, b, c) a = b + c;
259 #define Acc2WordsBy2(a, b) a += b;
260 #define LowWord(a) word(a)
261 #define HighWord(a) word(a>>WORD_BITS)
262 #define Double3Words(c, d) d = 2*d + (c>>(WORD_BITS-1)); c *= 2;
263 #define AddWithCarry(u, a, b) u = dword(a) + b + GetCarry(u);
264 #define SubtractWithBorrow(u, a, b) u = dword(a) - b - GetBorrow(u);
265 #define GetCarry(u) HighWord(u)
266 #define GetBorrow(u) word(u>>(WORD_BITS*2-1))
267#endif
268#ifndef MulAcc
269 #define MulAcc(c, d, a, b) MultiplyWords(p, a, b); Acc2WordsBy1(p, c); c = LowWord(p); Acc2WordsBy1(d, HighWord(p));
270#endif
271#ifndef Acc2WordsBy1
272 #define Acc2WordsBy1(a, b) Add2WordsBy1(a, a, b)
273#endif
274#ifndef Acc3WordsBy2
275 #define Acc3WordsBy2(c, d, e) Acc2WordsBy1(e, c); c = LowWord(e); Add2WordsBy1(e, d, HighWord(e));
276#endif
277
278class DWord
279{
280public:
281#if defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE)
282 DWord() {std::memset(&m_whole, 0x00, sizeof(m_whole));}
283#else
284 DWord() {std::memset(&m_halfs, 0x00, sizeof(m_halfs));}
285#endif
286
287#ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
288 explicit DWord(word low) : m_whole(low) { }
289#else
290 explicit DWord(word low)
291 {
292 m_halfs.high = 0;
293 m_halfs.low = low;
294 }
295#endif
296
297#if defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE)
298 DWord(word low, word high) : m_whole()
299#else
300 DWord(word low, word high) : m_halfs()
301#endif
302 {
303#if defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE)
304# if (CRYPTOPP_LITTLE_ENDIAN)
305 const word t[2] = {low,high};
306 std::memcpy(&m_whole, t, sizeof(m_whole));
307# else
308 const word t[2] = {high,low};
309 std::memcpy(&m_whole, t, sizeof(m_whole));
310# endif
311#else
312 m_halfs.low = low;
313 m_halfs.high = high;
314#endif
315 }
316
317 static DWord Multiply(word a, word b)
318 {
319 DWord r;
320 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
321 r.m_whole = (dword)a * b;
322 #elif defined(MultiplyWordsLoHi)
323 MultiplyWordsLoHi(r.m_halfs.low, r.m_halfs.high, a, b);
324 #else
326 #endif
327 return r;
328 }
329
330 static DWord MultiplyAndAdd(word a, word b, word c)
331 {
332 DWord r = Multiply(a, b);
333 return r += c;
334 }
335
336 DWord & operator+=(word a)
337 {
338 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
339 m_whole = m_whole + a;
340 #else
341 m_halfs.low += a;
342 m_halfs.high += (m_halfs.low < a);
343 #endif
344 return *this;
345 }
346
347 DWord operator+(word a)
348 {
349 DWord r;
350 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
351 r.m_whole = m_whole + a;
352 #else
353 r.m_halfs.low = m_halfs.low + a;
354 r.m_halfs.high = m_halfs.high + (r.m_halfs.low < a);
355 #endif
356 return r;
357 }
358
359 DWord operator-(DWord a)
360 {
361 DWord r;
362 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
363 r.m_whole = m_whole - a.m_whole;
364 #else
365 r.m_halfs.low = m_halfs.low - a.m_halfs.low;
366 r.m_halfs.high = m_halfs.high - a.m_halfs.high - (r.m_halfs.low > m_halfs.low);
367 #endif
368 return r;
369 }
370
371 DWord operator-(word a)
372 {
373 DWord r;
374 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
375 r.m_whole = m_whole - a;
376 #else
377 r.m_halfs.low = m_halfs.low - a;
378 r.m_halfs.high = m_halfs.high - (r.m_halfs.low > m_halfs.low);
379 #endif
380 return r;
381 }
382
383 // returns quotient, which must fit in a word
384 word operator/(word divisor);
385
386 word operator%(word a);
387
388 bool operator!() const
389 {
390 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
391 return !m_whole;
392 #else
393 return !m_halfs.high && !m_halfs.low;
394 #endif
395 }
396
397 // TODO: When NATIVE_DWORD is in effect, we access high and low, which are inactive
398 // union members, and that's UB. Also see http://stackoverflow.com/q/11373203.
399 word GetLowHalf() const {return m_halfs.low;}
400 word GetHighHalf() const {return m_halfs.high;}
401 word GetHighHalfAsBorrow() const {return 0-m_halfs.high;}
402
403private:
404 // Issue 274, "Types cannot be declared in anonymous union",
405 // http://github.com/weidai11/cryptopp/issues/274
406 // Thanks to Martin Bonner at http://stackoverflow.com/a/39507183
407 struct half_words
408 {
409 #if (CRYPTOPP_LITTLE_ENDIAN)
410 word low;
411 word high;
412 #else
413 word high;
414 word low;
415 #endif
416 };
417 union
418 {
419 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
420 dword m_whole;
421 #endif
422 half_words m_halfs;
423 };
424};
425
426class Word
427{
428public:
429 Word() : m_whole(0) {}
430 Word(word value) : m_whole(value) {}
431 Word(hword low, hword high) : m_whole(low | (word(high) << (WORD_BITS/2))) {}
432
433 static Word Multiply(hword a, hword b)
434 {
435 Word r;
436 r.m_whole = (word)a * b;
437 return r;
438 }
439
440 Word operator-(Word a)
441 {
442 Word r;
443 r.m_whole = m_whole - a.m_whole;
444 return r;
445 }
446
447 Word operator-(hword a)
448 {
449 Word r;
450 r.m_whole = m_whole - a;
451 return r;
452 }
453
454 // returns quotient, which must fit in a word
455 hword operator/(hword divisor)
456 {
457 return hword(m_whole / divisor);
458 }
459
460 bool operator!() const
461 {
462 return !m_whole;
463 }
464
465 word GetWhole() const {return m_whole;}
466 hword GetLowHalf() const {return hword(m_whole);}
467 hword GetHighHalf() const {return hword(m_whole>>(WORD_BITS/2));}
468 hword GetHighHalfAsBorrow() const {return 0-hword(m_whole>>(WORD_BITS/2));}
469
470private:
471 word m_whole;
472};
473
474// do a 3 word by 2 word divide, returns quotient and leaves remainder in A
475template <class S, class D>
476S DivideThreeWordsByTwo(S *A, S B0, S B1, D *dummy=NULLPTR)
477{
478 CRYPTOPP_UNUSED(dummy);
479
480 // Assert {A[2],A[1]} < {B1,B0}, so quotient can fit in a S
481 CRYPTOPP_ASSERT(A[2] < B1 || (A[2]==B1 && A[1] < B0));
482
483 // Profiling guided the flow below.
484
485 // estimate the quotient: do a 2 S by 1 S divide.
486 S Q; bool pre = (S(B1+1) == 0);
487 if (B1 > 0 && !pre)
488 Q = D(A[1], A[2]) / S(B1+1);
489 else if (pre)
490 Q = A[2];
491 else
492 Q = D(A[0], A[1]) / B0;
493
494 // now subtract Q*B from A
495 D p = D::Multiply(B0, Q);
496 D u = (D) A[0] - p.GetLowHalf();
497 A[0] = u.GetLowHalf();
498 u = (D) A[1] - p.GetHighHalf() - u.GetHighHalfAsBorrow() - D::Multiply(B1, Q);
499 A[1] = u.GetLowHalf();
500 A[2] += u.GetHighHalf();
501
502 // Q <= actual quotient, so fix it
503 while (A[2] || A[1] > B1 || (A[1]==B1 && A[0]>=B0))
504 {
505 u = (D) A[0] - B0;
506 A[0] = u.GetLowHalf();
507 u = (D) A[1] - B1 - u.GetHighHalfAsBorrow();
508 A[1] = u.GetLowHalf();
509 A[2] += u.GetHighHalf();
510 Q++;
511 CRYPTOPP_ASSERT(Q); // shouldn't overflow
512 }
513
514 return Q;
515}
516
517// do a 4 word by 2 word divide, returns 2 word quotient in Q0 and Q1
518template <class S, class D>
519inline D DivideFourWordsByTwo(S *T, const D &Al, const D &Ah, const D &B)
520{
521 // Profiling guided the flow below.
522
523 if (!!B)
524 {
525 S Q[2];
526 T[0] = Al.GetLowHalf();
527 T[1] = Al.GetHighHalf();
528 T[2] = Ah.GetLowHalf();
529 T[3] = Ah.GetHighHalf();
530 Q[1] = DivideThreeWordsByTwo<S, D>(T+1, B.GetLowHalf(), B.GetHighHalf());
531 Q[0] = DivideThreeWordsByTwo<S, D>(T, B.GetLowHalf(), B.GetHighHalf());
532 return D(Q[0], Q[1]);
533 }
534 else // if divisor is 0, we assume divisor==2**(2*WORD_BITS)
535 {
536 return D(Ah.GetLowHalf(), Ah.GetHighHalf());
537 }
538}
539
540// returns quotient, which must fit in a word
541inline word DWord::operator/(word a)
542{
543 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
544 return word(m_whole / a);
545 #else
546 hword r[4];
547 return DivideFourWordsByTwo<hword, Word>(r, m_halfs.low, m_halfs.high, a).GetWhole();
548 #endif
549}
550
551inline word DWord::operator%(word a)
552{
553 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
554 return word(m_whole % a);
555 #else
556 if (a < (word(1) << (WORD_BITS/2)))
557 {
558 hword h = hword(a);
559 word r = m_halfs.high % h;
560 r = ((m_halfs.low >> (WORD_BITS/2)) + (r << (WORD_BITS/2))) % h;
561 return hword((hword(m_halfs.low) + (r << (WORD_BITS/2))) % h);
562 }
563 else
564 {
565 hword r[4];
566 DivideFourWordsByTwo<hword, Word>(r, m_halfs.low, m_halfs.high, a);
567 return Word(r[0], r[1]).GetWhole();
568 }
569 #endif
570}
571
572// ********************************************************
573
574// Use some tricks to share assembly code between MSVC, GCC, Clang and Sun CC.
575#if defined(__GNUC__)
576 #define AddPrologue \
577 int result; \
578 __asm__ __volatile__ \
579 ( \
580 INTEL_NOPREFIX
581 #define AddEpilogue \
582 ATT_PREFIX \
583 : "=a" (result)\
584 : "d" (C), "a" (A), "D" (B), "c" (N) \
585 : "%esi", "memory", "cc" \
586 );\
587 return result;
588 #define MulPrologue \
589 __asm__ __volatile__ \
590 ( \
591 INTEL_NOPREFIX \
592 AS1( push ebx) \
593 AS2( mov ebx, edx)
594 #define MulEpilogue \
595 AS1( pop ebx) \
596 ATT_PREFIX \
597 : \
598 : "d" (s_maskLow16), "c" (C), "a" (A), "D" (B) \
599 : "%esi", "memory", "cc" \
600 );
601 #define SquPrologue MulPrologue
602 #define SquEpilogue \
603 AS1( pop ebx) \
604 ATT_PREFIX \
605 : \
606 : "d" (s_maskLow16), "c" (C), "a" (A) \
607 : "%esi", "%edi", "memory", "cc" \
608 );
609 #define TopPrologue MulPrologue
610 #define TopEpilogue \
611 AS1( pop ebx) \
612 ATT_PREFIX \
613 : \
614 : "d" (s_maskLow16), "c" (C), "a" (A), "D" (B), "S" (L) \
615 : "memory", "cc" \
616 );
617#else
618 #define AddPrologue \
619 __asm push edi \
620 __asm push esi \
621 __asm mov eax, [esp+12] \
622 __asm mov edi, [esp+16]
623 #define AddEpilogue \
624 __asm pop esi \
625 __asm pop edi \
626 __asm ret 8
627 #define SaveEBX
628 #define RestoreEBX
629 #define SquPrologue \
630 AS2( mov eax, A) \
631 AS2( mov ecx, C) \
632 SaveEBX \
633 AS2( lea ebx, s_maskLow16)
634 #define MulPrologue \
635 AS2( mov eax, A) \
636 AS2( mov edi, B) \
637 AS2( mov ecx, C) \
638 SaveEBX \
639 AS2( lea ebx, s_maskLow16)
640 #define TopPrologue \
641 AS2( mov eax, A) \
642 AS2( mov edi, B) \
643 AS2( mov ecx, C) \
644 AS2( mov esi, L) \
645 SaveEBX \
646 AS2( lea ebx, s_maskLow16)
647 #define SquEpilogue RestoreEBX
648 #define MulEpilogue RestoreEBX
649 #define TopEpilogue RestoreEBX
650#endif
651
652#ifdef CRYPTOPP_X64_MASM_AVAILABLE
653extern "C" {
654int Baseline_Add(size_t N, word *C, const word *A, const word *B);
655int Baseline_Sub(size_t N, word *C, const word *A, const word *B);
656}
657#elif defined(CRYPTOPP_X64_ASM_AVAILABLE) && defined(__GNUC__) && defined(CRYPTOPP_WORD128_AVAILABLE)
658int Baseline_Add(size_t N, word *C, const word *A, const word *B)
659{
660 word result;
661 __asm__ __volatile__
662 (
663 INTEL_NOPREFIX
664 AS1( neg %1)
665 ASJ( jz, 1, f)
666 AS2( mov %0,[%3+8*%1])
667 AS2( add %0,[%4+8*%1])
668 AS2( mov [%2+8*%1],%0)
669 ASL(0)
670 AS2( mov %0,[%3+8*%1+8])
671 AS2( adc %0,[%4+8*%1+8])
672 AS2( mov [%2+8*%1+8],%0)
673 AS2( lea %1,[%1+2])
674 ASJ( jrcxz, 1, f)
675 AS2( mov %0,[%3+8*%1])
676 AS2( adc %0,[%4+8*%1])
677 AS2( mov [%2+8*%1],%0)
678 ASJ( jmp, 0, b)
679 ASL(1)
680 AS2( mov %0, 0)
681 AS2( adc %0, %0)
682 ATT_NOPREFIX
683 : "=&r" (result), "+c" (N)
684 : "r" (C+N), "r" (A+N), "r" (B+N)
685 : "memory", "cc"
686 );
687 return (int)result;
688}
689
690int Baseline_Sub(size_t N, word *C, const word *A, const word *B)
691{
692 word result;
693 __asm__ __volatile__
694 (
695 INTEL_NOPREFIX
696 AS1( neg %1)
697 ASJ( jz, 1, f)
698 AS2( mov %0,[%3+8*%1])
699 AS2( sub %0,[%4+8*%1])
700 AS2( mov [%2+8*%1],%0)
701 ASL(0)
702 AS2( mov %0,[%3+8*%1+8])
703 AS2( sbb %0,[%4+8*%1+8])
704 AS2( mov [%2+8*%1+8],%0)
705 AS2( lea %1,[%1+2])
706 ASJ( jrcxz, 1, f)
707 AS2( mov %0,[%3+8*%1])
708 AS2( sbb %0,[%4+8*%1])
709 AS2( mov [%2+8*%1],%0)
710 ASJ( jmp, 0, b)
711 ASL(1)
712 AS2( mov %0, 0)
713 AS2( adc %0, %0)
714 ATT_NOPREFIX
715 : "=&r" (result), "+c" (N)
716 : "r" (C+N), "r" (A+N), "r" (B+N)
717 : "memory", "cc"
718 );
719 return (int)result;
720}
721#elif defined(CRYPTOPP_X86_ASM_AVAILABLE) && CRYPTOPP_BOOL_X86
722CRYPTOPP_NAKED int CRYPTOPP_FASTCALL Baseline_Add(size_t N, word *C, const word *A, const word *B)
723{
724 AddPrologue
725
726 // now: eax = A, edi = B, edx = C, ecx = N
727 AS2( lea eax, [eax+4*ecx])
728 AS2( lea edi, [edi+4*ecx])
729 AS2( lea edx, [edx+4*ecx])
730
731 AS1( neg ecx) // ecx is negative index
732 AS2( test ecx, 2) // this clears carry flag
733 ASJ( jz, 0, f)
734 AS2( sub ecx, 2)
735 ASJ( jmp, 1, f)
736
737 ASL(0)
738 ASJ( jecxz, 2, f) // loop until ecx overflows and becomes zero
739 AS2( mov esi,[eax+4*ecx])
740 AS2( adc esi,[edi+4*ecx])
741 AS2( mov [edx+4*ecx],esi)
742 AS2( mov esi,[eax+4*ecx+4])
743 AS2( adc esi,[edi+4*ecx+4])
744 AS2( mov [edx+4*ecx+4],esi)
745 ASL(1)
746 AS2( mov esi,[eax+4*ecx+8])
747 AS2( adc esi,[edi+4*ecx+8])
748 AS2( mov [edx+4*ecx+8],esi)
749 AS2( mov esi,[eax+4*ecx+12])
750 AS2( adc esi,[edi+4*ecx+12])
751 AS2( mov [edx+4*ecx+12],esi)
752
753 AS2( lea ecx,[ecx+4]) // advance index, avoid inc which causes slowdown on Intel Core 2
754 ASJ( jmp, 0, b)
755
756 ASL(2)
757 AS2( mov eax, 0)
758 AS1( setc al) // store carry into eax (return result register)
759
760 AddEpilogue
761
762 // http://github.com/weidai11/cryptopp/issues/340
763 CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
764 CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
765}
766
767CRYPTOPP_NAKED int CRYPTOPP_FASTCALL Baseline_Sub(size_t N, word *C, const word *A, const word *B)
768{
769 AddPrologue
770
771 // now: eax = A, edi = B, edx = C, ecx = N
772 AS2( lea eax, [eax+4*ecx])
773 AS2( lea edi, [edi+4*ecx])
774 AS2( lea edx, [edx+4*ecx])
775
776 AS1( neg ecx) // ecx is negative index
777 AS2( test ecx, 2) // this clears carry flag
778 ASJ( jz, 0, f)
779 AS2( sub ecx, 2)
780 ASJ( jmp, 1, f)
781
782 ASL(0)
783 ASJ( jecxz, 2, f) // loop until ecx overflows and becomes zero
784 AS2( mov esi,[eax+4*ecx])
785 AS2( sbb esi,[edi+4*ecx])
786 AS2( mov [edx+4*ecx],esi)
787 AS2( mov esi,[eax+4*ecx+4])
788 AS2( sbb esi,[edi+4*ecx+4])
789 AS2( mov [edx+4*ecx+4],esi)
790 ASL(1)
791 AS2( mov esi,[eax+4*ecx+8])
792 AS2( sbb esi,[edi+4*ecx+8])
793 AS2( mov [edx+4*ecx+8],esi)
794 AS2( mov esi,[eax+4*ecx+12])
795 AS2( sbb esi,[edi+4*ecx+12])
796 AS2( mov [edx+4*ecx+12],esi)
797
798 AS2( lea ecx,[ecx+4]) // advance index, avoid inc which causes slowdown on Intel Core 2
799 ASJ( jmp, 0, b)
800
801 ASL(2)
802 AS2( mov eax, 0)
803 AS1( setc al) // store carry into eax (return result register)
804
805 AddEpilogue
806
807 // http://github.com/weidai11/cryptopp/issues/340
808 CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
809 CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
810}
811
812#if CRYPTOPP_INTEGER_SSE2
813CRYPTOPP_NAKED int CRYPTOPP_FASTCALL SSE2_Add(size_t N, word *C, const word *A, const word *B)
814{
815 AddPrologue
816
817 // now: eax = A, edi = B, edx = C, ecx = N
818 AS2( lea eax, [eax+4*ecx])
819 AS2( lea edi, [edi+4*ecx])
820 AS2( lea edx, [edx+4*ecx])
821
822 AS1( neg ecx) // ecx is negative index
823 AS2( pxor mm2, mm2)
824 ASJ( jz, 2, f)
825 AS2( test ecx, 2) // this clears carry flag
826 ASJ( jz, 0, f)
827 AS2( sub ecx, 2)
828 ASJ( jmp, 1, f)
829
830 ASL(0)
831 AS2( movd mm0, DWORD PTR [eax+4*ecx])
832 AS2( movd mm1, DWORD PTR [edi+4*ecx])
833 AS2( paddq mm0, mm1)
834 AS2( paddq mm2, mm0)
835 AS2( movd DWORD PTR [edx+4*ecx], mm2)
836 AS2( psrlq mm2, 32)
837
838 AS2( movd mm0, DWORD PTR [eax+4*ecx+4])
839 AS2( movd mm1, DWORD PTR [edi+4*ecx+4])
840 AS2( paddq mm0, mm1)
841 AS2( paddq mm2, mm0)
842 AS2( movd DWORD PTR [edx+4*ecx+4], mm2)
843 AS2( psrlq mm2, 32)
844
845 ASL(1)
846 AS2( movd mm0, DWORD PTR [eax+4*ecx+8])
847 AS2( movd mm1, DWORD PTR [edi+4*ecx+8])
848 AS2( paddq mm0, mm1)
849 AS2( paddq mm2, mm0)
850 AS2( movd DWORD PTR [edx+4*ecx+8], mm2)
851 AS2( psrlq mm2, 32)
852
853 AS2( movd mm0, DWORD PTR [eax+4*ecx+12])
854 AS2( movd mm1, DWORD PTR [edi+4*ecx+12])
855 AS2( paddq mm0, mm1)
856 AS2( paddq mm2, mm0)
857 AS2( movd DWORD PTR [edx+4*ecx+12], mm2)
858 AS2( psrlq mm2, 32)
859
860 AS2( add ecx, 4)
861 ASJ( jnz, 0, b)
862
863 ASL(2)
864 AS2( movd eax, mm2)
865 AS1( emms)
866
867 AddEpilogue
868
869 // http://github.com/weidai11/cryptopp/issues/340
870 CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
871 CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
872}
873CRYPTOPP_NAKED int CRYPTOPP_FASTCALL SSE2_Sub(size_t N, word *C, const word *A, const word *B)
874{
875 AddPrologue
876
877 // now: eax = A, edi = B, edx = C, ecx = N
878 AS2( lea eax, [eax+4*ecx])
879 AS2( lea edi, [edi+4*ecx])
880 AS2( lea edx, [edx+4*ecx])
881
882 AS1( neg ecx) // ecx is negative index
883 AS2( pxor mm2, mm2)
884 ASJ( jz, 2, f)
885 AS2( test ecx, 2) // this clears carry flag
886 ASJ( jz, 0, f)
887 AS2( sub ecx, 2)
888 ASJ( jmp, 1, f)
889
890 ASL(0)
891 AS2( movd mm0, DWORD PTR [eax+4*ecx])
892 AS2( movd mm1, DWORD PTR [edi+4*ecx])
893 AS2( psubq mm0, mm1)
894 AS2( psubq mm0, mm2)
895 AS2( movd DWORD PTR [edx+4*ecx], mm0)
896 AS2( psrlq mm0, 63)
897
898 AS2( movd mm2, DWORD PTR [eax+4*ecx+4])
899 AS2( movd mm1, DWORD PTR [edi+4*ecx+4])
900 AS2( psubq mm2, mm1)
901 AS2( psubq mm2, mm0)
902 AS2( movd DWORD PTR [edx+4*ecx+4], mm2)
903 AS2( psrlq mm2, 63)
904
905 ASL(1)
906 AS2( movd mm0, DWORD PTR [eax+4*ecx+8])
907 AS2( movd mm1, DWORD PTR [edi+4*ecx+8])
908 AS2( psubq mm0, mm1)
909 AS2( psubq mm0, mm2)
910 AS2( movd DWORD PTR [edx+4*ecx+8], mm0)
911 AS2( psrlq mm0, 63)
912
913 AS2( movd mm2, DWORD PTR [eax+4*ecx+12])
914 AS2( movd mm1, DWORD PTR [edi+4*ecx+12])
915 AS2( psubq mm2, mm1)
916 AS2( psubq mm2, mm0)
917 AS2( movd DWORD PTR [edx+4*ecx+12], mm2)
918 AS2( psrlq mm2, 63)
919
920 AS2( add ecx, 4)
921 ASJ( jnz, 0, b)
922
923 ASL(2)
924 AS2( movd eax, mm2)
925 AS1( emms)
926
927 AddEpilogue
928
929 // http://github.com/weidai11/cryptopp/issues/340
930 CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
931 CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
932}
933#endif // CRYPTOPP_INTEGER_SSE2
934#else // CRYPTOPP_SSE2_ASM_AVAILABLE
935int CRYPTOPP_FASTCALL Baseline_Add(size_t N, word *C, const word *A, const word *B)
936{
937 CRYPTOPP_ASSERT (N%2 == 0);
938
939 Declare2Words(u);
940 AssignWord(u, 0);
941 for (size_t i=0; i<N; i+=2)
942 {
943 AddWithCarry(u, A[i], B[i]);
944 C[i] = LowWord(u);
945 AddWithCarry(u, A[i+1], B[i+1]);
946 C[i+1] = LowWord(u);
947 }
948 return int(GetCarry(u));
949}
950
951int CRYPTOPP_FASTCALL Baseline_Sub(size_t N, word *C, const word *A, const word *B)
952{
953 CRYPTOPP_ASSERT (N%2 == 0);
954
955 Declare2Words(u);
956 AssignWord(u, 0);
957 for (size_t i=0; i<N; i+=2)
958 {
959 SubtractWithBorrow(u, A[i], B[i]);
960 C[i] = LowWord(u);
961 SubtractWithBorrow(u, A[i+1], B[i+1]);
962 C[i+1] = LowWord(u);
963 }
964 return int(GetBorrow(u));
965}
966#endif
967
968static word LinearMultiply(word *C, const word *AA, word B, size_t N)
969{
970 // http://github.com/weidai11/cryptopp/issues/188
972
973 word carry=0;
974 for(unsigned i=0; i<N; i++)
975 {
976 Declare2Words(p);
977 MultiplyWords(p, A[i], B);
978 Acc2WordsBy1(p, carry);
979 C[i] = LowWord(p);
980 carry = HighWord(p);
981 }
982 return carry;
983}
984
985#ifndef CRYPTOPP_DOXYGEN_PROCESSING
986
987#define Mul_2 \
988 Mul_Begin(2) \
989 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
990 Mul_End(1, 1)
991
992#define Mul_4 \
993 Mul_Begin(4) \
994 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
995 Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
996 Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
997 Mul_SaveAcc(3, 1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) \
998 Mul_SaveAcc(4, 2, 3) Mul_Acc(3, 2) \
999 Mul_End(5, 3)
1000
1001#define Mul_8 \
1002 Mul_Begin(8) \
1003 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1004 Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1005 Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1006 Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1007 Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1008 Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1009 Mul_SaveAcc(6, 0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1010 Mul_SaveAcc(7, 1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) \
1011 Mul_SaveAcc(8, 2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) \
1012 Mul_SaveAcc(9, 3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) \
1013 Mul_SaveAcc(10, 4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) \
1014 Mul_SaveAcc(11, 5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) \
1015 Mul_SaveAcc(12, 6, 7) Mul_Acc(7, 6) \
1016 Mul_End(13, 7)
1017
1018#define Mul_16 \
1019 Mul_Begin(16) \
1020 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1021 Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1022 Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1023 Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1024 Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1025 Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1026 Mul_SaveAcc(6, 0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1027 Mul_SaveAcc(7, 0, 8) Mul_Acc(1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) Mul_Acc(8, 0) \
1028 Mul_SaveAcc(8, 0, 9) Mul_Acc(1, 8) Mul_Acc(2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) Mul_Acc(8, 1) Mul_Acc(9, 0) \
1029 Mul_SaveAcc(9, 0, 10) Mul_Acc(1, 9) Mul_Acc(2, 8) Mul_Acc(3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) Mul_Acc(8, 2) Mul_Acc(9, 1) Mul_Acc(10, 0) \
1030 Mul_SaveAcc(10, 0, 11) Mul_Acc(1, 10) Mul_Acc(2, 9) Mul_Acc(3, 8) Mul_Acc(4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) Mul_Acc(8, 3) Mul_Acc(9, 2) Mul_Acc(10, 1) Mul_Acc(11, 0) \
1031 Mul_SaveAcc(11, 0, 12) Mul_Acc(1, 11) Mul_Acc(2, 10) Mul_Acc(3, 9) Mul_Acc(4, 8) Mul_Acc(5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) Mul_Acc(8, 4) Mul_Acc(9, 3) Mul_Acc(10, 2) Mul_Acc(11, 1) Mul_Acc(12, 0) \
1032 Mul_SaveAcc(12, 0, 13) Mul_Acc(1, 12) Mul_Acc(2, 11) Mul_Acc(3, 10) Mul_Acc(4, 9) Mul_Acc(5, 8) Mul_Acc(6, 7) Mul_Acc(7, 6) Mul_Acc(8, 5) Mul_Acc(9, 4) Mul_Acc(10, 3) Mul_Acc(11, 2) Mul_Acc(12, 1) Mul_Acc(13, 0) \
1033 Mul_SaveAcc(13, 0, 14) Mul_Acc(1, 13) Mul_Acc(2, 12) Mul_Acc(3, 11) Mul_Acc(4, 10) Mul_Acc(5, 9) Mul_Acc(6, 8) Mul_Acc(7, 7) Mul_Acc(8, 6) Mul_Acc(9, 5) Mul_Acc(10, 4) Mul_Acc(11, 3) Mul_Acc(12, 2) Mul_Acc(13, 1) Mul_Acc(14, 0) \
1034 Mul_SaveAcc(14, 0, 15) Mul_Acc(1, 14) Mul_Acc(2, 13) Mul_Acc(3, 12) Mul_Acc(4, 11) Mul_Acc(5, 10) Mul_Acc(6, 9) Mul_Acc(7, 8) Mul_Acc(8, 7) Mul_Acc(9, 6) Mul_Acc(10, 5) Mul_Acc(11, 4) Mul_Acc(12, 3) Mul_Acc(13, 2) Mul_Acc(14, 1) Mul_Acc(15, 0) \
1035 Mul_SaveAcc(15, 1, 15) Mul_Acc(2, 14) Mul_Acc(3, 13) Mul_Acc(4, 12) Mul_Acc(5, 11) Mul_Acc(6, 10) Mul_Acc(7, 9) Mul_Acc(8, 8) Mul_Acc(9, 7) Mul_Acc(10, 6) Mul_Acc(11, 5) Mul_Acc(12, 4) Mul_Acc(13, 3) Mul_Acc(14, 2) Mul_Acc(15, 1) \
1036 Mul_SaveAcc(16, 2, 15) Mul_Acc(3, 14) Mul_Acc(4, 13) Mul_Acc(5, 12) Mul_Acc(6, 11) Mul_Acc(7, 10) Mul_Acc(8, 9) Mul_Acc(9, 8) Mul_Acc(10, 7) Mul_Acc(11, 6) Mul_Acc(12, 5) Mul_Acc(13, 4) Mul_Acc(14, 3) Mul_Acc(15, 2) \
1037 Mul_SaveAcc(17, 3, 15) Mul_Acc(4, 14) Mul_Acc(5, 13) Mul_Acc(6, 12) Mul_Acc(7, 11) Mul_Acc(8, 10) Mul_Acc(9, 9) Mul_Acc(10, 8) Mul_Acc(11, 7) Mul_Acc(12, 6) Mul_Acc(13, 5) Mul_Acc(14, 4) Mul_Acc(15, 3) \
1038 Mul_SaveAcc(18, 4, 15) Mul_Acc(5, 14) Mul_Acc(6, 13) Mul_Acc(7, 12) Mul_Acc(8, 11) Mul_Acc(9, 10) Mul_Acc(10, 9) Mul_Acc(11, 8) Mul_Acc(12, 7) Mul_Acc(13, 6) Mul_Acc(14, 5) Mul_Acc(15, 4) \
1039 Mul_SaveAcc(19, 5, 15) Mul_Acc(6, 14) Mul_Acc(7, 13) Mul_Acc(8, 12) Mul_Acc(9, 11) Mul_Acc(10, 10) Mul_Acc(11, 9) Mul_Acc(12, 8) Mul_Acc(13, 7) Mul_Acc(14, 6) Mul_Acc(15, 5) \
1040 Mul_SaveAcc(20, 6, 15) Mul_Acc(7, 14) Mul_Acc(8, 13) Mul_Acc(9, 12) Mul_Acc(10, 11) Mul_Acc(11, 10) Mul_Acc(12, 9) Mul_Acc(13, 8) Mul_Acc(14, 7) Mul_Acc(15, 6) \
1041 Mul_SaveAcc(21, 7, 15) Mul_Acc(8, 14) Mul_Acc(9, 13) Mul_Acc(10, 12) Mul_Acc(11, 11) Mul_Acc(12, 10) Mul_Acc(13, 9) Mul_Acc(14, 8) Mul_Acc(15, 7) \
1042 Mul_SaveAcc(22, 8, 15) Mul_Acc(9, 14) Mul_Acc(10, 13) Mul_Acc(11, 12) Mul_Acc(12, 11) Mul_Acc(13, 10) Mul_Acc(14, 9) Mul_Acc(15, 8) \
1043 Mul_SaveAcc(23, 9, 15) Mul_Acc(10, 14) Mul_Acc(11, 13) Mul_Acc(12, 12) Mul_Acc(13, 11) Mul_Acc(14, 10) Mul_Acc(15, 9) \
1044 Mul_SaveAcc(24, 10, 15) Mul_Acc(11, 14) Mul_Acc(12, 13) Mul_Acc(13, 12) Mul_Acc(14, 11) Mul_Acc(15, 10) \
1045 Mul_SaveAcc(25, 11, 15) Mul_Acc(12, 14) Mul_Acc(13, 13) Mul_Acc(14, 12) Mul_Acc(15, 11) \
1046 Mul_SaveAcc(26, 12, 15) Mul_Acc(13, 14) Mul_Acc(14, 13) Mul_Acc(15, 12) \
1047 Mul_SaveAcc(27, 13, 15) Mul_Acc(14, 14) Mul_Acc(15, 13) \
1048 Mul_SaveAcc(28, 14, 15) Mul_Acc(15, 14) \
1049 Mul_End(29, 15)
1050
1051#define Squ_2 \
1052 Squ_Begin(2) \
1053 Squ_End(2)
1054
1055#define Squ_4 \
1056 Squ_Begin(4) \
1057 Squ_SaveAcc(1, 0, 2) Squ_Diag(1) \
1058 Squ_SaveAcc(2, 0, 3) Squ_Acc(1, 2) Squ_NonDiag \
1059 Squ_SaveAcc(3, 1, 3) Squ_Diag(2) \
1060 Squ_SaveAcc(4, 2, 3) Squ_NonDiag \
1061 Squ_End(4)
1062
1063#define Squ_8 \
1064 Squ_Begin(8) \
1065 Squ_SaveAcc(1, 0, 2) Squ_Diag(1) \
1066 Squ_SaveAcc(2, 0, 3) Squ_Acc(1, 2) Squ_NonDiag \
1067 Squ_SaveAcc(3, 0, 4) Squ_Acc(1, 3) Squ_Diag(2) \
1068 Squ_SaveAcc(4, 0, 5) Squ_Acc(1, 4) Squ_Acc(2, 3) Squ_NonDiag \
1069 Squ_SaveAcc(5, 0, 6) Squ_Acc(1, 5) Squ_Acc(2, 4) Squ_Diag(3) \
1070 Squ_SaveAcc(6, 0, 7) Squ_Acc(1, 6) Squ_Acc(2, 5) Squ_Acc(3, 4) Squ_NonDiag \
1071 Squ_SaveAcc(7, 1, 7) Squ_Acc(2, 6) Squ_Acc(3, 5) Squ_Diag(4) \
1072 Squ_SaveAcc(8, 2, 7) Squ_Acc(3, 6) Squ_Acc(4, 5) Squ_NonDiag \
1073 Squ_SaveAcc(9, 3, 7) Squ_Acc(4, 6) Squ_Diag(5) \
1074 Squ_SaveAcc(10, 4, 7) Squ_Acc(5, 6) Squ_NonDiag \
1075 Squ_SaveAcc(11, 5, 7) Squ_Diag(6) \
1076 Squ_SaveAcc(12, 6, 7) Squ_NonDiag \
1077 Squ_End(8)
1078
1079#define Squ_16 \
1080 Squ_Begin(16) \
1081 Squ_SaveAcc(1, 0, 2) Squ_Diag(1) \
1082 Squ_SaveAcc(2, 0, 3) Squ_Acc(1, 2) Squ_NonDiag \
1083 Squ_SaveAcc(3, 0, 4) Squ_Acc(1, 3) Squ_Diag(2) \
1084 Squ_SaveAcc(4, 0, 5) Squ_Acc(1, 4) Squ_Acc(2, 3) Squ_NonDiag \
1085 Squ_SaveAcc(5, 0, 6) Squ_Acc(1, 5) Squ_Acc(2, 4) Squ_Diag(3) \
1086 Squ_SaveAcc(6, 0, 7) Squ_Acc(1, 6) Squ_Acc(2, 5) Squ_Acc(3, 4) Squ_NonDiag \
1087 Squ_SaveAcc(7, 0, 8) Squ_Acc(1, 7) Squ_Acc(2, 6) Squ_Acc(3, 5) Squ_Diag(4) \
1088 Squ_SaveAcc(8, 0, 9) Squ_Acc(1, 8) Squ_Acc(2, 7) Squ_Acc(3, 6) Squ_Acc(4, 5) Squ_NonDiag \
1089 Squ_SaveAcc(9, 0, 10) Squ_Acc(1, 9) Squ_Acc(2, 8) Squ_Acc(3, 7) Squ_Acc(4, 6) Squ_Diag(5) \
1090 Squ_SaveAcc(10, 0, 11) Squ_Acc(1, 10) Squ_Acc(2, 9) Squ_Acc(3, 8) Squ_Acc(4, 7) Squ_Acc(5, 6) Squ_NonDiag \
1091 Squ_SaveAcc(11, 0, 12) Squ_Acc(1, 11) Squ_Acc(2, 10) Squ_Acc(3, 9) Squ_Acc(4, 8) Squ_Acc(5, 7) Squ_Diag(6) \
1092 Squ_SaveAcc(12, 0, 13) Squ_Acc(1, 12) Squ_Acc(2, 11) Squ_Acc(3, 10) Squ_Acc(4, 9) Squ_Acc(5, 8) Squ_Acc(6, 7) Squ_NonDiag \
1093 Squ_SaveAcc(13, 0, 14) Squ_Acc(1, 13) Squ_Acc(2, 12) Squ_Acc(3, 11) Squ_Acc(4, 10) Squ_Acc(5, 9) Squ_Acc(6, 8) Squ_Diag(7) \
1094 Squ_SaveAcc(14, 0, 15) Squ_Acc(1, 14) Squ_Acc(2, 13) Squ_Acc(3, 12) Squ_Acc(4, 11) Squ_Acc(5, 10) Squ_Acc(6, 9) Squ_Acc(7, 8) Squ_NonDiag \
1095 Squ_SaveAcc(15, 1, 15) Squ_Acc(2, 14) Squ_Acc(3, 13) Squ_Acc(4, 12) Squ_Acc(5, 11) Squ_Acc(6, 10) Squ_Acc(7, 9) Squ_Diag(8) \
1096 Squ_SaveAcc(16, 2, 15) Squ_Acc(3, 14) Squ_Acc(4, 13) Squ_Acc(5, 12) Squ_Acc(6, 11) Squ_Acc(7, 10) Squ_Acc(8, 9) Squ_NonDiag \
1097 Squ_SaveAcc(17, 3, 15) Squ_Acc(4, 14) Squ_Acc(5, 13) Squ_Acc(6, 12) Squ_Acc(7, 11) Squ_Acc(8, 10) Squ_Diag(9) \
1098 Squ_SaveAcc(18, 4, 15) Squ_Acc(5, 14) Squ_Acc(6, 13) Squ_Acc(7, 12) Squ_Acc(8, 11) Squ_Acc(9, 10) Squ_NonDiag \
1099 Squ_SaveAcc(19, 5, 15) Squ_Acc(6, 14) Squ_Acc(7, 13) Squ_Acc(8, 12) Squ_Acc(9, 11) Squ_Diag(10) \
1100 Squ_SaveAcc(20, 6, 15) Squ_Acc(7, 14) Squ_Acc(8, 13) Squ_Acc(9, 12) Squ_Acc(10, 11) Squ_NonDiag \
1101 Squ_SaveAcc(21, 7, 15) Squ_Acc(8, 14) Squ_Acc(9, 13) Squ_Acc(10, 12) Squ_Diag(11) \
1102 Squ_SaveAcc(22, 8, 15) Squ_Acc(9, 14) Squ_Acc(10, 13) Squ_Acc(11, 12) Squ_NonDiag \
1103 Squ_SaveAcc(23, 9, 15) Squ_Acc(10, 14) Squ_Acc(11, 13) Squ_Diag(12) \
1104 Squ_SaveAcc(24, 10, 15) Squ_Acc(11, 14) Squ_Acc(12, 13) Squ_NonDiag \
1105 Squ_SaveAcc(25, 11, 15) Squ_Acc(12, 14) Squ_Diag(13) \
1106 Squ_SaveAcc(26, 12, 15) Squ_Acc(13, 14) Squ_NonDiag \
1107 Squ_SaveAcc(27, 13, 15) Squ_Diag(14) \
1108 Squ_SaveAcc(28, 14, 15) Squ_NonDiag \
1109 Squ_End(16)
1110
1111#define Bot_2 \
1112 Mul_Begin(2) \
1113 Bot_SaveAcc(0, 0, 1) Bot_Acc(1, 0) \
1114 Bot_End(2)
1115
1116#define Bot_4 \
1117 Mul_Begin(4) \
1118 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1119 Mul_SaveAcc(1, 2, 0) Mul_Acc(1, 1) Mul_Acc(0, 2) \
1120 Bot_SaveAcc(2, 0, 3) Bot_Acc(1, 2) Bot_Acc(2, 1) Bot_Acc(3, 0) \
1121 Bot_End(4)
1122
1123#define Bot_8 \
1124 Mul_Begin(8) \
1125 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1126 Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1127 Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1128 Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1129 Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1130 Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1131 Bot_SaveAcc(6, 0, 7) Bot_Acc(1, 6) Bot_Acc(2, 5) Bot_Acc(3, 4) Bot_Acc(4, 3) Bot_Acc(5, 2) Bot_Acc(6, 1) Bot_Acc(7, 0) \
1132 Bot_End(8)
1133
1134#define Bot_16 \
1135 Mul_Begin(16) \
1136 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1137 Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1138 Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1139 Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1140 Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1141 Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1142 Mul_SaveAcc(6, 0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1143 Mul_SaveAcc(7, 0, 8) Mul_Acc(1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) Mul_Acc(8, 0) \
1144 Mul_SaveAcc(8, 0, 9) Mul_Acc(1, 8) Mul_Acc(2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) Mul_Acc(8, 1) Mul_Acc(9, 0) \
1145 Mul_SaveAcc(9, 0, 10) Mul_Acc(1, 9) Mul_Acc(2, 8) Mul_Acc(3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) Mul_Acc(8, 2) Mul_Acc(9, 1) Mul_Acc(10, 0) \
1146 Mul_SaveAcc(10, 0, 11) Mul_Acc(1, 10) Mul_Acc(2, 9) Mul_Acc(3, 8) Mul_Acc(4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) Mul_Acc(8, 3) Mul_Acc(9, 2) Mul_Acc(10, 1) Mul_Acc(11, 0) \
1147 Mul_SaveAcc(11, 0, 12) Mul_Acc(1, 11) Mul_Acc(2, 10) Mul_Acc(3, 9) Mul_Acc(4, 8) Mul_Acc(5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) Mul_Acc(8, 4) Mul_Acc(9, 3) Mul_Acc(10, 2) Mul_Acc(11, 1) Mul_Acc(12, 0) \
1148 Mul_SaveAcc(12, 0, 13) Mul_Acc(1, 12) Mul_Acc(2, 11) Mul_Acc(3, 10) Mul_Acc(4, 9) Mul_Acc(5, 8) Mul_Acc(6, 7) Mul_Acc(7, 6) Mul_Acc(8, 5) Mul_Acc(9, 4) Mul_Acc(10, 3) Mul_Acc(11, 2) Mul_Acc(12, 1) Mul_Acc(13, 0) \
1149 Mul_SaveAcc(13, 0, 14) Mul_Acc(1, 13) Mul_Acc(2, 12) Mul_Acc(3, 11) Mul_Acc(4, 10) Mul_Acc(5, 9) Mul_Acc(6, 8) Mul_Acc(7, 7) Mul_Acc(8, 6) Mul_Acc(9, 5) Mul_Acc(10, 4) Mul_Acc(11, 3) Mul_Acc(12, 2) Mul_Acc(13, 1) Mul_Acc(14, 0) \
1150 Bot_SaveAcc(14, 0, 15) Bot_Acc(1, 14) Bot_Acc(2, 13) Bot_Acc(3, 12) Bot_Acc(4, 11) Bot_Acc(5, 10) Bot_Acc(6, 9) Bot_Acc(7, 8) Bot_Acc(8, 7) Bot_Acc(9, 6) Bot_Acc(10, 5) Bot_Acc(11, 4) Bot_Acc(12, 3) Bot_Acc(13, 2) Bot_Acc(14, 1) Bot_Acc(15, 0) \
1151 Bot_End(16)
1152
1153#endif
1154
1155#if 0
1156#define Mul_Begin(n) \
1157 Declare2Words(p) \
1158 Declare2Words(c) \
1159 Declare2Words(d) \
1160 MultiplyWords(p, A[0], B[0]) \
1161 AssignWord(c, LowWord(p)) \
1162 AssignWord(d, HighWord(p))
1163
1164#define Mul_Acc(i, j) \
1165 MultiplyWords(p, A[i], B[j]) \
1166 Acc2WordsBy1(c, LowWord(p)) \
1167 Acc2WordsBy1(d, HighWord(p))
1168
1169#define Mul_SaveAcc(k, i, j) \
1170 R[k] = LowWord(c); \
1171 Add2WordsBy1(c, d, HighWord(c)) \
1172 MultiplyWords(p, A[i], B[j]) \
1173 AssignWord(d, HighWord(p)) \
1174 Acc2WordsBy1(c, LowWord(p))
1175
1176#define Mul_End(n) \
1177 R[2*n-3] = LowWord(c); \
1178 Acc2WordsBy1(d, HighWord(c)) \
1179 MultiplyWords(p, A[n-1], B[n-1])\
1180 Acc2WordsBy2(d, p) \
1181 R[2*n-2] = LowWord(d); \
1182 R[2*n-1] = HighWord(d);
1183
1184#define Bot_SaveAcc(k, i, j) \
1185 R[k] = LowWord(c); \
1186 word e = LowWord(d) + HighWord(c); \
1187 e += A[i] * B[j];
1188
1189#define Bot_Acc(i, j) \
1190 e += A[i] * B[j];
1191
1192#define Bot_End(n) \
1193 R[n-1] = e;
1194#else
1195#define Mul_Begin(n) \
1196 Declare2Words(p) \
1197 word c; \
1198 Declare2Words(d) \
1199 MultiplyWords(p, A[0], B[0]) \
1200 c = LowWord(p); \
1201 AssignWord(d, HighWord(p))
1202
1203#define Mul_Acc(i, j) \
1204 MulAcc(c, d, A[i], B[j])
1205
1206#define Mul_SaveAcc(k, i, j) \
1207 R[k] = c; \
1208 c = LowWord(d); \
1209 AssignWord(d, HighWord(d)) \
1210 MulAcc(c, d, A[i], B[j])
1211
1212#define Mul_End(k, i) \
1213 R[k] = c; \
1214 MultiplyWords(p, A[i], B[i]) \
1215 Acc2WordsBy2(p, d) \
1216 R[k+1] = LowWord(p); \
1217 R[k+2] = HighWord(p);
1218
1219#define Bot_SaveAcc(k, i, j) \
1220 R[k] = c; \
1221 c = LowWord(d); \
1222 c += A[i] * B[j];
1223
1224#define Bot_Acc(i, j) \
1225 c += A[i] * B[j];
1226
1227#define Bot_End(n) \
1228 R[n-1] = c;
1229#endif
1230
1231#define Squ_Begin(n) \
1232 Declare2Words(p) \
1233 word c; \
1234 Declare2Words(d) \
1235 Declare2Words(e) \
1236 MultiplyWords(p, A[0], A[0]) \
1237 R[0] = LowWord(p); \
1238 AssignWord(e, HighWord(p)) \
1239 MultiplyWords(p, A[0], A[1]) \
1240 c = LowWord(p); \
1241 AssignWord(d, HighWord(p)) \
1242 Squ_NonDiag \
1243
1244#define Squ_NonDiag \
1245 Double3Words(c, d)
1246
1247#define Squ_SaveAcc(k, i, j) \
1248 Acc3WordsBy2(c, d, e) \
1249 R[k] = c; \
1250 MultiplyWords(p, A[i], A[j]) \
1251 c = LowWord(p); \
1252 AssignWord(d, HighWord(p)) \
1253
1254#define Squ_Acc(i, j) \
1255 MulAcc(c, d, A[i], A[j])
1256
1257#define Squ_Diag(i) \
1258 Squ_NonDiag \
1259 MulAcc(c, d, A[i], A[i])
1260
1261#define Squ_End(n) \
1262 Acc3WordsBy2(c, d, e) \
1263 R[2*n-3] = c; \
1264 MultiplyWords(p, A[n-1], A[n-1])\
1265 Acc2WordsBy2(p, e) \
1266 R[2*n-2] = LowWord(p); \
1267 R[2*n-1] = HighWord(p);
1268
1269
1270void Baseline_Multiply2(word *R, const word *AA, const word *BB)
1271{
1272 // http://github.com/weidai11/cryptopp/issues/188
1275
1276 Mul_2
1277}
1278
1279void Baseline_Multiply4(word *R, const word *AA, const word *BB)
1280{
1281 // http://github.com/weidai11/cryptopp/issues/188
1284
1285 Mul_4
1286}
1287
1288void Baseline_Multiply8(word *R, const word *AA, const word *BB)
1289{
1290 // http://github.com/weidai11/cryptopp/issues/188
1293
1294 Mul_8
1295}
1296
1297void Baseline_Square2(word *R, const word *AA)
1298{
1299 // http://github.com/weidai11/cryptopp/issues/188
1301
1302 Squ_2
1303}
1304
1305void Baseline_Square4(word *R, const word *AA)
1306{
1307 // http://github.com/weidai11/cryptopp/issues/188
1309
1310 Squ_4
1311}
1312
1313void Baseline_Square8(word *R, const word *AA)
1314{
1315 // http://github.com/weidai11/cryptopp/issues/188
1317
1318 Squ_8
1319}
1320
1321void Baseline_MultiplyBottom2(word *R, const word *AA, const word *BB)
1322{
1323 // http://github.com/weidai11/cryptopp/issues/188
1326
1327 Bot_2
1328
1329// http://github.com/weidai11/cryptopp/issues/340
1330#if defined(TWO_64_BIT_WORDS)
1331 CRYPTOPP_UNUSED(d0); CRYPTOPP_UNUSED(d1);
1332#endif
1333}
1334
1335void Baseline_MultiplyBottom4(word *R, const word *AA, const word *BB)
1336{
1337 // http://github.com/weidai11/cryptopp/issues/188
1340
1341 Bot_4
1342}
1343
1344void Baseline_MultiplyBottom8(word *R, const word *AA, const word *BB)
1345{
1346 // http://github.com/weidai11/cryptopp/issues/188
1349
1350 Bot_8
1351}
1352
1353#define Top_Begin(n) \
1354 Declare2Words(p) \
1355 word c; \
1356 Declare2Words(d) \
1357 MultiplyWords(p, A[0], B[n-2]);\
1358 AssignWord(d, HighWord(p));
1359
1360#define Top_Acc(i, j) \
1361 MultiplyWords(p, A[i], B[j]);\
1362 Acc2WordsBy1(d, HighWord(p));
1363
1364#define Top_SaveAcc0(i, j) \
1365 c = LowWord(d); \
1366 AssignWord(d, HighWord(d)) \
1367 MulAcc(c, d, A[i], B[j])
1368
1369#define Top_SaveAcc1(i, j) \
1370 c = L<c; \
1371 Acc2WordsBy1(d, c); \
1372 c = LowWord(d); \
1373 AssignWord(d, HighWord(d)) \
1374 MulAcc(c, d, A[i], B[j])
1375
1376void Baseline_MultiplyTop2(word *R, const word *A, const word *B, word L)
1377{
1378 CRYPTOPP_UNUSED(L);
1379 word T[4];
1380 Baseline_Multiply2(T, A, B);
1381 R[0] = T[2];
1382 R[1] = T[3];
1383}
1384
1385void Baseline_MultiplyTop4(word *R, const word *AA, const word *BB, word L)
1386{
1387 // http://github.com/weidai11/cryptopp/issues/188
1390
1391 Top_Begin(4)
1392 Top_Acc(1, 1) Top_Acc(2, 0) \
1393 Top_SaveAcc0(0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1394 Top_SaveAcc1(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) \
1395 Mul_SaveAcc(0, 2, 3) Mul_Acc(3, 2) \
1396 Mul_End(1, 3)
1397}
1398
1399void Baseline_MultiplyTop8(word *R, const word *AA, const word *BB, word L)
1400{
1401 // http://github.com/weidai11/cryptopp/issues/188
1404
1405 Top_Begin(8)
1406 Top_Acc(1, 5) Top_Acc(2, 4) Top_Acc(3, 3) Top_Acc(4, 2) Top_Acc(5, 1) Top_Acc(6, 0) \
1407 Top_SaveAcc0(0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1408 Top_SaveAcc1(1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) \
1409 Mul_SaveAcc(0, 2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) \
1410 Mul_SaveAcc(1, 3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) \
1411 Mul_SaveAcc(2, 4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) \
1412 Mul_SaveAcc(3, 5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) \
1413 Mul_SaveAcc(4, 6, 7) Mul_Acc(7, 6) \
1414 Mul_End(5, 7)
1415}
1416
1417#if !CRYPTOPP_INTEGER_SSE2 // save memory by not compiling these functions when SSE2 is available
1418void Baseline_Multiply16(word *R, const word *AA, const word *BB)
1419{
1420 // http://github.com/weidai11/cryptopp/issues/188
1423
1424 Mul_16
1425}
1426
1427void Baseline_Square16(word *R, const word *AA)
1428{
1429 // http://github.com/weidai11/cryptopp/issues/188
1431
1432 Squ_16
1433}
1434
1435void Baseline_MultiplyBottom16(word *R, const word *AA, const word *BB)
1436{
1437 // http://github.com/weidai11/cryptopp/issues/188
1440
1441 Bot_16
1442}
1443
1444void Baseline_MultiplyTop16(word *R, const word *AA, const word *BB, word L)
1445{
1446 // http://github.com/weidai11/cryptopp/issues/188
1449
1450 Top_Begin(16)
1451 Top_Acc(1, 13) Top_Acc(2, 12) Top_Acc(3, 11) Top_Acc(4, 10) Top_Acc(5, 9) Top_Acc(6, 8) Top_Acc(7, 7) Top_Acc(8, 6) Top_Acc(9, 5) Top_Acc(10, 4) Top_Acc(11, 3) Top_Acc(12, 2) Top_Acc(13, 1) Top_Acc(14, 0) \
1452 Top_SaveAcc0(0, 15) Mul_Acc(1, 14) Mul_Acc(2, 13) Mul_Acc(3, 12) Mul_Acc(4, 11) Mul_Acc(5, 10) Mul_Acc(6, 9) Mul_Acc(7, 8) Mul_Acc(8, 7) Mul_Acc(9, 6) Mul_Acc(10, 5) Mul_Acc(11, 4) Mul_Acc(12, 3) Mul_Acc(13, 2) Mul_Acc(14, 1) Mul_Acc(15, 0) \
1453 Top_SaveAcc1(1, 15) Mul_Acc(2, 14) Mul_Acc(3, 13) Mul_Acc(4, 12) Mul_Acc(5, 11) Mul_Acc(6, 10) Mul_Acc(7, 9) Mul_Acc(8, 8) Mul_Acc(9, 7) Mul_Acc(10, 6) Mul_Acc(11, 5) Mul_Acc(12, 4) Mul_Acc(13, 3) Mul_Acc(14, 2) Mul_Acc(15, 1) \
1454 Mul_SaveAcc(0, 2, 15) Mul_Acc(3, 14) Mul_Acc(4, 13) Mul_Acc(5, 12) Mul_Acc(6, 11) Mul_Acc(7, 10) Mul_Acc(8, 9) Mul_Acc(9, 8) Mul_Acc(10, 7) Mul_Acc(11, 6) Mul_Acc(12, 5) Mul_Acc(13, 4) Mul_Acc(14, 3) Mul_Acc(15, 2) \
1455 Mul_SaveAcc(1, 3, 15) Mul_Acc(4, 14) Mul_Acc(5, 13) Mul_Acc(6, 12) Mul_Acc(7, 11) Mul_Acc(8, 10) Mul_Acc(9, 9) Mul_Acc(10, 8) Mul_Acc(11, 7) Mul_Acc(12, 6) Mul_Acc(13, 5) Mul_Acc(14, 4) Mul_Acc(15, 3) \
1456 Mul_SaveAcc(2, 4, 15) Mul_Acc(5, 14) Mul_Acc(6, 13) Mul_Acc(7, 12) Mul_Acc(8, 11) Mul_Acc(9, 10) Mul_Acc(10, 9) Mul_Acc(11, 8) Mul_Acc(12, 7) Mul_Acc(13, 6) Mul_Acc(14, 5) Mul_Acc(15, 4) \
1457 Mul_SaveAcc(3, 5, 15) Mul_Acc(6, 14) Mul_Acc(7, 13) Mul_Acc(8, 12) Mul_Acc(9, 11) Mul_Acc(10, 10) Mul_Acc(11, 9) Mul_Acc(12, 8) Mul_Acc(13, 7) Mul_Acc(14, 6) Mul_Acc(15, 5) \
1458 Mul_SaveAcc(4, 6, 15) Mul_Acc(7, 14) Mul_Acc(8, 13) Mul_Acc(9, 12) Mul_Acc(10, 11) Mul_Acc(11, 10) Mul_Acc(12, 9) Mul_Acc(13, 8) Mul_Acc(14, 7) Mul_Acc(15, 6) \
1459 Mul_SaveAcc(5, 7, 15) Mul_Acc(8, 14) Mul_Acc(9, 13) Mul_Acc(10, 12) Mul_Acc(11, 11) Mul_Acc(12, 10) Mul_Acc(13, 9) Mul_Acc(14, 8) Mul_Acc(15, 7) \
1460 Mul_SaveAcc(6, 8, 15) Mul_Acc(9, 14) Mul_Acc(10, 13) Mul_Acc(11, 12) Mul_Acc(12, 11) Mul_Acc(13, 10) Mul_Acc(14, 9) Mul_Acc(15, 8) \
1461 Mul_SaveAcc(7, 9, 15) Mul_Acc(10, 14) Mul_Acc(11, 13) Mul_Acc(12, 12) Mul_Acc(13, 11) Mul_Acc(14, 10) Mul_Acc(15, 9) \
1462 Mul_SaveAcc(8, 10, 15) Mul_Acc(11, 14) Mul_Acc(12, 13) Mul_Acc(13, 12) Mul_Acc(14, 11) Mul_Acc(15, 10) \
1463 Mul_SaveAcc(9, 11, 15) Mul_Acc(12, 14) Mul_Acc(13, 13) Mul_Acc(14, 12) Mul_Acc(15, 11) \
1464 Mul_SaveAcc(10, 12, 15) Mul_Acc(13, 14) Mul_Acc(14, 13) Mul_Acc(15, 12) \
1465 Mul_SaveAcc(11, 13, 15) Mul_Acc(14, 14) Mul_Acc(15, 13) \
1466 Mul_SaveAcc(12, 14, 15) Mul_Acc(15, 14) \
1467 Mul_End(13, 15)
1468}
1469#endif
1470
1471// ********************************************************
1472
1473#if CRYPTOPP_INTEGER_SSE2
1474
1475CRYPTOPP_ALIGN_DATA(16)
1477const word32 s_maskLow16[4] = {
1478 0xffff,0xffff,0xffff,0xffff
1479};
1480
1481#undef Mul_Begin
1482#undef Mul_Acc
1483#undef Top_Begin
1484#undef Top_Acc
1485#undef Squ_Acc
1486#undef Squ_NonDiag
1487#undef Squ_Diag
1488#undef Squ_SaveAcc
1489#undef Squ_Begin
1490#undef Mul_SaveAcc
1491#undef Bot_Acc
1492#undef Bot_SaveAcc
1493#undef Bot_End
1494#undef Squ_End
1495#undef Mul_End
1496
1497#define SSE2_FinalSave(k) \
1498 AS2( psllq xmm5, 16) \
1499 AS2( paddq xmm4, xmm5) \
1500 AS2( movq QWORD PTR [ecx+8*(k)], xmm4)
1501
1502#define SSE2_SaveShift(k) \
1503 AS2( movq xmm0, xmm6) \
1504 AS2( punpckhqdq xmm6, xmm0) \
1505 AS2( movq xmm1, xmm7) \
1506 AS2( punpckhqdq xmm7, xmm1) \
1507 AS2( paddd xmm6, xmm0) \
1508 AS2( pslldq xmm6, 4) \
1509 AS2( paddd xmm7, xmm1) \
1510 AS2( paddd xmm4, xmm6) \
1511 AS2( pslldq xmm7, 4) \
1512 AS2( movq xmm6, xmm4) \
1513 AS2( paddd xmm5, xmm7) \
1514 AS2( movq xmm7, xmm5) \
1515 AS2( movd DWORD PTR [ecx+8*(k)], xmm4) \
1516 AS2( psrlq xmm6, 16) \
1517 AS2( paddq xmm6, xmm7) \
1518 AS2( punpckhqdq xmm4, xmm0) \
1519 AS2( punpckhqdq xmm5, xmm0) \
1520 AS2( movq QWORD PTR [ecx+8*(k)+2], xmm6) \
1521 AS2( psrlq xmm6, 3*16) \
1522 AS2( paddd xmm4, xmm6) \
1523
1524#define Squ_SSE2_SaveShift(k) \
1525 AS2( movq xmm0, xmm6) \
1526 AS2( punpckhqdq xmm6, xmm0) \
1527 AS2( movq xmm1, xmm7) \
1528 AS2( punpckhqdq xmm7, xmm1) \
1529 AS2( paddd xmm6, xmm0) \
1530 AS2( pslldq xmm6, 4) \
1531 AS2( paddd xmm7, xmm1) \
1532 AS2( paddd xmm4, xmm6) \
1533 AS2( pslldq xmm7, 4) \
1534 AS2( movhlps xmm6, xmm4) \
1535 AS2( movd DWORD PTR [ecx+8*(k)], xmm4) \
1536 AS2( paddd xmm5, xmm7) \
1537 AS2( movhps QWORD PTR [esp+12], xmm5)\
1538 AS2( psrlq xmm4, 16) \
1539 AS2( paddq xmm4, xmm5) \
1540 AS2( movq QWORD PTR [ecx+8*(k)+2], xmm4) \
1541 AS2( psrlq xmm4, 3*16) \
1542 AS2( paddd xmm4, xmm6) \
1543 AS2( movq QWORD PTR [esp+4], xmm4)\
1544
1545#define SSE2_FirstMultiply(i) \
1546 AS2( movdqa xmm7, [esi+(i)*16])\
1547 AS2( movdqa xmm5, [edi-(i)*16])\
1548 AS2( pmuludq xmm5, xmm7) \
1549 AS2( movdqa xmm4, [ebx])\
1550 AS2( movdqa xmm6, xmm4) \
1551 AS2( pand xmm4, xmm5) \
1552 AS2( psrld xmm5, 16) \
1553 AS2( pmuludq xmm7, [edx-(i)*16])\
1554 AS2( pand xmm6, xmm7) \
1555 AS2( psrld xmm7, 16)
1556
1557#define Squ_Begin(n) \
1558 SquPrologue \
1559 AS2( mov esi, esp)\
1560 AS2( and esp, 0xfffffff0)\
1561 AS2( lea edi, [esp-32*n])\
1562 AS2( sub esp, 32*n+16)\
1563 AS1( push esi)\
1564 AS2( mov esi, edi) \
1565 AS2( xor edx, edx) \
1566 ASL(1) \
1567 ASS( pshufd xmm0, [eax+edx], 3,1,2,0) \
1568 ASS( pshufd xmm1, [eax+edx], 2,0,3,1) \
1569 AS2( movdqa [edi+2*edx], xmm0) \
1570 AS2( psrlq xmm0, 32) \
1571 AS2( movdqa [edi+2*edx+16], xmm0) \
1572 AS2( movdqa [edi+16*n+2*edx], xmm1) \
1573 AS2( psrlq xmm1, 32) \
1574 AS2( movdqa [edi+16*n+2*edx+16], xmm1) \
1575 AS2( add edx, 16) \
1576 AS2( cmp edx, 8*(n)) \
1577 ASJ( jne, 1, b) \
1578 AS2( lea edx, [edi+16*n])\
1579 SSE2_FirstMultiply(0) \
1580
1581#define Squ_Acc(i) \
1582 ASL(LSqu##i) \
1583 AS2( movdqa xmm1, [esi+(i)*16]) \
1584 AS2( movdqa xmm0, [edi-(i)*16]) \
1585 AS2( movdqa xmm2, [ebx]) \
1586 AS2( pmuludq xmm0, xmm1) \
1587 AS2( pmuludq xmm1, [edx-(i)*16]) \
1588 AS2( movdqa xmm3, xmm2) \
1589 AS2( pand xmm2, xmm0) \
1590 AS2( psrld xmm0, 16) \
1591 AS2( paddd xmm4, xmm2) \
1592 AS2( paddd xmm5, xmm0) \
1593 AS2( pand xmm3, xmm1) \
1594 AS2( psrld xmm1, 16) \
1595 AS2( paddd xmm6, xmm3) \
1596 AS2( paddd xmm7, xmm1) \
1597
1598#define Squ_Acc1(i)
1599#define Squ_Acc2(i) ASC(call, LSqu##i)
1600#define Squ_Acc3(i) Squ_Acc2(i)
1601#define Squ_Acc4(i) Squ_Acc2(i)
1602#define Squ_Acc5(i) Squ_Acc2(i)
1603#define Squ_Acc6(i) Squ_Acc2(i)
1604#define Squ_Acc7(i) Squ_Acc2(i)
1605#define Squ_Acc8(i) Squ_Acc2(i)
1606
1607#define SSE2_End(E, n) \
1608 SSE2_SaveShift(2*(n)-3) \
1609 AS2( movdqa xmm7, [esi+16]) \
1610 AS2( movdqa xmm0, [edi]) \
1611 AS2( pmuludq xmm0, xmm7) \
1612 AS2( movdqa xmm2, [ebx]) \
1613 AS2( pmuludq xmm7, [edx]) \
1614 AS2( movdqa xmm6, xmm2) \
1615 AS2( pand xmm2, xmm0) \
1616 AS2( psrld xmm0, 16) \
1617 AS2( paddd xmm4, xmm2) \
1618 AS2( paddd xmm5, xmm0) \
1619 AS2( pand xmm6, xmm7) \
1620 AS2( psrld xmm7, 16) \
1621 SSE2_SaveShift(2*(n)-2) \
1622 SSE2_FinalSave(2*(n)-1) \
1623 AS1( pop esp)\
1624 E
1625
1626#define Squ_End(n) SSE2_End(SquEpilogue, n)
1627#define Mul_End(n) SSE2_End(MulEpilogue, n)
1628#define Top_End(n) SSE2_End(TopEpilogue, n)
1629
1630#define Squ_Column1(k, i) \
1631 Squ_SSE2_SaveShift(k) \
1632 AS2( add esi, 16) \
1633 SSE2_FirstMultiply(1)\
1634 Squ_Acc##i(i) \
1635 AS2( paddd xmm4, xmm4) \
1636 AS2( paddd xmm5, xmm5) \
1637 AS2( movdqa xmm3, [esi]) \
1638 AS2( movq xmm1, QWORD PTR [esi+8]) \
1639 AS2( pmuludq xmm1, xmm3) \
1640 AS2( pmuludq xmm3, xmm3) \
1641 AS2( movdqa xmm0, [ebx])\
1642 AS2( movdqa xmm2, xmm0) \
1643 AS2( pand xmm0, xmm1) \
1644 AS2( psrld xmm1, 16) \
1645 AS2( paddd xmm6, xmm0) \
1646 AS2( paddd xmm7, xmm1) \
1647 AS2( pand xmm2, xmm3) \
1648 AS2( psrld xmm3, 16) \
1649 AS2( paddd xmm6, xmm6) \
1650 AS2( paddd xmm7, xmm7) \
1651 AS2( paddd xmm4, xmm2) \
1652 AS2( paddd xmm5, xmm3) \
1653 AS2( movq xmm0, QWORD PTR [esp+4])\
1654 AS2( movq xmm1, QWORD PTR [esp+12])\
1655 AS2( paddd xmm4, xmm0)\
1656 AS2( paddd xmm5, xmm1)\
1657
1658#define Squ_Column0(k, i) \
1659 Squ_SSE2_SaveShift(k) \
1660 AS2( add edi, 16) \
1661 AS2( add edx, 16) \
1662 SSE2_FirstMultiply(1)\
1663 Squ_Acc##i(i) \
1664 AS2( paddd xmm6, xmm6) \
1665 AS2( paddd xmm7, xmm7) \
1666 AS2( paddd xmm4, xmm4) \
1667 AS2( paddd xmm5, xmm5) \
1668 AS2( movq xmm0, QWORD PTR [esp+4])\
1669 AS2( movq xmm1, QWORD PTR [esp+12])\
1670 AS2( paddd xmm4, xmm0)\
1671 AS2( paddd xmm5, xmm1)\
1672
1673#define SSE2_MulAdd45 \
1674 AS2( movdqa xmm7, [esi]) \
1675 AS2( movdqa xmm0, [edi]) \
1676 AS2( pmuludq xmm0, xmm7) \
1677 AS2( movdqa xmm2, [ebx]) \
1678 AS2( pmuludq xmm7, [edx]) \
1679 AS2( movdqa xmm6, xmm2) \
1680 AS2( pand xmm2, xmm0) \
1681 AS2( psrld xmm0, 16) \
1682 AS2( paddd xmm4, xmm2) \
1683 AS2( paddd xmm5, xmm0) \
1684 AS2( pand xmm6, xmm7) \
1685 AS2( psrld xmm7, 16)
1686
1687#define Mul_Begin(n) \
1688 MulPrologue \
1689 AS2( mov esi, esp)\
1690 AS2( and esp, 0xfffffff0)\
1691 AS2( sub esp, 48*n+16)\
1692 AS1( push esi)\
1693 AS2( xor edx, edx) \
1694 ASL(1) \
1695 ASS( pshufd xmm0, [eax+edx], 3,1,2,0) \
1696 ASS( pshufd xmm1, [eax+edx], 2,0,3,1) \
1697 ASS( pshufd xmm2, [edi+edx], 3,1,2,0) \
1698 AS2( movdqa [esp+20+2*edx], xmm0) \
1699 AS2( psrlq xmm0, 32) \
1700 AS2( movdqa [esp+20+2*edx+16], xmm0) \
1701 AS2( movdqa [esp+20+16*n+2*edx], xmm1) \
1702 AS2( psrlq xmm1, 32) \
1703 AS2( movdqa [esp+20+16*n+2*edx+16], xmm1) \
1704 AS2( movdqa [esp+20+32*n+2*edx], xmm2) \
1705 AS2( psrlq xmm2, 32) \
1706 AS2( movdqa [esp+20+32*n+2*edx+16], xmm2) \
1707 AS2( add edx, 16) \
1708 AS2( cmp edx, 8*(n)) \
1709 ASJ( jne, 1, b) \
1710 AS2( lea edi, [esp+20])\
1711 AS2( lea edx, [esp+20+16*n])\
1712 AS2( lea esi, [esp+20+32*n])\
1713 SSE2_FirstMultiply(0) \
1714
1715#define Mul_Acc(i) \
1716 ASL(LMul##i) \
1717 AS2( movdqa xmm1, [esi+i/2*(1-(i-2*(i/2))*2)*16]) \
1718 AS2( movdqa xmm0, [edi-i/2*(1-(i-2*(i/2))*2)*16]) \
1719 AS2( movdqa xmm2, [ebx]) \
1720 AS2( pmuludq xmm0, xmm1) \
1721 AS2( pmuludq xmm1, [edx-i/2*(1-(i-2*(i/2))*2)*16]) \
1722 AS2( movdqa xmm3, xmm2) \
1723 AS2( pand xmm2, xmm0) \
1724 AS2( psrld xmm0, 16) \
1725 AS2( paddd xmm4, xmm2) \
1726 AS2( paddd xmm5, xmm0) \
1727 AS2( pand xmm3, xmm1) \
1728 AS2( psrld xmm1, 16) \
1729 AS2( paddd xmm6, xmm3) \
1730 AS2( paddd xmm7, xmm1) \
1731
1732#define Mul_Acc1(i)
1733#define Mul_Acc2(i) ASC(call, LMul##i)
1734#define Mul_Acc3(i) Mul_Acc2(i)
1735#define Mul_Acc4(i) Mul_Acc2(i)
1736#define Mul_Acc5(i) Mul_Acc2(i)
1737#define Mul_Acc6(i) Mul_Acc2(i)
1738#define Mul_Acc7(i) Mul_Acc2(i)
1739#define Mul_Acc8(i) Mul_Acc2(i)
1740#define Mul_Acc9(i) Mul_Acc2(i)
1741#define Mul_Acc10(i) Mul_Acc2(i)
1742#define Mul_Acc11(i) Mul_Acc2(i)
1743#define Mul_Acc12(i) Mul_Acc2(i)
1744#define Mul_Acc13(i) Mul_Acc2(i)
1745#define Mul_Acc14(i) Mul_Acc2(i)
1746#define Mul_Acc15(i) Mul_Acc2(i)
1747#define Mul_Acc16(i) Mul_Acc2(i)
1748
1749#define Mul_Column1(k, i) \
1750 SSE2_SaveShift(k) \
1751 AS2( add esi, 16) \
1752 SSE2_MulAdd45\
1753 Mul_Acc##i(i) \
1754
1755#define Mul_Column0(k, i) \
1756 SSE2_SaveShift(k) \
1757 AS2( add edi, 16) \
1758 AS2( add edx, 16) \
1759 SSE2_MulAdd45\
1760 Mul_Acc##i(i) \
1761
1762#define Bot_Acc(i) \
1763 AS2( movdqa xmm1, [esi+i/2*(1-(i-2*(i/2))*2)*16]) \
1764 AS2( movdqa xmm0, [edi-i/2*(1-(i-2*(i/2))*2)*16]) \
1765 AS2( pmuludq xmm0, xmm1) \
1766 AS2( pmuludq xmm1, [edx-i/2*(1-(i-2*(i/2))*2)*16]) \
1767 AS2( paddq xmm4, xmm0) \
1768 AS2( paddd xmm6, xmm1)
1769
1770#define Bot_SaveAcc(k) \
1771 SSE2_SaveShift(k) \
1772 AS2( add edi, 16) \
1773 AS2( add edx, 16) \
1774 AS2( movdqa xmm6, [esi]) \
1775 AS2( movdqa xmm0, [edi]) \
1776 AS2( pmuludq xmm0, xmm6) \
1777 AS2( paddq xmm4, xmm0) \
1778 AS2( psllq xmm5, 16) \
1779 AS2( paddq xmm4, xmm5) \
1780 AS2( pmuludq xmm6, [edx])
1781
1782#define Bot_End(n) \
1783 AS2( movhlps xmm7, xmm6) \
1784 AS2( paddd xmm6, xmm7) \
1785 AS2( psllq xmm6, 32) \
1786 AS2( paddd xmm4, xmm6) \
1787 AS2( movq QWORD PTR [ecx+8*((n)-1)], xmm4) \
1788 AS1( pop esp)\
1789 MulEpilogue
1790
1791#define Top_Begin(n) \
1792 TopPrologue \
1793 AS2( mov edx, esp)\
1794 AS2( and esp, 0xfffffff0)\
1795 AS2( sub esp, 48*n+16)\
1796 AS1( push edx)\
1797 AS2( xor edx, edx) \
1798 ASL(1) \
1799 ASS( pshufd xmm0, [eax+edx], 3,1,2,0) \
1800 ASS( pshufd xmm1, [eax+edx], 2,0,3,1) \
1801 ASS( pshufd xmm2, [edi+edx], 3,1,2,0) \
1802 AS2( movdqa [esp+20+2*edx], xmm0) \
1803 AS2( psrlq xmm0, 32) \
1804 AS2( movdqa [esp+20+2*edx+16], xmm0) \
1805 AS2( movdqa [esp+20+16*n+2*edx], xmm1) \
1806 AS2( psrlq xmm1, 32) \
1807 AS2( movdqa [esp+20+16*n+2*edx+16], xmm1) \
1808 AS2( movdqa [esp+20+32*n+2*edx], xmm2) \
1809 AS2( psrlq xmm2, 32) \
1810 AS2( movdqa [esp+20+32*n+2*edx+16], xmm2) \
1811 AS2( add edx, 16) \
1812 AS2( cmp edx, 8*(n)) \
1813 ASJ( jne, 1, b) \
1814 AS2( mov eax, esi) \
1815 AS2( lea edi, [esp+20+00*n+16*(n/2-1)])\
1816 AS2( lea edx, [esp+20+16*n+16*(n/2-1)])\
1817 AS2( lea esi, [esp+20+32*n+16*(n/2-1)])\
1818 AS2( pxor xmm4, xmm4)\
1819 AS2( pxor xmm5, xmm5)
1820
1821#define Top_Acc(i) \
1822 AS2( movq xmm0, QWORD PTR [esi+i/2*(1-(i-2*(i/2))*2)*16+8]) \
1823 AS2( pmuludq xmm0, [edx-i/2*(1-(i-2*(i/2))*2)*16]) \
1824 AS2( psrlq xmm0, 48) \
1825 AS2( paddd xmm5, xmm0)\
1826
1827#define Top_Column0(i) \
1828 AS2( psllq xmm5, 32) \
1829 AS2( add edi, 16) \
1830 AS2( add edx, 16) \
1831 SSE2_MulAdd45\
1832 Mul_Acc##i(i) \
1833
1834#define Top_Column1(i) \
1835 SSE2_SaveShift(0) \
1836 AS2( add esi, 16) \
1837 SSE2_MulAdd45\
1838 Mul_Acc##i(i) \
1839 AS2( shr eax, 16) \
1840 AS2( movd xmm0, eax)\
1841 AS2( movd xmm1, [ecx+4])\
1842 AS2( psrld xmm1, 16)\
1843 AS2( pcmpgtd xmm1, xmm0)\
1844 AS2( psrld xmm1, 31)\
1845 AS2( paddd xmm4, xmm1)\
1846
1847void SSE2_Square4(word *C, const word *A)
1848{
1849 Squ_Begin(2)
1850 Squ_Column0(0, 1)
1851 Squ_End(2)
1852}
1853
1854void SSE2_Square8(word *C, const word *A)
1855{
1856 Squ_Begin(4)
1857#ifndef __GNUC__
1858 ASJ( jmp, 0, f)
1859 Squ_Acc(2)
1860 AS1( ret) ASL(0)
1861#endif
1862 Squ_Column0(0, 1)
1863 Squ_Column1(1, 1)
1864 Squ_Column0(2, 2)
1865 Squ_Column1(3, 1)
1866 Squ_Column0(4, 1)
1867 Squ_End(4)
1868}
1869
1870void SSE2_Square16(word *C, const word *A)
1871{
1872 Squ_Begin(8)
1873#ifndef __GNUC__
1874 ASJ( jmp, 0, f)
1875 Squ_Acc(4) Squ_Acc(3) Squ_Acc(2)
1876 AS1( ret) ASL(0)
1877#endif
1878 Squ_Column0(0, 1)
1879 Squ_Column1(1, 1)
1880 Squ_Column0(2, 2)
1881 Squ_Column1(3, 2)
1882 Squ_Column0(4, 3)
1883 Squ_Column1(5, 3)
1884 Squ_Column0(6, 4)
1885 Squ_Column1(7, 3)
1886 Squ_Column0(8, 3)
1887 Squ_Column1(9, 2)
1888 Squ_Column0(10, 2)
1889 Squ_Column1(11, 1)
1890 Squ_Column0(12, 1)
1891 Squ_End(8)
1892}
1893
1894void SSE2_Square32(word *C, const word *A)
1895{
1896 Squ_Begin(16)
1897 ASJ( jmp, 0, f)
1898 Squ_Acc(8) Squ_Acc(7) Squ_Acc(6) Squ_Acc(5) Squ_Acc(4) Squ_Acc(3) Squ_Acc(2)
1899 AS1( ret) ASL(0)
1900 Squ_Column0(0, 1)
1901 Squ_Column1(1, 1)
1902 Squ_Column0(2, 2)
1903 Squ_Column1(3, 2)
1904 Squ_Column0(4, 3)
1905 Squ_Column1(5, 3)
1906 Squ_Column0(6, 4)
1907 Squ_Column1(7, 4)
1908 Squ_Column0(8, 5)
1909 Squ_Column1(9, 5)
1910 Squ_Column0(10, 6)
1911 Squ_Column1(11, 6)
1912 Squ_Column0(12, 7)
1913 Squ_Column1(13, 7)
1914 Squ_Column0(14, 8)
1915 Squ_Column1(15, 7)
1916 Squ_Column0(16, 7)
1917 Squ_Column1(17, 6)
1918 Squ_Column0(18, 6)
1919 Squ_Column1(19, 5)
1920 Squ_Column0(20, 5)
1921 Squ_Column1(21, 4)
1922 Squ_Column0(22, 4)
1923 Squ_Column1(23, 3)
1924 Squ_Column0(24, 3)
1925 Squ_Column1(25, 2)
1926 Squ_Column0(26, 2)
1927 Squ_Column1(27, 1)
1928 Squ_Column0(28, 1)
1929 Squ_End(16)
1930}
1931
1932void SSE2_Multiply4(word *C, const word *A, const word *B)
1933{
1934 Mul_Begin(2)
1935#ifndef __GNUC__
1936 ASJ( jmp, 0, f)
1937 Mul_Acc(2)
1938 AS1( ret) ASL(0)
1939#endif
1940 Mul_Column0(0, 2)
1941 Mul_End(2)
1942}
1943
1944void SSE2_Multiply8(word *C, const word *A, const word *B)
1945{
1946 Mul_Begin(4)
1947#ifndef __GNUC__
1948 ASJ( jmp, 0, f)
1949 Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
1950 AS1( ret) ASL(0)
1951#endif
1952 Mul_Column0(0, 2)
1953 Mul_Column1(1, 3)
1954 Mul_Column0(2, 4)
1955 Mul_Column1(3, 3)
1956 Mul_Column0(4, 2)
1957 Mul_End(4)
1958}
1959
1960void SSE2_Multiply16(word *C, const word *A, const word *B)
1961{
1962 Mul_Begin(8)
1963#ifndef __GNUC__
1964 ASJ( jmp, 0, f)
1965 Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
1966 AS1( ret) ASL(0)
1967#endif
1968 Mul_Column0(0, 2)
1969 Mul_Column1(1, 3)
1970 Mul_Column0(2, 4)
1971 Mul_Column1(3, 5)
1972 Mul_Column0(4, 6)
1973 Mul_Column1(5, 7)
1974 Mul_Column0(6, 8)
1975 Mul_Column1(7, 7)
1976 Mul_Column0(8, 6)
1977 Mul_Column1(9, 5)
1978 Mul_Column0(10, 4)
1979 Mul_Column1(11, 3)
1980 Mul_Column0(12, 2)
1981 Mul_End(8)
1982}
1983
1984void SSE2_Multiply32(word *C, const word *A, const word *B)
1985{
1986 Mul_Begin(16)
1987 ASJ( jmp, 0, f)
1988 Mul_Acc(16) Mul_Acc(15) Mul_Acc(14) Mul_Acc(13) Mul_Acc(12) Mul_Acc(11) Mul_Acc(10) Mul_Acc(9) Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
1989 AS1( ret) ASL(0)
1990 Mul_Column0(0, 2)
1991 Mul_Column1(1, 3)
1992 Mul_Column0(2, 4)
1993 Mul_Column1(3, 5)
1994 Mul_Column0(4, 6)
1995 Mul_Column1(5, 7)
1996 Mul_Column0(6, 8)
1997 Mul_Column1(7, 9)
1998 Mul_Column0(8, 10)
1999 Mul_Column1(9, 11)
2000 Mul_Column0(10, 12)
2001 Mul_Column1(11, 13)
2002 Mul_Column0(12, 14)
2003 Mul_Column1(13, 15)
2004 Mul_Column0(14, 16)
2005 Mul_Column1(15, 15)
2006 Mul_Column0(16, 14)
2007 Mul_Column1(17, 13)
2008 Mul_Column0(18, 12)
2009 Mul_Column1(19, 11)
2010 Mul_Column0(20, 10)
2011 Mul_Column1(21, 9)
2012 Mul_Column0(22, 8)
2013 Mul_Column1(23, 7)
2014 Mul_Column0(24, 6)
2015 Mul_Column1(25, 5)
2016 Mul_Column0(26, 4)
2017 Mul_Column1(27, 3)
2018 Mul_Column0(28, 2)
2019 Mul_End(16)
2020}
2021
2022void SSE2_MultiplyBottom4(word *C, const word *A, const word *B)
2023{
2024 Mul_Begin(2)
2025 Bot_SaveAcc(0) Bot_Acc(2)
2026 Bot_End(2)
2027}
2028
2029void SSE2_MultiplyBottom8(word *C, const word *A, const word *B)
2030{
2031 Mul_Begin(4)
2032#ifndef __GNUC__
2033 ASJ( jmp, 0, f)
2034 Mul_Acc(3) Mul_Acc(2)
2035 AS1( ret) ASL(0)
2036#endif
2037 Mul_Column0(0, 2)
2038 Mul_Column1(1, 3)
2039 Bot_SaveAcc(2) Bot_Acc(4) Bot_Acc(3) Bot_Acc(2)
2040 Bot_End(4)
2041}
2042
2043void SSE2_MultiplyBottom16(word *C, const word *A, const word *B)
2044{
2045 Mul_Begin(8)
2046#ifndef __GNUC__
2047 ASJ( jmp, 0, f)
2048 Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2049 AS1( ret) ASL(0)
2050#endif
2051 Mul_Column0(0, 2)
2052 Mul_Column1(1, 3)
2053 Mul_Column0(2, 4)
2054 Mul_Column1(3, 5)
2055 Mul_Column0(4, 6)
2056 Mul_Column1(5, 7)
2057 Bot_SaveAcc(6) Bot_Acc(8) Bot_Acc(7) Bot_Acc(6) Bot_Acc(5) Bot_Acc(4) Bot_Acc(3) Bot_Acc(2)
2058 Bot_End(8)
2059}
2060
2061void SSE2_MultiplyBottom32(word *C, const word *A, const word *B)
2062{
2063 Mul_Begin(16)
2064#ifndef __GNUC__
2065 ASJ( jmp, 0, f)
2066 Mul_Acc(15) Mul_Acc(14) Mul_Acc(13) Mul_Acc(12) Mul_Acc(11) Mul_Acc(10) Mul_Acc(9) Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2067 AS1( ret) ASL(0)
2068#endif
2069 Mul_Column0(0, 2)
2070 Mul_Column1(1, 3)
2071 Mul_Column0(2, 4)
2072 Mul_Column1(3, 5)
2073 Mul_Column0(4, 6)
2074 Mul_Column1(5, 7)
2075 Mul_Column0(6, 8)
2076 Mul_Column1(7, 9)
2077 Mul_Column0(8, 10)
2078 Mul_Column1(9, 11)
2079 Mul_Column0(10, 12)
2080 Mul_Column1(11, 13)
2081 Mul_Column0(12, 14)
2082 Mul_Column1(13, 15)
2083 Bot_SaveAcc(14) Bot_Acc(16) Bot_Acc(15) Bot_Acc(14) Bot_Acc(13) Bot_Acc(12) Bot_Acc(11) Bot_Acc(10) Bot_Acc(9) Bot_Acc(8) Bot_Acc(7) Bot_Acc(6) Bot_Acc(5) Bot_Acc(4) Bot_Acc(3) Bot_Acc(2)
2084 Bot_End(16)
2085}
2086
2087void SSE2_MultiplyTop8(word *C, const word *A, const word *B, word L)
2088{
2089 Top_Begin(4)
2090 Top_Acc(3) Top_Acc(2) Top_Acc(1)
2091#ifndef __GNUC__
2092 ASJ( jmp, 0, f)
2093 Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2094 AS1( ret) ASL(0)
2095#endif
2096 Top_Column0(4)
2097 Top_Column1(3)
2098 Mul_Column0(0, 2)
2099 Top_End(2)
2100}
2101
2102void SSE2_MultiplyTop16(word *C, const word *A, const word *B, word L)
2103{
2104 Top_Begin(8)
2105 Top_Acc(7) Top_Acc(6) Top_Acc(5) Top_Acc(4) Top_Acc(3) Top_Acc(2) Top_Acc(1)
2106#ifndef __GNUC__
2107 ASJ( jmp, 0, f)
2108 Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2109 AS1( ret) ASL(0)
2110#endif
2111 Top_Column0(8)
2112 Top_Column1(7)
2113 Mul_Column0(0, 6)
2114 Mul_Column1(1, 5)
2115 Mul_Column0(2, 4)
2116 Mul_Column1(3, 3)
2117 Mul_Column0(4, 2)
2118 Top_End(4)
2119}
2120
2121void SSE2_MultiplyTop32(word *C, const word *A, const word *B, word L)
2122{
2123 Top_Begin(16)
2124 Top_Acc(15) Top_Acc(14) Top_Acc(13) Top_Acc(12) Top_Acc(11) Top_Acc(10) Top_Acc(9) Top_Acc(8) Top_Acc(7) Top_Acc(6) Top_Acc(5) Top_Acc(4) Top_Acc(3) Top_Acc(2) Top_Acc(1)
2125#ifndef __GNUC__
2126 ASJ( jmp, 0, f)
2127 Mul_Acc(16) Mul_Acc(15) Mul_Acc(14) Mul_Acc(13) Mul_Acc(12) Mul_Acc(11) Mul_Acc(10) Mul_Acc(9) Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2128 AS1( ret) ASL(0)
2129#endif
2130 Top_Column0(16)
2131 Top_Column1(15)
2132 Mul_Column0(0, 14)
2133 Mul_Column1(1, 13)
2134 Mul_Column0(2, 12)
2135 Mul_Column1(3, 11)
2136 Mul_Column0(4, 10)
2137 Mul_Column1(5, 9)
2138 Mul_Column0(6, 8)
2139 Mul_Column1(7, 7)
2140 Mul_Column0(8, 6)
2141 Mul_Column1(9, 5)
2142 Mul_Column0(10, 4)
2143 Mul_Column1(11, 3)
2144 Mul_Column0(12, 2)
2145 Top_End(8)
2146}
2147
2148#endif // #if CRYPTOPP_INTEGER_SSE2
2149
2150// ********************************************************
2151
2152typedef int (CRYPTOPP_FASTCALL * PAdd)(size_t N, word *C, const word *A, const word *B);
2153typedef void (* PMul)(word *C, const word *A, const word *B);
2154typedef void (* PSqu)(word *C, const word *A);
2155typedef void (* PMulTop)(word *C, const word *A, const word *B, word L);
2156
2157#if CRYPTOPP_INTEGER_SSE2
2158static PAdd s_pAdd = &Baseline_Add, s_pSub = &Baseline_Sub;
2159static size_t s_recursionLimit = 8;
2160#else
2161static const size_t s_recursionLimit = 16;
2162#endif // CRYPTOPP_INTEGER_SSE2
2163
2164static PMul s_pMul[9], s_pBot[9];
2165static PSqu s_pSqu[9];
2166static PMulTop s_pTop[9];
2167
2168void SetFunctionPointers()
2169{
2170 s_pMul[0] = &Baseline_Multiply2;
2171 s_pBot[0] = &Baseline_MultiplyBottom2;
2172 s_pSqu[0] = &Baseline_Square2;
2173 s_pTop[0] = &Baseline_MultiplyTop2;
2174 s_pTop[1] = &Baseline_MultiplyTop4;
2175
2176#if CRYPTOPP_INTEGER_SSE2
2177 if (HasSSE2())
2178 {
2179 if (IsP4())
2180 {
2181 s_pAdd = &SSE2_Add;
2182 s_pSub = &SSE2_Sub;
2183 }
2184
2185 s_recursionLimit = 32;
2186
2187 s_pMul[1] = &SSE2_Multiply4;
2188 s_pMul[2] = &SSE2_Multiply8;
2189 s_pMul[4] = &SSE2_Multiply16;
2190 s_pMul[8] = &SSE2_Multiply32;
2191
2192 s_pBot[1] = &SSE2_MultiplyBottom4;
2193 s_pBot[2] = &SSE2_MultiplyBottom8;
2194 s_pBot[4] = &SSE2_MultiplyBottom16;
2195 s_pBot[8] = &SSE2_MultiplyBottom32;
2196
2197 s_pSqu[1] = &SSE2_Square4;
2198 s_pSqu[2] = &SSE2_Square8;
2199 s_pSqu[4] = &SSE2_Square16;
2200 s_pSqu[8] = &SSE2_Square32;
2201
2202 s_pTop[2] = &SSE2_MultiplyTop8;
2203 s_pTop[4] = &SSE2_MultiplyTop16;
2204 s_pTop[8] = &SSE2_MultiplyTop32;
2205 }
2206 else
2207#endif // CRYPTOPP_INTEGER_SSE2
2208 {
2209 s_pMul[1] = &Baseline_Multiply4;
2210 s_pMul[2] = &Baseline_Multiply8;
2211
2212 s_pBot[1] = &Baseline_MultiplyBottom4;
2213 s_pBot[2] = &Baseline_MultiplyBottom8;
2214
2215 s_pSqu[1] = &Baseline_Square4;
2216 s_pSqu[2] = &Baseline_Square8;
2217
2218 s_pTop[2] = &Baseline_MultiplyTop8;
2219
2220#if !CRYPTOPP_INTEGER_SSE2
2221 s_pMul[4] = &Baseline_Multiply16;
2222 s_pBot[4] = &Baseline_MultiplyBottom16;
2223 s_pSqu[4] = &Baseline_Square16;
2224 s_pTop[4] = &Baseline_MultiplyTop16;
2225#endif // !CRYPTOPP_INTEGER_SSE2
2226 }
2227}
2228
2229inline int Add(word *C, const word *A, const word *B, size_t N)
2230{
2231#if CRYPTOPP_INTEGER_SSE2
2232 return s_pAdd(N, C, A, B);
2233#else
2234 return Baseline_Add(N, C, A, B);
2235#endif // CRYPTOPP_INTEGER_SSE2
2236}
2237
2238inline int Subtract(word *C, const word *A, const word *B, size_t N)
2239{
2240#if CRYPTOPP_INTEGER_SSE2
2241 return s_pSub(N, C, A, B);
2242#else
2243 return Baseline_Sub(N, C, A, B);
2244#endif // CRYPTOPP_INTEGER_SSE2
2245}
2246
2247// ********************************************************
2248
2249
2250#define A0 A
2251#define A1 (A+N2)
2252#define B0 B
2253#define B1 (B+N2)
2254
2255#define T0 T
2256#define T1 (T+N2)
2257#define T2 (T+N)
2258#define T3 (T+N+N2)
2259
2260#define R0 R
2261#define R1 (R+N2)
2262#define R2 (R+N)
2263#define R3 (R+N+N2)
2264
2265// R[2*N] - result = A*B
2266// T[2*N] - temporary work space
2267// A[N] --- multiplier
2268// B[N] --- multiplicant
2269
2270void RecursiveMultiply(word *R, word *T, const word *A, const word *B, size_t N)
2271{
2272 CRYPTOPP_ASSERT(N>=2 && N%2==0);
2273
2274 if (N <= s_recursionLimit)
2275 s_pMul[N/4](R, A, B);
2276 else
2277 {
2278 const size_t N2 = N/2;
2279
2280 size_t AN2 = Compare(A0, A1, N2) > 0 ? 0 : N2;
2281 Subtract(R0, A + AN2, A + (N2 ^ AN2), N2);
2282
2283 size_t BN2 = Compare(B0, B1, N2) > 0 ? 0 : N2;
2284 Subtract(R1, B + BN2, B + (N2 ^ BN2), N2);
2285
2286 RecursiveMultiply(R2, T2, A1, B1, N2);
2287 RecursiveMultiply(T0, T2, R0, R1, N2);
2288 RecursiveMultiply(R0, T2, A0, B0, N2);
2289
2290 // now T[01] holds (A1-A0)*(B0-B1), R[01] holds A0*B0, R[23] holds A1*B1
2291
2292 int c2 = Add(R2, R2, R1, N2);
2293 int c3 = c2;
2294 c2 += Add(R1, R2, R0, N2);
2295 c3 += Add(R2, R2, R3, N2);
2296
2297 if (AN2 == BN2)
2298 c3 -= Subtract(R1, R1, T0, N);
2299 else
2300 c3 += Add(R1, R1, T0, N);
2301
2302 c3 += Increment(R2, N2, c2);
2303 CRYPTOPP_ASSERT (c3 >= 0 && c3 <= 2);
2304 Increment(R3, N2, c3);
2305 }
2306}
2307
2308// R[2*N] - result = A*A
2309// T[2*N] - temporary work space
2310// A[N] --- number to be squared
2311
2312void RecursiveSquare(word *R, word *T, const word *A, size_t N)
2313{
2314 CRYPTOPP_ASSERT(N && N%2==0);
2315
2316 if (N <= s_recursionLimit)
2317 s_pSqu[N/4](R, A);
2318 else
2319 {
2320 const size_t N2 = N/2;
2321
2322 RecursiveSquare(R0, T2, A0, N2);
2323 RecursiveSquare(R2, T2, A1, N2);
2324 RecursiveMultiply(T0, T2, A0, A1, N2);
2325
2326 int carry = Add(R1, R1, T0, N);
2327 carry += Add(R1, R1, T0, N);
2328 Increment(R3, N2, carry);
2329 }
2330}
2331
2332// R[N] - bottom half of A*B
2333// T[3*N/2] - temporary work space
2334// A[N] - multiplier
2335// B[N] - multiplicant
2336
2337void RecursiveMultiplyBottom(word *R, word *T, const word *A, const word *B, size_t N)
2338{
2339 CRYPTOPP_ASSERT(N>=2 && N%2==0);
2340
2341 if (N <= s_recursionLimit)
2342 s_pBot[N/4](R, A, B);
2343 else
2344 {
2345 const size_t N2 = N/2;
2346
2347 RecursiveMultiply(R, T, A0, B0, N2);
2348 RecursiveMultiplyBottom(T0, T1, A1, B0, N2);
2349 Add(R1, R1, T0, N2);
2350 RecursiveMultiplyBottom(T0, T1, A0, B1, N2);
2351 Add(R1, R1, T0, N2);
2352 }
2353}
2354
2355// R[N] --- upper half of A*B
2356// T[2*N] - temporary work space
2357// L[N] --- lower half of A*B
2358// A[N] --- multiplier
2359// B[N] --- multiplicant
2360
2361void MultiplyTop(word *R, word *T, const word *L, const word *A, const word *B, size_t N)
2362{
2363 CRYPTOPP_ASSERT(N>=2 && N%2==0);
2364
2365 if (N <= s_recursionLimit)
2366 s_pTop[N/4](R, A, B, L[N-1]);
2367 else
2368 {
2369 const size_t N2 = N/2;
2370
2371 size_t AN2 = Compare(A0, A1, N2) > 0 ? 0 : N2;
2372 Subtract(R0, A + AN2, A + (N2 ^ AN2), N2);
2373
2374 size_t BN2 = Compare(B0, B1, N2) > 0 ? 0 : N2;
2375 Subtract(R1, B + BN2, B + (N2 ^ BN2), N2);
2376
2377 RecursiveMultiply(T0, T2, R0, R1, N2);
2378 RecursiveMultiply(R0, T2, A1, B1, N2);
2379
2380 // now T[01] holds (A1-A0)*(B0-B1) = A1*B0+A0*B1-A1*B1-A0*B0, R[01] holds A1*B1
2381
2382 int t, c3;
2383 int c2 = Subtract(T2, L+N2, L, N2);
2384
2385 if (AN2 == BN2)
2386 {
2387 c2 -= Add(T2, T2, T0, N2);
2388 t = (Compare(T2, R0, N2) == -1);
2389 c3 = t - Subtract(T2, T2, T1, N2);
2390 }
2391 else
2392 {
2393 c2 += Subtract(T2, T2, T0, N2);
2394 t = (Compare(T2, R0, N2) == -1);
2395 c3 = t + Add(T2, T2, T1, N2);
2396 }
2397
2398 c2 += t;
2399 if (c2 >= 0)
2400 c3 += Increment(T2, N2, c2);
2401 else
2402 c3 -= Decrement(T2, N2, -c2);
2403 c3 += Add(R0, T2, R1, N2);
2404
2405 CRYPTOPP_ASSERT (c3 >= 0 && c3 <= 2);
2406 Increment(R1, N2, c3);
2407 }
2408}
2409
2410inline void Multiply(word *R, word *T, const word *A, const word *B, size_t N)
2411{
2412 RecursiveMultiply(R, T, A, B, N);
2413}
2414
2415inline void Square(word *R, word *T, const word *A, size_t N)
2416{
2417 RecursiveSquare(R, T, A, N);
2418}
2419
2420inline void MultiplyBottom(word *R, word *T, const word *A, const word *B, size_t N)
2421{
2422 RecursiveMultiplyBottom(R, T, A, B, N);
2423}
2424
2425// R[NA+NB] - result = A*B
2426// T[NA+NB] - temporary work space
2427// A[NA] ---- multiplier
2428// B[NB] ---- multiplicant
2429
2430void AsymmetricMultiply(word *R, word *T, const word *A, size_t NA, const word *B, size_t NB)
2431{
2432 if (NA == NB)
2433 {
2434 // Profiling guided the flow below.
2435 if (A != B)
2436 Multiply(R, T, A, B, NA);
2437 else
2438 Square(R, T, A, NA);
2439
2440 return;
2441 }
2442
2443 if (NA > NB)
2444 {
2445 std::swap(A, B);
2446 std::swap(NA, NB);
2447 }
2448
2449 CRYPTOPP_ASSERT(NB % NA == 0);
2450
2451 if (NA==2 && !A[1])
2452 {
2453 // Profiling guided the flow below.
2454 switch (A[0])
2455 {
2456 default:
2457 R[NB] = LinearMultiply(R, B, A[0], NB);
2458 R[NB+1] = 0;
2459 return;
2460 case 0:
2461 SetWords(R, 0, NB+2);
2462 return;
2463 case 1:
2464 CopyWords(R, B, NB);
2465 R[NB] = R[NB+1] = 0;
2466 return;
2467 }
2468 }
2469
2470 size_t i;
2471 if ((NB/NA)%2 == 0)
2472 {
2473 Multiply(R, T, A, B, NA);
2474 CopyWords(T+2*NA, R+NA, NA);
2475
2476 for (i=2*NA; i<NB; i+=2*NA)
2477 Multiply(T+NA+i, T, A, B+i, NA);
2478 for (i=NA; i<NB; i+=2*NA)
2479 Multiply(R+i, T, A, B+i, NA);
2480 }
2481 else
2482 {
2483 for (i=0; i<NB; i+=2*NA)
2484 Multiply(R+i, T, A, B+i, NA);
2485 for (i=NA; i<NB; i+=2*NA)
2486 Multiply(T+NA+i, T, A, B+i, NA);
2487 }
2488
2489 if (Add(R+NA, R+NA, T+2*NA, NB-NA))
2490 Increment(R+NB, NA);
2491}
2492
2493// R[N] ----- result = A inverse mod 2**(WORD_BITS*N)
2494// T[3*N/2] - temporary work space
2495// A[N] ----- an odd number as input
2496
2497void RecursiveInverseModPower2(word *R, word *T, const word *A, size_t N)
2498{
2499 // Profiling guided the flow below.
2500 if (N!=2)
2501 {
2502 const size_t N2 = N/2;
2503 RecursiveInverseModPower2(R0, T0, A0, N2);
2504 T0[0] = 1;
2505 SetWords(T0+1, 0, N2-1);
2506 MultiplyTop(R1, T1, T0, R0, A0, N2);
2507 MultiplyBottom(T0, T1, R0, A1, N2);
2508 Add(T0, R1, T0, N2);
2509 TwosComplement(T0, N2);
2510 MultiplyBottom(R1, T1, R0, T0, N2);
2511 }
2512 else
2513 {
2514 T[0] = AtomicInverseModPower2(A[0]);
2515 T[1] = 0;
2516 s_pBot[0](T+2, T, A);
2517 TwosComplement(T+2, 2);
2518 Increment(T+2, 2, 2);
2519 s_pBot[0](R, T, T+2);
2520 }
2521}
2522
2523// R[N] --- result = X/(2**(WORD_BITS*N)) mod M
2524// T[3*N] - temporary work space
2525// X[2*N] - number to be reduced
2526// M[N] --- modulus
2527// U[N] --- multiplicative inverse of M mod 2**(WORD_BITS*N)
2528
2529void MontgomeryReduce(word *R, word *T, word *X, const word *M, const word *U, size_t N)
2530{
2531#if 1
2532 MultiplyBottom(R, T, X, U, N);
2533 MultiplyTop(T, T+N, X, R, M, N);
2534 word borrow = Subtract(T, X+N, T, N);
2535 // defend against timing attack by doing this Add even when not needed
2536 word carry = Add(T+N, T, M, N);
2537 CRYPTOPP_ASSERT(carry | !borrow);
2538 CRYPTOPP_UNUSED(carry), CRYPTOPP_UNUSED(borrow);
2539 CopyWords(R, T + ((0-borrow) & N), N);
2540#elif 0
2541 const word u = 0-U[0];
2542 Declare2Words(p)
2543 for (size_t i=0; i<N; i++)
2544 {
2545 const word t = u * X[i];
2546 word c = 0;
2547 for (size_t j=0; j<N; j+=2)
2548 {
2549 MultiplyWords(p, t, M[j]);
2550 Acc2WordsBy1(p, X[i+j]);
2551 Acc2WordsBy1(p, c);
2552 X[i+j] = LowWord(p);
2553 c = HighWord(p);
2554 MultiplyWords(p, t, M[j+1]);
2555 Acc2WordsBy1(p, X[i+j+1]);
2556 Acc2WordsBy1(p, c);
2557 X[i+j+1] = LowWord(p);
2558 c = HighWord(p);
2559 }
2560
2561 if (Increment(X+N+i, N-i, c))
2562 while (!Subtract(X+N, X+N, M, N)) {}
2563 }
2564
2565 std::memcpy(R, X+N, N*WORD_SIZE);
2566#else
2567 __m64 u = _mm_cvtsi32_si64(0-U[0]), p;
2568 for (size_t i=0; i<N; i++)
2569 {
2570 __m64 t = _mm_cvtsi32_si64(X[i]);
2571 t = _mm_mul_su32(t, u);
2572 __m64 c = _mm_setzero_si64();
2573 for (size_t j=0; j<N; j+=2)
2574 {
2575 p = _mm_mul_su32(t, _mm_cvtsi32_si64(M[j]));
2576 p = _mm_add_si64(p, _mm_cvtsi32_si64(X[i+j]));
2577 c = _mm_add_si64(c, p);
2578 X[i+j] = _mm_cvtsi64_si32(c);
2579 c = _mm_srli_si64(c, 32);
2580 p = _mm_mul_su32(t, _mm_cvtsi32_si64(M[j+1]));
2581 p = _mm_add_si64(p, _mm_cvtsi32_si64(X[i+j+1]));
2582 c = _mm_add_si64(c, p);
2583 X[i+j+1] = _mm_cvtsi64_si32(c);
2584 c = _mm_srli_si64(c, 32);
2585 }
2586
2587 if (Increment(X+N+i, N-i, _mm_cvtsi64_si32(c)))
2588 while (!Subtract(X+N, X+N, M, N)) {}
2589 }
2590
2591 std::memcpy(R, X+N, N*WORD_SIZE);
2592 _mm_empty();
2593#endif
2594}
2595
2596// R[N] --- result = X/(2**(WORD_BITS*N/2)) mod M
2597// T[2*N] - temporary work space
2598// X[2*N] - number to be reduced
2599// M[N] --- modulus
2600// U[N/2] - multiplicative inverse of M mod 2**(WORD_BITS*N/2)
2601// V[N] --- 2**(WORD_BITS*3*N/2) mod M
2602
2603void HalfMontgomeryReduce(word *R, word *T, const word *X, const word *M, const word *U, const word *V, size_t N)
2604{
2605 CRYPTOPP_ASSERT(N%2==0 && N>=4);
2606
2607#define M0 M
2608#define M1 (M+N2)
2609#define V0 V
2610#define V1 (V+N2)
2611
2612#define X0 X
2613#define X1 (X+N2)
2614#define X2 (X+N)
2615#define X3 (X+N+N2)
2616
2617 const size_t N2 = N/2;
2618 Multiply(T0, T2, V0, X3, N2);
2619 int c2 = Add(T0, T0, X0, N);
2620 MultiplyBottom(T3, T2, T0, U, N2);
2621 MultiplyTop(T2, R, T0, T3, M0, N2);
2622 c2 -= Subtract(T2, T1, T2, N2);
2623 Multiply(T0, R, T3, M1, N2);
2624 c2 -= Subtract(T0, T2, T0, N2);
2625 int c3 = -(int)Subtract(T1, X2, T1, N2);
2626 Multiply(R0, T2, V1, X3, N2);
2627 c3 += Add(R, R, T, N);
2628
2629 if (c2>0)
2630 c3 += Increment(R1, N2);
2631 else if (c2<0)
2632 c3 -= Decrement(R1, N2, -c2);
2633
2634 CRYPTOPP_ASSERT(c3>=-1 && c3<=1);
2635 if (c3>0)
2636 Subtract(R, R, M, N);
2637 else if (c3<0)
2638 Add(R, R, M, N);
2639
2640#undef M0
2641#undef M1
2642#undef V0
2643#undef V1
2644
2645#undef X0
2646#undef X1
2647#undef X2
2648#undef X3
2649}
2650
2651#undef A0
2652#undef A1
2653#undef B0
2654#undef B1
2655
2656#undef T0
2657#undef T1
2658#undef T2
2659#undef T3
2660
2661#undef R0
2662#undef R1
2663#undef R2
2664#undef R3
2665
2666/*
2667// do a 3 word by 2 word divide, returns quotient and leaves remainder in A
2668static word SubatomicDivide(word *A, word B0, word B1)
2669{
2670 // Assert {A[2],A[1]} < {B1,B0}, so quotient can fit in a word
2671 CRYPTOPP_ASSERT(A[2] < B1 || (A[2]==B1 && A[1] < B0));
2672
2673 // estimate the quotient: do a 2 word by 1 word divide
2674 word Q;
2675 if (B1+1 == 0)
2676 Q = A[2];
2677 else
2678 Q = DWord(A[1], A[2]).DividedBy(B1+1);
2679
2680 // now subtract Q*B from A
2681 DWord p = DWord::Multiply(B0, Q);
2682 DWord u = (DWord) A[0] - p.GetLowHalf();
2683 A[0] = u.GetLowHalf();
2684 u = (DWord) A[1] - p.GetHighHalf() - u.GetHighHalfAsBorrow() - DWord::Multiply(B1, Q);
2685 A[1] = u.GetLowHalf();
2686 A[2] += u.GetHighHalf();
2687
2688 // Q <= actual quotient, so fix it
2689 while (A[2] || A[1] > B1 || (A[1]==B1 && A[0]>=B0))
2690 {
2691 u = (DWord) A[0] - B0;
2692 A[0] = u.GetLowHalf();
2693 u = (DWord) A[1] - B1 - u.GetHighHalfAsBorrow();
2694 A[1] = u.GetLowHalf();
2695 A[2] += u.GetHighHalf();
2696 Q++;
2697 CRYPTOPP_ASSERT(Q); // shouldn't overflow
2698 }
2699
2700 return Q;
2701}
2702
2703// do a 4 word by 2 word divide, returns 2 word quotient in Q0 and Q1
2704static inline void AtomicDivide(word *Q, const word *A, const word *B)
2705{
2706 if (!B[0] && !B[1]) // if divisor is 0, we assume divisor==2**(2*WORD_BITS)
2707 {
2708 Q[0] = A[2];
2709 Q[1] = A[3];
2710 }
2711 else
2712 {
2713 word T[4];
2714 T[0] = A[0]; T[1] = A[1]; T[2] = A[2]; T[3] = A[3];
2715 Q[1] = SubatomicDivide(T+1, B[0], B[1]);
2716 Q[0] = SubatomicDivide(T, B[0], B[1]);
2717
2718#if defined(CRYPTOPP_DEBUG)
2719 // multiply quotient and divisor and add remainder, make sure it equals dividend
2720 CRYPTOPP_ASSERT(!T[2] && !T[3] && (T[1] < B[1] || (T[1]==B[1] && T[0]<B[0])));
2721 word P[4];
2722 LowLevel::Multiply2(P, Q, B);
2723 Add(P, P, T, 4);
2724 CRYPTOPP_ASSERT(std::memcmp(P, A, 4*WORD_SIZE)==0);
2725#endif
2726 }
2727}
2728*/
2729
2730static inline void AtomicDivide(word *Q, const word *A, const word *B)
2731{
2732 word T[4];
2733 DWord q = DivideFourWordsByTwo<word, DWord>(T, DWord(A[0], A[1]), DWord(A[2], A[3]), DWord(B[0], B[1]));
2734 Q[0] = q.GetLowHalf();
2735 Q[1] = q.GetHighHalf();
2736
2737#if defined(CRYPTOPP_DEBUG)
2738 if (B[0] || B[1])
2739 {
2740 // multiply quotient and divisor and add remainder, make sure it equals dividend
2741 CRYPTOPP_ASSERT(!T[2] && !T[3] && (T[1] < B[1] || (T[1]==B[1] && T[0]<B[0])));
2742 word P[4];
2743 s_pMul[0](P, Q, B);
2744 Add(P, P, T, 4);
2745 CRYPTOPP_ASSERT(std::memcmp(P, A, 4*WORD_SIZE)==0);
2746 }
2747#endif
2748}
2749
2750// for use by Divide(), corrects the underestimated quotient {Q1,Q0}
2751static void CorrectQuotientEstimate(word *R, word *T, word *Q, const word *B, size_t N)
2752{
2753 CRYPTOPP_ASSERT(N && N%2==0);
2754
2755 AsymmetricMultiply(T, T+N+2, Q, 2, B, N);
2756
2757 word borrow = Subtract(R, R, T, N+2);
2758 CRYPTOPP_ASSERT(!borrow && !R[N+1]);
2759 CRYPTOPP_UNUSED(borrow);
2760
2761 while (R[N] || Compare(R, B, N) >= 0)
2762 {
2763 R[N] -= Subtract(R, R, B, N);
2764 Q[1] += (++Q[0]==0);
2765 CRYPTOPP_ASSERT(Q[0] || Q[1]); // no overflow
2766 }
2767}
2768
2769// R[NB] -------- remainder = A%B
2770// Q[NA-NB+2] --- quotient = A/B
2771// T[NA+3*(NB+2)] - temp work space
2772// A[NA] -------- dividend
2773// B[NB] -------- divisor
2774
2775void Divide(word *R, word *Q, word *T, const word *A, size_t NA, const word *B, size_t NB)
2776{
2777 CRYPTOPP_ASSERT(NA && NB && NA%2==0 && NB%2==0);
2778 CRYPTOPP_ASSERT(B[NB-1] || B[NB-2]);
2779 CRYPTOPP_ASSERT(NB <= NA);
2780
2781 // set up temporary work space
2782 word *const TA=T;
2783 word *const TB=T+NA+2;
2784 word *const TP=T+NA+2+NB;
2785
2786 // copy B into TB and normalize it so that TB has highest bit set to 1
2787 unsigned shiftWords = (B[NB-1]==0);
2788 TB[0] = TB[NB-1] = 0;
2789 CopyWords(TB+shiftWords, B, NB-shiftWords);
2790 unsigned shiftBits = WORD_BITS - BitPrecision(TB[NB-1]);
2791 CRYPTOPP_ASSERT(shiftBits < WORD_BITS);
2792 ShiftWordsLeftByBits(TB, NB, shiftBits);
2793
2794 // copy A into TA and normalize it
2795 TA[0] = TA[NA] = TA[NA+1] = 0;
2796 CopyWords(TA+shiftWords, A, NA);
2797 ShiftWordsLeftByBits(TA, NA+2, shiftBits);
2798
2799 if (TA[NA+1]==0 && TA[NA] <= 1)
2800 {
2801 Q[NA-NB+1] = Q[NA-NB] = 0;
2802 while (TA[NA] || Compare(TA+NA-NB, TB, NB) >= 0)
2803 {
2804 TA[NA] -= Subtract(TA+NA-NB, TA+NA-NB, TB, NB);
2805 ++Q[NA-NB];
2806 }
2807 }
2808 else
2809 {
2810 NA+=2;
2811 CRYPTOPP_ASSERT(Compare(TA+NA-NB, TB, NB) < 0);
2812 }
2813
2814 word BT[2];
2815 BT[0] = TB[NB-2] + 1;
2816 BT[1] = TB[NB-1] + (BT[0]==0);
2817
2818 // start reducing TA mod TB, 2 words at a time
2819 for (size_t i=NA-2; i>=NB; i-=2)
2820 {
2821 AtomicDivide(Q+i-NB, TA+i-2, BT);
2822 CorrectQuotientEstimate(TA+i-NB, TP, Q+i-NB, TB, NB);
2823 }
2824
2825 // copy TA into R, and denormalize it
2826 CopyWords(R, TA+shiftWords, NB);
2827 ShiftWordsRightByBits(R, NB, shiftBits);
2828}
2829
2830static inline size_t EvenWordCount(const word *X, size_t N)
2831{
2832 while (N && X[N-2]==0 && X[N-1]==0)
2833 N-=2;
2834 return N;
2835}
2836
2837// return k
2838// R[N] --- result = A^(-1) * 2^k mod M
2839// T[4*N] - temporary work space
2840// A[NA] -- number to take inverse of
2841// M[N] --- modulus
2842
2843unsigned int AlmostInverse(word *R, word *T, const word *A, size_t NA, const word *M, size_t N)
2844{
2845 CRYPTOPP_ASSERT(NA<=N && N && N%2==0);
2846
2847 word *b = T;
2848 word *c = T+N;
2849 word *f = T+2*N;
2850 word *g = T+3*N;
2851 size_t bcLen=2, fgLen=EvenWordCount(M, N);
2852 unsigned int k=0;
2853 bool s=false;
2854
2855 SetWords(T, 0, 3*N);
2856 b[0]=1;
2857 CopyWords(f, A, NA);
2858 CopyWords(g, M, N);
2859
2860 while (1)
2861 {
2862 word t=f[0];
2863 while (!t)
2864 {
2865 if (EvenWordCount(f, fgLen)==0)
2866 {
2867 SetWords(R, 0, N);
2868 return 0;
2869 }
2870
2871 ShiftWordsRightByWords(f, fgLen, 1);
2872 bcLen += 2 * (c[bcLen-1] != 0);
2873 CRYPTOPP_ASSERT(bcLen <= N);
2874 ShiftWordsLeftByWords(c, bcLen, 1);
2875 k+=WORD_BITS;
2876 t=f[0];
2877 }
2878
2879 // t must be non-0; otherwise undefined.
2880 unsigned int i = TrailingZeros(t);
2881 t >>= i;
2882 k += i;
2883
2884 if (t==1 && f[1]==0 && EvenWordCount(f+2, fgLen-2)==0)
2885 {
2886 if (s)
2887 Subtract(R, M, b, N);
2888 else
2889 CopyWords(R, b, N);
2890 return k;
2891 }
2892
2893 ShiftWordsRightByBits(f, fgLen, i);
2894 t = ShiftWordsLeftByBits(c, bcLen, i);
2895 c[bcLen] += t;
2896 bcLen += 2 * (t!=0);
2897 CRYPTOPP_ASSERT(bcLen <= N);
2898
2899 bool swap = Compare(f, g, fgLen)==-1;
2900 ConditionalSwapPointers(swap, f, g);
2901 ConditionalSwapPointers(swap, b, c);
2902 s ^= swap;
2903
2904 fgLen -= 2 * !(f[fgLen-2] | f[fgLen-1]);
2905
2906 Subtract(f, f, g, fgLen);
2907 t = Add(b, b, c, bcLen);
2908 b[bcLen] += t;
2909 bcLen += 2*t;
2910 CRYPTOPP_ASSERT(bcLen <= N);
2911 }
2912}
2913
2914// R[N] - result = A/(2^k) mod M
2915// A[N] - input
2916// M[N] - modulus
2917
2918void DivideByPower2Mod(word *R, const word *A, size_t k, const word *M, size_t N)
2919{
2920 CopyWords(R, A, N);
2921
2922 while (k--)
2923 {
2924 if (R[0]%2==0)
2925 ShiftWordsRightByBits(R, N, 1);
2926 else
2927 {
2928 word carry = Add(R, R, M, N);
2929 ShiftWordsRightByBits(R, N, 1);
2930 R[N-1] += carry<<(WORD_BITS-1);
2931 }
2932 }
2933}
2934
2935// R[N] - result = A*(2^k) mod M
2936// A[N] - input
2937// M[N] - modulus
2938
2939void MultiplyByPower2Mod(word *R, const word *A, size_t k, const word *M, size_t N)
2940{
2941 CopyWords(R, A, N);
2942
2943 while (k--)
2944 if (ShiftWordsLeftByBits(R, N, 1) || Compare(R, M, N)>=0)
2945 Subtract(R, R, M, N);
2946}
2947
2948// ******************************************************************
2949
2950static const unsigned int RoundupSizeTable[] = {2, 2, 2, 4, 4, 8, 8, 8, 8};
2951
2952static inline size_t RoundupSize(size_t n)
2953{
2954 if (n<=8)
2955 return RoundupSizeTable[n];
2956 else if (n<=16)
2957 return 16;
2958 else if (n<=32)
2959 return 32;
2960 else if (n<=64)
2961 return 64;
2962 else
2963 return size_t(1) << BitPrecision(n-1);
2964}
2965
2967 : reg(2), sign(POSITIVE)
2968{
2969 reg[0] = reg[1] = 0;
2970}
2971
2972Integer::Integer(const Integer& t)
2973 : reg(RoundupSize(t.WordCount())), sign(t.sign)
2974{
2975 CopyWords(reg, t.reg, reg.size());
2976}
2977
2978Integer::Integer(Sign s, lword value)
2979 : reg(2), sign(s)
2980{
2981 reg[0] = word(value);
2982 reg[1] = word(SafeRightShift<WORD_BITS>(value));
2983}
2984
2985Integer::Integer(signed long value)
2986 : reg(2)
2987{
2988 if (value >= 0)
2989 sign = POSITIVE;
2990 else
2991 {
2992 sign = NEGATIVE;
2993 value = -value;
2994 }
2995 reg[0] = word(value);
2996 reg[1] = word(SafeRightShift<WORD_BITS>((unsigned long)value));
2997}
2998
2999Integer::Integer(Sign s, word high, word low)
3000 : reg(2), sign(s)
3001{
3002 reg[0] = low;
3003 reg[1] = high;
3004}
3005
3007{
3008 if (ByteCount() > sizeof(long))
3009 return false;
3010
3011 unsigned long value = (unsigned long)reg[0];
3012 value += SafeLeftShift<WORD_BITS, unsigned long>((unsigned long)reg[1]);
3013
3014 if (sign==POSITIVE)
3015 return (signed long)value >= 0;
3016 else
3017 return -(signed long)value < 0;
3018}
3019
3020signed long Integer::ConvertToLong() const
3021{
3023
3024 unsigned long value = (unsigned long)reg[0];
3025 value += SafeLeftShift<WORD_BITS, unsigned long>((unsigned long)reg[1]);
3026 return sign==POSITIVE ? value : -(signed long)value;
3027}
3028
3029Integer::Integer(BufferedTransformation &encodedInteger, size_t byteCount, Signedness s, ByteOrder o)
3030{
3032
3033 if (o != LITTLE_ENDIAN_ORDER)
3034 {
3035 Decode(encodedInteger, byteCount, s);
3036 }
3037 else
3038 {
3039 SecByteBlock block(byteCount);
3040 encodedInteger.Get(block, block.size());
3041 std::reverse(block.begin(), block.begin()+block.size());
3042
3043 Decode(block.begin(), block.size(), s);
3044 }
3045}
3046
3047Integer::Integer(const byte *encodedInteger, size_t byteCount, Signedness s, ByteOrder o)
3048{
3049 CRYPTOPP_ASSERT(encodedInteger && byteCount); // NULL buffer
3051
3052 if (o != LITTLE_ENDIAN_ORDER)
3053 {
3054 Decode(encodedInteger, byteCount, s);
3055 }
3056 else
3057 {
3058 SecByteBlock block(byteCount);
3059#if (CRYPTOPP_MSC_VERSION >= 1500)
3060 std::reverse_copy(encodedInteger, encodedInteger+byteCount,
3061 stdext::make_checked_array_iterator(block.begin(), block.size()));
3062#else
3063 std::reverse_copy(encodedInteger, encodedInteger+byteCount, block.begin());
3064#endif
3065 Decode(block.begin(), block.size(), s);
3066 return;
3067 }
3068}
3069
3071{
3072 // Make explicit call to avoid virtual-dispatch findings in ctor
3074}
3075
3076Integer::Integer(RandomNumberGenerator &rng, size_t bitcount)
3077{
3078 Randomize(rng, bitcount);
3079}
3080
3081Integer::Integer(RandomNumberGenerator &rng, const Integer &min, const Integer &max, RandomNumberType rnType, const Integer &equiv, const Integer &mod)
3082{
3083 if (!Randomize(rng, min, max, rnType, equiv, mod))
3085}
3086
3087Integer Integer::Power2(size_t e)
3088{
3089 Integer r((word)0, BitsToWords(e+1));
3090 r.SetBit(e);
3091 return r;
3092}
3093
3094bool Integer::operator!() const
3095{
3096 return IsNegative() ? false : (reg[0]==0 && WordCount()==0);
3097}
3098
3100{
3101 if (this != &t)
3102 {
3103 if (reg.size() != t.reg.size() || t.reg[t.reg.size()/2] == 0)
3104 reg.New(RoundupSize(t.WordCount()));
3105 CopyWords(reg, t.reg, reg.size());
3106 sign = t.sign;
3107 }
3108 return *this;
3109}
3110
3111bool Integer::GetBit(size_t n) const
3112{
3113 // Profiling guided the flow below.
3114 if (n/WORD_BITS < reg.size())
3115 return bool((reg[n/WORD_BITS] >> (n % WORD_BITS)) & 1);
3116 else
3117 return 0;
3118}
3119
3120void Integer::SetBit(size_t n, bool value)
3121{
3122 if (value)
3123 {
3124 reg.CleanGrow(RoundupSize(BitsToWords(n+1)));
3125 reg[n/WORD_BITS] |= (word(1) << (n%WORD_BITS));
3126 }
3127 else
3128 {
3129 if (n/WORD_BITS < reg.size())
3130 reg[n/WORD_BITS] &= ~(word(1) << (n%WORD_BITS));
3131 }
3132}
3133
3134byte Integer::GetByte(size_t n) const
3135{
3136 // Profiling guided the flow below.
3137 if (n/WORD_SIZE < reg.size())
3138 return byte(reg[n/WORD_SIZE] >> ((n%WORD_SIZE)*8));
3139 else
3140 return 0;
3141}
3142
3143void Integer::SetByte(size_t n, byte value)
3144{
3145 reg.CleanGrow(RoundupSize(BytesToWords(n+1)));
3146 reg[n/WORD_SIZE] &= ~(word(0xff) << 8*(n%WORD_SIZE));
3147 reg[n/WORD_SIZE] |= (word(value) << 8*(n%WORD_SIZE));
3148}
3149
3150lword Integer::GetBits(size_t i, size_t n) const
3151{
3152 lword v = 0;
3153 CRYPTOPP_ASSERT(n <= sizeof(v)*8);
3154 for (unsigned int j=0; j<n; j++)
3155 v |= lword(GetBit(i+j)) << j;
3156 return v;
3157}
3158
3160{
3161 Integer result(*this);
3162 result.Negate();
3163 return result;
3164}
3165
3167{
3168 Integer result(*this);
3169 result.sign = POSITIVE;
3170 return result;
3171}
3172
3173void Integer::swap(Integer &a)
3174{
3175 reg.swap(a.reg);
3176 std::swap(sign, a.sign);
3177}
3178
3179Integer::Integer(word value, size_t length)
3180 : reg(RoundupSize(length)), sign(POSITIVE)
3181{
3182 reg[0] = value;
3183 SetWords(reg+1, 0, reg.size()-1);
3184}
3185
3186template <class T>
3187static Integer StringToInteger(const T *str, ByteOrder order)
3188{
3190
3191 int radix, sign = 1;
3192 // GCC workaround
3193 // std::char_traits<wchar_t>::length() not defined in GCC 3.2 and STLport 4.5.3
3194 unsigned int length;
3195 for (length = 0; str[length] != 0; length++) {}
3196
3197 Integer v;
3198
3199 if (length == 0)
3200 return Integer::Zero();
3201
3202 // 'str' is of length 1 or more
3203 switch (str[length-1])
3204 {
3205 case 'h':
3206 case 'H':
3207 radix=16;
3208 break;
3209 case 'o':
3210 case 'O':
3211 radix=8;
3212 break;
3213 case 'b':
3214 case 'B':
3215 radix=2;
3216 break;
3217 default:
3218 radix=10;
3219 }
3220
3221 // 'str' is of length 1 or more
3222 if (str[0] == '-')
3223 {
3224 sign = -1;
3225 str += 1, length -= 1;
3226 }
3227
3228 // Recognize common prefixes for hexadecimal, octal and decimal.
3229 // Microsoft's MASM also recognizes 0t for octal, but not here.
3230 if (length > 2 && str[0] == '0')
3231 {
3232 if (str[1] == 'x' || str[1] == 'X')
3233 {
3234 radix = 16;
3235 str += 2, length -= 2;
3236 }
3237 else if (str[1] == 'n' || str[1] == 'N')
3238 {
3239 radix = 10;
3240 str += 2, length -= 2;
3241 }
3242 else if (str[1] == 'o' || str[1] == 'O')
3243 {
3244 radix = 8;
3245 str += 2, length -= 2;
3246 }
3247 }
3248
3249 if (order == BIG_ENDIAN_ORDER)
3250 {
3251 for (unsigned int i=0; i<length; i++)
3252 {
3253 int digit, ch = static_cast<int>(str[i]);
3254
3255 // Profiling guided the flow below.
3256 if (ch >= '0' && ch <= '9')
3257 digit = ch - '0';
3258 else if (ch >= 'a' && ch <= 'f')
3259 digit = ch - 'a' + 10;
3260 else if (ch >= 'A' && ch <= 'F')
3261 digit = ch - 'A' + 10;
3262 else
3263 digit = radix;
3264
3265 if (digit < radix)
3266 {
3267 v *= radix;
3268 v += digit;
3269 }
3270 }
3271 }
3272 else if (radix == 16 && order == LITTLE_ENDIAN_ORDER)
3273 {
3274 // Nibble high, low and count
3275 unsigned int nh = 0, nl = 0, nc = 0;
3276 Integer position(Integer::One());
3277
3278 for (unsigned int i=0; i<length; i++)
3279 {
3280 int digit, ch = static_cast<int>(str[i]);
3281
3282 if (ch >= '0' && ch <= '9')
3283 digit = ch - '0';
3284 else if (ch >= 'a' && ch <= 'f')
3285 digit = ch - 'a' + 10;
3286 else if (ch >= 'A' && ch <= 'F')
3287 digit = ch - 'A' + 10;
3288 else
3289 digit = radix;
3290
3291 if (digit < radix)
3292 {
3293 if (nc++ == 0)
3294 nh = digit;
3295 else
3296 nl = digit;
3297
3298 if (nc == 2)
3299 {
3300 v += position * (nh << 4 | nl);
3301 nc = 0, position <<= 8;
3302 }
3303 }
3304 }
3305
3306 if (nc == 1)
3307 v += nh * position;
3308 }
3309 else // LITTLE_ENDIAN_ORDER && radix != 16
3310 {
3311 for (int i=static_cast<int>(length)-1; i>=0; i--)
3312 {
3313 int digit, ch = static_cast<int>(str[i]);
3314
3315 if (ch >= '0' && ch <= '9')
3316 digit = ch - '0';
3317 else if (ch >= 'a' && ch <= 'f')
3318 digit = ch - 'a' + 10;
3319 else if (ch >= 'A' && ch <= 'F')
3320 digit = ch - 'A' + 10;
3321 else
3322 digit = radix;
3323
3324 if (digit < radix)
3325 {
3326 v *= radix;
3327 v += digit;
3328 }
3329 }
3330 }
3331
3332 if (sign == -1)
3333 v.Negate();
3334
3335 return v;
3336}
3337
3338Integer::Integer(const char *str, ByteOrder order)
3339 : reg(2), sign(POSITIVE)
3340{
3341 *this = StringToInteger(str,order);
3342}
3343
3344Integer::Integer(const wchar_t *str, ByteOrder order)
3345 : reg(2), sign(POSITIVE)
3346{
3347 *this = StringToInteger(str,order);
3348}
3349
3350unsigned int Integer::WordCount() const
3351{
3352 return (unsigned int)CountWords(reg, reg.size());
3353}
3354
3355unsigned int Integer::ByteCount() const
3356{
3357 unsigned wordCount = WordCount();
3358 if (wordCount)
3359 return (wordCount-1)*WORD_SIZE + BytePrecision(reg[wordCount-1]);
3360 else
3361 return 0;
3362}
3363
3364unsigned int Integer::BitCount() const
3365{
3366 unsigned wordCount = WordCount();
3367 if (wordCount)
3368 return (wordCount-1)*WORD_BITS + BitPrecision(reg[wordCount-1]);
3369 else
3370 return 0;
3371}
3372
3373void Integer::Decode(const byte *input, size_t inputLen, Signedness s)
3374{
3375 CRYPTOPP_ASSERT(input && inputLen); // NULL buffer
3376 StringStore store(input, inputLen);
3377 Decode(store, inputLen, s);
3378}
3379
3380void Integer::Decode(BufferedTransformation &bt, size_t inputLen, Signedness s)
3381{
3382 CRYPTOPP_ASSERT(bt.MaxRetrievable() >= inputLen);
3383 if (bt.MaxRetrievable() < inputLen)
3384 throw InvalidArgument("Integer: input length is too small");
3385
3386 byte b;
3387 bt.Peek(b);
3388 sign = ((s==SIGNED) && (b & 0x80)) ? NEGATIVE : POSITIVE;
3389
3390 while (inputLen>0 && (sign==POSITIVE ? b==0 : b==0xff))
3391 {
3392 bt.Skip(1);
3393 inputLen--;
3394 bt.Peek(b);
3395 }
3396
3397 reg.CleanNew(RoundupSize(BytesToWords(inputLen)));
3398 for (size_t i=inputLen; i > 0; i--)
3399 {
3400 (void)bt.Get(b);
3401 reg[(i-1)/WORD_SIZE] |= word(b) << ((i-1)%WORD_SIZE)*8;
3402 }
3403
3404 if (sign == NEGATIVE)
3405 {
3406 for (size_t i=inputLen; i<reg.size()*WORD_SIZE; i++)
3407 reg[i/WORD_SIZE] |= word(0xff) << (i%WORD_SIZE)*8;
3408 TwosComplement(reg, reg.size());
3409 }
3410}
3411
3412size_t Integer::MinEncodedSize(Signedness signedness) const
3413{
3414 // Profiling guided the flow below.
3415 unsigned int outputLen = STDMAX(1U, ByteCount());
3416 const bool pre = (signedness == UNSIGNED);
3417 if (!pre && NotNegative() && (GetByte(outputLen-1) & 0x80))
3418 outputLen++;
3419 if (pre)
3420 return outputLen;
3421 if (IsNegative() && *this < -Power2(outputLen*8-1))
3422 outputLen++;
3423 return outputLen;
3424}
3425
3426// PKCS12_PBKDF and other classes use undersized buffers
3427void Integer::Encode(byte *output, size_t outputLen, Signedness signedness) const
3428{
3429 CRYPTOPP_ASSERT(output && outputLen); // NULL buffer
3430 ArraySink sink(output, outputLen);
3431 Encode(sink, outputLen, signedness);
3432}
3433
3434void Integer::Encode(BufferedTransformation &bt, size_t outputLen, Signedness signedness) const
3435{
3436 if (signedness == UNSIGNED || NotNegative())
3437 {
3438 for (size_t i=outputLen; i > 0; i--)
3439 bt.Put(GetByte(i-1));
3440 }
3441 else
3442 {
3443 // take two's complement of *this
3444 Integer temp = Integer::Power2(8*STDMAX((size_t)ByteCount(), outputLen)) + *this;
3445 temp.Encode(bt, outputLen, UNSIGNED);
3446 }
3447}
3448
3450{
3451 DERGeneralEncoder enc(bt, INTEGER);
3453 enc.MessageEnd();
3454}
3455
3456void Integer::BERDecode(const byte *input, size_t len)
3457{
3458 CRYPTOPP_ASSERT(input && len); // NULL buffer
3459 StringStore store(input, len);
3460 BERDecode(store);
3461}
3462
3464{
3465 BERGeneralDecoder dec(bt, INTEGER);
3466 if (!dec.IsDefiniteLength() || dec.MaxRetrievable() < dec.RemainingLength())
3468 Decode(dec, (size_t)dec.RemainingLength(), SIGNED);
3469 dec.MessageEnd();
3470}
3471
3472void Integer::DEREncodeAsOctetString(BufferedTransformation &bt, size_t length) const
3473{
3475 Encode(enc, length);
3476 enc.MessageEnd();
3477}
3478
3480{
3482 if (!dec.IsDefiniteLength() || dec.RemainingLength() != length)
3484 Decode(dec, length);
3485 dec.MessageEnd();
3486}
3487
3488size_t Integer::OpenPGPEncode(byte *output, size_t bufferSize) const
3489{
3490 CRYPTOPP_ASSERT(output && bufferSize); // NULL buffer
3491 CRYPTOPP_ASSERT(bufferSize >= 2+ByteCount()); // Undersized buffer
3492 ArraySink sink(output, bufferSize);
3493 return OpenPGPEncode(sink);
3494}
3495
3497{
3498 word16 bitCount = word16(BitCount());
3499 bt.PutWord16(bitCount);
3500 size_t byteCount = BitsToBytes(bitCount);
3501 Encode(bt, byteCount);
3502 return 2 + byteCount;
3503}
3504
3505void Integer::OpenPGPDecode(const byte *input, size_t len)
3506{
3507 CRYPTOPP_ASSERT(input && len); // NULL buffer
3508 StringStore store(input, len);
3509 OpenPGPDecode(store);
3510}
3511
3513{
3514 word16 bitCount;
3515 if (bt.GetWord16(bitCount) != 2 || bt.MaxRetrievable() < BitsToBytes(bitCount))
3516 throw OpenPGPDecodeErr();
3517 Decode(bt, BitsToBytes(bitCount));
3518}
3519
3520void Integer::Randomize(RandomNumberGenerator &rng, size_t nbits)
3521{
3522 const size_t nbytes = nbits/8 + 1;
3523 SecByteBlock buf(nbytes);
3524 rng.GenerateBlock(buf, nbytes);
3525
3526 // https://github.com/weidai11/cryptopp/issues/1206
3527 // if (nbytes)
3528 // buf[0] = (byte)Crop(buf[0], nbits % 8);
3529
3530 buf[0] = (byte)Crop(buf[0], nbits % 8);
3531 Decode(buf, nbytes, UNSIGNED);
3532}
3533
3534void Integer::Randomize(RandomNumberGenerator &rng, const Integer &min, const Integer &max)
3535{
3536 if (min > max)
3537 throw InvalidArgument("Integer: Min must be no greater than Max");
3538
3539 Integer range = max - min;
3540 const unsigned int nbits = range.BitCount();
3541
3542 do
3543 {
3544 Randomize(rng, nbits);
3545 }
3546 while (*this > range);
3547
3548 *this += min;
3549}
3550
3551bool Integer::Randomize(RandomNumberGenerator &rng, const Integer &min, const Integer &max, RandomNumberType rnType, const Integer &equiv, const Integer &mod)
3552{
3553 return GenerateRandomNoThrow(rng, MakeParameters("Min", min)("Max", max)
3554 ("RandomNumberType", rnType)("EquivalentTo", equiv)("Mod", mod));
3555}
3556
3557class KDF2_RNG : public RandomNumberGenerator
3558{
3559public:
3560 KDF2_RNG(const byte *seed, size_t seedSize)
3561 : m_counter(0), m_counterAndSeed(ClampSize(seedSize) + 4)
3562 {
3563 std::memcpy(m_counterAndSeed + 4, seed, ClampSize(seedSize));
3564 }
3565
3566 void GenerateBlock(byte *output, size_t size)
3567 {
3568 CRYPTOPP_ASSERT(output && size); // NULL buffer
3569 PutWord(false, BIG_ENDIAN_ORDER, m_counterAndSeed, m_counter);
3570 ++m_counter;
3571 P1363_KDF2<SHA1>::DeriveKey(output, size, m_counterAndSeed, m_counterAndSeed.size(), NULLPTR, 0);
3572 }
3573
3574 // UBsan finding, -Wstringop-overflow
3575 inline size_t ClampSize(size_t req) const
3576 {
3577 // Clamp at 16 MB
3578 if (req > 16U*1024*1024)
3579 return 16U*1024*1024;
3580 return req;
3581 }
3582
3583private:
3584 word32 m_counter;
3585 SecByteBlock m_counterAndSeed;
3586};
3587
3589{
3590 Integer min = params.GetValueWithDefault("Min", Integer::Zero());
3591 Integer max;
3592 if (!params.GetValue("Max", max))
3593 {
3594 int bitLength;
3595 if (params.GetIntValue("BitLength", bitLength))
3596 max = Integer::Power2(bitLength);
3597 else
3598 throw InvalidArgument("Integer: missing Max argument");
3599 }
3600 if (min > max)
3601 throw InvalidArgument("Integer: Min must be no greater than Max");
3602
3603 Integer equiv = params.GetValueWithDefault("EquivalentTo", Integer::Zero());
3604 Integer mod = params.GetValueWithDefault("Mod", Integer::One());
3605
3606 if (equiv.IsNegative() || equiv >= mod)
3607 throw InvalidArgument("Integer: invalid EquivalentTo and/or Mod argument");
3608
3609 Integer::RandomNumberType rnType = params.GetValueWithDefault("RandomNumberType", Integer::ANY);
3610
3611 member_ptr<KDF2_RNG> kdf2Rng;
3613 if (params.GetValue(Name::Seed(), seed))
3614 {
3615 ByteQueue bq;
3616 DERSequenceEncoder seq(bq);
3617 min.DEREncode(seq);
3618 max.DEREncode(seq);
3619 equiv.DEREncode(seq);
3620 mod.DEREncode(seq);
3621 DEREncodeUnsigned(seq, rnType);
3622 DEREncodeOctetString(seq, seed.begin(), seed.size());
3623 seq.MessageEnd();
3624
3625 SecByteBlock finalSeed((size_t)bq.MaxRetrievable());
3626 bq.Get(finalSeed, finalSeed.size());
3627 kdf2Rng.reset(new KDF2_RNG(finalSeed.begin(), finalSeed.size()));
3628 }
3629 RandomNumberGenerator &rng = kdf2Rng.get() ? (RandomNumberGenerator &)*kdf2Rng : i_rng;
3630
3631 switch (rnType)
3632 {
3633 case ANY:
3634 if (mod == One())
3635 Randomize(rng, min, max);
3636 else
3637 {
3638 Integer min1 = min + (equiv-min)%mod;
3639 if (max < min1)
3640 return false;
3641 Randomize(rng, Zero(), (max - min1) / mod);
3642 *this *= mod;
3643 *this += min1;
3644 }
3645 return true;
3646
3647 case PRIME:
3648 {
3649 const PrimeSelector *pSelector = params.GetValueWithDefault(Name::PointerToPrimeSelector(), (const PrimeSelector *)NULLPTR);
3650
3651 int i;
3652 i = 0;
3653 while (1)
3654 {
3655 if (++i==16)
3656 {
3657 // check if there are any suitable primes in [min, max]
3658 Integer first = min;
3659 if (FirstPrime(first, max, equiv, mod, pSelector))
3660 {
3661 // if there is only one suitable prime, we're done
3662 *this = first;
3663 if (!FirstPrime(first, max, equiv, mod, pSelector))
3664 return true;
3665 }
3666 else
3667 return false;
3668 }
3669
3670 Randomize(rng, min, max);
3671 if (FirstPrime(*this, STDMIN(*this+mod*PrimeSearchInterval(max), max), equiv, mod, pSelector))
3672 return true;
3673 }
3674 }
3675
3676 default:
3677 throw InvalidArgument("Integer: invalid RandomNumberType argument");
3678 }
3679}
3680
3681std::istream& operator>>(std::istream& in, Integer &a)
3682{
3683 char c;
3684 unsigned int length = 0;
3685 SecBlock<char> str(length + 16);
3686
3687 std::ws(in);
3688
3689 do
3690 {
3691 in.read(&c, 1);
3692 str[length++] = c;
3693 if (length >= str.size())
3694 str.Grow(length + 16);
3695 }
3696 while (in && (c=='-' || c=='x' || (c>='0' && c<='9') || (c>='a' && c<='f') || (c>='A' && c<='F') || c=='h' || c=='H' || c=='o' || c=='O' || c==',' || c=='.'));
3697
3698 if (in.gcount())
3699 in.putback(c);
3700 str[length-1] = '\0';
3701 a = Integer(str);
3702
3703 return in;
3704}
3705
3706// Ensure base 10 is default
3707inline int FlagToBase(long f) {
3708 return f == std::ios::hex ? 16 : (f == std::ios::oct ? 8 : 10);
3709}
3710
3711inline char FlagToSuffix(long f) {
3712 return f == std::ios::hex ? 'h' : (f == std::ios::oct ? 'o' : '.');
3713}
3714
3715// Ensure base 10 is default
3716std::ostream& operator<<(std::ostream& out, const Integer &a)
3717{
3718 // Get relevant conversion specifications from ostream.
3719 const long f = out.flags() & std::ios::basefield;
3720 const int base = FlagToBase(f);
3721 const char suffix = FlagToSuffix(f);
3722
3723 Integer temp1=a, temp2;
3724 if (a.IsNegative())
3725 {
3726 out << '-';
3727 temp1.Negate();
3728 }
3729
3730 if (!a)
3731 out << '0';
3732
3733 static const char upper[]="0123456789ABCDEF";
3734 static const char lower[]="0123456789abcdef";
3735
3736 const char* vec = (out.flags() & std::ios::uppercase) ? upper : lower;
3737 unsigned int i=0;
3738 SecBlock<char> s(a.BitCount() / (SaturatingSubtract1(BitPrecision(base),1U)) + 1);
3739
3740 while (!!temp1)
3741 {
3742 word digit;
3743 Integer::Divide(digit, temp2, temp1, base);
3744 s[i++]=vec[digit];
3745 temp1.swap(temp2);
3746 }
3747
3748 while (i--)
3749 {
3750 out << s[i];
3751 }
3752
3753#ifdef CRYPTOPP_USE_STD_SHOWBASE
3754 if (out.flags() & std::ios_base::showbase)
3755 out << suffix;
3756
3757 return out;
3758#else
3759 return out << suffix;
3760#endif
3761}
3762
3764{
3765 if (NotNegative())
3766 {
3767 if (Increment(reg, reg.size()))
3768 {
3769 reg.CleanGrow(2*reg.size());
3770 reg[reg.size()/2]=1;
3771 }
3772 }
3773 else
3774 {
3775 word borrow = Decrement(reg, reg.size());
3776 CRYPTOPP_ASSERT(!borrow); CRYPTOPP_UNUSED(borrow);
3777
3778 if (WordCount()==0)
3779 *this = Zero();
3780 }
3781 return *this;
3782}
3783
3785{
3786 if (IsNegative())
3787 {
3788 if (Increment(reg, reg.size()))
3789 {
3790 reg.CleanGrow(2*reg.size());
3791 reg[reg.size()/2]=1;
3792 }
3793 }
3794 else
3795 {
3796 if (Decrement(reg, reg.size()))
3797 *this = -One();
3798 }
3799 return *this;
3800}
3801
3802// This is a bit operation. We set sign to POSITIVE, so there's no need to
3803// worry about negative zero. Also see http://stackoverflow.com/q/11644362.
3804Integer Integer::And(const Integer& t) const
3805{
3806 // Grow due to https://github.com/weidai11/cryptopp/issues/1072
3807 // The temporary Integer 'result' may have fewer blocks than
3808 // 'this' or 't', if leading 0-blocks are trimmed in copy ctor.
3809
3810 if (this == &t)
3811 {
3812 return AbsoluteValue();
3813 }
3814 else if (reg.size() >= t.reg.size())
3815 {
3816 IntegerSecBlock temp(t.reg.size());
3817 // AndWords(temp, reg, t.reg, t.reg.size());
3818 for (size_t i=0; i<t.reg.size(); ++i)
3819 temp[i] = reg[i] & t.reg[i];
3820
3821 Integer result;
3822 std::swap(result.reg, temp);
3823
3824 return result;
3825 }
3826 else // reg.size() < t.reg.size()
3827 {
3828 IntegerSecBlock temp(reg.size());
3829 // AndWords(temp, reg, t.reg, reg.size());
3830 for (size_t i=0; i<reg.size(); ++i)
3831 temp[i] = reg[i] & t.reg[i];
3832
3833 Integer result;
3834 std::swap(result.reg, temp);
3835
3836 return result;
3837 }
3838}
3839
3840// This is a bit operation. We set sign to POSITIVE, so there's no need to
3841// worry about negative zero. Also see http://stackoverflow.com/q/11644362.
3842Integer Integer::Or(const Integer& t) const
3843{
3844 // Grow due to https://github.com/weidai11/cryptopp/issues/1072
3845 // The temporary Integer 'result' may have fewer blocks than
3846 // 'this' or 't', if leading 0-blocks are trimmed in copy ctor.
3847
3848 if (this == &t)
3849 {
3850 return AbsoluteValue();
3851 }
3852 else if (reg.size() >= t.reg.size())
3853 {
3854 IntegerSecBlock temp(reg, reg.size());
3855 // OrWords(temp, t.reg, t.reg.size());
3856 for (size_t i=0; i<t.reg.size(); ++i)
3857 temp[i] |= t.reg[i];
3858
3859 Integer result;
3860 std::swap(result.reg, temp);
3861
3862 return result;
3863 }
3864 else // reg.size() < t.reg.size()
3865 {
3866 IntegerSecBlock temp(t.reg, t.reg.size());
3867 // OrWords(temp, reg, reg.size());
3868 for (size_t i=0; i<reg.size(); ++i)
3869 temp[i] |= reg[i];
3870
3871 Integer result;
3872 std::swap(result.reg, temp);
3873
3874 return result;
3875 }
3876}
3877
3878// This is a bit operation. We set sign to POSITIVE, so there's no need to
3879// worry about negative zero. Also see http://stackoverflow.com/q/11644362.
3880Integer Integer::Xor(const Integer& t) const
3881{
3882 // Grow due to https://github.com/weidai11/cryptopp/issues/1072
3883 // The temporary Integer 'result' may have fewer blocks than
3884 // 'this' or 't', if leading 0-blocks are trimmed in copy ctor.
3885
3886 if (this == &t)
3887 {
3888 return Integer::Zero();
3889 }
3890 else if (reg.size() >= t.reg.size())
3891 {
3892 IntegerSecBlock temp(reg, reg.size());
3893 // OrWords(temp, t.reg, t.reg.size());
3894 for (size_t i=0; i<t.reg.size(); ++i)
3895 temp[i] ^= t.reg[i];
3896
3897 Integer result;
3898 std::swap(result.reg, temp);
3899
3900 return result;
3901 }
3902 else // reg.size() < t.reg.size()
3903 {
3904 IntegerSecBlock temp(t.reg, t.reg.size());
3905 // OrWords(temp, reg, reg.size());
3906 for (size_t i=0; i<reg.size(); ++i)
3907 temp[i] ^= reg[i];
3908
3909 Integer result;
3910 std::swap(result.reg, temp);
3911
3912 return result;
3913 }
3914}
3915
3916void PositiveAdd(Integer &sum, const Integer &a, const Integer& b)
3917{
3918 // Profiling guided the flow below.
3919 int carry; const bool pre = (a.reg.size() == b.reg.size());
3920 if (!pre && a.reg.size() > b.reg.size())
3921 {
3922 carry = Add(sum.reg, a.reg, b.reg, b.reg.size());
3923 CopyWords(sum.reg+b.reg.size(), a.reg+b.reg.size(), a.reg.size()-b.reg.size());
3924 carry = Increment(sum.reg+b.reg.size(), a.reg.size()-b.reg.size(), carry);
3925 }
3926 else if (pre)
3927 {
3928 carry = Add(sum.reg, a.reg, b.reg, a.reg.size());
3929 }
3930 else
3931 {
3932 carry = Add(sum.reg, a.reg, b.reg, a.reg.size());
3933 CopyWords(sum.reg+a.reg.size(), b.reg+a.reg.size(), b.reg.size()-a.reg.size());
3934 carry = Increment(sum.reg+a.reg.size(), b.reg.size()-a.reg.size(), carry);
3935 }
3936
3937 if (carry)
3938 {
3939 sum.reg.CleanGrow(2*sum.reg.size());
3940 sum.reg[sum.reg.size()/2] = 1;
3941 }
3942 sum.sign = Integer::POSITIVE;
3943}
3944
3945void PositiveSubtract(Integer &diff, const Integer &a, const Integer& b)
3946{
3947 unsigned aSize = a.WordCount();
3948 aSize += aSize%2;
3949 unsigned bSize = b.WordCount();
3950 bSize += bSize%2;
3951
3952 // Profiling guided the flow below.
3953 if (aSize > bSize)
3954 {
3955 word borrow = Subtract(diff.reg, a.reg, b.reg, bSize);
3956 CopyWords(diff.reg+bSize, a.reg+bSize, aSize-bSize);
3957 borrow = Decrement(diff.reg+bSize, aSize-bSize, borrow);
3958 CRYPTOPP_ASSERT(!borrow); CRYPTOPP_UNUSED(borrow);
3959 diff.sign = Integer::POSITIVE;
3960 }
3961 else if (aSize == bSize)
3962 {
3963 if (Compare(a.reg, b.reg, aSize) >= 0)
3964 {
3965 Subtract(diff.reg, a.reg, b.reg, aSize);
3966 diff.sign = Integer::POSITIVE;
3967 }
3968 else
3969 {
3970 Subtract(diff.reg, b.reg, a.reg, aSize);
3971 diff.sign = Integer::NEGATIVE;
3972 }
3973 }
3974 else
3975 {
3976 word borrow = Subtract(diff.reg, b.reg, a.reg, aSize);
3977 CopyWords(diff.reg+aSize, b.reg+aSize, bSize-aSize);
3978 borrow = Decrement(diff.reg+aSize, bSize-aSize, borrow);
3979 CRYPTOPP_ASSERT(!borrow); CRYPTOPP_UNUSED(borrow);
3980 diff.sign = Integer::NEGATIVE;
3981 }
3982}
3983
3984// MSVC .NET 2003 workaround
3985template <class T> inline const T& STDMAX2(const T& a, const T& b)
3986{
3987 return a < b ? b : a;
3988}
3989
3990Integer Integer::Plus(const Integer& b) const
3991{
3992 Integer sum((word)0, STDMAX2(reg.size(), b.reg.size()));
3993 if (NotNegative())
3994 {
3995 if (b.NotNegative())
3996 PositiveAdd(sum, *this, b);
3997 else
3998 PositiveSubtract(sum, *this, b);
3999 }
4000 else
4001 {
4002 if (b.NotNegative())
4003 PositiveSubtract(sum, b, *this);
4004 else
4005 {
4006 PositiveAdd(sum, *this, b);
4007 sum.sign = Integer::NEGATIVE;
4008 }
4009 }
4010 return sum;
4011}
4012
4014{
4015 reg.CleanGrow(t.reg.size());
4016 if (NotNegative())
4017 {
4018 if (t.NotNegative())
4019 PositiveAdd(*this, *this, t);
4020 else
4021 PositiveSubtract(*this, *this, t);
4022 }
4023 else
4024 {
4025 if (t.NotNegative())
4026 PositiveSubtract(*this, t, *this);
4027 else
4028 {
4029 PositiveAdd(*this, *this, t);
4030 sign = Integer::NEGATIVE;
4031 }
4032 }
4033 return *this;
4034}
4035
4036Integer Integer::Minus(const Integer& b) const
4037{
4038 Integer diff((word)0, STDMAX2(reg.size(), b.reg.size()));
4039 if (NotNegative())
4040 {
4041 if (b.NotNegative())
4042 PositiveSubtract(diff, *this, b);
4043 else
4044 PositiveAdd(diff, *this, b);
4045 }
4046 else
4047 {
4048 if (b.NotNegative())
4049 {
4050 PositiveAdd(diff, *this, b);
4051 diff.sign = Integer::NEGATIVE;
4052 }
4053 else
4054 PositiveSubtract(diff, b, *this);
4055 }
4056 return diff;
4057}
4058
4060{
4061 reg.CleanGrow(t.reg.size());
4062 if (NotNegative())
4063 {
4064 if (t.NotNegative())
4065 PositiveSubtract(*this, *this, t);
4066 else
4067 PositiveAdd(*this, *this, t);
4068 }
4069 else
4070 {
4071 if (t.NotNegative())
4072 {
4073 PositiveAdd(*this, *this, t);
4074 sign = Integer::NEGATIVE;
4075 }
4076 else
4077 PositiveSubtract(*this, t, *this);
4078 }
4079 return *this;
4080}
4081
4083{
4084 const size_t wordCount = WordCount();
4085 const size_t shiftWords = n / WORD_BITS;
4086 const unsigned int shiftBits = (unsigned int)(n % WORD_BITS);
4087
4088 reg.CleanGrow(RoundupSize(wordCount+BitsToWords(n)));
4089 ShiftWordsLeftByWords(reg, wordCount + shiftWords, shiftWords);
4090 ShiftWordsLeftByBits(reg+shiftWords, wordCount+BitsToWords(shiftBits), shiftBits);
4091 return *this;
4092}
4093
4095{
4096 const size_t wordCount = WordCount();
4097 const size_t shiftWords = n / WORD_BITS;
4098 const unsigned int shiftBits = (unsigned int)(n % WORD_BITS);
4099
4100 ShiftWordsRightByWords(reg, wordCount, shiftWords);
4101 if (wordCount > shiftWords)
4102 ShiftWordsRightByBits(reg, wordCount-shiftWords, shiftBits);
4103 if (IsNegative() && WordCount()==0) // avoid -0
4104 *this = Zero();
4105 return *this;
4106}
4107
4109{
4110 if (this != &t)
4111 {
4112 const size_t size = STDMIN(reg.size(), t.reg.size());
4113 reg.resize(size);
4114 AndWords(reg, t.reg, size);
4115 }
4116 sign = POSITIVE;
4117 return *this;
4118}
4119
4121{
4122 if (this != &t)
4123 {
4124 if (reg.size() >= t.reg.size())
4125 {
4126 OrWords(reg, t.reg, t.reg.size());
4127 }
4128 else // reg.size() < t.reg.size()
4129 {
4130 const size_t head = reg.size();
4131 const size_t tail = t.reg.size() - reg.size();
4132 reg.resize(head+tail);
4133 OrWords(reg, t.reg, head);
4134 CopyWords(reg+head,t.reg+head,tail);
4135 }
4136 }
4137 sign = POSITIVE;
4138 return *this;
4139}
4140
4142{
4143 if (this == &t)
4144 {
4145 *this = Zero();
4146 }
4147 else
4148 {
4149 if (reg.size() >= t.reg.size())
4150 {
4151 XorWords(reg, t.reg, t.reg.size());
4152 }
4153 else // reg.size() < t.reg.size()
4154 {
4155 const size_t head = reg.size();
4156 const size_t tail = t.reg.size() - reg.size();
4157 reg.resize(head+tail);
4158 XorWords(reg, t.reg, head);
4159 CopyWords(reg+head,t.reg+head,tail);
4160 }
4161 }
4162 sign = POSITIVE;
4163 return *this;
4164}
4165
4166void PositiveMultiply(Integer &product, const Integer &a, const Integer &b)
4167{
4168 size_t aSize = RoundupSize(a.WordCount());
4169 size_t bSize = RoundupSize(b.WordCount());
4170
4171 product.reg.CleanNew(RoundupSize(aSize+bSize));
4172 product.sign = Integer::POSITIVE;
4173
4174 IntegerSecBlock workspace(aSize + bSize);
4175 AsymmetricMultiply(product.reg, workspace, a.reg, aSize, b.reg, bSize);
4176}
4177
4178void Multiply(Integer &product, const Integer &a, const Integer &b)
4179{
4180 PositiveMultiply(product, a, b);
4181
4182 if (a.NotNegative() != b.NotNegative())
4183 product.Negate();
4184}
4185
4186Integer Integer::Times(const Integer &b) const
4187{
4188 Integer product;
4189 Multiply(product, *this, b);
4190 return product;
4191}
4192
4193/*
4194void PositiveDivide(Integer &remainder, Integer &quotient,
4195 const Integer &dividend, const Integer &divisor)
4196{
4197 remainder.reg.CleanNew(divisor.reg.size());
4198 remainder.sign = Integer::POSITIVE;
4199 quotient.reg.New(0);
4200 quotient.sign = Integer::POSITIVE;
4201 unsigned i=dividend.BitCount();
4202 while (i--)
4203 {
4204 word overflow = ShiftWordsLeftByBits(remainder.reg, remainder.reg.size(), 1);
4205 remainder.reg[0] |= dividend[i];
4206 if (overflow || remainder >= divisor)
4207 {
4208 Subtract(remainder.reg, remainder.reg, divisor.reg, remainder.reg.size());
4209 quotient.SetBit(i);
4210 }
4211 }
4212}
4213*/
4214
4215void PositiveDivide(Integer &remainder, Integer &quotient,
4216 const Integer &a, const Integer &b)
4217{
4218 unsigned aSize = a.WordCount();
4219 unsigned bSize = b.WordCount();
4220
4221 if (!bSize)
4222 throw Integer::DivideByZero();
4223
4224 if (aSize < bSize)
4225 {
4226 remainder = a;
4227 remainder.sign = Integer::POSITIVE;
4228 quotient = Integer::Zero();
4229 return;
4230 }
4231
4232 aSize += aSize%2; // round up to next even number
4233 bSize += bSize%2;
4234
4235 remainder.reg.CleanNew(RoundupSize(bSize));
4236 remainder.sign = Integer::POSITIVE;
4237 quotient.reg.CleanNew(RoundupSize(aSize-bSize+2));
4238 quotient.sign = Integer::POSITIVE;
4239
4240 IntegerSecBlock T(aSize+3*(bSize+2));
4241 Divide(remainder.reg, quotient.reg, T, a.reg, aSize, b.reg, bSize);
4242}
4243
4244void Integer::Divide(Integer &remainder, Integer &quotient, const Integer &dividend, const Integer &divisor)
4245{
4246 PositiveDivide(remainder, quotient, dividend, divisor);
4247
4248 if (dividend.IsNegative())
4249 {
4250 quotient.Negate();
4251 if (remainder.NotZero())
4252 {
4253 --quotient;
4254 remainder = divisor.AbsoluteValue() - remainder;
4255 }
4256 }
4257
4258 if (divisor.IsNegative())
4259 quotient.Negate();
4260}
4261
4262void Integer::DivideByPowerOf2(Integer &r, Integer &q, const Integer &a, unsigned int n)
4263{
4264 q = a;
4265 q >>= n;
4266
4267 const size_t wordCount = BitsToWords(n);
4268 if (wordCount <= a.WordCount())
4269 {
4270 r.reg.resize(RoundupSize(wordCount));
4271 CopyWords(r.reg, a.reg, wordCount);
4272 SetWords(r.reg+wordCount, 0, r.reg.size()-wordCount);
4273 if (n % WORD_BITS != 0)
4274 r.reg[wordCount-1] %= (word(1) << (n % WORD_BITS));
4275 }
4276 else
4277 {
4278 r.reg.resize(RoundupSize(a.WordCount()));
4279 CopyWords(r.reg, a.reg, r.reg.size());
4280 }
4281 r.sign = POSITIVE;
4282
4283 if (a.IsNegative() && r.NotZero())
4284 {
4285 --q;
4286 r = Power2(n) - r;
4287 }
4288}
4289
4290Integer Integer::DividedBy(const Integer &b) const
4291{
4292 Integer remainder, quotient;
4293 Integer::Divide(remainder, quotient, *this, b);
4294 return quotient;
4295}
4296
4297Integer Integer::Modulo(const Integer &b) const
4298{
4299 Integer remainder, quotient;
4300 Integer::Divide(remainder, quotient, *this, b);
4301 return remainder;
4302}
4303
4304void Integer::Divide(word &remainder, Integer &quotient, const Integer &dividend, word divisor)
4305{
4306 if (!divisor)
4307 throw Integer::DivideByZero();
4308
4309 // IsPowerOf2 uses BMI on x86 if available. There is a small
4310 // but measurable improvement during decryption and signing.
4311 if (IsPowerOf2(divisor))
4312 {
4313 quotient = dividend >> (BitPrecision(divisor)-1);
4314 remainder = dividend.reg[0] & (divisor-1);
4315 return;
4316 }
4317
4318 unsigned int i = dividend.WordCount();
4319 quotient.reg.CleanNew(RoundupSize(i));
4320 remainder = 0;
4321 while (i--)
4322 {
4323 quotient.reg[i] = DWord(dividend.reg[i], remainder) / divisor;
4324 remainder = DWord(dividend.reg[i], remainder) % divisor;
4325 }
4326
4327 if (dividend.NotNegative())
4328 quotient.sign = POSITIVE;
4329 else
4330 {
4331 quotient.sign = NEGATIVE;
4332 if (remainder)
4333 {
4334 --quotient;
4335 remainder = divisor - remainder;
4336 }
4337 }
4338}
4339
4341{
4342 word remainder;
4343 Integer quotient;
4344 Integer::Divide(remainder, quotient, *this, b);
4345 return quotient;
4346}
4347
4348word Integer::Modulo(word divisor) const
4349{
4350 if (!divisor)
4351 throw Integer::DivideByZero();
4352
4353 word remainder;
4354
4355 // Profiling guided the flow below.
4356 if ((divisor & (divisor-1)) != 0) // divisor is not a power of 2
4357 {
4358 // Profiling guided the flow below.
4359 unsigned int i = WordCount();
4360 if (divisor > 5)
4361 {
4362 remainder = 0;
4363 while (i--)
4364 remainder = DWord(reg[i], remainder) % divisor;
4365 }
4366 else
4367 {
4368 DWord sum(0, 0);
4369 while (i--)
4370 sum += reg[i];
4371 remainder = sum % divisor;
4372 }
4373 }
4374 else // divisor is a power of 2
4375 {
4376 remainder = reg[0] & (divisor-1);
4377 }
4378
4379 if (IsNegative() && remainder)
4380 remainder = divisor - remainder;
4381
4382 return remainder;
4383}
4384
4385void Integer::Negate()
4386{
4387 if (!!(*this)) // don't flip sign if *this==0
4388 sign = Sign(1-sign);
4389}
4390
4391int Integer::PositiveCompare(const Integer& t) const
4392{
4393 // Profiling guided the flow below.
4394 const unsigned size = WordCount(), tSize = t.WordCount();
4395 if (size != tSize)
4396 return size > tSize ? 1 : -1;
4397 else
4398 return CryptoPP::Compare(reg, t.reg, size);
4399}
4400
4401int Integer::Compare(const Integer& t) const
4402{
4403 if (NotNegative())
4404 {
4405 if (t.NotNegative())
4406 return PositiveCompare(t);
4407 else
4408 return 1;
4409 }
4410 else
4411 {
4412 if (t.NotNegative())
4413 return -1;
4414 else
4415 return -PositiveCompare(t);
4416 }
4417}
4418
4420{
4421 if (!IsPositive())
4422 return Zero();
4423
4424 // overestimate square root
4425 Integer x, y = Power2((BitCount()+1)/2);
4426 CRYPTOPP_ASSERT(y*y >= *this);
4427
4428 do
4429 {
4430 x = y;
4431 y = (x + *this/x) >> 1;
4432 } while (y<x);
4433
4434 return x;
4435}
4436
4437bool Integer::IsSquare() const
4438{
4439 Integer r = SquareRoot();
4440 return *this == r.Squared();
4441}
4442
4443bool Integer::IsUnit() const
4444{
4445 return (WordCount() == 1) && (reg[0] == 1);
4446}
4447
4449{
4450 return IsUnit() ? *this : Zero();
4451}
4452
4453Integer a_times_b_mod_c(const Integer &x, const Integer& y, const Integer& m)
4454{
4456 if (m.IsZero())
4457 throw Integer::DivideByZero();
4458
4459 return x*y%m;
4460}
4461
4462Integer a_exp_b_mod_c(const Integer &x, const Integer& e, const Integer& m)
4463{
4465 if (m.IsZero())
4466 throw Integer::DivideByZero();
4467
4468 ModularArithmetic mr(m);
4469 return mr.Exponentiate(x, e);
4470}
4471
4472Integer Integer::Gcd(const Integer &a, const Integer &b)
4473{
4474 return EuclideanDomainOf<Integer>().Gcd(a, b);
4475}
4476
4477Integer Integer::InverseMod(const Integer &m) const
4478{
4481
4482 if (IsNegative())
4483 return Modulo(m).InverseModNext(m);
4484
4485 // http://github.com/weidai11/cryptopp/issues/602
4486 if (*this >= m)
4487 return Modulo(m).InverseModNext(m);
4488
4489 return InverseModNext(m);
4490}
4491
4492Integer Integer::InverseModNext(const Integer &m) const
4493{
4496
4497 if (m.IsEven())
4498 {
4499 if (!m || IsEven())
4500 return Zero(); // no inverse
4501 if (*this == One())
4502 return One();
4503
4504 Integer u = m.Modulo(*this).InverseModNext(*this);
4505 return !u ? Zero() : (m*(*this-u)+1)/(*this);
4506 }
4507
4508 // AlmostInverse requires a 4x workspace
4509 IntegerSecBlock T(m.reg.size() * 4);
4510 Integer r((word)0, m.reg.size());
4511 unsigned k = AlmostInverse(r.reg, T, reg, reg.size(), m.reg, m.reg.size());
4512 DivideByPower2Mod(r.reg, r.reg, k, m.reg, m.reg.size());
4513 return r;
4514}
4515
4516word Integer::InverseMod(word mod) const
4517{
4518 CRYPTOPP_ASSERT(mod != 0);
4519
4520 word g0 = mod, g1 = *this % mod;
4521 word v0 = 0, v1 = 1;
4522 word y;
4523
4524 while (g1)
4525 {
4526 if (g1 == 1)
4527 return v1;
4528 y = g0 / g1;
4529 g0 = g0 % g1;
4530 v0 += y * v1;
4531
4532 if (!g0)
4533 break;
4534 if (g0 == 1)
4535 return mod-v0;
4536 y = g1 / g0;
4537 g1 = g1 % g0;
4538 v1 += y * v0;
4539 }
4540 return 0;
4541}
4542
4543// ********************************************************
4544
4546{
4547 BERSequenceDecoder seq(bt);
4548 OID oid(seq);
4549 if (oid != ASN1::prime_field())
4551 m_modulus.BERDecode(seq);
4552 seq.MessageEnd();
4553 m_result.reg.resize(m_modulus.reg.size());
4554}
4555
4557{
4558 DERSequenceEncoder seq(bt);
4559 ASN1::prime_field().DEREncode(seq);
4560 m_modulus.DEREncode(seq);
4561 seq.MessageEnd();
4562}
4563
4564void ModularArithmetic::DEREncodeElement(BufferedTransformation &out, const Element &a) const
4565{
4566 a.DEREncodeAsOctetString(out, MaxElementByteLength());
4567}
4568
4570{
4571 a.BERDecodeAsOctetString(in, MaxElementByteLength());
4572}
4573
4574const Integer& ModularArithmetic::Half(const Integer &a) const
4575{
4576 if (a.reg.size()==m_modulus.reg.size())
4577 {
4578 CryptoPP::DivideByPower2Mod(m_result.reg.begin(), a.reg, 1, m_modulus.reg, a.reg.size());
4579 return m_result;
4580 }
4581 else
4582 return m_result1 = (a.IsEven() ? (a >> 1) : ((a+m_modulus) >> 1));
4583}
4584
4585const Integer& ModularArithmetic::Add(const Integer &a, const Integer &b) const
4586{
4587 if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4588 {
4589 if (CryptoPP::Add(m_result.reg.begin(), a.reg, b.reg, a.reg.size())
4590 || Compare(m_result.reg, m_modulus.reg, a.reg.size()) >= 0)
4591 {
4592 CryptoPP::Subtract(m_result.reg.begin(), m_result.reg, m_modulus.reg, a.reg.size());
4593 }
4594 return m_result;
4595 }
4596 else
4597 {
4598 m_result1 = a+b;
4599 if (m_result1 >= m_modulus)
4600 m_result1 -= m_modulus;
4601 return m_result1;
4602 }
4603}
4604
4606{
4607 if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4608 {
4609 if (CryptoPP::Add(a.reg, a.reg, b.reg, a.reg.size())
4610 || Compare(a.reg, m_modulus.reg, a.reg.size()) >= 0)
4611 {
4612 CryptoPP::Subtract(a.reg, a.reg, m_modulus.reg, a.reg.size());
4613 }
4614 }
4615 else
4616 {
4617 a+=b;
4618 if (a>=m_modulus)
4619 a-=m_modulus;
4620 }
4621
4622 return a;
4623}
4624
4625const Integer& ModularArithmetic::Subtract(const Integer &a, const Integer &b) const
4626{
4627 if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4628 {
4629 if (CryptoPP::Subtract(m_result.reg.begin(), a.reg, b.reg, a.reg.size()))
4630 CryptoPP::Add(m_result.reg.begin(), m_result.reg, m_modulus.reg, a.reg.size());
4631 return m_result;
4632 }
4633 else
4634 {
4635 m_result1 = a-b;
4636 if (m_result1.IsNegative())
4637 m_result1 += m_modulus;
4638 return m_result1;
4639 }
4640}
4641
4643{
4644 if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4645 {
4646 if (CryptoPP::Subtract(a.reg, a.reg, b.reg, a.reg.size()))
4647 CryptoPP::Add(a.reg, a.reg, m_modulus.reg, a.reg.size());
4648 }
4649 else
4650 {
4651 a-=b;
4652 if (a.IsNegative())
4653 a+=m_modulus;
4654 }
4655
4656 return a;
4657}
4658
4659const Integer& ModularArithmetic::Inverse(const Integer &a) const
4660{
4661 if (!a)
4662 return a;
4663
4664 CopyWords(m_result.reg.begin(), m_modulus.reg, m_modulus.reg.size());
4665 if (CryptoPP::Subtract(m_result.reg.begin(), m_result.reg, a.reg, a.reg.size()))
4666 Decrement(m_result.reg.begin()+a.reg.size(), m_modulus.reg.size()-a.reg.size());
4667
4668 return m_result;
4669}
4670
4671Integer ModularArithmetic::CascadeExponentiate(const Integer &x, const Integer &e1, const Integer &y, const Integer &e2) const
4672{
4673 if (m_modulus.IsOdd())
4674 {
4675 MontgomeryRepresentation dr(m_modulus);
4676 return dr.ConvertOut(dr.CascadeExponentiate(dr.ConvertIn(x), e1, dr.ConvertIn(y), e2));
4677 }
4678 else
4679 return AbstractRing<Integer>::CascadeExponentiate(x, e1, y, e2);
4680}
4681
4682void ModularArithmetic::SimultaneousExponentiate(Integer *results, const Integer &base, const Integer *exponents, unsigned int exponentsCount) const
4683{
4684 if (m_modulus.IsOdd())
4685 {
4686 MontgomeryRepresentation dr(m_modulus);
4687 dr.SimultaneousExponentiate(results, dr.ConvertIn(base), exponents, exponentsCount);
4688 for (unsigned int i=0; i<exponentsCount; i++)
4689 results[i] = dr.ConvertOut(results[i]);
4690 }
4691 else
4692 AbstractRing<Integer>::SimultaneousExponentiate(results, base, exponents, exponentsCount);
4693}
4694
4695MontgomeryRepresentation::MontgomeryRepresentation(const Integer &m) // modulus must be odd
4696 : ModularArithmetic(m),
4697 m_u((word)0, m_modulus.reg.size()),
4698 m_workspace(5*m_modulus.reg.size())
4699{
4700 if (!m_modulus.IsOdd())
4701 throw InvalidArgument("MontgomeryRepresentation: Montgomery representation requires an odd modulus");
4702
4703 RecursiveInverseModPower2(m_u.reg, m_workspace, m_modulus.reg, m_modulus.reg.size());
4704}
4705
4706const Integer& MontgomeryRepresentation::Multiply(const Integer &a, const Integer &b) const
4707{
4708 word *const T = m_workspace.begin();
4709 word *const R = m_result.reg.begin();
4710 const size_t N = m_modulus.reg.size();
4711 CRYPTOPP_ASSERT(a.reg.size()<=N && b.reg.size()<=N);
4712
4713 AsymmetricMultiply(T, T+2*N, a.reg, a.reg.size(), b.reg, b.reg.size());
4714 SetWords(T+a.reg.size()+b.reg.size(), 0, 2*N-a.reg.size()-b.reg.size());
4715 MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4716 return m_result;
4717}
4718
4720{
4721 word *const T = m_workspace.begin();
4722 word *const R = m_result.reg.begin();
4723 const size_t N = m_modulus.reg.size();
4724 CRYPTOPP_ASSERT(a.reg.size()<=N);
4725
4726 CryptoPP::Square(T, T+2*N, a.reg, a.reg.size());
4727 SetWords(T+2*a.reg.size(), 0, 2*N-2*a.reg.size());
4728 MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4729 return m_result;
4730}
4731
4733{
4734 word *const T = m_workspace.begin();
4735 word *const R = m_result.reg.begin();
4736 const size_t N = m_modulus.reg.size();
4737 CRYPTOPP_ASSERT(a.reg.size()<=N);
4738
4739 CopyWords(T, a.reg, a.reg.size());
4740 SetWords(T+a.reg.size(), 0, 2*N-a.reg.size());
4741 MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4742 return m_result;
4743}
4744
4746{
4747// return (EuclideanMultiplicativeInverse(a, modulus)<<(2*WORD_BITS*modulus.reg.size()))%modulus;
4748 word *const T = m_workspace.begin();
4749 word *const R = m_result.reg.begin();
4750 const size_t N = m_modulus.reg.size();
4751 CRYPTOPP_ASSERT(a.reg.size()<=N);
4752
4753 CopyWords(T, a.reg, a.reg.size());
4754 SetWords(T+a.reg.size(), 0, 2*N-a.reg.size());
4755 MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4756 unsigned k = AlmostInverse(R, T, R, N, m_modulus.reg, N);
4757
4758// cout << "k=" << k << " N*32=" << 32*N << endl;
4759
4760 if (k>N*WORD_BITS)
4761 DivideByPower2Mod(R, R, k-N*WORD_BITS, m_modulus.reg, N);
4762 else
4763 MultiplyByPower2Mod(R, R, N*WORD_BITS-k, m_modulus.reg, N);
4764
4765 return m_result;
4766}
4767
4768// Specialization declared in misc.h to allow us to print integers
4769// with additional control options, like arbitrary bases and uppercase.
4770template <> CRYPTOPP_DLL
4771std::string IntToString<Integer>(Integer value, unsigned int base)
4772{
4773 // Hack... set the high bit for uppercase. Set the next bit fo a suffix.
4774 static const unsigned int BIT_32 = (1U << 31);
4775 const bool UPPER = !!(base & BIT_32);
4776 static const unsigned int BIT_31 = (1U << 30);
4777 const bool BASE = !!(base & BIT_31);
4778
4779 const char CH = UPPER ? 'A' : 'a';
4780 base &= ~(BIT_32|BIT_31);
4781 CRYPTOPP_ASSERT(base >= 2 && base <= 32);
4782
4783 if (value == 0)
4784 return "0";
4785
4786 bool negative = false, zero = false;
4787 if (value.IsNegative())
4788 {
4789 negative = true;
4790 value.Negate();
4791 }
4792
4793 if (!value)
4794 zero = true;
4795
4796 SecBlock<char> s(value.BitCount() / (SaturatingSubtract1(BitPrecision(base),1U)) + 1);
4797 Integer temp;
4798
4799 unsigned int i=0;
4800 while (!!value)
4801 {
4802 word digit;
4803 Integer::Divide(digit, temp, value, word(base));
4804 s[i++]=char((digit < 10 ? '0' : (CH - 10)) + digit);
4805 value.swap(temp);
4806 }
4807
4808 std::string result;
4809 result.reserve(i+2);
4810
4811 if (negative)
4812 result += '-';
4813
4814 if (zero)
4815 result += '0';
4816
4817 while (i--)
4818 result += s[i];
4819
4820 if (BASE)
4821 {
4822 if (base == 10)
4823 result += '.';
4824 else if (base == 16)
4825 result += 'h';
4826 else if (base == 8)
4827 result += 'o';
4828 else if (base == 2)
4829 result += 'b';
4830 }
4831
4832 return result;
4833}
4834
4835// Specialization declared in misc.h to avoid Coverity findings.
4836template <> CRYPTOPP_DLL
4837std::string IntToString<word64>(word64 value, unsigned int base)
4838{
4839 // Hack... set the high bit for uppercase.
4840 static const unsigned int HIGH_BIT = (1U << 31);
4841 const char CH = !!(base & HIGH_BIT) ? 'A' : 'a';
4842 base &= ~HIGH_BIT;
4843
4844 CRYPTOPP_ASSERT(base >= 2);
4845 if (value == 0)
4846 return "0";
4847
4848 std::string result;
4849 while (value > 0)
4850 {
4851 word64 digit = value % base;
4852 result = char((digit < 10 ? '0' : (CH - 10)) + digit) + result;
4853 value /= base;
4854 }
4855 return result;
4856}
4857
4858#ifndef CRYPTOPP_NO_ASSIGN_TO_INTEGER
4859// Allow the linker to discard Integer code if not needed.
4860// Also see http://github.com/weidai11/cryptopp/issues/389.
4861bool AssignIntToInteger(const std::type_info &valueType, void *pInteger, const void *pInt)
4862{
4863 if (valueType != typeid(Integer))
4864 return false;
4865 *reinterpret_cast<Integer *>(pInteger) = *reinterpret_cast<const int *>(pInt);
4866 return true;
4867}
4868#endif // CRYPTOPP_NO_ASSIGN_TO_INTEGER
4869
4870// *************************** C++ Static Initialization ***************************
4871
4872class InitInteger
4873{
4874public:
4875 InitInteger()
4876 {
4877 SetFunctionPointers();
4878 }
4879};
4880
4881// This is not really needed because each Integer can dynamically initialize
4882// itself, but we take a peephole optimization and initialize the class once
4883// if init priorities are available. Dynamic initialization will be used if
4884// init priorities are not available.
4885
4886#if defined(HAVE_GCC_INIT_PRIORITY)
4887 const InitInteger s_init __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 10))) = InitInteger();
4888 const Integer g_zero __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 11))) = Integer(0L);
4889 const Integer g_one __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 12))) = Integer(1L);
4890 const Integer g_two __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 13))) = Integer(2L);
4891#elif defined(HAVE_MSC_INIT_PRIORITY)
4892 #pragma warning(disable: 4075)
4893 #pragma init_seg(".CRT$XCU")
4894 const InitInteger s_init;
4895 const Integer g_zero(0L);
4896 const Integer g_one(1L);
4897 const Integer g_two(2L);
4898 #pragma warning(default: 4075)
4899#elif HAVE_XLC_INIT_PRIORITY
4900 // XLC needs constant, not a define
4901 #pragma priority(280)
4902 const InitInteger s_init;
4903 const Integer g_zero(0L);
4904 const Integer g_one(1L);
4905 const Integer g_two(2L);
4906#else
4907 const InitInteger s_init;
4908#endif
4909
4910// ***************** Library code ********************
4911
4912const Integer &Integer::Zero()
4913{
4914#if defined(HAVE_GCC_INIT_PRIORITY) || defined(HAVE_MSC_INIT_PRIORITY) || defined(HAVE_XLC_INIT_PRIORITY)
4915 return g_zero;
4916#elif defined(CRYPTOPP_CXX11_STATIC_INIT)
4917 static const Integer s_zero(0L);
4918 return s_zero;
4919#else // Potential memory leak. Avoid if possible.
4920 return Singleton<Integer, NewInteger<0L> >().Ref();
4921#endif
4922}
4923
4924const Integer &Integer::One()
4925{
4926#if defined(HAVE_GCC_INIT_PRIORITY) || defined(HAVE_MSC_INIT_PRIORITY) || defined(HAVE_XLC_INIT_PRIORITY)
4927 return g_one;
4928#elif defined(CRYPTOPP_CXX11_STATIC_INIT)
4929 static const Integer s_one(1L);
4930 return s_one;
4931#else // Potential memory leak. Avoid if possible.
4932 return Singleton<Integer, NewInteger<1L> >().Ref();
4933#endif
4934}
4935
4936const Integer &Integer::Two()
4937{
4938#if defined(HAVE_GCC_INIT_PRIORITY) || defined(HAVE_MSC_INIT_PRIORITY) || defined(HAVE_XLC_INIT_PRIORITY)
4939 return g_two;
4940#elif defined(CRYPTOPP_CXX11_STATIC_INIT)
4941 static const Integer s_two(2L);
4942 return s_two;
4943#else // Potential memory leak. Avoid if possible.
4944 return Singleton<Integer, NewInteger<2L> >().Ref();
4945#endif
4946}
4947
4948NAMESPACE_END
4949
4950#endif // CRYPTOPP_IMPORTS
#define MAYBE_UNCONST_CAST(T, x)
SunCC workaround.
Definition adv_simd.h:595
#define MAYBE_CONST
SunCC workaround.
Definition adv_simd.h:590
Classes for working with NameValuePairs.
AlgorithmParameters MakeParameters(const char *name, const T &value, bool throwIfNotUsed=true)
Create an object that implements NameValuePairs.
Definition algparam.h:508
Classes and functions for working with ANS.1 objects.
std::ostream & operator<<(std::ostream &out, const OID &oid)
Print a OID value.
OID operator+(const OID &lhs, unsigned long rhs)
Append a value to an OID.
CRYPTOPP_DLL size_t DEREncodeOctetString(BufferedTransformation &bt, const byte *str, size_t strLen)
DER encode octet string.
size_t DEREncodeUnsigned(BufferedTransformation &out, T w, byte asnTag=INTEGER)
DER Encode unsigned value.
Definition asn.h:821
@ OCTET_STRING
ASN.1 Octet string.
Definition asn.h:38
@ INTEGER
ASN.1 Integer.
Definition asn.h:34
void BERDecodeError()
Raises a BERDecodeErr.
Definition asn.h:104
virtual const Element & Gcd(const Element &a, const Element &b) const
Calculates the greatest common denominator in the ring.
Definition algebra.cpp:56
virtual void SimultaneousExponentiate(Element *results, const Element &base, const Integer *exponents, unsigned int exponentsCount) const
Exponentiates a base to multiple exponents in the Ring.
Definition algebra.cpp:334
virtual Element CascadeExponentiate(const Element &x, const Integer &e1, const Element &y, const Integer &e2) const
TODO.
Definition algebra.cpp:323
Copy input to a memory buffer.
Definition filters.h:1200
BER General Decoder.
Definition asn.h:381
BER Sequence Decoder.
Definition asn.h:526
Interface for buffered transformations.
Definition cryptlib.h:1657
virtual size_t Get(byte &outByte)
Retrieve a 8-bit byte.
size_t GetWord16(word16 &value, ByteOrder order=BIG_ENDIAN_ORDER)
Retrieve a 16-bit word.
virtual size_t Peek(byte &outByte) const
Peek a 8-bit byte.
size_t PutWord16(word16 value, ByteOrder order=BIG_ENDIAN_ORDER, bool blocking=true)
Input a 16-bit word for processing.
size_t Put(byte inByte, bool blocking=true)
Input a byte for processing.
Definition cryptlib.h:1678
virtual lword Skip(lword skipMax=LWORD_MAX)
Discard skipMax bytes from the output buffer.
virtual lword MaxRetrievable() const
Provides the number of bytes ready for retrieval.
Data structure used to store byte strings.
Definition queue.h:23
size_t Get(byte &outByte)
Retrieve a 8-bit byte.
lword MaxRetrievable() const
Provides the number of bytes ready for retrieval.
Definition queue.h:40
Used to pass byte array input as part of a NameValuePairs object.
Definition algparam.h:25
const byte * begin() const
Pointer to the first byte in the memory block.
Definition algparam.h:84
size_t size() const
Length of the memory block.
Definition algparam.h:88
DER General Encoder.
Definition asn.h:492
DER Sequence Encoder.
Definition asn.h:558
Euclidean domain.
Definition algebra.h:316
Exception thrown when division by 0 is encountered.
Definition integer.h:56
Exception thrown when a random number cannot be found that satisfies the condition.
Definition integer.h:64
Multiple precision integer with arithmetic operations.
Definition integer.h:50
static void Divide(Integer &r, Integer &q, const Integer &a, const Integer &d)
Extended Division.
void DEREncode(BufferedTransformation &bt) const
Encode in DER format.
Integer & operator>>=(size_t n)
Right-shift Assignment.
Integer & operator&=(const Integer &t)
Bitwise AND Assignment.
bool GetBit(size_t i) const
Provides the i-th bit of the Integer.
void SetByte(size_t n, byte value)
Set the n-th byte to value.
static void DivideByPowerOf2(Integer &r, Integer &q, const Integer &a, unsigned int n)
Extended Division.
bool IsPositive() const
Determines if the Integer is positive.
Definition integer.h:347
Integer Minus(const Integer &b) const
Subtraction.
signed long ConvertToLong() const
Convert the Integer to Long.
Integer operator-() const
Subtraction.
void SetBit(size_t n, bool value=1)
Set the n-th bit to value.
Integer & operator+=(const Integer &t)
Addition Assignment.
Integer And(const Integer &t) const
Bitwise AND.
bool IsSquare() const
Determine whether this integer is a perfect square.
Integer Plus(const Integer &b) const
Addition.
Integer DividedBy(const Integer &b) const
Division.
Integer & operator++()
Pre-increment.
void DEREncodeAsOctetString(BufferedTransformation &bt, size_t length) const
Encode absolute value as big-endian octet string.
void OpenPGPDecode(const byte *input, size_t inputLen)
Decode from OpenPGP format.
bool NotZero() const
Determines if the Integer is non-0.
Definition integer.h:338
Integer Times(const Integer &b) const
Multiplication.
void BERDecodeAsOctetString(BufferedTransformation &bt, size_t length)
Decode nonnegative value from big-endian octet string.
Integer & operator--()
Pre-decrement.
byte GetByte(size_t i) const
Provides the i-th byte of the Integer.
static const Integer & Zero()
Integer representing 0.
void Randomize(RandomNumberGenerator &rng, size_t bitCount)
Set this Integer to random integer.
bool IsConvertableToLong() const
Determines if the Integer is convertable to Long.
Integer Or(const Integer &t) const
Bitwise OR.
lword GetBits(size_t i, size_t n) const
Provides the low order bits of the Integer.
static Integer Power2(size_t e)
Exponentiates to a power of 2.
Integer Squared() const
Multiply this integer by itself.
Definition integer.h:634
Integer()
Creates the zero integer.
void BERDecode(const byte *input, size_t inputLen)
Decode from BER format.
size_t MinEncodedSize(Signedness sign=UNSIGNED) const
Minimum number of bytes to encode this integer.
unsigned int BitCount() const
Determines the number of bits required to represent the Integer.
Integer & operator|=(const Integer &t)
Bitwise OR Assignment.
void Negate()
Reverse the Sign of the Integer.
bool NotNegative() const
Determines if the Integer is non-negative.
Definition integer.h:344
unsigned int WordCount() const
Determines the number of words required to represent the Integer.
Integer & operator^=(const Integer &t)
Bitwise XOR Assignment.
Integer & operator=(const Integer &t)
Assignment.
Integer & operator-=(const Integer &t)
Subtraction Assignment.
RandomNumberType
Properties of a random integer.
Definition integer.h:91
@ ANY
a number with no special properties
Definition integer.h:93
@ PRIME
a number which is probabilistically prime
Definition integer.h:95
bool operator!() const
Negation.
Integer AbsoluteValue() const
Retrieve the absolute value of this integer.
@ SIGNED
a signed value
Definition integer.h:87
@ UNSIGNED
an unsigned value
Definition integer.h:85
Integer Xor(const Integer &t) const
Bitwise XOR.
int Compare(const Integer &a) const
Perform signed comparison.
Integer Modulo(const Integer &b) const
Remainder.
size_t OpenPGPEncode(byte *output, size_t bufferSize) const
Encode absolute value in OpenPGP format.
void swap(Integer &a)
Swaps this Integer with another Integer.
static const Integer & Two()
Integer representing 2.
bool IsZero() const
Determines if the Integer is 0.
Definition integer.h:335
Integer MultiplicativeInverse() const
Calculate multiplicative inverse.
bool GenerateRandomNoThrow(RandomNumberGenerator &rng, const NameValuePairs &params=g_nullNameValuePairs)
Generate a random number.
bool IsNegative() const
Determines if the Integer is negative.
Definition integer.h:341
void Decode(const byte *input, size_t inputLen, Signedness sign=UNSIGNED)
Decode from big-endian byte array.
Integer & operator<<=(size_t n)
Left-shift Assignment.
Sign
Used internally to represent the integer.
Definition integer.h:73
@ NEGATIVE
the value is negative
Definition integer.h:77
@ POSITIVE
the value is positive or 0
Definition integer.h:75
unsigned int ByteCount() const
Determines the number of bytes required to represent the Integer.
bool IsUnit() const
Determine if 1 or -1.
bool IsOdd() const
Determines if the Integer is odd parity.
Definition integer.h:356
static Integer Gcd(const Integer &a, const Integer &n)
Calculate greatest common divisor.
void Encode(byte *output, size_t outputLen, Signedness sign=UNSIGNED) const
Encode in big-endian format.
Integer InverseMod(const Integer &n) const
Calculate multiplicative inverse.
Integer SquareRoot() const
Extract square root.
static const Integer & One()
Integer representing 1.
bool IsEven() const
Determines if the Integer is even parity.
Definition integer.h:353
An invalid argument was detected.
Definition cryptlib.h:208
Ring of congruence classes modulo n.
Definition modarith.h:44
Integer & Reduce(Integer &a, const Integer &b) const
TODO.
ModularArithmetic(const Integer &modulus=Integer::One())
Construct a ModularArithmetic.
Definition modarith.h:54
const Integer & Half(const Integer &a) const
Divides an element by 2.
const Integer & Inverse(const Integer &a) const
Inverts the element in the ring.
void BERDecodeElement(BufferedTransformation &in, Element &a) const
Decodes element in DER format.
unsigned int MaxElementByteLength() const
Provides the maximum byte size of an element in the ring.
Definition modarith.h:248
void DEREncodeElement(BufferedTransformation &out, const Element &a) const
Encodes element in DER format.
void SimultaneousExponentiate(Element *results, const Element &base, const Integer *exponents, unsigned int exponentsCount) const
Exponentiates a base to multiple exponents in the ring.
Integer CascadeExponentiate(const Integer &x, const Integer &e1, const Integer &y, const Integer &e2) const
TODO.
Integer & Accumulate(Integer &a, const Integer &b) const
TODO.
const Integer & Subtract(const Integer &a, const Integer &b) const
Subtracts elements in the ring.
const Integer & Add(const Integer &a, const Integer &b) const
Adds elements in the ring.
void DEREncode(BufferedTransformation &bt) const
Encodes in DER format.
Performs modular arithmetic in Montgomery representation for increased speed.
Definition modarith.h:296
Integer ConvertOut(const Integer &a) const
Reduces an element in the congruence class.
const Integer & Square(const Integer &a) const
Square an element in the ring.
const Integer & Multiply(const Integer &a, const Integer &b) const
Multiplies elements in the ring.
MontgomeryRepresentation(const Integer &modulus)
Construct a MontgomeryRepresentation.
const Integer & MultiplicativeInverse(const Integer &a) const
Calculate the multiplicative inverse of an element in the ring.
Interface for retrieving values given their names.
Definition cryptlib.h:327
T GetValueWithDefault(const char *name, T defaultValue) const
Get a named value.
Definition cryptlib.h:397
bool GetValue(const char *name, T &value) const
Get a named value.
Definition cryptlib.h:384
CRYPTOPP_DLL bool GetIntValue(const char *name, int &value) const
Get a named value with type int.
Definition cryptlib.h:420
Object Identifier.
Definition asn.h:265
static void DeriveKey(byte *output, size_t outputLength, const byte *input, size_t inputLength, const byte *derivationParams, size_t derivationParamsLength)
P1363 key derivation function.
Definition pubkey.h:760
Application callback to signal suitability of a candidate prime.
Definition nbtheory.h:117
Interface for random number generators.
Definition cryptlib.h:1440
virtual void GenerateBlock(byte *output, size_t size)
Generate random array of bytes.
Secure memory block with allocator and cleanup.
Definition secblock.h:731
iterator begin()
Provides an iterator pointing to the first element in the memory block.
Definition secblock.h:836
void CleanNew(size_type newSize)
Change size without preserving contents.
Definition secblock.h:1143
void swap(SecBlock< T, A > &b)
Swap contents with another SecBlock.
Definition secblock.h:1209
void CleanGrow(size_type newSize)
Change size and preserve contents.
Definition secblock.h:1179
void New(size_type newSize)
Change size without preserving contents.
Definition secblock.h:1126
size_type size() const
Provides the count of elements in the SecBlock.
Definition secblock.h:867
void resize(size_type newSize)
Change size and preserve contents.
Definition secblock.h:1198
SecBlock typedef.
Definition secblock.h:1226
Restricts the instantiation of a class to one static object without locks.
Definition misc.h:309
Square block cipher.
Definition square.h:25
String-based implementation of Store interface.
Definition filters.h:1259
Pointer that overloads operator ->
Definition smartptr.h:38
Library configuration file.
#define CRYPTOPP_TABLE
Override for internal linkage.
Definition config_dll.h:111
unsigned char byte
8-bit unsigned datatype
Definition config_int.h:66
word64 word
Full word used for multiprecision integer arithmetic.
Definition config_int.h:192
const unsigned int WORD_BITS
Size of a platform word in bits.
Definition config_int.h:260
unsigned int word32
32-bit unsigned datatype
Definition config_int.h:72
unsigned short word16
16-bit unsigned datatype
Definition config_int.h:69
word128 dword
Double word used for multiprecision integer arithmetic.
Definition config_int.h:203
unsigned long long word64
64-bit unsigned datatype
Definition config_int.h:101
word32 hword
Half word used for multiprecision integer arithmetic.
Definition config_int.h:184
const unsigned int WORD_SIZE
Size of a platform word in bytes.
Definition config_int.h:255
word64 lword
Large word type.
Definition config_int.h:168
Functions for CPU features and intrinsics.
ByteOrder
Provides the byte ordering.
Definition cryptlib.h:148
@ LITTLE_ENDIAN_ORDER
byte order is little-endian
Definition cryptlib.h:150
@ BIG_ENDIAN_ORDER
byte order is big-endian
Definition cryptlib.h:152
Implementation of BufferedTransformation's attachment interface.
Multiple precision integer with arithmetic operations.
Utility functions for the Crypto++ library.
CRYPTOPP_DLL std::string IntToString< Integer >(Integer value, unsigned int base)
Converts an Integer to a string.
const T & STDMAX(const T &a, const T &b)
Replacement function for std::max.
Definition misc.h:668
unsigned int BitPrecision(const T &value)
Returns the number of bits required for a value.
Definition misc.h:1047
unsigned int BytePrecision(const T &value)
Returns the number of 8-bit bytes or octets required for a value.
Definition misc.h:1024
size_t BitsToWords(size_t bitCount)
Returns the number of words required for the specified number of bits.
Definition misc.h:1163
unsigned int TrailingZeros(word32 v)
Determines the number of trailing 0-bits in a value.
Definition misc.h:1072
void ConditionalSwapPointers(bool c, T &a, T &b)
Performs a branch-less swap of pointers a and b if condition c is true.
Definition misc.h:1567
T Crop(T value, size_t bits)
Truncates the value to the specified number of bits.
Definition misc.h:1131
bool IsPowerOf2(const T &value)
Tests whether a value is a power of 2.
Definition misc.h:1215
const T & STDMIN(const T &a, const T &b)
Replacement function for std::min.
Definition misc.h:657
size_t BitsToBytes(size_t bitCount)
Returns the number of 8-bit bytes or octets required for the specified number of bits.
Definition misc.h:1143
#define MEMORY_BARRIER
A memory barrier.
Definition misc.h:272
size_t BytesToWords(size_t byteCount)
Returns the number of words required for the specified number of bytes.
Definition misc.h:1153
CRYPTOPP_DLL std::string IntToString< word64 >(word64 value, unsigned int base)
Converts an unsigned value to a string.
void PutWord(bool assumeAligned, ByteOrder order, byte *block, T value, const byte *xorBlock=NULL)
Access a block of memory.
Definition misc.h:2948
T1 SaturatingSubtract1(const T1 &a, const T2 &b)
Performs a saturating subtract clamped at 1.
Definition misc.h:1319
Class file for performing modular arithmetic.
Crypto++ library namespace.
const char * PointerToPrimeSelector()
const PrimeSelector *
Definition argnames.h:42
const char * Seed()
ConstByteArrayParameter.
Definition argnames.h:19
Classes and functions for number theoretic operations.
CRYPTOPP_DLL bool FirstPrime(Integer &p, const Integer &max, const Integer &equiv, const Integer &mod, const PrimeSelector *pSelector)
Finds a random prime of special form.
ASN.1 object identifiers for algorithms and schemes.
Precompiled header file.
This file contains helper classes/functions for implementing public key algorithms.
Classes and functions for secure memory allocations.
Classes for SHA-1 and SHA-2 family of message digests.
Classes for automatic resource management.
Common C++ header files.
#define CRYPTOPP_ASSERT(exp)
Debugging and diagnostic assertion.
Definition trap.h:68
Support functions for word operations.
void ShiftWordsRightByWords(word *r, size_t n, size_t shiftWords)
Right shift word array.
Definition words.h:212
void XorWords(word *r, const word *a, const word *b, size_t n)
XOR word arrays.
Definition words.h:66
void SetWords(word *r, word a, size_t n)
Set the value of words.
Definition words.h:35
void OrWords(word *r, const word *a, const word *b, size_t n)
OR word arrays.
Definition words.h:120
word ShiftWordsLeftByBits(word *r, size_t n, unsigned int shiftBits)
Left shift word array.
Definition words.h:149
void ShiftWordsLeftByWords(word *r, size_t n, size_t shiftWords)
Left shift word array.
Definition words.h:194
size_t CountWords(const word *x, size_t n)
Count the number of words.
Definition words.h:21
word ShiftWordsRightByBits(word *r, size_t n, unsigned int shiftBits)
Right shift word array.
Definition words.h:172
void CopyWords(word *r, const word *a, size_t n)
Copy word array.
Definition words.h:48
void AndWords(word *r, const word *a, const word *b, size_t n)
AND word arrays.
Definition words.h:93