1 This is ../../gmp/doc/gmp.info, produced by makeinfo version 4.8 from
2 ../../gmp/doc/gmp.texi.
4 This manual describes how to install and use the GNU multiple
5 precision arithmetic library, version 5.0.1.
7 Copyright 1991, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000,
8 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 Free
9 Software Foundation, Inc.
11 Permission is granted to copy, distribute and/or modify this
12 document under the terms of the GNU Free Documentation License, Version
13 1.3 or any later version published by the Free Software Foundation;
14 with no Invariant Sections, with the Front-Cover Texts being "A GNU
15 Manual", and with the Back-Cover Texts being "You have freedom to copy
16 and modify this GNU Manual, like GNU software". A copy of the license
17 is included in *Note GNU Free Documentation License::.
19 INFO-DIR-SECTION GNU libraries
21 * gmp: (gmp). GNU Multiple Precision Arithmetic Library.
25 File: gmp.info, Node: Powering Algorithms, Next: Root Extraction Algorithms, Prev: Greatest Common Divisor Algorithms, Up: Algorithms
27 16.4 Powering Algorithms
28 ========================
32 * Normal Powering Algorithm::
33 * Modular Powering Algorithm::
36 File: gmp.info, Node: Normal Powering Algorithm, Next: Modular Powering Algorithm, Prev: Powering Algorithms, Up: Powering Algorithms
38 16.4.1 Normal Powering
39 ----------------------
41 Normal `mpz' or `mpf' powering uses a simple binary algorithm,
42 successively squaring and then multiplying by the base when a 1 bit is
43 seen in the exponent, as per Knuth section 4.6.3. The "left to right"
44 variant described there is used rather than algorithm A, since it's
45 just as easy and can be done with somewhat less temporary memory.
48 File: gmp.info, Node: Modular Powering Algorithm, Prev: Normal Powering Algorithm, Up: Powering Algorithms
50 16.4.2 Modular Powering
51 -----------------------
53 Modular powering is implemented using a 2^k-ary sliding window
54 algorithm, as per "Handbook of Applied Cryptography" algorithm 14.85
55 (*note References::). k is chosen according to the size of the
56 exponent. Larger exponents use larger values of k, the choice being
57 made to minimize the average number of multiplications that must
58 supplement the squaring.
60 The modular multiplies and squares use either a simple division or
61 the REDC method by Montgomery (*note References::). REDC is a little
62 faster, essentially saving N single limb divisions in a fashion similar
63 to an exact remainder (*note Exact Remainder::).
66 File: gmp.info, Node: Root Extraction Algorithms, Next: Radix Conversion Algorithms, Prev: Powering Algorithms, Up: Algorithms
68 16.5 Root Extraction Algorithms
69 ===============================
73 * Square Root Algorithm::
74 * Nth Root Algorithm::
75 * Perfect Square Algorithm::
76 * Perfect Power Algorithm::
79 File: gmp.info, Node: Square Root Algorithm, Next: Nth Root Algorithm, Prev: Root Extraction Algorithms, Up: Root Extraction Algorithms
84 Square roots are taken using the "Karatsuba Square Root" algorithm by
85 Paul Zimmermann (*note References::).
87 An input n is split into four parts of k bits each, so with b=2^k we
88 have n = a3*b^3 + a2*b^2 + a1*b + a0. Part a3 must be "normalized" so
89 that either the high or second highest bit is set. In GMP, k is kept
90 on a limb boundary and the input is left shifted (by an even number of
93 The square root of the high two parts is taken, by recursive
94 application of the algorithm (bottoming out in a one-limb Newton's
97 s1,r1 = sqrtrem (a3*b + a2)
99 This is an approximation to the desired root and is extended by a
100 division to give s,r,
102 q,u = divrem (r1*b + a1, 2*s1)
106 The normalization requirement on a3 means at this point s is either
107 correct or 1 too big. r is negative in the latter case, so
113 The algorithm is expressed in a divide and conquer form, but as
114 noted in the paper it can also be viewed as a discrete variant of
115 Newton's method, or as a variation on the schoolboy method (no longer
116 taught) for square roots two digits at a time.
118 If the remainder r is not required then usually only a few high limbs
119 of r and u need to be calculated to determine whether an adjustment to
120 s is required. This optimization is not currently implemented.
122 In the Karatsuba multiplication range this algorithm is
123 O(1.5*M(N/2)), where M(n) is the time to multiply two numbers of n
124 limbs. In the FFT multiplication range this grows to a bound of
125 O(6*M(N/2)). In practice a factor of about 1.5 to 1.8 is found in the
126 Karatsuba and Toom-3 ranges, growing to 2 or 3 in the FFT range.
128 The algorithm does all its calculations in integers and the resulting
129 `mpn_sqrtrem' is used for both `mpz_sqrt' and `mpf_sqrt'. The extended
130 precision given by `mpf_sqrt_ui' is obtained by padding with zero limbs.
133 File: gmp.info, Node: Nth Root Algorithm, Next: Perfect Square Algorithm, Prev: Square Root Algorithm, Up: Root Extraction Algorithms
138 Integer Nth roots are taken using Newton's method with the following
139 iteration, where A is the input and n is the root to be taken.
142 a[i+1] = - * ( --------- + (n-1)*a[i] )
145 The initial approximation a[1] is generated bitwise by successively
146 powering a trial root with or without new 1 bits, aiming to be just
147 above the true root. The iteration converges quadratically when
148 started from a good approximation. When n is large more initial bits
149 are needed to get good convergence. The current implementation is not
150 particularly well optimized.
153 File: gmp.info, Node: Perfect Square Algorithm, Next: Perfect Power Algorithm, Prev: Nth Root Algorithm, Up: Root Extraction Algorithms
155 16.5.3 Perfect Square
156 ---------------------
158 A significant fraction of non-squares can be quickly identified by
159 checking whether the input is a quadratic residue modulo small integers.
161 `mpz_perfect_square_p' first tests the input mod 256, which means
162 just examining the low byte. Only 44 different values occur for
163 squares mod 256, so 82.8% of inputs can be immediately identified as
166 On a 32-bit system similar tests are done mod 9, 5, 7, 13 and 17,
167 for a total 99.25% of inputs identified as non-squares. On a 64-bit
168 system 97 is tested too, for a total 99.62%.
170 These moduli are chosen because they're factors of 2^24-1 (or 2^48-1
171 for 64-bits), and such a remainder can be quickly taken just using
172 additions (see `mpn_mod_34lsub1').
174 When nails are in use moduli are instead selected by the `gen-psqr.c'
175 program and applied with an `mpn_mod_1'. The same 2^24-1 or 2^48-1
176 could be done with nails using some extra bit shifts, but this is not
177 currently implemented.
179 In any case each modulus is applied to the `mpn_mod_34lsub1' or
180 `mpn_mod_1' remainder and a table lookup identifies non-squares. By
181 using a "modexact" style calculation, and suitably permuted tables,
182 just one multiply each is required, see the code for details. Moduli
183 are also combined to save operations, so long as the lookup tables
184 don't become too big. `gen-psqr.c' does all the pre-calculations.
186 A square root must still be taken for any value that passes these
187 tests, to verify it's really a square and not one of the small fraction
188 of non-squares that get through (ie. a pseudo-square to all the tested
191 Clearly more residue tests could be done, `mpz_perfect_square_p' only
192 uses a compact and efficient set. Big inputs would probably benefit
193 from more residue testing, small inputs might be better off with less.
194 The assumed distribution of squares versus non-squares in the input
195 would affect such considerations.
198 File: gmp.info, Node: Perfect Power Algorithm, Prev: Perfect Square Algorithm, Up: Root Extraction Algorithms
203 Detecting perfect powers is required by some factorization algorithms.
204 Currently `mpz_perfect_power_p' is implemented using repeated Nth root
205 extractions, though naturally only prime roots need to be considered.
206 (*Note Nth Root Algorithm::.)
208 If a prime divisor p with multiplicity e can be found, then only
209 roots which are divisors of e need to be considered, much reducing the
210 work necessary. To this end divisibility by a set of small primes is
214 File: gmp.info, Node: Radix Conversion Algorithms, Next: Other Algorithms, Prev: Root Extraction Algorithms, Up: Algorithms
216 16.6 Radix Conversion
217 =====================
219 Radix conversions are less important than other algorithms. A program
220 dominated by conversions should probably use a different data
229 File: gmp.info, Node: Binary to Radix, Next: Radix to Binary, Prev: Radix Conversion Algorithms, Up: Radix Conversion Algorithms
231 16.6.1 Binary to Radix
232 ----------------------
234 Conversions from binary to a power-of-2 radix use a simple and fast
235 O(N) bit extraction algorithm.
237 Conversions from binary to other radices use one of two algorithms.
238 Sizes below `GET_STR_PRECOMPUTE_THRESHOLD' use a basic O(N^2) method.
239 Repeated divisions by b^n are made, where b is the radix and n is the
240 biggest power that fits in a limb. But instead of simply using the
241 remainder r from such divisions, an extra divide step is done to give a
242 fractional limb representing r/b^n. The digits of r can then be
243 extracted using multiplications by b rather than divisions. Special
244 case code is provided for decimal, allowing multiplications by 10 to
245 optimize to shifts and adds.
247 Above `GET_STR_PRECOMPUTE_THRESHOLD' a sub-quadratic algorithm is
248 used. For an input t, powers b^(n*2^i) of the radix are calculated,
249 until a power between t and sqrt(t) is reached. t is then divided by
250 that largest power, giving a quotient which is the digits above that
251 power, and a remainder which is those below. These two parts are in
252 turn divided by the second highest power, and so on recursively. When
253 a piece has been divided down to less than `GET_STR_DC_THRESHOLD'
254 limbs, the basecase algorithm described above is used.
256 The advantage of this algorithm is that big divisions can make use
257 of the sub-quadratic divide and conquer division (*note Divide and
258 Conquer Division::), and big divisions tend to have less overheads than
259 lots of separate single limb divisions anyway. But in any case the
260 cost of calculating the powers b^(n*2^i) must first be overcome.
262 `GET_STR_PRECOMPUTE_THRESHOLD' and `GET_STR_DC_THRESHOLD' represent
263 the same basic thing, the point where it becomes worth doing a big
264 division to cut the input in half. `GET_STR_PRECOMPUTE_THRESHOLD'
265 includes the cost of calculating the radix power required, whereas
266 `GET_STR_DC_THRESHOLD' assumes that's already available, which is the
269 Since the base case produces digits from least to most significant
270 but they want to be stored from most to least, it's necessary to
271 calculate in advance how many digits there will be, or at least be sure
272 not to underestimate that. For GMP the number of input bits is
273 multiplied by `chars_per_bit_exactly' from `mp_bases', rounding up.
274 The result is either correct or one too big.
276 Examining some of the high bits of the input could increase the
277 chance of getting the exact number of digits, but an exact result every
278 time would not be practical, since in general the difference between
279 numbers 100... and 99... is only in the last few bits and the work to
280 identify 99... might well be almost as much as a full conversion.
282 `mpf_get_str' doesn't currently use the algorithm described here, it
283 multiplies or divides by a power of b to move the radix point to the
284 just above the highest non-zero digit (or at worst one above that
285 location), then multiplies by b^n to bring out digits. This is O(N^2)
286 and is certainly not optimal.
288 The r/b^n scheme described above for using multiplications to bring
289 out digits might be useful for more than a single limb. Some brief
290 experiments with it on the base case when recursing didn't give a
291 noticeable improvement, but perhaps that was only due to the
292 implementation. Something similar would work for the sub-quadratic
293 divisions too, though there would be the cost of calculating a bigger
296 Another possible improvement for the sub-quadratic part would be to
297 arrange for radix powers that balanced the sizes of quotient and
298 remainder produced, ie. the highest power would be an b^(n*k)
299 approximately equal to sqrt(t), not restricted to a 2^i factor. That
300 ought to smooth out a graph of times against sizes, but may or may not
304 File: gmp.info, Node: Radix to Binary, Prev: Binary to Radix, Up: Radix Conversion Algorithms
306 16.6.2 Radix to Binary
307 ----------------------
309 *This section needs to be rewritten, it currently describes the
310 algorithms used before GMP 4.3.*
312 Conversions from a power-of-2 radix into binary use a simple and fast
313 O(N) bitwise concatenation algorithm.
315 Conversions from other radices use one of two algorithms. Sizes
316 below `SET_STR_PRECOMPUTE_THRESHOLD' use a basic O(N^2) method. Groups
317 of n digits are converted to limbs, where n is the biggest power of the
318 base b which will fit in a limb, then those groups are accumulated into
319 the result by multiplying by b^n and adding. This saves
320 multi-precision operations, as per Knuth section 4.4 part E (*note
321 References::). Some special case code is provided for decimal, giving
322 the compiler a chance to optimize multiplications by 10.
324 Above `SET_STR_PRECOMPUTE_THRESHOLD' a sub-quadratic algorithm is
325 used. First groups of n digits are converted into limbs. Then adjacent
326 limbs are combined into limb pairs with x*b^n+y, where x and y are the
327 limbs. Adjacent limb pairs are combined into quads similarly with
328 x*b^(2n)+y. This continues until a single block remains, that being
331 The advantage of this method is that the multiplications for each x
332 are big blocks, allowing Karatsuba and higher algorithms to be used.
333 But the cost of calculating the powers b^(n*2^i) must be overcome.
334 `SET_STR_PRECOMPUTE_THRESHOLD' usually ends up quite big, around 5000
335 digits, and on some processors much bigger still.
337 `SET_STR_PRECOMPUTE_THRESHOLD' is based on the input digits (and
338 tuned for decimal), though it might be better based on a limb count, so
339 as to be independent of the base. But that sort of count isn't used by
340 the base case and so would need some sort of initial calculation or
343 The main reason `SET_STR_PRECOMPUTE_THRESHOLD' is so much bigger
344 than the corresponding `GET_STR_PRECOMPUTE_THRESHOLD' is that
345 `mpn_mul_1' is much faster than `mpn_divrem_1' (often by a factor of 5,
349 File: gmp.info, Node: Other Algorithms, Next: Assembly Coding, Prev: Radix Conversion Algorithms, Up: Algorithms
351 16.7 Other Algorithms
352 =====================
356 * Prime Testing Algorithm::
357 * Factorial Algorithm::
358 * Binomial Coefficients Algorithm::
359 * Fibonacci Numbers Algorithm::
360 * Lucas Numbers Algorithm::
361 * Random Number Algorithms::
364 File: gmp.info, Node: Prime Testing Algorithm, Next: Factorial Algorithm, Prev: Other Algorithms, Up: Other Algorithms
369 The primality testing in `mpz_probab_prime_p' (*note Number Theoretic
370 Functions::) first does some trial division by small factors and then
371 uses the Miller-Rabin probabilistic primality testing algorithm, as
372 described in Knuth section 4.5.4 algorithm P (*note References::).
374 For an odd input n, and with n = q*2^k+1 where q is odd, this
375 algorithm selects a random base x and tests whether x^q mod n is 1 or
376 -1, or an x^(q*2^j) mod n is 1, for 1<=j<=k. If so then n is probably
377 prime, if not then n is definitely composite.
379 Any prime n will pass the test, but some composites do too. Such
380 composites are known as strong pseudoprimes to base x. No n is a
381 strong pseudoprime to more than 1/4 of all bases (see Knuth exercise
382 22), hence with x chosen at random there's no more than a 1/4 chance a
383 "probable prime" will in fact be composite.
385 In fact strong pseudoprimes are quite rare, making the test much more
386 powerful than this analysis would suggest, but 1/4 is all that's proven
390 File: gmp.info, Node: Factorial Algorithm, Next: Binomial Coefficients Algorithm, Prev: Prime Testing Algorithm, Up: Other Algorithms
395 Factorials are calculated by a combination of removal of twos,
396 powering, and binary splitting. The procedure can be best illustrated
399 23! = 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.16.17.18.19.20.21.22.23
401 has factors of two removed,
403 23! = 2^19.1.1.3.1.5.3.7.1.9.5.11.3.13.7.15.1.17.9.19.5.21.11.23
405 and the resulting terms collected up according to their multiplicity,
407 23! = 2^19.(3.5)^3.(7.9.11)^2.(13.15.17.19.21.23)
409 Each sequence such as 13.15.17.19.21.23 is evaluated by splitting
410 into every second term, as for instance (13.17.21).(15.19.23), and the
411 same recursively on each half. This is implemented iteratively using
414 Such splitting is more efficient than repeated Nx1 multiplies since
415 it forms big multiplies, allowing Karatsuba and higher algorithms to be
416 used. And even below the Karatsuba threshold a big block of work can
417 be more efficient for the basecase algorithm.
419 Splitting into subsequences of every second term keeps the resulting
420 products more nearly equal in size than would the simpler approach of
421 say taking the first half and second half of the sequence. Nearly
422 equal products are more efficient for the current multiply
426 File: gmp.info, Node: Binomial Coefficients Algorithm, Next: Fibonacci Numbers Algorithm, Prev: Factorial Algorithm, Up: Other Algorithms
428 16.7.3 Binomial Coefficients
429 ----------------------------
431 Binomial coefficients C(n,k) are calculated by first arranging k <= n/2
432 using C(n,k) = C(n,n-k) if necessary, and then evaluating the following
433 product simply from i=2 to i=k.
436 C(n,k) = (n-k+1) * prod -------
439 It's easy to show that each denominator i will divide the product so
440 far, so the exact division algorithm is used (*note Exact Division::).
442 The numerators n-k+i and denominators i are first accumulated into
443 as many fit a limb, to save multi-precision operations, though for
444 `mpz_bin_ui' this applies only to the divisors, since n is an `mpz_t'
445 and n-k+i in general won't fit in a limb at all.
448 File: gmp.info, Node: Fibonacci Numbers Algorithm, Next: Lucas Numbers Algorithm, Prev: Binomial Coefficients Algorithm, Up: Other Algorithms
450 16.7.4 Fibonacci Numbers
451 ------------------------
453 The Fibonacci functions `mpz_fib_ui' and `mpz_fib2_ui' are designed for
454 calculating isolated F[n] or F[n],F[n-1] values efficiently.
456 For small n, a table of single limb values in `__gmp_fib_table' is
457 used. On a 32-bit limb this goes up to F[47], or on a 64-bit limb up
458 to F[93]. For convenience the table starts at F[-1].
460 Beyond the table, values are generated with a binary powering
461 algorithm, calculating a pair F[n] and F[n-1] working from high to low
462 across the bits of n. The formulas used are
464 F[2k+1] = 4*F[k]^2 - F[k-1]^2 + 2*(-1)^k
465 F[2k-1] = F[k]^2 + F[k-1]^2
467 F[2k] = F[2k+1] - F[2k-1]
469 At each step, k is the high b bits of n. If the next bit of n is 0
470 then F[2k],F[2k-1] is used, or if it's a 1 then F[2k+1],F[2k] is used,
471 and the process repeated until all bits of n are incorporated. Notice
472 these formulas require just two squares per bit of n.
474 It'd be possible to handle the first few n above the single limb
475 table with simple additions, using the defining Fibonacci recurrence
476 F[k+1]=F[k]+F[k-1], but this is not done since it usually turns out to
477 be faster for only about 10 or 20 values of n, and including a block of
478 code for just those doesn't seem worthwhile. If they really mattered
479 it'd be better to extend the data table.
481 Using a table avoids lots of calculations on small numbers, and
482 makes small n go fast. A bigger table would make more small n go fast,
483 it's just a question of balancing size against desired speed. For GMP
484 the code is kept compact, with the emphasis primarily on a good
487 `mpz_fib2_ui' returns both F[n] and F[n-1], but `mpz_fib_ui' is only
488 interested in F[n]. In this case the last step of the algorithm can
489 become one multiply instead of two squares. One of the following two
490 formulas is used, according as n is odd or even.
492 F[2k] = F[k]*(F[k]+2F[k-1])
494 F[2k+1] = (2F[k]+F[k-1])*(2F[k]-F[k-1]) + 2*(-1)^k
496 F[2k+1] here is the same as above, just rearranged to be a multiply.
497 For interest, the 2*(-1)^k term both here and above can be applied
498 just to the low limb of the calculation, without a carry or borrow into
499 further limbs, which saves some code size. See comments with
500 `mpz_fib_ui' and the internal `mpn_fib2_ui' for how this is done.
503 File: gmp.info, Node: Lucas Numbers Algorithm, Next: Random Number Algorithms, Prev: Fibonacci Numbers Algorithm, Up: Other Algorithms
508 `mpz_lucnum2_ui' derives a pair of Lucas numbers from a pair of
509 Fibonacci numbers with the following simple formulas.
511 L[k] = F[k] + 2*F[k-1]
512 L[k-1] = 2*F[k] - F[k-1]
514 `mpz_lucnum_ui' is only interested in L[n], and some work can be
515 saved. Trailing zero bits on n can be handled with a single square
518 L[2k] = L[k]^2 - 2*(-1)^k
520 And the lowest 1 bit can be handled with one multiply of a pair of
521 Fibonacci numbers, similar to what `mpz_fib_ui' does.
523 L[2k+1] = 5*F[k-1]*(2*F[k]+F[k-1]) - 4*(-1)^k
526 File: gmp.info, Node: Random Number Algorithms, Prev: Lucas Numbers Algorithm, Up: Other Algorithms
528 16.7.6 Random Numbers
529 ---------------------
531 For the `urandomb' functions, random numbers are generated simply by
532 concatenating bits produced by the generator. As long as the generator
533 has good randomness properties this will produce well-distributed N bit
536 For the `urandomm' functions, random numbers in a range 0<=R<N are
537 generated by taking values R of ceil(log2(N)) bits each until one
538 satisfies R<N. This will normally require only one or two attempts,
539 but the attempts are limited in case the generator is somehow
540 degenerate and produces only 1 bits or similar.
542 The Mersenne Twister generator is by Matsumoto and Nishimura (*note
543 References::). It has a non-repeating period of 2^19937-1, which is a
544 Mersenne prime, hence the name of the generator. The state is 624
545 words of 32-bits each, which is iterated with one XOR and shift for each
546 32-bit word generated, making the algorithm very fast. Randomness
547 properties are also very good and this is the default algorithm used by
550 Linear congruential generators are described in many text books, for
551 instance Knuth volume 2 (*note References::). With a modulus M and
552 parameters A and C, a integer state S is iterated by the formula S <-
553 A*S+C mod M. At each step the new state is a linear function of the
554 previous, mod M, hence the name of the generator.
556 In GMP only moduli of the form 2^N are supported, and the current
557 implementation is not as well optimized as it could be. Overheads are
558 significant when N is small, and when N is large clearly the multiply
559 at each step will become slow. This is not a big concern, since the
560 Mersenne Twister generator is better in every respect and is therefore
561 recommended for all normal applications.
563 For both generators the current state can be deduced by observing
564 enough output and applying some linear algebra (over GF(2) in the case
565 of the Mersenne Twister). This generally means raw output is
566 unsuitable for cryptographic applications without further hashing or
570 File: gmp.info, Node: Assembly Coding, Prev: Other Algorithms, Up: Algorithms
575 The assembly subroutines in GMP are the most significant source of
576 speed at small to moderate sizes. At larger sizes algorithm selection
577 becomes more important, but of course speedups in low level routines
578 will still speed up everything proportionally.
580 Carry handling and widening multiplies that are important for GMP
581 can't be easily expressed in C. GCC `asm' blocks help a lot and are
582 provided in `longlong.h', but hand coding low level routines invariably
583 offers a speedup over generic C by a factor of anything from 2 to 10.
587 * Assembly Code Organisation::
589 * Assembly Carry Propagation::
590 * Assembly Cache Handling::
591 * Assembly Functional Units::
592 * Assembly Floating Point::
593 * Assembly SIMD Instructions::
594 * Assembly Software Pipelining::
595 * Assembly Loop Unrolling::
596 * Assembly Writing Guide::
599 File: gmp.info, Node: Assembly Code Organisation, Next: Assembly Basics, Prev: Assembly Coding, Up: Assembly Coding
601 16.8.1 Code Organisation
602 ------------------------
604 The various `mpn' subdirectories contain machine-dependent code, written
605 in C or assembly. The `mpn/generic' subdirectory contains default code,
606 used when there's no machine-specific version of a particular file.
608 Each `mpn' subdirectory is for an ISA family. Generally 32-bit and
609 64-bit variants in a family cannot share code and have separate
610 directories. Within a family further subdirectories may exist for CPU
613 In each directory a `nails' subdirectory may exist, holding code with
614 nails support for that CPU variant. A `NAILS_SUPPORT' directive in each
615 file indicates the nails values the code handles. Nails code only
616 exists where it's faster, or promises to be faster, than plain code.
617 There's no effort put into nails if they're not going to enhance a
621 File: gmp.info, Node: Assembly Basics, Next: Assembly Carry Propagation, Prev: Assembly Code Organisation, Up: Assembly Coding
623 16.8.2 Assembly Basics
624 ----------------------
626 `mpn_addmul_1' and `mpn_submul_1' are the most important routines for
627 overall GMP performance. All multiplications and divisions come down to
628 repeated calls to these. `mpn_add_n', `mpn_sub_n', `mpn_lshift' and
629 `mpn_rshift' are next most important.
631 On some CPUs assembly versions of the internal functions
632 `mpn_mul_basecase' and `mpn_sqr_basecase' give significant speedups,
633 mainly through avoiding function call overheads. They can also
634 potentially make better use of a wide superscalar processor, as can
635 bigger primitives like `mpn_addmul_2' or `mpn_addmul_4'.
637 The restrictions on overlaps between sources and destinations (*note
638 Low-level Functions::) are designed to facilitate a variety of
639 implementations. For example, knowing `mpn_add_n' won't have partly
640 overlapping sources and destination means reading can be done far ahead
641 of writing on superscalar processors, and loops can be vectorized on a
642 vector processor, depending on the carry handling.
645 File: gmp.info, Node: Assembly Carry Propagation, Next: Assembly Cache Handling, Prev: Assembly Basics, Up: Assembly Coding
647 16.8.3 Carry Propagation
648 ------------------------
650 The problem that presents most challenges in GMP is propagating carries
651 from one limb to the next. In functions like `mpn_addmul_1' and
652 `mpn_add_n', carries are the only dependencies between limb operations.
654 On processors with carry flags, a straightforward CISC style `adc' is
655 generally best. AMD K6 `mpn_addmul_1' however is an example of an
656 unusual set of circumstances where a branch works out better.
658 On RISC processors generally an add and compare for overflow is
659 used. This sort of thing can be seen in `mpn/generic/aors_n.c'. Some
660 carry propagation schemes require 4 instructions, meaning at least 4
661 cycles per limb, but other schemes may use just 1 or 2. On wide
662 superscalar processors performance may be completely determined by the
663 number of dependent instructions between carry-in and carry-out for
666 On vector processors good use can be made of the fact that a carry
667 bit only very rarely propagates more than one limb. When adding a
668 single bit to a limb, there's only a carry out if that limb was
669 `0xFF...FF' which on random data will be only 1 in 2^mp_bits_per_limb.
670 `mpn/cray/add_n.c' is an example of this, it adds all limbs in
671 parallel, adds one set of carry bits in parallel and then only rarely
672 needs to fall through to a loop propagating further carries.
674 On the x86s, GCC (as of version 2.95.2) doesn't generate
675 particularly good code for the RISC style idioms that are necessary to
676 handle carry bits in C. Often conditional jumps are generated where
677 `adc' or `sbb' forms would be better. And so unfortunately almost any
678 loop involving carry bits needs to be coded in assembly for best
682 File: gmp.info, Node: Assembly Cache Handling, Next: Assembly Functional Units, Prev: Assembly Carry Propagation, Up: Assembly Coding
684 16.8.4 Cache Handling
685 ---------------------
687 GMP aims to perform well both on operands that fit entirely in L1 cache
688 and those which don't.
690 Basic routines like `mpn_add_n' or `mpn_lshift' are often used on
691 large operands, so L2 and main memory performance is important for them.
692 `mpn_mul_1' and `mpn_addmul_1' are mostly used for multiply and square
693 basecases, so L1 performance matters most for them, unless assembly
694 versions of `mpn_mul_basecase' and `mpn_sqr_basecase' exist, in which
695 case the remaining uses are mostly for larger operands.
697 For L2 or main memory operands, memory access times will almost
698 certainly be more than the calculation time. The aim therefore is to
699 maximize memory throughput, by starting a load of the next cache line
700 while processing the contents of the previous one. Clearly this is
701 only possible if the chip has a lock-up free cache or some sort of
702 prefetch instruction. Most current chips have both these features.
704 Prefetching sources combines well with loop unrolling, since a
705 prefetch can be initiated once per unrolled loop (or more than once if
706 the loop covers more than one cache line).
708 On CPUs without write-allocate caches, prefetching destinations will
709 ensure individual stores don't go further down the cache hierarchy,
710 limiting bandwidth. Of course for calculations which are slow anyway,
711 like `mpn_divrem_1', write-throughs might be fine.
713 The distance ahead to prefetch will be determined by memory latency
714 versus throughput. The aim of course is to have data arriving
715 continuously, at peak throughput. Some CPUs have limits on the number
716 of fetches or prefetches in progress.
718 If a special prefetch instruction doesn't exist then a plain load
719 can be used, but in that case care must be taken not to attempt to read
720 past the end of an operand, since that might produce a segmentation
723 Some CPUs or systems have hardware that detects sequential memory
724 accesses and initiates suitable cache movements automatically, making
728 File: gmp.info, Node: Assembly Functional Units, Next: Assembly Floating Point, Prev: Assembly Cache Handling, Up: Assembly Coding
730 16.8.5 Functional Units
731 -----------------------
733 When choosing an approach for an assembly loop, consideration is given
734 to what operations can execute simultaneously and what throughput can
735 thereby be achieved. In some cases an algorithm can be tweaked to
736 accommodate available resources.
738 Loop control will generally require a counter and pointer updates,
739 costing as much as 5 instructions, plus any delays a branch introduces.
740 CPU addressing modes might reduce pointer updates, perhaps by allowing
741 just one updating pointer and others expressed as offsets from it, or
742 on CISC chips with all addressing done with the loop counter as a
745 The final loop control cost can be amortised by processing several
746 limbs in each iteration (*note Assembly Loop Unrolling::). This at
747 least ensures loop control isn't a big fraction the work done.
749 Memory throughput is always a limit. If perhaps only one load or
750 one store can be done per cycle then 3 cycles/limb will the top speed
751 for "binary" operations like `mpn_add_n', and any code achieving that
754 Integer resources can be freed up by having the loop counter in a
755 float register, or by pressing the float units into use for some
756 multiplying, perhaps doing every second limb on the float side (*note
757 Assembly Floating Point::).
759 Float resources can be freed up by doing carry propagation on the
760 integer side, or even by doing integer to float conversions in integers
764 File: gmp.info, Node: Assembly Floating Point, Next: Assembly SIMD Instructions, Prev: Assembly Functional Units, Up: Assembly Coding
766 16.8.6 Floating Point
767 ---------------------
769 Floating point arithmetic is used in GMP for multiplications on CPUs
770 with poor integer multipliers. It's mostly useful for `mpn_mul_1',
771 `mpn_addmul_1' and `mpn_submul_1' on 64-bit machines, and
772 `mpn_mul_basecase' on both 32-bit and 64-bit machines.
774 With IEEE 53-bit double precision floats, integer multiplications
775 producing up to 53 bits will give exact results. Breaking a 64x64
776 multiplication into eight 16x32->48 bit pieces is convenient. With
777 some care though six 21x32->53 bit products can be used, if one of the
778 lower two 21-bit pieces also uses the sign bit.
780 For the `mpn_mul_1' family of functions on a 64-bit machine, the
781 invariant single limb is split at the start, into 3 or 4 pieces.
782 Inside the loop, the bignum operand is split into 32-bit pieces. Fast
783 conversion of these unsigned 32-bit pieces to floating point is highly
784 machine-dependent. In some cases, reading the data into the integer
785 unit, zero-extending to 64-bits, then transferring to the floating
786 point unit back via memory is the only option.
788 Converting partial products back to 64-bit limbs is usually best
789 done as a signed conversion. Since all values are smaller than 2^53,
790 signed and unsigned are the same, but most processors lack unsigned
795 Here is a diagram showing 16x32 bit products for an `mpn_mul_1' or
796 `mpn_addmul_1' with a 64-bit limb. The single limb operand V is split
797 into four 16-bit parts. The multi-limb operand U is split in the loop
798 into two 32-bit parts.
801 |v48|v32|v16|v00| V operand
805 x | u32 | u00 | U operand (one limb)
808 ---------------------------------
811 | u00 x v00 | p00 48-bit products
835 p32 and r32 can be summed using floating-point addition, and
836 likewise p48 and r48. p00 and p16 can be summed with r64 and r80 from
837 the previous iteration.
839 For each loop then, four 49-bit quantities are transferred to the
840 integer unit, aligned as follows,
842 |-----64bits----|-----64bits----|
856 The challenge then is to sum these efficiently and add in a carry
857 limb, generating a low 64-bit result limb and a high 33-bit carry limb
858 (i48 extends 33 bits into the high half).
861 File: gmp.info, Node: Assembly SIMD Instructions, Next: Assembly Software Pipelining, Prev: Assembly Floating Point, Up: Assembly Coding
863 16.8.7 SIMD Instructions
864 ------------------------
866 The single-instruction multiple-data support in current microprocessors
867 is aimed at signal processing algorithms where each data point can be
868 treated more or less independently. There's generally not much support
869 for propagating the sort of carries that arise in GMP.
871 SIMD multiplications of say four 16x16 bit multiplies only do as much
872 work as one 32x32 from GMP's point of view, and need some shifts and
873 adds besides. But of course if say the SIMD form is fully pipelined
874 and uses less instruction decoding then it may still be worthwhile.
876 On the x86 chips, MMX has so far found a use in `mpn_rshift' and
877 `mpn_lshift', and is used in a special case for 16-bit multipliers in
878 the P55 `mpn_mul_1'. SSE2 is used for Pentium 4 `mpn_mul_1',
879 `mpn_addmul_1', and `mpn_submul_1'.
882 File: gmp.info, Node: Assembly Software Pipelining, Next: Assembly Loop Unrolling, Prev: Assembly SIMD Instructions, Up: Assembly Coding
884 16.8.8 Software Pipelining
885 --------------------------
887 Software pipelining consists of scheduling instructions around the
888 branch point in a loop. For example a loop might issue a load not for
889 use in the present iteration but the next, thereby allowing extra
890 cycles for the data to arrive from memory.
892 Naturally this is wanted only when doing things like loads or
893 multiplies that take several cycles to complete, and only where a CPU
894 has multiple functional units so that other work can be done in the
897 A pipeline with several stages will have a data value in progress at
898 each stage and each loop iteration moves them along one stage. This is
901 If the latency of some instruction is greater than the loop time
902 then it will be necessary to unroll, so one register has a result ready
903 to use while another (or multiple others) are still in progress.
904 (*note Assembly Loop Unrolling::).
907 File: gmp.info, Node: Assembly Loop Unrolling, Next: Assembly Writing Guide, Prev: Assembly Software Pipelining, Up: Assembly Coding
909 16.8.9 Loop Unrolling
910 ---------------------
912 Loop unrolling consists of replicating code so that several limbs are
913 processed in each loop. At a minimum this reduces loop overheads by a
914 corresponding factor, but it can also allow better register usage, for
915 example alternately using one register combination and then another.
916 Judicious use of `m4' macros can help avoid lots of duplication in the
919 Any amount of unrolling can be handled with a loop counter that's
920 decremented by N each time, stopping when the remaining count is less
921 than the further N the loop will process. Or by subtracting N at the
922 start, the termination condition becomes when the counter C is less
923 than 0 (and the count of remaining limbs is C+N).
925 Alternately for a power of 2 unroll the loop count and remainder can
926 be established with a shift and mask. This is convenient if also
927 making a computed jump into the middle of a large loop.
929 The limbs not a multiple of the unrolling can be handled in various
932 * A simple loop at the end (or the start) to process the excess.
933 Care will be wanted that it isn't too much slower than the
936 * A set of binary tests, for example after an 8-limb unrolling, test
937 for 4 more limbs to process, then a further 2 more or not, and
938 finally 1 more or not. This will probably take more code space
941 * A `switch' statement, providing separate code for each possible
942 excess, for example an 8-limb unrolling would have separate code
943 for 0 remaining, 1 remaining, etc, up to 7 remaining. This might
944 take a lot of code, but may be the best way to optimize all cases
945 in combination with a deep pipelined loop.
947 * A computed jump into the middle of the loop, thus making the first
948 iteration handle the excess. This should make times smoothly
949 increase with size, which is attractive, but setups for the jump
950 and adjustments for pointers can be tricky and could become quite
951 difficult in combination with deep pipelining.
954 File: gmp.info, Node: Assembly Writing Guide, Prev: Assembly Loop Unrolling, Up: Assembly Coding
956 16.8.10 Writing Guide
957 ---------------------
959 This is a guide to writing software pipelined loops for processing limb
962 First determine the algorithm and which instructions are needed.
963 Code it without unrolling or scheduling, to make sure it works. On a
964 3-operand CPU try to write each new value to a new register, this will
965 greatly simplify later steps.
967 Then note for each instruction the functional unit and/or issue port
968 requirements. If an instruction can use either of two units, like U0
969 or U1 then make a category "U0/U1". Count the total using each unit
970 (or combined unit), and count all instructions.
972 Figure out from those counts the best possible loop time. The goal
973 will be to find a perfect schedule where instruction latencies are
974 completely hidden. The total instruction count might be the limiting
975 factor, or perhaps a particular functional unit. It might be possible
976 to tweak the instructions to help the limiting factor.
978 Suppose the loop time is N, then make N issue buckets, with the
979 final loop branch at the end of the last. Now fill the buckets with
980 dummy instructions using the functional units desired. Run this to
981 make sure the intended speed is reached.
983 Now replace the dummy instructions with the real instructions from
984 the slow but correct loop you started with. The first will typically
985 be a load instruction. Then the instruction using that value is placed
986 in a bucket an appropriate distance down. Run the loop again, to check
987 it still runs at target speed.
989 Keep placing instructions, frequently measuring the loop. After a
990 few you will need to wrap around from the last bucket back to the top
991 of the loop. If you used the new-register for new-value strategy above
992 then there will be no register conflicts. If not then take care not to
993 clobber something already in use. Changing registers at this time is
996 The loop will overlap two or more of the original loop iterations,
997 and the computation of one vector element result will be started in one
998 iteration of the new loop, and completed one or several iterations
1001 The final step is to create feed-in and wind-down code for the loop.
1002 A good way to do this is to make a copy (or copies) of the loop at the
1003 start and delete those instructions which don't have valid antecedents,
1004 and at the end replicate and delete those whose results are unwanted
1005 (including any further loads).
1007 The loop will have a minimum number of limbs loaded and processed,
1008 so the feed-in code must test if the request size is smaller and skip
1009 either to a suitable part of the wind-down or to special code for small
1013 File: gmp.info, Node: Internals, Next: Contributors, Prev: Algorithms, Up: Top
1018 *This chapter is provided only for informational purposes and the
1019 various internals described here may change in future GMP releases.
1020 Applications expecting to be compatible with future releases should use
1021 only the documented interfaces described in previous chapters.*
1025 * Integer Internals::
1026 * Rational Internals::
1028 * Raw Output Internals::
1029 * C++ Interface Internals::
1032 File: gmp.info, Node: Integer Internals, Next: Rational Internals, Prev: Internals, Up: Internals
1034 17.1 Integer Internals
1035 ======================
1037 `mpz_t' variables represent integers using sign and magnitude, in space
1038 dynamically allocated and reallocated. The fields are as follows.
1041 The number of limbs, or the negative of that when representing a
1042 negative integer. Zero is represented by `_mp_size' set to zero,
1043 in which case the `_mp_d' data is unused.
1046 A pointer to an array of limbs which is the magnitude. These are
1047 stored "little endian" as per the `mpn' functions, so `_mp_d[0]'
1048 is the least significant limb and `_mp_d[ABS(_mp_size)-1]' is the
1049 most significant. Whenever `_mp_size' is non-zero, the most
1050 significant limb is non-zero.
1052 Currently there's always at least one limb allocated, so for
1053 instance `mpz_set_ui' never needs to reallocate, and `mpz_get_ui'
1054 can fetch `_mp_d[0]' unconditionally (though its value is then
1055 only wanted if `_mp_size' is non-zero).
1058 `_mp_alloc' is the number of limbs currently allocated at `_mp_d',
1059 and naturally `_mp_alloc >= ABS(_mp_size)'. When an `mpz' routine
1060 is about to (or might be about to) increase `_mp_size', it checks
1061 `_mp_alloc' to see whether there's enough space, and reallocates
1062 if not. `MPZ_REALLOC' is generally used for this.
1064 The various bitwise logical functions like `mpz_and' behave as if
1065 negative values were twos complement. But sign and magnitude is always
1066 used internally, and necessary adjustments are made during the
1067 calculations. Sometimes this isn't pretty, but sign and magnitude are
1068 best for other routines.
1070 Some internal temporary variables are setup with `MPZ_TMP_INIT' and
1071 these have `_mp_d' space obtained from `TMP_ALLOC' rather than the
1072 memory allocation functions. Care is taken to ensure that these are
1073 big enough that no reallocation is necessary (since it would have
1074 unpredictable consequences).
1076 `_mp_size' and `_mp_alloc' are `int', although `mp_size_t' is
1077 usually a `long'. This is done to make the fields just 32 bits on some
1078 64 bits systems, thereby saving a few bytes of data space but still
1079 providing plenty of range.
1082 File: gmp.info, Node: Rational Internals, Next: Float Internals, Prev: Integer Internals, Up: Internals
1084 17.2 Rational Internals
1085 =======================
1087 `mpq_t' variables represent rationals using an `mpz_t' numerator and
1088 denominator (*note Integer Internals::).
1090 The canonical form adopted is denominator positive (and non-zero),
1091 no common factors between numerator and denominator, and zero uniquely
1094 It's believed that casting out common factors at each stage of a
1095 calculation is best in general. A GCD is an O(N^2) operation so it's
1096 better to do a few small ones immediately than to delay and have to do
1097 a big one later. Knowing the numerator and denominator have no common
1098 factors can be used for example in `mpq_mul' to make only two cross
1099 GCDs necessary, not four.
1101 This general approach to common factors is badly sub-optimal in the
1102 presence of simple factorizations or little prospect for cancellation,
1103 but GMP has no way to know when this will occur. As per *Note
1104 Efficiency::, that's left to applications. The `mpq_t' framework might
1105 still suit, with `mpq_numref' and `mpq_denref' for direct access to the
1106 numerator and denominator, or of course `mpz_t' variables can be used
1110 File: gmp.info, Node: Float Internals, Next: Raw Output Internals, Prev: Rational Internals, Up: Internals
1112 17.3 Float Internals
1113 ====================
1115 Efficient calculation is the primary aim of GMP floats and the use of
1116 whole limbs and simple rounding facilitates this.
1118 `mpf_t' floats have a variable precision mantissa and a single
1119 machine word signed exponent. The mantissa is represented using sign
1123 significant significant
1127 |---- _mp_exp ---> |
1128 _____ _____ _____ _____ _____
1129 |_____|_____|_____|_____|_____|
1130 . <------------ radix point
1132 <-------- _mp_size --------->
1134 The fields are as follows.
1137 The number of limbs currently in use, or the negative of that when
1138 representing a negative value. Zero is represented by `_mp_size'
1139 and `_mp_exp' both set to zero, and in that case the `_mp_d' data
1140 is unused. (In the future `_mp_exp' might be undefined when
1144 The precision of the mantissa, in limbs. In any calculation the
1145 aim is to produce `_mp_prec' limbs of result (the most significant
1149 A pointer to the array of limbs which is the absolute value of the
1150 mantissa. These are stored "little endian" as per the `mpn'
1151 functions, so `_mp_d[0]' is the least significant limb and
1152 `_mp_d[ABS(_mp_size)-1]' the most significant.
1154 The most significant limb is always non-zero, but there are no
1155 other restrictions on its value, in particular the highest 1 bit
1156 can be anywhere within the limb.
1158 `_mp_prec+1' limbs are allocated to `_mp_d', the extra limb being
1159 for convenience (see below). There are no reallocations during a
1160 calculation, only in a change of precision with `mpf_set_prec'.
1163 The exponent, in limbs, determining the location of the implied
1164 radix point. Zero means the radix point is just above the most
1165 significant limb. Positive values mean a radix point offset
1166 towards the lower limbs and hence a value >= 1, as for example in
1167 the diagram above. Negative exponents mean a radix point further
1168 above the highest limb.
1170 Naturally the exponent can be any value, it doesn't have to fall
1171 within the limbs as the diagram shows, it can be a long way above
1172 or a long way below. Limbs other than those included in the
1173 `{_mp_d,_mp_size}' data are treated as zero.
1175 The `_mp_size' and `_mp_prec' fields are `int', although the
1176 `mp_size_t' type is usually a `long'. The `_mp_exp' field is usually
1177 `long'. This is done to make some fields just 32 bits on some 64 bits
1178 systems, thereby saving a few bytes of data space but still providing
1179 plenty of precision and a very large range.
1182 The following various points should be noted.
1185 The least significant limbs `_mp_d[0]' etc can be zero, though
1186 such low zeros can always be ignored. Routines likely to produce
1187 low zeros check and avoid them to save time in subsequent
1188 calculations, but for most routines they're quite unlikely and
1192 The `_mp_size' count of limbs in use can be less than `_mp_prec' if
1193 the value can be represented in less. This means low precision
1194 values or small integers stored in a high precision `mpf_t' can
1195 still be operated on efficiently.
1197 `_mp_size' can also be greater than `_mp_prec'. Firstly a value is
1198 allowed to use all of the `_mp_prec+1' limbs available at `_mp_d',
1199 and secondly when `mpf_set_prec_raw' lowers `_mp_prec' it leaves
1200 `_mp_size' unchanged and so the size can be arbitrarily bigger than
1204 All rounding is done on limb boundaries. Calculating `_mp_prec'
1205 limbs with the high non-zero will ensure the application requested
1206 minimum precision is obtained.
1208 The use of simple "trunc" rounding towards zero is efficient,
1209 since there's no need to examine extra limbs and increment or
1213 Since the exponent is in limbs, there are no bit shifts in basic
1214 operations like `mpf_add' and `mpf_mul'. When differing exponents
1215 are encountered all that's needed is to adjust pointers to line up
1218 Of course `mpf_mul_2exp' and `mpf_div_2exp' will require bit
1219 shifts, but the choice is between an exponent in limbs which
1220 requires shifts there, or one in bits which requires them almost
1223 Use of `_mp_prec+1' Limbs
1224 The extra limb on `_mp_d' (`_mp_prec+1' rather than just
1225 `_mp_prec') helps when an `mpf' routine might get a carry from its
1226 operation. `mpf_add' for instance will do an `mpn_add' of
1227 `_mp_prec' limbs. If there's no carry then that's the result, but
1228 if there is a carry then it's stored in the extra limb of space and
1229 `_mp_size' becomes `_mp_prec+1'.
1231 Whenever `_mp_prec+1' limbs are held in a variable, the low limb
1232 is not needed for the intended precision, only the `_mp_prec' high
1233 limbs. But zeroing it out or moving the rest down is unnecessary.
1234 Subsequent routines reading the value will simply take the high
1235 limbs they need, and this will be `_mp_prec' if their target has
1236 that same precision. This is no more than a pointer adjustment,
1237 and must be checked anyway since the destination precision can be
1238 different from the sources.
1240 Copy functions like `mpf_set' will retain a full `_mp_prec+1' limbs
1241 if available. This ensures that a variable which has `_mp_size'
1242 equal to `_mp_prec+1' will get its full exact value copied.
1243 Strictly speaking this is unnecessary since only `_mp_prec' limbs
1244 are needed for the application's requested precision, but it's
1245 considered that an `mpf_set' from one variable into another of the
1246 same precision ought to produce an exact copy.
1248 Application Precisions
1249 `__GMPF_BITS_TO_PREC' converts an application requested precision
1250 to an `_mp_prec'. The value in bits is rounded up to a whole limb
1251 then an extra limb is added since the most significant limb of
1252 `_mp_d' is only non-zero and therefore might contain only one bit.
1254 `__GMPF_PREC_TO_BITS' does the reverse conversion, and removes the
1255 extra limb from `_mp_prec' before converting to bits. The net
1256 effect of reading back with `mpf_get_prec' is simply the precision
1257 rounded up to a multiple of `mp_bits_per_limb'.
1259 Note that the extra limb added here for the high only being
1260 non-zero is in addition to the extra limb allocated to `_mp_d'.
1261 For example with a 32-bit limb, an application request for 250
1262 bits will be rounded up to 8 limbs, then an extra added for the
1263 high being only non-zero, giving an `_mp_prec' of 9. `_mp_d' then
1264 gets 10 limbs allocated. Reading back with `mpf_get_prec' will
1265 take `_mp_prec' subtract 1 limb and multiply by 32, giving 256
1268 Strictly speaking, the fact the high limb has at least one bit
1269 means that a float with, say, 3 limbs of 32-bits each will be
1270 holding at least 65 bits, but for the purposes of `mpf_t' it's
1271 considered simply to be 64 bits, a nice multiple of the limb size.
1274 File: gmp.info, Node: Raw Output Internals, Next: C++ Interface Internals, Prev: Float Internals, Up: Internals
1276 17.4 Raw Output Internals
1277 =========================
1279 `mpz_out_raw' uses the following format.
1281 +------+------------------------+
1282 | size | data bytes |
1283 +------+------------------------+
1285 The size is 4 bytes written most significant byte first, being the
1286 number of subsequent data bytes, or the twos complement negative of
1287 that when a negative integer is represented. The data bytes are the
1288 absolute value of the integer, written most significant byte first.
1290 The most significant data byte is always non-zero, so the output is
1291 the same on all systems, irrespective of limb size.
1293 In GMP 1, leading zero bytes were written to pad the data bytes to a
1294 multiple of the limb size. `mpz_inp_raw' will still accept this, for
1297 The use of "big endian" for both the size and data fields is
1298 deliberate, it makes the data easy to read in a hex dump of a file.
1299 Unfortunately it also means that the limb data must be reversed when
1300 reading or writing, so neither a big endian nor little endian system
1301 can just read and write `_mp_d'.
1304 File: gmp.info, Node: C++ Interface Internals, Prev: Raw Output Internals, Up: Internals
1306 17.5 C++ Interface Internals
1307 ============================
1309 A system of expression templates is used to ensure something like
1310 `a=b+c' turns into a simple call to `mpz_add' etc. For `mpf_class' the
1311 scheme also ensures the precision of the final destination is used for
1312 any temporaries within a statement like `f=w*x+y*z'. These are
1313 important features which a naive implementation cannot provide.
1315 A simplified description of the scheme follows. The true scheme is
1316 complicated by the fact that expressions have different return types.
1317 For detailed information, refer to the source code.
1319 To perform an operation, say, addition, we first define a "function
1320 object" evaluating it,
1322 struct __gmp_binary_plus
1324 static void eval(mpf_t f, mpf_t g, mpf_t h) { mpf_add(f, g, h); }
1327 And an "additive expression" object,
1329 __gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >
1330 operator+(const mpf_class &f, const mpf_class &g)
1333 <__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >(f, g);
1336 The seemingly redundant `__gmp_expr<__gmp_binary_expr<...>>' is used
1337 to encapsulate any possible kind of expression into a single template
1338 type. In fact even `mpf_class' etc are `typedef' specializations of
1341 Next we define assignment of `__gmp_expr' to `mpf_class'.
1344 mpf_class & mpf_class::operator=(const __gmp_expr<T> &expr)
1346 expr.eval(this->get_mpf_t(), this->precision());
1351 void __gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, Op> >::eval
1352 (mpf_t f, mp_bitcnt_t precision)
1354 Op::eval(f, expr.val1.get_mpf_t(), expr.val2.get_mpf_t());
1357 where `expr.val1' and `expr.val2' are references to the expression's
1358 operands (here `expr' is the `__gmp_binary_expr' stored within the
1361 This way, the expression is actually evaluated only at the time of
1362 assignment, when the required precision (that of `f') is known.
1363 Furthermore the target `mpf_t' is now available, thus we can call
1364 `mpf_add' directly with `f' as the output argument.
1366 Compound expressions are handled by defining operators taking
1367 subexpressions as their arguments, like this:
1369 template <class T, class U>
1371 <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> >
1372 operator+(const __gmp_expr<T> &expr1, const __gmp_expr<U> &expr2)
1375 <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> >
1379 And the corresponding specializations of `__gmp_expr::eval':
1381 template <class T, class U, class Op>
1383 <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, Op> >::eval
1384 (mpf_t f, mp_bitcnt_t precision)
1386 // declare two temporaries
1387 mpf_class temp1(expr.val1, precision), temp2(expr.val2, precision);
1388 Op::eval(f, temp1.get_mpf_t(), temp2.get_mpf_t());
1391 The expression is thus recursively evaluated to any level of
1392 complexity and all subexpressions are evaluated to the precision of `f'.
1395 File: gmp.info, Node: Contributors, Next: References, Prev: Internals, Up: Top
1397 Appendix A Contributors
1398 ***********************
1400 Torbjo"rn Granlund wrote the original GMP library and is still the main
1401 developer. Code not explicitly attributed to others, was contributed by
1402 Torbjo"rn. Several other individuals and organizations have contributed
1403 GMP. Here is a list in chronological order on first contribution:
1405 Gunnar Sjo"din and Hans Riesel helped with mathematical problems in
1406 early versions of the library.
1408 Richard Stallman helped with the interface design and revised the
1409 first version of this manual.
1411 Brian Beuning and Doug Lea helped with testing of early versions of
1412 the library and made creative suggestions.
1414 John Amanatides of York University in Canada contributed the function
1415 `mpz_probab_prime_p'.
1417 Paul Zimmermann wrote the REDC-based mpz_powm code, the
1418 Scho"nhage-Strassen FFT multiply code, and the Karatsuba square root
1419 code. He also improved the Toom3 code for GMP 4.2. Paul sparked the
1420 development of GMP 2, with his comparisons between bignum packages.
1421 The ECMNET project Paul is organizing was a driving force behind many
1422 of the optimizations in GMP 3. Paul also wrote the new GMP 4.3 nth
1423 root code (with Torbjo"rn).
1425 Ken Weber (Kent State University, Universidade Federal do Rio Grande
1426 do Sul) contributed now defunct versions of `mpz_gcd', `mpz_divexact',
1427 `mpn_gcd', and `mpn_bdivmod', partially supported by CNPq (Brazil)
1430 Per Bothner of Cygnus Support helped to set up GMP to use Cygnus'
1431 configure. He has also made valuable suggestions and tested numerous
1432 intermediary releases.
1434 Joachim Hollman was involved in the design of the `mpf' interface,
1435 and in the `mpz' design revisions for version 2.
1437 Bennet Yee contributed the initial versions of `mpz_jacobi' and
1440 Andreas Schwab contributed the files `mpn/m68k/lshift.S' and
1441 `mpn/m68k/rshift.S' (now in `.asm' form).
1443 Robert Harley of Inria, France and David Seal of ARM, England,
1444 suggested clever improvements for population count. Robert also wrote
1445 highly optimized Karatsuba and 3-way Toom multiplication functions for
1446 GMP 3, and contributed the ARM assembly code.
1448 Torsten Ekedahl of the Mathematical department of Stockholm
1449 University provided significant inspiration during several phases of
1450 the GMP development. His mathematical expertise helped improve several
1453 Linus Nordberg wrote the new configure system based on autoconf and
1454 implemented the new random functions.
1456 Kevin Ryde worked on a large number of things: optimized x86 code,
1457 m4 asm macros, parameter tuning, speed measuring, the configure system,
1458 function inlining, divisibility tests, bit scanning, Jacobi symbols,
1459 Fibonacci and Lucas number functions, printf and scanf functions, perl
1460 interface, demo expression parser, the algorithms chapter in the
1461 manual, `gmpasm-mode.el', and various miscellaneous improvements
1464 Kent Boortz made the Mac OS 9 port.
1466 Steve Root helped write the optimized alpha 21264 assembly code.
1468 Gerardo Ballabio wrote the `gmpxx.h' C++ class interface and the C++
1469 `istream' input routines.
1471 Jason Moxham rewrote `mpz_fac_ui'.
1473 Pedro Gimeno implemented the Mersenne Twister and made other random
1474 number improvements.
1476 Niels Mo"ller wrote the sub-quadratic GCD and extended GCD code, the
1477 quadratic Hensel division code, and (with Torbjo"rn) the new divide and
1478 conquer division code for GMP 4.3. Niels also helped implement the new
1479 Toom multiply code for GMP 4.3 and implemented helper functions to
1480 simplify Toom evaluations for GMP 5.0. He wrote the original version
1483 Alberto Zanoni and Marco Bodrato suggested the unbalanced multiply
1484 strategy, and found the optimal strategies for evaluation and
1485 interpolation in Toom multiplication.
1487 Marco Bodrato helped implement the new Toom multiply code for GMP
1488 4.3 and implemented most of the new Toom multiply and squaring code for
1489 5.0. He is the main author of the current mpn_mulmod_bnm1 and
1490 mpn_mullo_n. Marco also wrote the functions mpn_invert and
1493 David Harvey suggested the internal function `mpn_bdiv_dbm1',
1494 implementing division relevant to Toom multiplication. He also worked
1495 on fast assembly sequences, in particular on a fast AMD64
1498 Martin Boij wrote `mpn_perfect_power_p'.
1500 (This list is chronological, not ordered after significance. If you
1501 have contributed to GMP but are not listed above, please tell
1502 <gmp-devel@gmplib.org> about the omission!)
1504 The development of floating point functions of GNU MP 2, were
1505 supported in part by the ESPRIT-BRA (Basic Research Activities) 6846
1506 project POSSO (POlynomial System SOlving).
1508 The development of GMP 2, 3, and 4 was supported in part by the IDA
1509 Center for Computing Sciences.
1511 Thanks go to Hans Thorsen for donating an SGI system for the GMP
1512 test system environment.
1515 File: gmp.info, Node: References, Next: GNU Free Documentation License, Prev: Contributors, Up: Top
1517 Appendix B References
1518 *********************
1523 * Jonathan M. Borwein and Peter B. Borwein, "Pi and the AGM: A Study
1524 in Analytic Number Theory and Computational Complexity", Wiley,
1527 * Richard Crandall and Carl Pomerance, "Prime Numbers: A
1528 Computational Perspective", 2nd edition, Springer-Verlag, 2005.
1529 `http://math.dartmouth.edu/~carlp/'
1531 * Henri Cohen, "A Course in Computational Algebraic Number Theory",
1532 Graduate Texts in Mathematics number 138, Springer-Verlag, 1993.
1533 `http://www.math.u-bordeaux.fr/~cohen/'
1535 * Donald E. Knuth, "The Art of Computer Programming", volume 2,
1536 "Seminumerical Algorithms", 3rd edition, Addison-Wesley, 1998.
1537 `http://www-cs-faculty.stanford.edu/~knuth/taocp.html'
1539 * John D. Lipson, "Elements of Algebra and Algebraic Computing", The
1540 Benjamin Cummings Publishing Company Inc, 1981.
1542 * Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone,
1543 "Handbook of Applied Cryptography",
1544 `http://www.cacr.math.uwaterloo.ca/hac/'
1546 * Richard M. Stallman and the GCC Developer Community, "Using the
1547 GNU Compiler Collection", Free Software Foundation, 2008,
1548 available online `http://gcc.gnu.org/onlinedocs/', and in the GCC
1549 package `ftp://ftp.gnu.org/gnu/gcc/'
1554 * Yves Bertot, Nicolas Magaud and Paul Zimmermann, "A Proof of GMP
1555 Square Root", Journal of Automated Reasoning, volume 29, 2002, pp.
1556 225-252. Also available online as INRIA Research Report 4475,
1557 June 2001, `http://www.inria.fr/rrrt/rr-4475.html'
1559 * Christoph Burnikel and Joachim Ziegler, "Fast Recursive Division",
1560 Max-Planck-Institut fuer Informatik Research Report MPI-I-98-1-022,
1561 `http://data.mpi-sb.mpg.de/internet/reports.nsf/NumberView/1998-1-022'
1563 * Torbjo"rn Granlund and Peter L. Montgomery, "Division by Invariant
1564 Integers using Multiplication", in Proceedings of the SIGPLAN
1565 PLDI'94 Conference, June 1994. Also available
1566 `ftp://ftp.cwi.nl/pub/pmontgom/divcnst.psa4.gz' (and .psl.gz).
1568 * Niels Mo"ller and Torbjo"rn Granlund, "Improved division by
1569 invariant integers", to appear.
1571 * Torbjo"rn Granlund and Niels Mo"ller, "Division of integers large
1572 and small", to appear.
1574 * Tudor Jebelean, "An algorithm for exact division", Journal of
1575 Symbolic Computation, volume 15, 1993, pp. 169-180. Research
1576 report version available
1577 `ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-35.ps.gz'
1579 * Tudor Jebelean, "Exact Division with Karatsuba Complexity -
1580 Extended Abstract", RISC-Linz technical report 96-31,
1581 `ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-31.ps.gz'
1583 * Tudor Jebelean, "Practical Integer Division with Karatsuba
1584 Complexity", ISSAC 97, pp. 339-341. Technical report available
1585 `ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-29.ps.gz'
1587 * Tudor Jebelean, "A Generalization of the Binary GCD Algorithm",
1588 ISSAC 93, pp. 111-116. Technical report version available
1589 `ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1993/93-01.ps.gz'
1591 * Tudor Jebelean, "A Double-Digit Lehmer-Euclid Algorithm for
1592 Finding the GCD of Long Integers", Journal of Symbolic
1593 Computation, volume 19, 1995, pp. 145-157. Technical report
1594 version also available
1595 `ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-69.ps.gz'
1597 * Werner Krandick and Tudor Jebelean, "Bidirectional Exact Integer
1598 Division", Journal of Symbolic Computation, volume 21, 1996, pp.
1599 441-455. Early technical report version also available
1600 `ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1994/94-50.ps.gz'
1602 * Makoto Matsumoto and Takuji Nishimura, "Mersenne Twister: A
1603 623-dimensionally equidistributed uniform pseudorandom number
1604 generator", ACM Transactions on Modelling and Computer Simulation,
1605 volume 8, January 1998, pp. 3-30. Available online
1606 `http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/mt.ps.gz'
1609 * R. Moenck and A. Borodin, "Fast Modular Transforms via Division",
1610 Proceedings of the 13th Annual IEEE Symposium on Switching and
1611 Automata Theory, October 1972, pp. 90-96. Reprinted as "Fast
1612 Modular Transforms", Journal of Computer and System Sciences,
1613 volume 8, number 3, June 1974, pp. 366-386.
1615 * Niels Mo"ller, "On Scho"nhage's algorithm and subquadratic integer
1616 GCD computation", in Mathematics of Computation, volume 77,
1617 January 2008, pp. 589-607.
1619 * Peter L. Montgomery, "Modular Multiplication Without Trial
1620 Division", in Mathematics of Computation, volume 44, number 170,
1623 * Arnold Scho"nhage and Volker Strassen, "Schnelle Multiplikation
1624 grosser Zahlen", Computing 7, 1971, pp. 281-292.
1626 * Kenneth Weber, "The accelerated integer GCD algorithm", ACM
1627 Transactions on Mathematical Software, volume 21, number 1, March
1630 * Paul Zimmermann, "Karatsuba Square Root", INRIA Research Report
1631 3805, November 1999, `http://www.inria.fr/rrrt/rr-3805.html'
1633 * Paul Zimmermann, "A Proof of GMP Fast Division and Square Root
1635 `http://www.loria.fr/~zimmerma/papers/proof-div-sqrt.ps.gz'
1637 * Dan Zuras, "On Squaring and Multiplying Large Integers", ARITH-11:
1638 IEEE Symposium on Computer Arithmetic, 1993, pp. 260 to 271.
1639 Reprinted as "More on Multiplying and Squaring Large Integers",
1640 IEEE Transactions on Computers, volume 43, number 8, August 1994,
1644 File: gmp.info, Node: GNU Free Documentation License, Next: Concept Index, Prev: References, Up: Top
1646 Appendix C GNU Free Documentation License
1647 *****************************************
1649 Version 1.3, 3 November 2008
1651 Copyright (C) 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc.
1654 Everyone is permitted to copy and distribute verbatim copies
1655 of this license document, but changing it is not allowed.
1659 The purpose of this License is to make a manual, textbook, or other
1660 functional and useful document "free" in the sense of freedom: to
1661 assure everyone the effective freedom to copy and redistribute it,
1662 with or without modifying it, either commercially or
1663 noncommercially. Secondarily, this License preserves for the
1664 author and publisher a way to get credit for their work, while not
1665 being considered responsible for modifications made by others.
1667 This License is a kind of "copyleft", which means that derivative
1668 works of the document must themselves be free in the same sense.
1669 It complements the GNU General Public License, which is a copyleft
1670 license designed for free software.
1672 We have designed this License in order to use it for manuals for
1673 free software, because free software needs free documentation: a
1674 free program should come with manuals providing the same freedoms
1675 that the software does. But this License is not limited to
1676 software manuals; it can be used for any textual work, regardless
1677 of subject matter or whether it is published as a printed book.
1678 We recommend this License principally for works whose purpose is
1679 instruction or reference.
1681 1. APPLICABILITY AND DEFINITIONS
1683 This License applies to any manual or other work, in any medium,
1684 that contains a notice placed by the copyright holder saying it
1685 can be distributed under the terms of this License. Such a notice
1686 grants a world-wide, royalty-free license, unlimited in duration,
1687 to use that work under the conditions stated herein. The
1688 "Document", below, refers to any such manual or work. Any member
1689 of the public is a licensee, and is addressed as "you". You
1690 accept the license if you copy, modify or distribute the work in a
1691 way requiring permission under copyright law.
1693 A "Modified Version" of the Document means any work containing the
1694 Document or a portion of it, either copied verbatim, or with
1695 modifications and/or translated into another language.
1697 A "Secondary Section" is a named appendix or a front-matter section
1698 of the Document that deals exclusively with the relationship of the
1699 publishers or authors of the Document to the Document's overall
1700 subject (or to related matters) and contains nothing that could
1701 fall directly within that overall subject. (Thus, if the Document
1702 is in part a textbook of mathematics, a Secondary Section may not
1703 explain any mathematics.) The relationship could be a matter of
1704 historical connection with the subject or with related matters, or
1705 of legal, commercial, philosophical, ethical or political position
1708 The "Invariant Sections" are certain Secondary Sections whose
1709 titles are designated, as being those of Invariant Sections, in
1710 the notice that says that the Document is released under this
1711 License. If a section does not fit the above definition of
1712 Secondary then it is not allowed to be designated as Invariant.
1713 The Document may contain zero Invariant Sections. If the Document
1714 does not identify any Invariant Sections then there are none.
1716 The "Cover Texts" are certain short passages of text that are
1717 listed, as Front-Cover Texts or Back-Cover Texts, in the notice
1718 that says that the Document is released under this License. A
1719 Front-Cover Text may be at most 5 words, and a Back-Cover Text may
1720 be at most 25 words.
1722 A "Transparent" copy of the Document means a machine-readable copy,
1723 represented in a format whose specification is available to the
1724 general public, that is suitable for revising the document
1725 straightforwardly with generic text editors or (for images
1726 composed of pixels) generic paint programs or (for drawings) some
1727 widely available drawing editor, and that is suitable for input to
1728 text formatters or for automatic translation to a variety of
1729 formats suitable for input to text formatters. A copy made in an
1730 otherwise Transparent file format whose markup, or absence of
1731 markup, has been arranged to thwart or discourage subsequent
1732 modification by readers is not Transparent. An image format is
1733 not Transparent if used for any substantial amount of text. A
1734 copy that is not "Transparent" is called "Opaque".
1736 Examples of suitable formats for Transparent copies include plain
1737 ASCII without markup, Texinfo input format, LaTeX input format,
1738 SGML or XML using a publicly available DTD, and
1739 standard-conforming simple HTML, PostScript or PDF designed for
1740 human modification. Examples of transparent image formats include
1741 PNG, XCF and JPG. Opaque formats include proprietary formats that
1742 can be read and edited only by proprietary word processors, SGML or
1743 XML for which the DTD and/or processing tools are not generally
1744 available, and the machine-generated HTML, PostScript or PDF
1745 produced by some word processors for output purposes only.
1747 The "Title Page" means, for a printed book, the title page itself,
1748 plus such following pages as are needed to hold, legibly, the
1749 material this License requires to appear in the title page. For
1750 works in formats which do not have any title page as such, "Title
1751 Page" means the text near the most prominent appearance of the
1752 work's title, preceding the beginning of the body of the text.
1754 The "publisher" means any person or entity that distributes copies
1755 of the Document to the public.
1757 A section "Entitled XYZ" means a named subunit of the Document
1758 whose title either is precisely XYZ or contains XYZ in parentheses
1759 following text that translates XYZ in another language. (Here XYZ
1760 stands for a specific section name mentioned below, such as
1761 "Acknowledgements", "Dedications", "Endorsements", or "History".)
1762 To "Preserve the Title" of such a section when you modify the
1763 Document means that it remains a section "Entitled XYZ" according
1766 The Document may include Warranty Disclaimers next to the notice
1767 which states that this License applies to the Document. These
1768 Warranty Disclaimers are considered to be included by reference in
1769 this License, but only as regards disclaiming warranties: any other
1770 implication that these Warranty Disclaimers may have is void and
1771 has no effect on the meaning of this License.
1775 You may copy and distribute the Document in any medium, either
1776 commercially or noncommercially, provided that this License, the
1777 copyright notices, and the license notice saying this License
1778 applies to the Document are reproduced in all copies, and that you
1779 add no other conditions whatsoever to those of this License. You
1780 may not use technical measures to obstruct or control the reading
1781 or further copying of the copies you make or distribute. However,
1782 you may accept compensation in exchange for copies. If you
1783 distribute a large enough number of copies you must also follow
1784 the conditions in section 3.
1786 You may also lend copies, under the same conditions stated above,
1787 and you may publicly display copies.
1789 3. COPYING IN QUANTITY
1791 If you publish printed copies (or copies in media that commonly
1792 have printed covers) of the Document, numbering more than 100, and
1793 the Document's license notice requires Cover Texts, you must
1794 enclose the copies in covers that carry, clearly and legibly, all
1795 these Cover Texts: Front-Cover Texts on the front cover, and
1796 Back-Cover Texts on the back cover. Both covers must also clearly
1797 and legibly identify you as the publisher of these copies. The
1798 front cover must present the full title with all words of the
1799 title equally prominent and visible. You may add other material
1800 on the covers in addition. Copying with changes limited to the
1801 covers, as long as they preserve the title of the Document and
1802 satisfy these conditions, can be treated as verbatim copying in
1805 If the required texts for either cover are too voluminous to fit
1806 legibly, you should put the first ones listed (as many as fit
1807 reasonably) on the actual cover, and continue the rest onto
1810 If you publish or distribute Opaque copies of the Document
1811 numbering more than 100, you must either include a
1812 machine-readable Transparent copy along with each Opaque copy, or
1813 state in or with each Opaque copy a computer-network location from
1814 which the general network-using public has access to download
1815 using public-standard network protocols a complete Transparent
1816 copy of the Document, free of added material. If you use the
1817 latter option, you must take reasonably prudent steps, when you
1818 begin distribution of Opaque copies in quantity, to ensure that
1819 this Transparent copy will remain thus accessible at the stated
1820 location until at least one year after the last time you
1821 distribute an Opaque copy (directly or through your agents or
1822 retailers) of that edition to the public.
1824 It is requested, but not required, that you contact the authors of
1825 the Document well before redistributing any large number of
1826 copies, to give them a chance to provide you with an updated
1827 version of the Document.
1831 You may copy and distribute a Modified Version of the Document
1832 under the conditions of sections 2 and 3 above, provided that you
1833 release the Modified Version under precisely this License, with
1834 the Modified Version filling the role of the Document, thus
1835 licensing distribution and modification of the Modified Version to
1836 whoever possesses a copy of it. In addition, you must do these
1837 things in the Modified Version:
1839 A. Use in the Title Page (and on the covers, if any) a title
1840 distinct from that of the Document, and from those of
1841 previous versions (which should, if there were any, be listed
1842 in the History section of the Document). You may use the
1843 same title as a previous version if the original publisher of
1844 that version gives permission.
1846 B. List on the Title Page, as authors, one or more persons or
1847 entities responsible for authorship of the modifications in
1848 the Modified Version, together with at least five of the
1849 principal authors of the Document (all of its principal
1850 authors, if it has fewer than five), unless they release you
1851 from this requirement.
1853 C. State on the Title page the name of the publisher of the
1854 Modified Version, as the publisher.
1856 D. Preserve all the copyright notices of the Document.
1858 E. Add an appropriate copyright notice for your modifications
1859 adjacent to the other copyright notices.
1861 F. Include, immediately after the copyright notices, a license
1862 notice giving the public permission to use the Modified
1863 Version under the terms of this License, in the form shown in
1866 G. Preserve in that license notice the full lists of Invariant
1867 Sections and required Cover Texts given in the Document's
1870 H. Include an unaltered copy of this License.
1872 I. Preserve the section Entitled "History", Preserve its Title,
1873 and add to it an item stating at least the title, year, new
1874 authors, and publisher of the Modified Version as given on
1875 the Title Page. If there is no section Entitled "History" in
1876 the Document, create one stating the title, year, authors,
1877 and publisher of the Document as given on its Title Page,
1878 then add an item describing the Modified Version as stated in
1879 the previous sentence.
1881 J. Preserve the network location, if any, given in the Document
1882 for public access to a Transparent copy of the Document, and
1883 likewise the network locations given in the Document for
1884 previous versions it was based on. These may be placed in
1885 the "History" section. You may omit a network location for a
1886 work that was published at least four years before the
1887 Document itself, or if the original publisher of the version
1888 it refers to gives permission.
1890 K. For any section Entitled "Acknowledgements" or "Dedications",
1891 Preserve the Title of the section, and preserve in the
1892 section all the substance and tone of each of the contributor
1893 acknowledgements and/or dedications given therein.
1895 L. Preserve all the Invariant Sections of the Document,
1896 unaltered in their text and in their titles. Section numbers
1897 or the equivalent are not considered part of the section
1900 M. Delete any section Entitled "Endorsements". Such a section
1901 may not be included in the Modified Version.
1903 N. Do not retitle any existing section to be Entitled
1904 "Endorsements" or to conflict in title with any Invariant
1907 O. Preserve any Warranty Disclaimers.
1909 If the Modified Version includes new front-matter sections or
1910 appendices that qualify as Secondary Sections and contain no
1911 material copied from the Document, you may at your option
1912 designate some or all of these sections as invariant. To do this,
1913 add their titles to the list of Invariant Sections in the Modified
1914 Version's license notice. These titles must be distinct from any
1915 other section titles.
1917 You may add a section Entitled "Endorsements", provided it contains
1918 nothing but endorsements of your Modified Version by various
1919 parties--for example, statements of peer review or that the text
1920 has been approved by an organization as the authoritative
1921 definition of a standard.
1923 You may add a passage of up to five words as a Front-Cover Text,
1924 and a passage of up to 25 words as a Back-Cover Text, to the end
1925 of the list of Cover Texts in the Modified Version. Only one
1926 passage of Front-Cover Text and one of Back-Cover Text may be
1927 added by (or through arrangements made by) any one entity. If the
1928 Document already includes a cover text for the same cover,
1929 previously added by you or by arrangement made by the same entity
1930 you are acting on behalf of, you may not add another; but you may
1931 replace the old one, on explicit permission from the previous
1932 publisher that added the old one.
1934 The author(s) and publisher(s) of the Document do not by this
1935 License give permission to use their names for publicity for or to
1936 assert or imply endorsement of any Modified Version.
1938 5. COMBINING DOCUMENTS
1940 You may combine the Document with other documents released under
1941 this License, under the terms defined in section 4 above for
1942 modified versions, provided that you include in the combination
1943 all of the Invariant Sections of all of the original documents,
1944 unmodified, and list them all as Invariant Sections of your
1945 combined work in its license notice, and that you preserve all
1946 their Warranty Disclaimers.
1948 The combined work need only contain one copy of this License, and
1949 multiple identical Invariant Sections may be replaced with a single
1950 copy. If there are multiple Invariant Sections with the same name
1951 but different contents, make the title of each such section unique
1952 by adding at the end of it, in parentheses, the name of the
1953 original author or publisher of that section if known, or else a
1954 unique number. Make the same adjustment to the section titles in
1955 the list of Invariant Sections in the license notice of the
1958 In the combination, you must combine any sections Entitled
1959 "History" in the various original documents, forming one section
1960 Entitled "History"; likewise combine any sections Entitled
1961 "Acknowledgements", and any sections Entitled "Dedications". You
1962 must delete all sections Entitled "Endorsements."
1964 6. COLLECTIONS OF DOCUMENTS
1966 You may make a collection consisting of the Document and other
1967 documents released under this License, and replace the individual
1968 copies of this License in the various documents with a single copy
1969 that is included in the collection, provided that you follow the
1970 rules of this License for verbatim copying of each of the
1971 documents in all other respects.
1973 You may extract a single document from such a collection, and
1974 distribute it individually under this License, provided you insert
1975 a copy of this License into the extracted document, and follow
1976 this License in all other respects regarding verbatim copying of
1979 7. AGGREGATION WITH INDEPENDENT WORKS
1981 A compilation of the Document or its derivatives with other
1982 separate and independent documents or works, in or on a volume of
1983 a storage or distribution medium, is called an "aggregate" if the
1984 copyright resulting from the compilation is not used to limit the
1985 legal rights of the compilation's users beyond what the individual
1986 works permit. When the Document is included in an aggregate, this
1987 License does not apply to the other works in the aggregate which
1988 are not themselves derivative works of the Document.
1990 If the Cover Text requirement of section 3 is applicable to these
1991 copies of the Document, then if the Document is less than one half
1992 of the entire aggregate, the Document's Cover Texts may be placed
1993 on covers that bracket the Document within the aggregate, or the
1994 electronic equivalent of covers if the Document is in electronic
1995 form. Otherwise they must appear on printed covers that bracket
1996 the whole aggregate.
2000 Translation is considered a kind of modification, so you may
2001 distribute translations of the Document under the terms of section
2002 4. Replacing Invariant Sections with translations requires special
2003 permission from their copyright holders, but you may include
2004 translations of some or all Invariant Sections in addition to the
2005 original versions of these Invariant Sections. You may include a
2006 translation of this License, and all the license notices in the
2007 Document, and any Warranty Disclaimers, provided that you also
2008 include the original English version of this License and the
2009 original versions of those notices and disclaimers. In case of a
2010 disagreement between the translation and the original version of
2011 this License or a notice or disclaimer, the original version will
2014 If a section in the Document is Entitled "Acknowledgements",
2015 "Dedications", or "History", the requirement (section 4) to
2016 Preserve its Title (section 1) will typically require changing the
2021 You may not copy, modify, sublicense, or distribute the Document
2022 except as expressly provided under this License. Any attempt
2023 otherwise to copy, modify, sublicense, or distribute it is void,
2024 and will automatically terminate your rights under this License.
2026 However, if you cease all violation of this License, then your
2027 license from a particular copyright holder is reinstated (a)
2028 provisionally, unless and until the copyright holder explicitly
2029 and finally terminates your license, and (b) permanently, if the
2030 copyright holder fails to notify you of the violation by some
2031 reasonable means prior to 60 days after the cessation.
2033 Moreover, your license from a particular copyright holder is
2034 reinstated permanently if the copyright holder notifies you of the
2035 violation by some reasonable means, this is the first time you have
2036 received notice of violation of this License (for any work) from
2037 that copyright holder, and you cure the violation prior to 30 days
2038 after your receipt of the notice.
2040 Termination of your rights under this section does not terminate
2041 the licenses of parties who have received copies or rights from
2042 you under this License. If your rights have been terminated and
2043 not permanently reinstated, receipt of a copy of some or all of
2044 the same material does not give you any rights to use it.
2046 10. FUTURE REVISIONS OF THIS LICENSE
2048 The Free Software Foundation may publish new, revised versions of
2049 the GNU Free Documentation License from time to time. Such new
2050 versions will be similar in spirit to the present version, but may
2051 differ in detail to address new problems or concerns. See
2052 `http://www.gnu.org/copyleft/'.
2054 Each version of the License is given a distinguishing version
2055 number. If the Document specifies that a particular numbered
2056 version of this License "or any later version" applies to it, you
2057 have the option of following the terms and conditions either of
2058 that specified version or of any later version that has been
2059 published (not as a draft) by the Free Software Foundation. If
2060 the Document does not specify a version number of this License,
2061 you may choose any version ever published (not as a draft) by the
2062 Free Software Foundation. If the Document specifies that a proxy
2063 can decide which future versions of this License can be used, that
2064 proxy's public statement of acceptance of a version permanently
2065 authorizes you to choose that version for the Document.
2069 "Massive Multiauthor Collaboration Site" (or "MMC Site") means any
2070 World Wide Web server that publishes copyrightable works and also
2071 provides prominent facilities for anybody to edit those works. A
2072 public wiki that anybody can edit is an example of such a server.
2073 A "Massive Multiauthor Collaboration" (or "MMC") contained in the
2074 site means any set of copyrightable works thus published on the MMC
2077 "CC-BY-SA" means the Creative Commons Attribution-Share Alike 3.0
2078 license published by Creative Commons Corporation, a not-for-profit
2079 corporation with a principal place of business in San Francisco,
2080 California, as well as future copyleft versions of that license
2081 published by that same organization.
2083 "Incorporate" means to publish or republish a Document, in whole or
2084 in part, as part of another Document.
2086 An MMC is "eligible for relicensing" if it is licensed under this
2087 License, and if all works that were first published under this
2088 License somewhere other than this MMC, and subsequently
2089 incorporated in whole or in part into the MMC, (1) had no cover
2090 texts or invariant sections, and (2) were thus incorporated prior
2091 to November 1, 2008.
2093 The operator of an MMC Site may republish an MMC contained in the
2094 site under CC-BY-SA on the same site at any time before August 1,
2095 2009, provided the MMC is eligible for relicensing.
2098 ADDENDUM: How to use this License for your documents
2099 ====================================================
2101 To use this License in a document you have written, include a copy of
2102 the License in the document and put the following copyright and license
2103 notices just after the title page:
2105 Copyright (C) YEAR YOUR NAME.
2106 Permission is granted to copy, distribute and/or modify this document
2107 under the terms of the GNU Free Documentation License, Version 1.3
2108 or any later version published by the Free Software Foundation;
2109 with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
2110 Texts. A copy of the license is included in the section entitled ``GNU
2111 Free Documentation License''.
2113 If you have Invariant Sections, Front-Cover Texts and Back-Cover
2114 Texts, replace the "with...Texts." line with this:
2116 with the Invariant Sections being LIST THEIR TITLES, with
2117 the Front-Cover Texts being LIST, and with the Back-Cover Texts
2120 If you have Invariant Sections without Cover Texts, or some other
2121 combination of the three, merge those two alternatives to suit the
2124 If your document contains nontrivial examples of program code, we
2125 recommend releasing these examples in parallel under your choice of
2126 free software license, such as the GNU General Public License, to
2127 permit their use in free software.
2130 File: gmp.info, Node: Concept Index, Next: Function Index, Prev: GNU Free Documentation License, Up: Top
2138 * #include: Headers and Libraries.
2140 * --build: Build Options. (line 52)
2141 * --disable-fft: Build Options. (line 317)
2142 * --disable-shared: Build Options. (line 45)
2143 * --disable-static: Build Options. (line 45)
2144 * --enable-alloca: Build Options. (line 278)
2145 * --enable-assert: Build Options. (line 327)
2146 * --enable-cxx: Build Options. (line 230)
2147 * --enable-fat: Build Options. (line 164)
2148 * --enable-mpbsd: Build Options. (line 322)
2149 * --enable-profiling <1>: Profiling. (line 6)
2150 * --enable-profiling: Build Options. (line 331)
2151 * --exec-prefix: Build Options. (line 32)
2152 * --host: Build Options. (line 66)
2153 * --prefix: Build Options. (line 32)
2154 * -finstrument-functions: Profiling. (line 66)
2155 * 2exp functions: Efficiency. (line 43)
2156 * 68000: Notes for Particular Systems.
2158 * 80x86: Notes for Particular Systems.
2160 * ABI <1>: Build Options. (line 171)
2161 * ABI: ABI and ISA. (line 6)
2162 * About this manual: Introduction to GMP. (line 58)
2163 * AC_CHECK_LIB: Autoconf. (line 11)
2164 * AIX <1>: ABI and ISA. (line 184)
2165 * AIX <2>: Notes for Particular Systems.
2167 * AIX: ABI and ISA. (line 169)
2168 * Algorithms: Algorithms. (line 6)
2169 * alloca: Build Options. (line 278)
2170 * Allocation of memory: Custom Allocation. (line 6)
2171 * AMD64: ABI and ISA. (line 44)
2172 * Anonymous FTP of latest version: Introduction to GMP. (line 38)
2173 * Application Binary Interface: ABI and ISA. (line 6)
2174 * Arithmetic functions <1>: Float Arithmetic. (line 6)
2175 * Arithmetic functions <2>: Integer Arithmetic. (line 6)
2176 * Arithmetic functions: Rational Arithmetic. (line 6)
2177 * ARM: Notes for Particular Systems.
2179 * Assembly cache handling: Assembly Cache Handling.
2181 * Assembly carry propagation: Assembly Carry Propagation.
2183 * Assembly code organisation: Assembly Code Organisation.
2185 * Assembly coding: Assembly Coding. (line 6)
2186 * Assembly floating Point: Assembly Floating Point.
2188 * Assembly loop unrolling: Assembly Loop Unrolling.
2190 * Assembly SIMD: Assembly SIMD Instructions.
2192 * Assembly software pipelining: Assembly Software Pipelining.
2194 * Assembly writing guide: Assembly Writing Guide.
2196 * Assertion checking <1>: Debugging. (line 79)
2197 * Assertion checking: Build Options. (line 327)
2198 * Assignment functions <1>: Assigning Floats. (line 6)
2199 * Assignment functions <2>: Initializing Rationals.
2201 * Assignment functions <3>: Simultaneous Integer Init & Assign.
2203 * Assignment functions <4>: Simultaneous Float Init & Assign.
2205 * Assignment functions: Assigning Integers. (line 6)
2206 * Autoconf: Autoconf. (line 6)
2207 * Basics: GMP Basics. (line 6)
2208 * Berkeley MP compatible functions <1>: Build Options. (line 322)
2209 * Berkeley MP compatible functions: BSD Compatible Functions.
2211 * Binomial coefficient algorithm: Binomial Coefficients Algorithm.
2213 * Binomial coefficient functions: Number Theoretic Functions.
2215 * Binutils strip: Known Build Problems.
2217 * Bit manipulation functions: Integer Logic and Bit Fiddling.
2219 * Bit scanning functions: Integer Logic and Bit Fiddling.
2221 * Bit shift left: Integer Arithmetic. (line 35)
2222 * Bit shift right: Integer Division. (line 53)
2223 * Bits per limb: Useful Macros and Constants.
2225 * BSD MP compatible functions <1>: Build Options. (line 322)
2226 * BSD MP compatible functions: BSD Compatible Functions.
2228 * Bug reporting: Reporting Bugs. (line 6)
2229 * Build directory: Build Options. (line 19)
2230 * Build notes for binary packaging: Notes for Package Builds.
2232 * Build notes for particular systems: Notes for Particular Systems.
2234 * Build options: Build Options. (line 6)
2235 * Build problems known: Known Build Problems.
2237 * Build system: Build Options. (line 52)
2238 * Building GMP: Installing GMP. (line 6)
2239 * Bus error: Debugging. (line 7)
2240 * C compiler: Build Options. (line 182)
2241 * C++ compiler: Build Options. (line 254)
2242 * C++ interface: C++ Class Interface. (line 6)
2243 * C++ interface internals: C++ Interface Internals.
2245 * C++ istream input: C++ Formatted Input. (line 6)
2246 * C++ ostream output: C++ Formatted Output.
2248 * C++ support: Build Options. (line 230)
2249 * CC: Build Options. (line 182)
2250 * CC_FOR_BUILD: Build Options. (line 217)
2251 * CFLAGS: Build Options. (line 182)
2252 * Checker: Debugging. (line 115)
2253 * checkergcc: Debugging. (line 122)
2254 * Code organisation: Assembly Code Organisation.
2256 * Compaq C++: Notes for Particular Systems.
2258 * Comparison functions <1>: Integer Comparisons. (line 6)
2259 * Comparison functions <2>: Comparing Rationals. (line 6)
2260 * Comparison functions: Float Comparison. (line 6)
2261 * Compatibility with older versions: Compatibility with older versions.
2263 * Conditions for copying GNU MP: Copying. (line 6)
2264 * Configuring GMP: Installing GMP. (line 6)
2265 * Congruence algorithm: Exact Remainder. (line 29)
2266 * Congruence functions: Integer Division. (line 124)
2267 * Constants: Useful Macros and Constants.
2269 * Contributors: Contributors. (line 6)
2270 * Conventions for parameters: Parameter Conventions.
2272 * Conventions for variables: Variable Conventions.
2274 * Conversion functions <1>: Converting Integers. (line 6)
2275 * Conversion functions <2>: Converting Floats. (line 6)
2276 * Conversion functions: Rational Conversions.
2278 * Copying conditions: Copying. (line 6)
2279 * CPPFLAGS: Build Options. (line 208)
2280 * CPU types <1>: Introduction to GMP. (line 24)
2281 * CPU types: Build Options. (line 108)
2282 * Cross compiling: Build Options. (line 66)
2283 * Custom allocation: Custom Allocation. (line 6)
2284 * CXX: Build Options. (line 254)
2285 * CXXFLAGS: Build Options. (line 254)
2286 * Cygwin: Notes for Particular Systems.
2288 * Darwin: Known Build Problems.
2290 * Debugging: Debugging. (line 6)
2291 * Demonstration programs: Demonstration Programs.
2293 * Digits in an integer: Miscellaneous Integer Functions.
2295 * Divisibility algorithm: Exact Remainder. (line 29)
2296 * Divisibility functions: Integer Division. (line 124)
2297 * Divisibility testing: Efficiency. (line 91)
2298 * Division algorithms: Division Algorithms. (line 6)
2299 * Division functions <1>: Rational Arithmetic. (line 22)
2300 * Division functions <2>: Integer Division. (line 6)
2301 * Division functions: Float Arithmetic. (line 33)
2302 * DJGPP <1>: Notes for Particular Systems.
2304 * DJGPP: Known Build Problems.
2306 * DLLs: Notes for Particular Systems.
2308 * DocBook: Build Options. (line 354)
2309 * Documentation formats: Build Options. (line 347)
2310 * Documentation license: GNU Free Documentation License.
2312 * DVI: Build Options. (line 350)
2313 * Efficiency: Efficiency. (line 6)
2314 * Emacs: Emacs. (line 6)
2315 * Exact division functions: Integer Division. (line 102)
2316 * Exact remainder: Exact Remainder. (line 6)
2317 * Example programs: Demonstration Programs.
2319 * Exec prefix: Build Options. (line 32)
2320 * Execution profiling <1>: Profiling. (line 6)
2321 * Execution profiling: Build Options. (line 331)
2322 * Exponentiation functions <1>: Integer Exponentiation.
2324 * Exponentiation functions: Float Arithmetic. (line 41)
2325 * Export: Integer Import and Export.
2327 * Expression parsing demo: Demonstration Programs.
2329 * Extended GCD: Number Theoretic Functions.
2331 * Factor removal functions: Number Theoretic Functions.
2333 * Factorial algorithm: Factorial Algorithm. (line 6)
2334 * Factorial functions: Number Theoretic Functions.
2336 * Factorization demo: Demonstration Programs.
2338 * Fast Fourier Transform: FFT Multiplication. (line 6)
2339 * Fat binary: Build Options. (line 164)
2340 * FFT multiplication <1>: FFT Multiplication. (line 6)
2341 * FFT multiplication: Build Options. (line 317)
2342 * Fibonacci number algorithm: Fibonacci Numbers Algorithm.
2344 * Fibonacci sequence functions: Number Theoretic Functions.
2346 * Float arithmetic functions: Float Arithmetic. (line 6)
2347 * Float assignment functions <1>: Simultaneous Float Init & Assign.
2349 * Float assignment functions: Assigning Floats. (line 6)
2350 * Float comparison functions: Float Comparison. (line 6)
2351 * Float conversion functions: Converting Floats. (line 6)
2352 * Float functions: Floating-point Functions.
2354 * Float initialization functions <1>: Simultaneous Float Init & Assign.
2356 * Float initialization functions: Initializing Floats. (line 6)
2357 * Float input and output functions: I/O of Floats. (line 6)
2358 * Float internals: Float Internals. (line 6)
2359 * Float miscellaneous functions: Miscellaneous Float Functions.
2361 * Float random number functions: Miscellaneous Float Functions.
2363 * Float rounding functions: Miscellaneous Float Functions.
2365 * Float sign tests: Float Comparison. (line 33)
2366 * Floating point mode: Notes for Particular Systems.
2368 * Floating-point functions: Floating-point Functions.
2370 * Floating-point number: Nomenclature and Types.
2372 * fnccheck: Profiling. (line 77)
2373 * Formatted input: Formatted Input. (line 6)
2374 * Formatted output: Formatted Output. (line 6)
2375 * Free Documentation License: GNU Free Documentation License.
2377 * frexp <1>: Converting Floats. (line 23)
2378 * frexp: Converting Integers. (line 42)
2379 * FTP of latest version: Introduction to GMP. (line 38)
2380 * Function classes: Function Classes. (line 6)
2381 * FunctionCheck: Profiling. (line 77)
2382 * GCC Checker: Debugging. (line 115)
2383 * GCD algorithms: Greatest Common Divisor Algorithms.
2385 * GCD extended: Number Theoretic Functions.
2387 * GCD functions: Number Theoretic Functions.
2389 * GDB: Debugging. (line 58)
2390 * Generic C: Build Options. (line 153)
2391 * GMP Perl module: Demonstration Programs.
2393 * GMP version number: Useful Macros and Constants.
2395 * gmp.h: Headers and Libraries.
2397 * gmpxx.h: C++ Interface General.
2399 * GNU Debugger: Debugging. (line 58)
2400 * GNU Free Documentation License: GNU Free Documentation License.
2402 * GNU strip: Known Build Problems.
2404 * gprof: Profiling. (line 41)
2405 * Greatest common divisor algorithms: Greatest Common Divisor Algorithms.
2407 * Greatest common divisor functions: Number Theoretic Functions.
2409 * Hardware floating point mode: Notes for Particular Systems.
2411 * Headers: Headers and Libraries.
2413 * Heap problems: Debugging. (line 24)
2414 * Home page: Introduction to GMP. (line 34)
2415 * Host system: Build Options. (line 66)
2416 * HP-UX: ABI and ISA. (line 107)
2417 * HPPA: ABI and ISA. (line 68)
2418 * I/O functions <1>: I/O of Integers. (line 6)
2419 * I/O functions <2>: I/O of Rationals. (line 6)
2420 * I/O functions: I/O of Floats. (line 6)
2421 * i386: Notes for Particular Systems.
2423 * IA-64: ABI and ISA. (line 107)
2424 * Import: Integer Import and Export.
2426 * In-place operations: Efficiency. (line 57)
2427 * Include files: Headers and Libraries.
2429 * info-lookup-symbol: Emacs. (line 6)
2430 * Initialization functions <1>: Initializing Integers.
2432 * Initialization functions <2>: Initializing Rationals.
2434 * Initialization functions <3>: Random State Initialization.
2436 * Initialization functions <4>: Simultaneous Float Init & Assign.
2438 * Initialization functions <5>: Simultaneous Integer Init & Assign.
2440 * Initialization functions: Initializing Floats. (line 6)
2441 * Initializing and clearing: Efficiency. (line 21)
2442 * Input functions <1>: I/O of Integers. (line 6)
2443 * Input functions <2>: I/O of Rationals. (line 6)
2444 * Input functions <3>: I/O of Floats. (line 6)
2445 * Input functions: Formatted Input Functions.
2447 * Install prefix: Build Options. (line 32)
2448 * Installing GMP: Installing GMP. (line 6)
2449 * Instruction Set Architecture: ABI and ISA. (line 6)
2450 * instrument-functions: Profiling. (line 66)
2451 * Integer: Nomenclature and Types.
2453 * Integer arithmetic functions: Integer Arithmetic. (line 6)
2454 * Integer assignment functions <1>: Simultaneous Integer Init & Assign.
2456 * Integer assignment functions: Assigning Integers. (line 6)
2457 * Integer bit manipulation functions: Integer Logic and Bit Fiddling.
2459 * Integer comparison functions: Integer Comparisons. (line 6)
2460 * Integer conversion functions: Converting Integers. (line 6)
2461 * Integer division functions: Integer Division. (line 6)
2462 * Integer exponentiation functions: Integer Exponentiation.
2464 * Integer export: Integer Import and Export.
2466 * Integer functions: Integer Functions. (line 6)
2467 * Integer import: Integer Import and Export.
2469 * Integer initialization functions <1>: Simultaneous Integer Init & Assign.
2471 * Integer initialization functions: Initializing Integers.
2473 * Integer input and output functions: I/O of Integers. (line 6)
2474 * Integer internals: Integer Internals. (line 6)
2475 * Integer logical functions: Integer Logic and Bit Fiddling.
2477 * Integer miscellaneous functions: Miscellaneous Integer Functions.
2479 * Integer random number functions: Integer Random Numbers.
2481 * Integer root functions: Integer Roots. (line 6)
2482 * Integer sign tests: Integer Comparisons. (line 28)
2483 * Integer special functions: Integer Special Functions.
2485 * Interix: Notes for Particular Systems.
2487 * Internals: Internals. (line 6)
2488 * Introduction: Introduction to GMP. (line 6)
2489 * Inverse modulo functions: Number Theoretic Functions.
2491 * IRIX <1>: Known Build Problems.
2493 * IRIX: ABI and ISA. (line 132)
2494 * ISA: ABI and ISA. (line 6)
2495 * istream input: C++ Formatted Input. (line 6)
2496 * Jacobi symbol algorithm: Jacobi Symbol. (line 6)
2497 * Jacobi symbol functions: Number Theoretic Functions.
2499 * Karatsuba multiplication: Karatsuba Multiplication.
2501 * Karatsuba square root algorithm: Square Root Algorithm.
2503 * Kronecker symbol functions: Number Theoretic Functions.
2505 * Language bindings: Language Bindings. (line 6)
2506 * Latest version of GMP: Introduction to GMP. (line 38)
2507 * LCM functions: Number Theoretic Functions.
2509 * Least common multiple functions: Number Theoretic Functions.
2511 * Legendre symbol functions: Number Theoretic Functions.
2513 * libgmp: Headers and Libraries.
2515 * libgmpxx: Headers and Libraries.
2517 * Libraries: Headers and Libraries.
2519 * Libtool: Headers and Libraries.
2521 * Libtool versioning: Notes for Package Builds.
2523 * License conditions: Copying. (line 6)
2524 * Limb: Nomenclature and Types.
2526 * Limb size: Useful Macros and Constants.
2528 * Linear congruential algorithm: Random Number Algorithms.
2530 * Linear congruential random numbers: Random State Initialization.
2532 * Linking: Headers and Libraries.
2534 * Logical functions: Integer Logic and Bit Fiddling.
2536 * Low-level functions: Low-level Functions. (line 6)
2537 * Lucas number algorithm: Lucas Numbers Algorithm.
2539 * Lucas number functions: Number Theoretic Functions.
2541 * MacOS X: Known Build Problems.
2543 * Mailing lists: Introduction to GMP. (line 45)
2544 * Malloc debugger: Debugging. (line 30)
2545 * Malloc problems: Debugging. (line 24)
2546 * Memory allocation: Custom Allocation. (line 6)
2547 * Memory management: Memory Management. (line 6)
2548 * Mersenne twister algorithm: Random Number Algorithms.
2550 * Mersenne twister random numbers: Random State Initialization.
2552 * MINGW: Notes for Particular Systems.
2554 * MIPS: ABI and ISA. (line 132)
2555 * Miscellaneous float functions: Miscellaneous Float Functions.
2557 * Miscellaneous integer functions: Miscellaneous Integer Functions.
2559 * MMX: Notes for Particular Systems.
2561 * Modular inverse functions: Number Theoretic Functions.
2563 * Most significant bit: Miscellaneous Integer Functions.
2565 * mp.h: BSD Compatible Functions.
2567 * MPN_PATH: Build Options. (line 335)
2568 * MS Windows: Notes for Particular Systems.
2570 * MS-DOS: Notes for Particular Systems.
2572 * Multi-threading: Reentrancy. (line 6)
2573 * Multiplication algorithms: Multiplication Algorithms.
2575 * Nails: Low-level Functions. (line 478)
2576 * Native compilation: Build Options. (line 52)
2577 * NeXT: Known Build Problems.
2579 * Next prime function: Number Theoretic Functions.
2581 * Nomenclature: Nomenclature and Types.
2583 * Non-Unix systems: Build Options. (line 11)
2584 * Nth root algorithm: Nth Root Algorithm. (line 6)
2585 * Number sequences: Efficiency. (line 147)
2586 * Number theoretic functions: Number Theoretic Functions.
2588 * Numerator and denominator: Applying Integer Functions.
2590 * obstack output: Formatted Output Functions.
2592 * OpenBSD: Notes for Particular Systems.
2594 * Optimizing performance: Performance optimization.
2596 * ostream output: C++ Formatted Output.
2598 * Other languages: Language Bindings. (line 6)
2599 * Output functions <1>: I/O of Floats. (line 6)
2600 * Output functions <2>: I/O of Rationals. (line 6)
2601 * Output functions <3>: Formatted Output Functions.
2603 * Output functions: I/O of Integers. (line 6)
2604 * Packaged builds: Notes for Package Builds.
2606 * Parameter conventions: Parameter Conventions.
2608 * Parsing expressions demo: Demonstration Programs.
2610 * Particular systems: Notes for Particular Systems.
2612 * Past GMP versions: Compatibility with older versions.
2614 * PDF: Build Options. (line 350)
2615 * Perfect power algorithm: Perfect Power Algorithm.
2617 * Perfect power functions: Integer Roots. (line 27)
2618 * Perfect square algorithm: Perfect Square Algorithm.
2620 * Perfect square functions: Integer Roots. (line 36)
2621 * perl: Demonstration Programs.
2623 * Perl module: Demonstration Programs.
2625 * Postscript: Build Options. (line 350)
2626 * Power/PowerPC <1>: Known Build Problems.
2628 * Power/PowerPC: Notes for Particular Systems.
2630 * Powering algorithms: Powering Algorithms. (line 6)
2631 * Powering functions <1>: Float Arithmetic. (line 41)
2632 * Powering functions: Integer Exponentiation.
2634 * PowerPC: ABI and ISA. (line 167)
2635 * Precision of floats: Floating-point Functions.
2637 * Precision of hardware floating point: Notes for Particular Systems.
2639 * Prefix: Build Options. (line 32)
2640 * Prime testing algorithms: Prime Testing Algorithm.
2642 * Prime testing functions: Number Theoretic Functions.
2644 * printf formatted output: Formatted Output. (line 6)
2645 * Probable prime testing functions: Number Theoretic Functions.
2647 * prof: Profiling. (line 24)
2648 * Profiling: Profiling. (line 6)
2649 * Radix conversion algorithms: Radix Conversion Algorithms.
2651 * Random number algorithms: Random Number Algorithms.
2653 * Random number functions <1>: Integer Random Numbers.
2655 * Random number functions <2>: Miscellaneous Float Functions.
2657 * Random number functions: Random Number Functions.
2659 * Random number seeding: Random State Seeding.
2661 * Random number state: Random State Initialization.
2663 * Random state: Nomenclature and Types.
2665 * Rational arithmetic: Efficiency. (line 113)
2666 * Rational arithmetic functions: Rational Arithmetic. (line 6)
2667 * Rational assignment functions: Initializing Rationals.
2669 * Rational comparison functions: Comparing Rationals. (line 6)
2670 * Rational conversion functions: Rational Conversions.
2672 * Rational initialization functions: Initializing Rationals.
2674 * Rational input and output functions: I/O of Rationals. (line 6)
2675 * Rational internals: Rational Internals. (line 6)
2676 * Rational number: Nomenclature and Types.
2678 * Rational number functions: Rational Number Functions.
2680 * Rational numerator and denominator: Applying Integer Functions.
2682 * Rational sign tests: Comparing Rationals. (line 27)
2683 * Raw output internals: Raw Output Internals.
2685 * Reallocations: Efficiency. (line 30)
2686 * Reentrancy: Reentrancy. (line 6)
2687 * References: References. (line 6)
2688 * Remove factor functions: Number Theoretic Functions.
2690 * Reporting bugs: Reporting Bugs. (line 6)
2691 * Root extraction algorithm: Nth Root Algorithm. (line 6)
2692 * Root extraction algorithms: Root Extraction Algorithms.
2694 * Root extraction functions <1>: Float Arithmetic. (line 37)
2695 * Root extraction functions: Integer Roots. (line 6)
2696 * Root testing functions: Integer Roots. (line 36)
2697 * Rounding functions: Miscellaneous Float Functions.
2699 * Sample programs: Demonstration Programs.
2701 * Scan bit functions: Integer Logic and Bit Fiddling.
2703 * scanf formatted input: Formatted Input. (line 6)
2704 * SCO: Known Build Problems.
2706 * Seeding random numbers: Random State Seeding.
2708 * Segmentation violation: Debugging. (line 7)
2709 * Sequent Symmetry: Known Build Problems.
2711 * Services for Unix: Notes for Particular Systems.
2713 * Shared library versioning: Notes for Package Builds.
2715 * Sign tests <1>: Float Comparison. (line 33)
2716 * Sign tests <2>: Integer Comparisons. (line 28)
2717 * Sign tests: Comparing Rationals. (line 27)
2718 * Size in digits: Miscellaneous Integer Functions.
2720 * Small operands: Efficiency. (line 7)
2721 * Solaris <1>: ABI and ISA. (line 201)
2722 * Solaris: Known Build Problems.
2724 * Sparc: Notes for Particular Systems.
2726 * Sparc V9: ABI and ISA. (line 201)
2727 * Special integer functions: Integer Special Functions.
2729 * Square root algorithm: Square Root Algorithm.
2731 * SSE2: Notes for Particular Systems.
2733 * Stack backtrace: Debugging. (line 50)
2734 * Stack overflow <1>: Debugging. (line 7)
2735 * Stack overflow: Build Options. (line 278)
2736 * Static linking: Efficiency. (line 14)
2737 * stdarg.h: Headers and Libraries.
2739 * stdio.h: Headers and Libraries.
2741 * Stripped libraries: Known Build Problems.
2743 * Sun: ABI and ISA. (line 201)
2744 * SunOS: Notes for Particular Systems.
2746 * Systems: Notes for Particular Systems.
2748 * Temporary memory: Build Options. (line 278)
2749 * Texinfo: Build Options. (line 347)
2750 * Text input/output: Efficiency. (line 153)
2751 * Thread safety: Reentrancy. (line 6)
2752 * Toom multiplication <1>: Other Multiplication.
2754 * Toom multiplication <2>: Toom 4-Way Multiplication.
2756 * Toom multiplication: Toom 3-Way Multiplication.
2758 * Types: Nomenclature and Types.
2760 * ui and si functions: Efficiency. (line 50)
2761 * Unbalanced multiplication: Unbalanced Multiplication.
2763 * Upward compatibility: Compatibility with older versions.
2765 * Useful macros and constants: Useful Macros and Constants.
2767 * User-defined precision: Floating-point Functions.
2769 * Valgrind: Debugging. (line 130)
2770 * Variable conventions: Variable Conventions.
2772 * Version number: Useful Macros and Constants.
2774 * Web page: Introduction to GMP. (line 34)
2775 * Windows: Notes for Particular Systems.
2777 * x86: Notes for Particular Systems.
2779 * x87: Notes for Particular Systems.
2781 * XML: Build Options. (line 354)
2784 File: gmp.info, Node: Function Index, Prev: Concept Index, Up: Top
2786 Function and Type Index
2787 ***********************
2792 * __GMP_CC: Useful Macros and Constants.
2794 * __GMP_CFLAGS: Useful Macros and Constants.
2796 * __GNU_MP_VERSION: Useful Macros and Constants.
2798 * __GNU_MP_VERSION_MINOR: Useful Macros and Constants.
2800 * __GNU_MP_VERSION_PATCHLEVEL: Useful Macros and Constants.