I’d like to thank everybody who helped to make this thesis become reality. Peter for his patience, trust and guidance; Charly, Hannes, Karin, and Stefan for the most creative, helpful and entertaining working environment I have ever experienced; and my mother for her unwavering support.

Chapter 1
Introduction

The purpose of this thesis is to discuss the implementation of the explicit inversive congruential generator (EICG) and the properties of the resulting pseudorandom numbers. But before we delve into the details of the implementation or the theoretical and empirical results we will take a closer look at the basic concept of pseudo-random numbers.

What do we mean when we talk about pseudorandom numbers (PRN) ? And for what purpose do we devise such elaborate means to artiﬁcially generate megabytes of digital noise ?

1.1 What do we need Pseudo-Random Numbers for ?

To the uninitiated, all this pseudo-random numbers “business” seems to have no serious applications. Everybody will come up with computer games as a ﬁeld where pseudo-random numbers are used to make the behaviour of the computer less predictable. Steering the movements of some on-screen monster does not require a high standard of randomness, almost any algorithm will suﬃce, provided it is easy to implement and does not cost too much computing resources.

Another domain where we need PRN is wherever we need to model a more or less random phenomenon of the real world. The simulation of a roulette table or other forms of lottery might still be in the area of non-serious application, but here the defects of the generator start to be an issue. Imagine this scenario: you try to develop a winning strategy for blackjack and use a simulation to test your algorithm. Any correlation between statistical defects and the strategy will lead to a skewed result and may even change the sign of the expected outcome. Playing that strategy in a real casino might cost you dearly. Thus it is important to make the choice of the generator an issue even in such non-scientiﬁc applications.

Simulation of random events is far from being limited to gambling, a signiﬁcant percentage of all simulations of natural phenomena contains a random component. Whether that may be quantum eﬀects, rainfall on a certain area, Brownian motion, absorption pattern, bifurcation of tree roots, failure of technical components, solar activity …, in all cases we know at best the statistical properties of an event. An analytical solution of the given problem based on probabilities is often not possible. Thus one has to resort to stochastic simulation (see [3, 48, 75]) where one calculates the result of the overall simulation by choosing possible outcomes of the underlying random events according to their respective probability. Doing this a number of times should provide enough samples of the outcome to estimate the probability of each possible result. Needless to say, the selection of the realisations of the underlying random events is crucial to the correctness of the whole calculation. Since this selection is done with the aid of PRN, their quality plays an important role in the whole process.

Finding the use of PRN in stochastic simulation is not that surprising, but ﬁnding them in algorithms for such mundane tasks like integration might need some more explanation. Numerical integration is a common problem in a great deal of real world problems. A big battery of algorithms (trapezoid method, Simpson’s method, spline quadrature, adaptive quadrature, Runge-Kutta, …) was developed to minimize the calculation costs while increasing the accuracy of the result. All these methods scale very badly with the dimension of the integral, so a completely diﬀerent approach is more appropriate there. The Monte Carlo method (see [3, 48, 78, 84]) uses randomly selected samples of the function to estimate the integral. Provided we know more about the behaviour of the function (i.e. its total variation) and the distribution of the actual samples used (as measured by their discrepancy), the inequality of Koksma-Hlawka (See page §, [51, 70]) will give an error bound for this method. Since it is generally not possible to calculate the exact value of the discrepancy of the random numbers used for the integration, this error bound will only be a probabilistic one. In order to get a deterministic one, the random numbers, which determine which samples of the function will be evaluated, are replaced by numbers for which the order of the discrepancy is known. This turns the Monte Carlo method into the Quasi-Monte Carlo method.

One way to get such good point sets is to explicitly construct them with that goal in mind (See [70] for a discussion of

(t, m, s)

-nets and other methods.) or use a PRNG for which an upper bound on the discrepancy is known. This is one of the reasons why we will take a close look at this quantity in Section 3.3. Not all applications of Quasi-Monte Carlo integration are labeled as such, as the basic algorithm can be regarded as a simple heuristic. For example, distributed ray tracing [32, p. 788] uses a set of randomly distributed rays to implement spatial and temporal antialiasing which amounts to an integration over both the time-frame of the picture and the pixel’s spatial extension.

Non-deterministic algorithms often use pseudorandom numbers, too. These algorithms are used to tackle problems for which a deterministic solution takes too much time. Although they cannot guarantee success, they promise to ﬁnd the solution (or a sub-optimal one) within reasonable time. Examples for this kind of algorithm are Pollard’s rho heuristic [8, p. 844] for integer factorization, the Rabin-Miller primality test [8, p. 839], Simulated Annealing [31], and Threshold Accepting [11].

Other algorithms use pseudorandom numbers for a diﬀerent purpose. Instead of using them directly for solving the problem they are used to randomize the problem (or the algorithm) in order to avoid running into the same worst-case behaviour again and again. See [8, p. 161] for an explanation of the rationale behind randomized quick-sort.

Some cryptographic algorithms and protocols require a good source of random numbers, too. Stream ciphers [77, p. 168f], for example, use the output of a PRNG (termed keystream generator) to encrypt the plaintext. The security of this cipher depends largely on the statistical quality of the keystream. Any regularities of the PRNG can be used by the attacker to predict the next bits and thus crack the code. No other domain of PRNG applications has such a high demand on the “randomness” of the generated PRN. Algorithms which are good enough for stochastic simulations are typically way too predictable to be useful as a keystream generator for a stream cipher. Thus the ﬁeld of cryptographically secure PRNG has amazingly little in common with the study of PRNG for stochastic simulation on which we will focus in this thesis. Information on cryptographically secure pseudorandom numbers can be found in [52], [82], and [77].

Another application of pseudo-random numbers in the ﬁeld of cryptology is providing the “random” numbers needed for a variety of cryptographic protocols. A well known example are the session keys generated for each transaction in hybrid cryptosystems. As the recent debacle involving the Netscape Navigator [33, 69] has shown, one must be very careful not to use a simple PRNG for this task. Since this is more a matter of how to get the entropy needed for non-predictability than one of analysing the properties of sequences of PRN we will not elaborate on this subject in this thesis.

1.2 Criteria for PRN Generator Selection

Now that we know a bit about the various applications of PRN, let’s try to formulate a few criteria for the selection of a good PRN generation algorithm. As we will see later it is crucial for the selection of the right PRNG to keep an eye on the application of the PRN.

1.2.1 Reproducibility

This criterion may sound strange at ﬁrst sight, since reproducibility contradicts the intuitive notion of randomness, and indeed, real random number generators are extremely unlikely to ever repeat their output. So what are the advantages of a generator which will produce the same sequence of pseudorandom numbers when fed with the same parameters ? Once again, we have to turn to the application of which the generator is a component. In the case of a stochastic simulation the beneﬁt is twofold:

In some areas, for example stream ciphers, reproducibility is a key requirement for the application. Only very few applications, most of them in the area of cryptography, do actually beneﬁt from the use of non-reproducible PRN.

1.2.2 Statistical properties

It is clear that when we want to simulate a random variable with a PRNG, then the output of the generator should model as closely as possible the expected behaviour of instances of the random variable. If a simulation of a dice generates a 7 or strongly favors the 6 we will not accept the generator. Other deviations from the desired behaviour, e.g. correlations, are harder to detect, and methods for systematically testing generators for such deﬁciencies have been the subject of considerable mathematical work [50, 28, 54, 85], including some parts of this thesis.

As we will see later, proving that a generator really has all the statistical properties a real random number generator is supposed to have, is not possible. So all we can do is to establish faith in the generator by testing it for some properties.

1.2.3 Empirical Test Results

Empirical testing usually involves using the PRN for a stochastic simulation with a known result. If the computed results contradict the expected ones, the generator will be dismissed as not suitable for that kind of stochastic simulation. A passed test will increase the faith that this generator will yield correct results in real world problems. We will examine the signiﬁcance of empirical test results later in greater details.

A large battery of such tests was developed over the years, from the well known tests of Knuth [50] and Marsaglia [66] to recent additions like the weighted spectral test [40, 41, 46, 44]. See [54, §3.5.] for further references on testing pseudorandom number generators.

1.2.4 Possibility of Theoretical Analysis

In order to make analytical investigations possible, most modern PRNG are deﬁned in quite simple mathematical terms. It is a tradeoﬀ: The simpler the algorithm, the easier it will be to prove statements concerning the quality of the generated numbers. On the other hand, a convoluted algorithm appeals to the intuition. History has shown [50, 74] that quite a few people could not resist the temptation to build generators based on doing obscure transformations on numbers stored in computers. Empirical analysis has shown that the quality of such generators are often abysmal.

1.2.5 Results of Theoretical Analysis

Doing empirical studies on the properties of a PRNG is always possible, but deriving properties of the generator output by pure mathematical study has a lot of advantages. Whereas an empirical test can only cover one speciﬁc set of parameters of a generator, it is sometime possible to make analytically proven statements on the properties of PRN generated by a certain generator regardless of the parameters used. In the same vein, an empirical test on a speciﬁc part of the generator’s output, say the ﬁrst billion numbers, may give us conﬁdence on the behaviour of the next billion numbers, but cannot oﬀer any guarantee that they will be equally good. Analytical results fall in the following categories:

1.2.6 Eﬃciency

With all the mathematical discussions about the merits of PRN generated by a new algorithm one should not forget the fact that we need to actually implement this algorithm on a real computer. There are a few things which should be noted here:

1.2.7 Practical Aspects

After implementing the algorithm one has to ﬁnd good parameters for that generator, too. Fortunately, for some common PRNG tables containing suitable parameters have been published [29, 43, 2, 83], so there is no need to reinvent the wheel there. Other generators like the EICG are known to be rather insensitive to the choice of the parameters.

Another aspect is the possibility to generate independent streams of pseudorandom numbers. Such streams are needed for parallel or vectorized computing. See [55, §8], [1], [12], and [36] for more information on this topic.

1.3 Important Types of PRN Generators

There is no shortage on proposed pseudorandom number generation algorithms. Every year new ideas on this topic are published, but only if the resulting PRN have been subject to intensive theoretical and empirical study the generator might have a chance to get used in a real world problem. As it is often the case with competing inventions, an objective technological superiority does not immediately lead to market domination. Whether the generator is included in standard programming libraries seems to be much more important than any published results on the distribution properties of the numbers. A classic example is the now infamous RANDU generator which was included in IBM’s Fortran library and features an extremely poor distribution of triples composed of subsequent numbers.

The following list introduces some of the most commonly used generators as well as the inversive generators on which we will focus in this thesis. More complete surveys on the current menagerie of PRNG can be found in [70, 72, 55].

In the following

M

denotes a positive integer (termed modulus) and

ℤ_{M} = {0, 1, \dots, M - 1}

represents the system of all residues modulo

M

. With the addition and multiplication modulo

M

the set

ℤ_{M}

acquires the algebraic structure of a ﬁnite ring. If the context makes it clear that we operate in the ring

(ℤ_{M}, +, \cdot)

we will omit the trailing “mod”.

1.3.1 The Linear Congruential Generator

As the sequence

lcg (p, a, b, y_{0}) = {(y_{n})}_{n \geq 0}

is deﬁned by a recursion of order one on a ﬁnite set it must be periodic. The longest possible period length is

M

in the case of

b \neq 0

and

M - 1

in the case of

b = 0

. The necessary conditions for achieving these period lengths are well known. [70, p. 169]

The LCG is very popular. Its implementation is quite simple, especially if

M

is chosen as 2 to the power of bits per native word of the computer (e.g.

2^{32}

) which reduces the modulo operations to just ignoring the overﬂow. Due to its simplicity and popularity the LCG has been subjected to intensive analytical and empirical examination. The quality of the resulting PRN depends very much on the choice of the parameters

M, a,

and

b

. Fortunately, tables containing good parameters have been published, see [29, 27, 53].

The output of a LCG shows a strong intrinsic structure ([65], see also p. §). A number of modiﬁcations were proposed to improve the quality of the generator. One approach is to extend the recursion to higher orders by making

y_{n}

a function of

y_{n - 1}, \dots, y_{n - r}

. Other proposals modify the function which describes the recursion. As the name says, the LCG uses the linear function

f (y_{n}) = a \cdot y_{n - 1} + b (m o d M)

to calculate

y_{n}

from

y_{n - 1}

. If we replace

f

by an arbitrary function, we refer to the resulting PRNG as a general ﬁrst-order congruential generator [70, p. 177]. In order to guarantee maximal period length, the function

f

must be carefully selected. For example, the quadratic congruential method, as proposed by Knuth in [50, §3.2.2] uses a polynomial of degree 2 as the recursion and a power of 2 as the modulus. See [70, p. 181f] for the conditions on the parameters and analytical investigation on the resulting PRN.

1.3.2 Shift-register Generators

Shift-register generators diﬀer from standard linear congruential generators in two respects. First, they use a higher-order linear recursion of the form

where

M \geq 2

is the modulus,

k \geq 1

is the order of the recursion and

a_{0}, \dots, a_{k - 1}

are elements of

ℤ_{M}

. Second, instead of just scaling the

y_{n}

to the unity interval to get the pseudorandom numbers, the

x_{n}

are calculated from a block of consecutive values

y_{n}, \dots, y_{n + m}

. Thus it is no longer necessary to use a large modulus to get a decent resolution of the resulting PRN. In order to simplify and optimize the implementation of recursion, the common choice of

M

is the prime 2. On a

L

-bit computer this allows the grouping of

L

steps into one operation.

Two techniques for the transformation of the sequence

{(y_{n})}_{n \geq 0}

into a sequence of pseudorandom numbers in

[0, 1 [

are commonly used: The digital multistep method puts

More popular is the generalized feedback shift-register method (GSFR) which can take advantage of the above mentioned blocking of

L

bits if

h_{1}, \dots, h_{m} \geq 0

are selected suitably:

If the parameters are carefully selected the period length will in both cases be

per (x_{n}) = p^{k} - 1

Shift register pseudorandom numbers have the advantage of a fast generation algorithm and a period length independent of the limitations of the integers used for the calculation. See [70, Chapter 9] and [55] for a discussion on the properties of shift register pseudorandom numbers.

1.3.3 The Inversive Congruential Generator

A promising modiﬁcation of the LCG was proposed by Eichenauer and Lehn in [14]. We will only consider the case of a prime modulus

p = M

here. It involves the operation of modular inversion in

ℤ_{p}

which we will denote by an overline (

\bar{c}

The restriction to prime moduli guarantees the unique existence of an inversive element in

ℤ_{p}

. This deﬁnition implies

c \bar{c} \equiv 1 (m o d p)

for

c \neq 0

Empirical as well as analytical investigations indicate that the output of an ICG is superior to the output of a LCG in several respects: longer usable sample sizes [85, 61], less correlations between consecutive numbers [72].

1.3.4 The Explicit Inversive Congruential Generator

Analytical calculations have led to the following observation: We can describe the generator as a function mapping

n

y_{n}

. This self-map

n \mapsto y_{n}

in the ﬁnite ﬁeld

ℤ_{p}

can be written as a uniquely deﬁned polynomial

g

with degree

d < p

. If we demand the sequence

{(y_{n})}_{n \geq 0}

to have the maximal possible period length

p

, the polynomial

g

maps

ℤ_{p}

onto itself and thus must be a permutation polynomial, which is either linear (

d = 1

) or satisﬁes

3 \leq d \leq p - 2

according to [64, Cor. 7.5]. It turns out that the degree

d

plays an important role in the analytical examination of the generator in a sense that a higher degree seems to indicate better distribution properties [70, Theorems 8.2, 8.3] (see p. §). The theorem of Euler-Fermat tells us that evaluating

c^{p - 2}

corresponds to the calculation of the multiplicative inverse. In this spirit, the deﬁnition² of the EICG seems quite natural:

As long as

a \neq 0

this generator will always have period length

p

. Once again analytical and empirical investigations have shown that the output of this generator is superior to that of an LCG. This will be the generator on which we will focus our attention in this thesis. The other generators mainly serve as a reference against which the EICG must compete.

1.3.5 EICG Variants

Two variations of the basic explicit inversive congruential generator have been proposed. Both proposals substitute the prime modulus

p

with

M = 2^{ω} (ω \geq 4)

. In the set

ℤ_{M}

we can deﬁne the modular inversion only for odd integers. This inversion is once again deﬁned by

c \bar{c} = 1 (m o d M)

for all odd

c

The conditions on

a

and

b

guarantee that the sequence

x_{0}, x_{1}, \dots

is purely periodic with period

M ∕ 2

. While powers of 2 as modulus have certain advantages for the implementation of the generator, all theoretical investigations [22, 16] on the quality of the resulting numbers have concluded that this generator is inferior to the original EICG.

In order to achieve a period length of

M

, Eichenauer-Herrmann [17] proposed the following generator:

Although this modiﬁcation does indeed increase the period length to

M

, the theoretically derived properties of the resulting numbers are still inferior to the original EICG.

1.3.6 Compound Techniques

An interesting meta-generator is the compound method. This is a very simple and eﬀective way to combine several streams of PRN into one single sequence with (hopefully) superior properties. It works as follows: For

1 \leq j \leq r

let

x_{0}^{(j)}, x_{1}^{(j)}, x_{2}^{(j)}, \dots

be a purely periodic sequence of pseudorandom numbers. Then we get the compound sequence

x_{0}, x_{1}, \dots

If the subsequences are purely periodic with distinct period

per (x_{n}^{(j)}) = p_{j}

, then we have

per (x_{n}) = \prod_{j = 1}^{r} p_{j}

This compound method extends the well-known approach of Wichmann and Hill [88]. The properties of the resulting sequence has been subject to a number of publications; we refer to Niederreiter [72, 4.2] for all the references. Generally speaking, the compound method preserves the basic properties of the underlying generators.

Chapter 2
The Notion of “Randomness”

Examining what we mean by random numbers will help us to understand the diﬃculties in generating pseudo-random numbers and interpreting test results. We will look at how we all intuitively deal with supposedly random sequences, and touch upon the mathematical treatment of the subject. Regrettably, we will not be able to comprehensively cover this topic, thus we will focus on the subject of testing (ﬁnite) sequences of PRN.

2.1 Randomness by Intuition

First of all, we want to take a closer look at the intuitive notion of randomness. For one, we all intuitively assign probabilities to various events we encounter, from such mundane things like which side a dropped slice of bread will land on, every-day events like rainfall, the number of red traﬃc lights encountered, or friends met in the bus, to explicitly random events like the outcome of a dice or the weekly lottery.

But how do we come to the conclusion that one of these events is somehow random ? What are the criteria for that decision ? In some of the example above the decision is easy as we know about the process which leads to the outcome. Watching the dice being cast properly is a sure way to convince oneself that the outcome is indeed truly random. But how do we proceed when we cannot look behind the scenes, when the sequence of outcomes is the only information we have got ?

The human mind has remarkable capabilities to spot regularities in a sequence of events. If it fails to notice anything suspicious it will declare the sequence to be random.

Let’s test this notion on the most widely used source of random numbers, the dice. A dice is supposed to select one of the numbers

1, \dots, 6

in a fair and independent fashion each time it is cast. In the following list we will argue on the merits of a few possible outcomes.¹

Did you see the one big fault in this sequence of would-be random sequences ? We did not notice it because we looked only at single sequences. Can you ﬁnd it now ?²

2.2 Formalizing the Intuitive Notion

Now that we have examined what we intuitively mean by saying “This sequence looks random.” we can try to formalize this notion and develop a set of properties we want to check if we have to judge a sequence and its generating algorithm. The goal in this formalisation is to be able to delegate the testing to computer programs. As computers are known to be very bad at spotting patterns, it will not be an easy undertaking to ﬁnd an algorithm which does as good as the human mind. We can only hope that all systematic faults in the sequence will eventually cause a suspicious behaviour of the sequence in a generic test.

In the following we abandon the dice as the example, and turn to uniformly distributed numbers in the interval

[0, 1 [

2.3 Randomness in Mathematical Terms

Now that we have clariﬁed the intuitive understanding of the concept of testing pseudorandom sequences, we will turn to the mathematical treatment of the subject. Rather than providing a full scale discussion of the mathematical objects and formalisms involved, which would exceed the scope of this thesis, we want to present an introduction targeted at the mathematical layman. Our aim in this section is to introduce as much of relevant concepts as is necessary to be able to explain the problems one faces when testing pseudorandom numbers and comparing PRNG. We refer to [85] for an in-depth discussion.

There is more than one mathematical approach to this topic. The following list tries to introduce the diﬀerent viewpoints and gives references for further reading.

All common tests rely on the idea of statistical testing. In the following we will try to elaborate on the motivation behind these tests, their mathematical foundation, their power and limitations, and how to interpret their results.

2.3.1 Random Variables and Probability

First of all, let us take a closer look at what we want to simulate. Our target are sequences of random numbers, which are realisations of a sequence of independent, uniformly distributed random variables.

Random variables (RVs) are one of the main building blocks in probability theory. They are used to assign each possible outcome (or, to be more exact, each reasonable set of outcomes) of an experiment a real number which is interpreted as the probability of this outcome.

But strictly speaking, the mathematical concept of RVs does not explicitly reﬂect our intuitive ideas about randomness of events, on the contrary: RVs are just simple, ordinary functions. One is tempted to ascribe mythical powers to RVs, like the ability to randomly select one of a set of possible events. This is not true, they only describe certain aspects of an idealized system which ﬂips the metaphorical coin.

So where is the link between the mathematical world of RVs and the real life world of roulette tables ? Unfortunately there is none for single events. Even if a RV does in fact model a real world event, hardly any conclusions can be made about the outcome of the next single event. Even such unlikely events as winning the jackpot in a lottery do happen every now and then, and most people are not deterred by the extremely bad odds from playing every week. On the other hand some people are scared of travelling by plane because the probability of a safe ﬂight is marginally less than one. In both cases our experience tells us that the probability alone cannot predict the next outcome.

But even such pretty deﬁnite sounding statements like “this event will occur with probability 1” cannot guarantee the outcome of an event. More insight into measure theory will tell us why such strange things can happen. For example, the probability that the next realisation of an

U ([0, 1 [)

-distributed random variable will be a rational number is zero. This does not stop the real world from delivering one of the inﬁnite number of rational numbers, thus rendering the statement “This experiment will only return irrational numbers” incorrect.

We have seen that a RV cannot make concrete statements about a single outcome, so we might ask what statements about outcomes it can make at all. One way to formulate the meaning of probability is the following: [85, p. 10]

We need to elaborate on two aspects of this deﬁnition as they are not as strict and unambiguous as commonly demanded from a good deﬁnition.

First, what do we mean by “expected” ? That seems to indicate that probability cannot be an intrinsic property on an event. There is no mathematically satisfying way to assign a probability to an event based on a (ﬁnite³) set of measurements, as it is extremely unlikely that another set of experiments will result in the same value. The common way out is to make assumptions about some parts of the experiment, like the Laplace assumption which assigns the same probability to all underlying events. These assumptions are based on a mental model of that event which includes a theory on how often something should occur. It is the mathematician, the physicist or just some observer who forms a mental model based on experiences or consideration. Such simplifying mental models of the real world are ubiquitous as they provide an essential simpliﬁcation in the way we view the world. Other such simpliﬁcations include the concept of rigid bodies, ﬂuids, or gases which are abstractions of “a bunch of molecules tied together by various forces”. Just as the laws of leverage rely in their formulation on the concept of forces and of rigid bodies the laws of chance depend on the concept of probability assigned to events.

The other critical word in the above deﬁnition is “unlinked”. By unliked we mean that the outcome of one experiment does not inﬂuence the outcome of any subsequent experiment. Common examples for unlinked experiments include drawing balls from an urn (with putting them back in !), casting a dice, or the roulette wheel. Please note that in all these examples there is a connection between two successive experiments as the ﬁrst one does inﬂuences the second. It is a conscious decision by the observer that the re-shuﬄing of the balls in the urn caused by the ﬁrst experiment does not aﬀect the probability in the second one. This sounds almost like a paradox, as the re-shuﬄing surely does eﬀect the outcome. But remember, just above we noted that the probability of an event does not determine the next outcome at all, so there is not contradiction here.

We have to be careful with sequences of PRN and their relation to independent random variables, too. The concept of independence is based on the concept of distributions. As we cannot ascribe distributions to numbers, we cannot use the term “independence” for sequences of PRN. We will use the word correlations to refer to any unwanted relationship between elements in the sequence.

2.3.2 Testing

As described above, the theory of random variables and probability tries to model aspects of the physical world. The fundamental principles of science demand justiﬁcation in form of experiments for all such theories. For typical physical models such experiments are usually easy to set up and follow the same scheme of comparing an expected (calculated) result to the measurements of the actual physical event. If they diﬀer more than inaccuracies in the measurements would allow, the theory is proven to be wrong. Philosophy of Science tells us that it is impossible to positively prove a theory.

Do the same principles hold for conjectures in the ﬁeld of probability, too ? Unfortunately, they do not. Let us illustrate this with an example:

As a theory to test we might take the assumption that a given coin is fair, meaning that the probability it lands with the heads side up is

1 ∕ 2

. How might an experiment designed to test this hypothesis look like ? Surely it will involve throwing the coin a number of times and then comparing the result to the prediction. Calculating the prediction based on the theory is simple, unfortunately the prediction assigns each possible outcome a positive probability. Thus regardless of the behaviour of the coin the result is consistent with the theory, as we cannot rule out the measured result. If we have no way to reject a theory, we have to ﬁnd a diﬀerent set of criteria according to which we can justify theories.

The common way out is statistical testing. It should be clear that statistical testing can never be as strict as testing in other areas. It is a heuristic approach to the problem. As such, it relies on the good judgement of the tester and is not objective. But before we elaborate on the shortcomings of statistical testing let us summarize the basic procedure again, already using the test for randomness as the example.

This is the basic outline of all common empirical tests. We will discuss a few tests and their results later in this thesis. So what are the weak spots in this method of testing pseudorandom numbers ?

2.3.3 Interpreting Test Results

In order to conclude this chapter on the notion of randomness let us recapitulate what we know about testing a generator, and how we should proceed when we face the task of selecting a generator for a particular simulation problem.

Chapter 3
Theoretical Results

In this chapter we will discuss analytically derived properties of the explicit inversive congruential generator (EICG). We will use results obtained for the LCG to serve as a reference as the LCG is the most commonly used generator. Let us start by repeating the deﬁnition of the EICG:

Please remember that we perform all calculations except the ﬁnal scaling in the ﬁnite ﬁeld

ℤ_{p} = {0, 1, 2, \dots, p - 1}

ℤ_{p}^{*}

will denote the non-zero elements of

ℤ_{p}

, that is

ℤ_{p}^{*} = {1, 2, 3, \dots, p - 1}

. The over-line

\bar{a}

denotes the multiplicative inversion in

ℤ_{p}

for all non-zero elements

a \in ℤ_{p}^{*}

. With the special case

\bar{0} = 0

added,

x \mapsto \bar{x}

is a bijective function from

ℤ_{p}

onto

ℤ_{p}

. Furthermore, we have

\bar{\bar{x}} = x

and

\bar{x} = x^{p - 2}

for all

x \in ℤ_{p}

. The latter identity is due to Fermat’s Little Theorem.

Note that from the explicit deﬁnition of the sequence

{(y_{n})}_{n \geq 0}

we can easily derive a recursive description:

In order to achieve maximal period length

p

, the parameters

a

b

, and

n_{0}

can be freely chosen from

ℤ_{p}

as long as

p

is prime and

a \neq 0

. To see this, consider the function

f (n) : = \bar{a \cdot (n_{0} + n) + b}

which is composed of bijective functions in

ℤ_{p}

and thus is bijective, too. As

n + p = n

ℤ_{p}

we have

f (n + p) = f (p)

for all

n

, thus the sequence

{(y_{n})}_{n = 0}^{\infty}

is purely periodic with period length

p

We will write

eicg (p, a, b, n_{0})

to denote the output of a particular instance of the EICG method. Unlike Leeb [59, p. 89] we mean the whole inﬁnite (but periodic) sequence, and not just the ﬁrst

p

numbers. This way, no special treatment of wrap-arounds is needed when considering subsequences.

3.1 Relations between diﬀerent EICG

The choice of parameters is simple for an EICG, but not all choices will lead to completely diﬀerent pseudorandom numbers. In this section we will examine the relations between EICGs with the same modulus, but diﬀerent parameters

a

b

, and

n_{0}

These results are helpful for the implementation, as one can eliminate an addition modulo

p

, as well as to the theoretical investigation as they provide a very elegant way to describe sub-streams. We will elaborate on this idea which is due to Niederreiter [71, p. 5] later on.

First of all, let us make a rather trivial observation on the role of the parameter

n_{0}

Proof: This relation follows from the fact that

n

and

n_{0}

appear only as their sum

n + n_{0}

in the deﬁnition of the EICG. __

The following observation is taken from Leeb [59, p. 89]; it states that one of the parameters is redundant.

Proof: We base the proof on the recursive deﬁnition of the EICG. As the recursion does only depend on

a

, which is constant, it is suﬃcient to show that the

y_{0}

of these generators are equal. In the ﬁrst two cases we have

The third equality can be used to rewrite any EICG as an EICG with

b = 0

, but a diﬀerent value for

n_{0}

. Thus the generating formula can always be rewritten as

which saves one addition. The addition

n_{0}^{'} + n

can be implemented by simply incrementing the previous value modulo

p

, thus we need to perform only one increment, one multiplication, one inversion, and one division to generate the next pseudorandom number.

There is an obvious connection between

eicg (p, a, b, k n_{0})

and

eicg (p, k a, b, n_{0})

, too:¹

Proof: The sequence generated by taking every

k

-th element in the sequence

eicg (p, a, b, k n_{0}) = {(y_{n})}_{n \geq 0}

can be written as

{(y_{k n})}_{n \geq 0}

. We have

These three observations give us the tools to show that all maximal period EICGs can be derived from the “mother-EICG”

eicg (p, 1, 0, 0)

in the following way:

Can these insights help us in the theoretical investigation on how samples from an EICG behave under various tests ? Yes, they provide a very convenient and elegant formalism to describe subsequences and various kinds of

s

-tuples generated from the stream of pseudorandom numbers. With this formalism, the proofs of discrepancy estimates and non-linear properties are very concise.

First of all, we do not need to bother with the parameter

n_{0}

in the theoretical investigation as we can always rewrite the EICG to one with

n_{0} = 0

Second, any property of a sequence of EICG numbers, which is valid independently of the parameters used, is immediately valid for subsequences consisting of every

k

-th element. One direct consequence of this is, that once we can prove that pairs of consecutive numbers are uncorrelated for all valid parameters, we can rule out the possibility of long-range correlations at critical distances. See [10, 20] for a discussion of such problems inherent to the LCG.

The third gain, due to Niederreiter [71], is to be able to write almost arbitrary

s

-tuples formed out of the stream of EICG numbers as parallel streams. Such

s

-tuples as usually used to examine the correlation between successive numbers. For example, the overlapping serial test (see page §) tests the equidistribution of the vectors

for

n = 0, 1, \dots, p - 1

in order to test the PRN

{(y_{i})}_{i \geq 0}

for correlations. If we pick the ﬁrst coordinate of each vector we get the original sequence. Picking always the second results in the original sequence shifted by one. According to the above equivalences we can write this shifted sequence as an EICG with the same parameter

a

n_{0} = 0

, and a diﬀerent

b

. Thus we have

where

{(y_{n}^{(i)})}_{n \geq 0}

is the sequence generated by the EICG

eicg (p, a, a (i - 1) + b, 0)

. The obvious generalisation is to allow almost arbitrary EICGs

eicg (p, a_{i}, b_{i}, 0)

for each coordinate.

In the following, we will prove all statements on the behaviour of

s

-tuples in terms of these parallel streams. For that, we will need to restrict the possible values for the

a_{i}

and

b_{i}

in order to avoid certain special cases like

a_{1} = a_{2} \land b_{1} = b_{2}

. As we will see later in the various proofs, we need the condition

\bar{a_{i}} b_{i} \neq \bar{a_{j}} b_{j}

for all

i \neq j

. Thus we have the following deﬁnition:

An interesting special case of parallel streams, which is more general than the overlapping

s

-tuples considered above, can be obtained as follows. Choose an integer

m \geq 1

with

gcd (m, p) = 1

and integers

0 \leq n_{1} < n_{2} < \dots < n_{s} < p

and put

where the

y_{n}

are as in (3.1). This sequence of points in

ℤ_{p}^{s}

can be written in terms of parallel streams according to Deﬁnition 3.1 by putting

a_{i} = a m

and

b_{i} = a n_{i} + b

for

1 \leq i \leq s

. It is easy to show that the

\bar{a_{i}} b_{i}

are distinct, thus all results concerning parallel streams are valid for this general method of composing

s

-tuples, too.

The non-overlapping tuples

y_{n} : = (y_{s n}, y_{s n + 1}, \dots, y_{s n + s - 1})

are covered by the concept of parallel streams, too. To see this, set

a_{i} = s a

and

b_{i} = a (i - 1) + b

for

i = 1, \dots, s

3.2 Structural Properties

The best known structural property of any pseudorandom number generator is the lattice structure of the LCG. Coveyou/MacPherson [9] and Marsaglia [65] noted ﬁrst that

s

-tuples formed from successive LCG-numbers form a lattice in the

s

-dimensional cube. Figure 3.1 depicts the lattice formed by plotting overlapping 2-tuples for the full period of two “toy” generators. “Production quality” generators exhibit the same structure, you just have to zoom into the image to see the pattern.

The shape of the lattice depends very strongly on the parameters

a

and

b

of the LCG. Thus various measurements on the coarseness of the lattice are used to select suitable parameters

a

and

b

. That way, a weakness of the LCG turns into a strength, as one can guarantee a well-behaved lattice for low dimensions as long as the parameters are chosen well enough.

Does the EICG exhibit a similar structure ? Figure 3.2 suggests that the EICG does not possess this linear property, although one can see some other regularities. In fact, one can prove a very stringent non-linearity property for

s

-tuples taken from an EICG. The theorem describing this is due to Niederreiter [71].

Proof: All calculations in this proof are performed in the ﬁnite ﬁeld

ℤ_{p}

. Furthermore, remember that according to Deﬁnition 3.1, the

\bar{a_{i}} b_{i}

are distinct.

A hyperplane

H_{c, c_{0}}

Z_{p}^{s}

is uniquely deﬁned by a vector

c = (c_{1}, \dots, c_{s}) \in ℤ_{p}^{s} ∖ {0}

and a scalar

c_{0} \in ℤ_{p}

H_{c, c_{0}} = {x \in Z_{p}^{s} | x \cdot c = c_{0}}

. We restrict our search for points on a hyperplane

n \in W : = ℤ_{p} ∖ {- \bar{a_{1}} b_{1}, \dots, - \bar{a_{s}} b_{s}}

. Thus for

n \in W

we have according to (3.8)

y_{n}^{(1)}, \dots, y_{n}^{(s)} \neq 0

, therefore we can rewrite the hyperplane equation for

y_{n}

c_{0} \neq 0

, then

h

is a nonzero² polynomial of degree

s

over

ℤ_{p}

. Since such a polynomial has at most

s

roots in

ℤ_{p}

, the hyperplane

H_{c, c_{0}}

contains at most

s

of the

y_{n}

with

n \in {0, 1, \dots, p - 1} ∖ {- \bar{a_{1}} b_{1}, \dots, - \bar{a_{s}} b_{s}}

whose degree is at most

s - 1

. It remains to show that

h

is not the zero polynomial. As

c

is not the zero vector, one of its coordinates is nonzero. For

c_{k} \neq 0

we have

because

c_{k}

is the chosen nonzero coordinate, and the

a_{i}

as well as the

(\bar{a_{i}} b_{i} - \bar{a_{k}} b_{k})

are nonzero according to the conditions of the theorem. As we have found

h (x) \neq 0

for some

x

h

cannot be the zero polynomial. __

This theorem proves that

s

-tuples taken from an EICG do not form any linear structure such as a lattice. But that does not mean that no other kind of pattern emerges in plots of pairs of consecutive³ numbers. For example, consider Figure 3.3, where one can see a hyperbola-like structure in the upper left and lower right corner. Eichenauer-Herrmann and Wegenkittl are currently preparing a paper discussing these properties of the EICG.

All the plots so far contained all the overlapping pairs available from the full period of the generator. This way, the underlying structure of the generator is perfectly visible. But usually one does not utilize the full period of any generator; a common rule of thumb is to use not more than the square root of the period. Thus the LCG will never be able to build up the full lattice and the EICG will contain only a few points on hyperbola. Figure 3.4 depicts the lattice of lcg(65536,325,1,1), the ﬁrst image shows the full lattice in a zoomed view, the second one contains only 256 points, which corresponds to the square root of all possible points.

These ﬁgures clearly demonstrate that any regularities a generator develops over the full period are not necessarily present when only a fraction of the available numbers is used. The recommendation never to exhaust the full period can be further justiﬁed by the following argument: The PRNG is supposed to simulates drawing numbers from an urn with putting the numbers back into the urn, but in fact the typical PRNG empties the imaginary urn before it puts all numbers back when the period is exhausted. The diﬀerence between “drawing with replacement” and “drawing without replacement” is small as long as only a fraction of all numbers are drawn from the urn.

3.3 Correlation Analysis

The discrepancy is a widely used and well studied measure for the equidistribution of a set of points. In this section we will try to give a motivated deﬁnition, some theoretical background, and summarize all published results concerning the EICG.

3.3.1 Background

There are at least three approaches to the notion of discrepancy, one stems from statistics, one from geometric reasoning, and one from numerical integration. We will use the latter. An extensive introduction to discrepancy can be found in Niederreiter [70, Chapter 2].

We will use the following setting: The closed

s

-dimensional unit cube

Ī s = {[0, 1]}^{s}

will be the integration domain in which we will try to integrate the function

f

by using the quasi-Monte Carlo integration

with

x_{1}, \dots, x_{N} \in Ī s

. Ideally, we hope that the approximation converges to the integral as the number of points increases. If this is the case for a reasonable class of functions, say, for all continuous

f

Ī s

, then we call the (inﬁnite) sequence

(x_{1}, x_{2}, \dots)

uniformly distributed in

Ī s

. One can show that this deﬁnition of “uniformly distributed” is quite independent of the class of functions; using the Riemann-integrable functions yields the same test as using the characteristic functions of a very simple set of intervals.

Whereas the limit of the integration error can be used as a qualitative measure for the distribution properties of an (inﬁnite) sequence of points, one can use the integration error in the ﬁnite case as a quantitative measurement of the equidistribution of the ﬁnite sequence

{(x_{n})}_{n = 1}^{N}

In order to get a workable measurement, we have to state which family of functions

f

we consider for the integration, and how we condense all the integration-errors for each function from the family into one single number.

The general concept of discrepancy uses the set of characteristic functions of axis-parallel cubes in

I s : = [0, 1 [s

as the functions to integrate and the supremum as the condensing function. Formally, we can write this in the following way:

Ω = {(x_{n})}_{n = 1}^{N}

is a ﬁnite sequence in

I s

, and

B

an arbitrary subset of

I s

, then we can express the quasi-Monte Carlo integration of the characteristic function⁴

c_{B}

in terms of the number of

x_{i}

B

Based on this, the error when integrating

c_{B}

can be written as

|\frac{A (B, Ω)}{N} - λ_{s} (B)|

, where

λ_{s} (B)

is the

s

-dimensional volume⁵ of

B

. Thus we can write the general notion of the discrepancy of a ﬁnite sequence

Ω

of points in

I s

for an arbitrary⁶ family

ℬ

of sets as

From this general deﬁnition we can derive the deﬁnition of the two most important incarnations of discrepancy as follows:

While

D_{N}

and

D_{N}^{*}

are the classical ﬁgures for measuring the equidistribution, they are far from being the only ones. Interesting variations of the basic idea are the isotropic discrepancy, which uses the family of convex sets instead of axis-parallel cubes, or the

L^{2}

-discrepancy, which uses the

2

-norm instead of the supremum. Especially the

L^{2}

-discrepancy has received a lot of attention recently as it is suitable for empirical testing [38] and has a number of interesting theoretical properties [80, 62]. Another measurement worth mentioning is the weighted spectral test [40, 46, 44, 41, 47].

Let us quickly state a few general results concerning discrepancy. They will help us to interpret the main results of this chapter. We once again refer to Niederreiter [70, p. 14ﬀ, p. 166ﬀ] for proofs and further references.

In dimension one, that is

s = 1

, it is possible to express the discrepancy as a relatively simple formula operating on the ordered list of points.

From these formulae, as well as the well known fact that sorting is of complexity

𝒪 (N log N)

, it is easy to see that one can calculate the discrepancy in the one-dimensional case in

𝒪 (N log N) + 𝒪 (N)

steps. Using a memory versus speed tradeoﬀ [34] it is possible to get the complexity down to

𝒪 (N)

. In higher dimensions

s

calculating the discrepancy is of complexity

𝒪 (N^{s})

, making any reasonable empirical testing computationally infeasible. Probabilistic algorithms [89] can be employed to calculate tight upper bounds for a given

Ω

What do we know about the behaviour of

D_{N}

with increasing

N

? If the sequence of point is indeed uniformly distributed in

Ī s

, then we know that

{lim}_{N \to \infty} D_{N} = 0

. For a sequence of uniformly distributed random points we know that

{lim}_{N \to \infty} D_{N} = 0

with probability one. But in order to use the discrepancy as a ﬁgure of merit for ﬁnite sequences, we need to know exactly how

D_{N}

converges for random sequences. Luckily, the following result (due to Kiefer [49]) provides us with the benchmark according to which we can judge the discrepancy bounds derived for PRN.

The discrepancy is per deﬁnition an upper bound for the quasi-Monte Carlo integration error for a very limited class of functions, namely the characteristic functions of axis-parallel cubes. A classic result by Hlawka uses the discrepancy to derive an error-bound for a large class of functions.

Why is this inequality so important ? For the Monte Carlo numerical integration, which is based on “random numbers”, one cannot derive an a-priori⁷ error bound on the integration error. It is only possible to state a probabilistic error bound, a shortcoming that is often not acceptable. The inequality of Koksma-Hlawka on the other hand, is a hard bound on the integration error. Thus in order to get such a bound for the Monte Carlo method, one has to calculate the discrepancy for the numbers used, which is not feasible in practice. The way out is to use a set of numbers for which bounds on the discrepancy are known in advance, such as

(t, m, s)

-nets or PRNGs for which such bounds are available.

On the other hand, if we want our PRNG to model a

U [0, 1 [

distributed random variable as closely as possible, the law of the iterated logarithm provides us with the correct order of magnitude for the discrepancy. One can argue that any results concerning the discrepancy of a particular generator which shows a rate of growth close to

𝒪 (N^{- 1 ∕ 2} log log N)

is a sign for the right amount of “randomness” in the generator. Empirical evidence seems to support this argument. In any case, the discrepancy is certainly the most widely used ﬁgure of merit in theoretical analysis of pseudorandom number generation algorithms.

3.3.2 Auxiliary Results

In order to prove discrepancy bounds, we need a variety of auxiliary results. Unfortunately it is not possible to include the proofs for these lemmata without exceeding the scope of this thesis, as well as alienating the target audience. Thus we will only list the results and give references to the proofs.

The basic approach to derive discrepancy bounds for PRNG is due to Niederreiter [70], who established a link between the discrepancy of a ﬁnite sequence of points with rational coordinates and certain exponential sums. As most common PRNG use integer arithmetic combined with a ﬁnal scaling operation, this approach is perfectly suited for the study of the output of such generators.

Niederreiter’s proof [70, §3.2] is elementary, although rather tricky. Hellekalek [39] gave a proof based on dyadic harmonic analysis. A detailed proof can be found in Weingartner [87].

In correspondence with the literature [71, 70], we will use the following deﬁnitions: For a prime

p \geq 5

let

C_{s}^{*} (p)

be the set of all nonzero

h = (h_{1}, \dots, h_{s}) \in ℤ^{s}

with

| h_{i} | < p ∕ 2

for

1 \leq i \leq s

. For such

h

, put

r (h, p) = \prod_{i = 1}^{s} r (h_{i}, p)

with

r (h, p) = p sin (π | h | ∕ p)

for

h

nonzero and

r (0, p) = 1

. Furthermore, set

χ (n) = e^{2 π \sqrt{- 1} n ∕ p}

for

n \in Z_{p}

, and let

u \cdot v

denote the standard inner product of

u, v \in ℝ^{s}

Although some of the Lemmata below do not need this restriction, we will consider only the case of prime moduli here.

The following Lemma is due to Niederreiter [71, Lemma 2] which improves [70, Corollary 3.11].

In order to derive the bound

B

for the EICG we need the following two Lemmata, the ﬁrst one is due to Cochrane [7], the second is a variant of the Bomberi-Weil bound for exponential sums (see Moreno and Moreno [68, Theorem 2]):

The Koksma-Hlawka inequality can be used to derive a lower bound on the discrepancy of a ﬁnite sequence of points. All that is needed is a function with a known integral and bounded variation. In light of the previous lemmata it seems natural to use a function for which the Monte Carlo integration can be expressed in terms of exponential sums. The following result is due to Niederreiter [70, Cor. 3.17].

Unfortunately, we cannot prove a lower bound on the generic sum

\sum χ (h \cdot y_{n})

for ﬁnite sequences of points generated by an EICG. But we are able to prove such bounds for a slightly diﬀerent exponential sum

E_{N} (χ; d, e)

, which we can link to sequences generated by an EICG, and which is a special case of the sum in Lemma 3.5. This approach is due to Eichenauer-Herrmann and Niederreiter [23]. All lemmata and theorems concerning lower bounds are taken from this paper.

For

d = (d_{1}, \dots, d_{s}) \in ℤ_{p}^{s}

and

e = (e_{1}, \dots, e_{s}) \in ℤ_{p}^{s}

we deﬁne

See [23] (Theorem 1 and Corollary 1) for the proofs, which are similar to the proofs of Theorems 3.2 and 3.3. Besides these upper bounds, we know the average value of the

E_{N} (χ; d, e)

, according to the next Lemma.

Proof: We set

d_{j, m, n} : = d_{j} (\bar{n + e_{j}} - \bar{m + e_{j}})

. With

e = (e_{1}, \dots, e_{s})

we get

because we have

\sum_{d \in ℤ_{p}} χ (d \cdot k) = 0

for

k \in Z_{p}, k \neq 0

and

p

otherwise. __

Applying Lemma 3.5 on a ﬁnite sequence of points generated by parallel streams of EICG according to Deﬁnition 3.1, and

h = (1, 1, 0, \dots, 0) \in ℤ^{s}

we get for

s \geq 2

with

d = (\bar{a_{1}}, \bar{a_{2}}) \in ℤ_{p}^{2}

and

e = (b_{1} \bar{a_{1}}, b_{2} \bar{a_{2}}) \in ℤ_{p}^{2}

. Similarly, for

h = (1, 0, \dots, 0) \in ℤ^{s}

and

s \geq 1

we have

3.3.3 Bounds

We now have the tools necessary to derive upper and lower bounds on the discrepancy of ﬁnite sequences of points generated by parallel streams of EICG numbers. But as a reference, let us ﬁrst look at the result available for the LCG, which will once again serve as a reference.

According to [70, Theorem 7.4] we have for the multiplicative linear congruential generator the following statement: For

s \geq 2

and for an average multiplier

a

, the discrepancy of the ﬁnite sequence of

s

-dimensional points consisting of all

M - 1

overlapping tuples from

lcg (M, a, 0, 1)

obeys

Upper Bounds

We ﬁrst turn our attention to upper bounds; we will prove both bounds for the full, as well as or parts of the period. These bounds are relevant in two ways:

Please note that the following theorems do not depend on the choice of any parameters; they are valid for every single full period EICG. This is in sharp contrast to the bounds known for the LCG, which is only deals with the average over all multipliers, and thus tells us nothing about a particular generator. Furthermore, in the case of the LCG no bounds are known for parts of the period in dimensions

s \geq 2

As one can guess from the lemmata listed above, the proofs involve exponential sums which need to be rearranged in a way to be able to use Lemma 3.4.

We now restrict the sum to those terms where

y_{n}^{(i)} \neq 0

by using the same set

W

as in the proof of Theorem 3.1 as the summation domain. By noting that

card (ℤ_{p} ∖ W) = s

and using the triangle inequation we get

As all

a_{i}

are nonzero, we have

deg (R) = s < p

. Furthermore, as at least one of the

h_{i}

is nonzero, the uniqueness of the partial fraction decomposition for rational functions implies that

Q \neq 0

and

deg (Q) < s = deg (R)

. In order to apply Lemma 3.4, we have to show that

Q ∕ R

is not of the form

A^{p} - A

with

A \in {\bar{ℤ}}_{p} (x)

, where

{\bar{ℤ}}_{p}

denotes the algebraic closure of the ﬁeld

ℤ_{p}

, and

{\bar{ℤ}}_{p} (x)

denotes the ﬁeld of rational functions over

{\bar{ℤ}}_{p}

. If this were the case, we would have

Since we have demanded

gcd (K, L) = 1

L

cannot divide

K

(K^{p - 1} - L^{p - 1})

, thus

L^{p}

must divide

R

. As

deg (R) = s < p

, that can only be the case if

L

is a nonzero constant polynomial which implies

deg (L^{p}) = 1

. Comparing the degrees in (3.15) yields

deg (Q) \geq deg (R)

, which contradicts the degrees derived from (3.14). Thus we can apply Lemma 3.4, and this leads to

Proof: Just as in the proof of Theorem 3.2 we need to derive a bound for an exponential sum to be able to apply Lemma 3.2. This time the summation domain is not

ℤ_{p}

, thus we need to rewrite the sum in a rather tricky way.

with

r_{n} = \sum_{i = 1}^{s} h_{i} y_{n}^{(i)}

for

n \geq 0

. We can now rewrite

S_{N} (h)

using the fact that

\sum_{u \in ℤ_{p}} χ (u k)

evaluates to

0

for

k \neq 0

and to

p

otherwise.

In the last line the summand for

u = 0

was pulled out; the term

S (h)

is the same as in the proof of Theorem 3.2. As we need an upper bound on

| S_{N} (h) |

, we apply the triangle inequation, yielding

We want to apply Lemma 3.4 on the rightmost term: For

1 \leq u \leq p - 1

we have, by the same argument as following (3.14)

Once again, we claim that

Q ∕ R

is not of the form

A^{p} - A

with a rational function

A \in {\bar{ℤ}}_{p} (x)

. From the deﬁnition of

R

and

Q

we have

deg (R) = s

and

deg (Q) = s + 1

, as all the neither the

a_{i}

nor

u

can be zero. For if we had

with polynomials

K, L

over

{\bar{ℤ}}_{p}

and

gcd (K, L) = 1

, then the argument following (3.15) shows that

L

is a nonzero constant polynomial. Thus we have

for suitable

e_{1}, e_{2} \in {\bar{ℤ}}_{p}

with

e_{1}, e_{2} \neq 0

. Comparing the degrees of the polynomials in this equation we get

deg (e_{1} K^{p} + e_{2} K) = 1

, which implies

deg (K) \geq 1

, hence

deg (e_{1} K^{p} + e_{2} K) = p deg (K) > 1

. This contradiction proves that we can apply Lemma 3.4, yielding

Furthermore, by rewriting the sum over

t

using some elementary equivalences like

χ (x) - 1 = e^{2 π \sqrt{- 1} x} - 1 = e^{π \sqrt{- 1} x} \cdot (e^{π \sqrt{- 1} x} - e^{- π \sqrt{- 1} x}) = e^{π \sqrt{- 1} x} \cdot 2 \sqrt{- 1} sin π x

, we get

We now return to (3.17) to put the pieces together. Combining everything with an application of Lemma 3.3 we get

for all

h \in C_{s}^{*} (p)

, and thus we can apply Lemma 3.2 to obtain the desired upper bound on

D_{N}^{(s)}

. _

Lower Bounds

The theorems covering lower bounds are formulated in a diﬀerent way. Instead of giving hard bounds for all EICG, these theorems state how many EICGs there must be exceeding a threshold value for the discrepancy. This kind of statement follows from the basic approach to the problem, namely combining upper bounds on exponential sums with their average values.

Lower bounds guarantee that the PRN are not perfectly equidistributed, they contain the irregularities found in “random” numbers, too.

The following three theorems are due to Eichenauer-Herrmann and Niederreiter [23].

Proof: We rewrite the theorem in terms of

E_{N}

by using (3.12). Thus we have

d = (d_{1}, d_{2}) = (\bar{a_{1}}, \bar{a_{2}})

, and

e = (e_{1}, e_{2}) = (b_{1} \bar{a_{1}}, b_{2} \bar{a_{2}}) \in ℤ_{p}^{2}

with

e_{1} \neq e_{2}

, and we need to show that there are more than

A_{p} (t)

values for

d_{1} \in ℤ_{p}^{*}

such that

| E (χ; d, e) | \geq t p^{1 ∕ 2}

Now suppose there exist at most

A_{p} (t)

values of

d_{1} \in ℤ_{p}^{*}

with

| E (χ; d, e) | \geq t p^{1 ∕ 2}

, i.e., there exist at least

(p - 1) - A_{p} (t)

values of

d_{1}

with

| E (χ; d, e) | < t p^{1 ∕ 2}

. From Lemma 3.6 (with

s = 2

) we know that

| E (χ; d, e) | \leq 2 p^{1 ∕ 2} + 3

for every

d_{1} \in ℤ_{p}^{*}

. Hence, observing that

d_{1} = 0

contributes nothing to the sum, we obtain

Proof: The proof is analogous to the last one. The only real diﬀerence is the handling of

d_{1} = 0

, where we need to apply Lemma 3.6 with

s = 1

. _

Using (3.13) instead of (3.12) we get a slightly diﬀerent result. As the proof contains no new ideas, we omit it, too.

Restricting oneself to the ordinary serial test, i.e. is considering only overlapping

s

-tuples instead of vectors from parallel streams, it is possible to improve Theorem 3.6. See [23, Corr. 9] for details.

3.4 Other Results

For congruential generators modulo a prime

p

, the following deﬁnition, due to Niederreiter [70], speciﬁes another criteria which can be used to classify PRNGs.

Proof: As explained in 1.3.4, one can visualize any congruential generator modulo

p

with period length

p

as a permutation polynomial mapping

n

y_{n}

. In the case of the EICG, this polynomial has degree

d = p - 2

as we can write the the EICG formula as

according to the theorem of Euler-Fermat. The rest follows immediately from a theorem by Eichenauer, Grothe, and Lehn [13], which can also be found in Niederreiter [70, Theorem 8.2]. __

Chapter 4
Empirical Tests

As explained is section 2.3.2, testing a pseudorandom number generator is a tricky task. Even if one decides which properties one wants to test for, designing the test itself involves a fair amount of statistic knowledge as to how the test should be parameterized as well as how the results should be judged.

In this chapter, we will try to give a survey of empirical tests concerning the EICG carried out by various authors. Since it will be inappropriate to spend too much time on discussing all design decisions in detail, we will only describe the motivation, the test procedure, and the results. For more information we refer to the original authors.

Another compilation of empirical test results concerning inversive generators can be found in [21].

4.1 Digit Test

The Digit Test, due to Leeb¹ [58, 26, 61], tries to assess the distribution quality of a PRNG by looking at the

g

-adic representation of its output. If we only allow

g = 2^{l}

, one digit in base

g

corresponds to

l

binary digits, which we can obtain by cutting out

l

consecutive bits in the computer representation of the numbers. This is a very eﬃcient procedure which can be done by simple bitwise AND and shift operations. Let’s visualize this by an example, using

8 = 2^{3}

as the base and selecting the second digit which is corresponds to the three bits starting at position

k = 4

Ideally, we expect the last column to be uniformly distributed over all possible digits. Furthermore, we can assume that if the original numbers are uncorrelated, this will hold for the sequence of digits, too. On the other hand, if we can prove that something is wrong with the digit sequence, this does not shed a good light on the original numbers.

What have we gained by mapping numbers to digits ? Basically this mapping plays the role of the “bins” discussed on page § and §. As discussed there, this mapping is used to make the problem manageable by drastically reducing the number of possible values for each number. Now we can easily count the occurrences of each diﬀerent digit and perform correlation tests on them.

As a correlation test Leeb used the idea of the serial test (see p. §) with non-overlapping

s

-tuples. To measure the distribution of the

s

-tuples, a

χ^{2}

test was used; the resulting

χ^{2}

-value was called

t_{1} (s, k, l)

, the level-1 test statistic. This procedure was repeated 64 times, and all the

χ^{2}

values were compared to their expected distribution by a two-sided Kolmogorov-Smirnov test, yielding

t_{2} (s, k, l)

, the level-2 test statistic on which we will focus in the graphics.

Let’s summarize all the parameters which must be speciﬁed to turn the abstract idea of the Digit Test into a computer program.

It should be clear by now that the hierarchical design of the complete testing procedure results in huge resource requirements both computationally as well as memory-wise to actually run the test. See Table 4.1 for the actual parameters used, and Table 4.2 for the generators tested. The computations were carried out using the pLab [57] PRNG testing framework, which is based on the author’s generator library.

For the graphics, the

t_{1}

value was transformed according to the expected

χ^{2}

distribution to yield a value which should be asymptotically equidistributed. Thus all points in the graphics on the left hand side should vary freely between 0 and 1. Values close to 0 signify a distribution of the

s

-tuples which is much to well-balanced, whereas values close to 1 indicate gross irregularities. The right hand graphics depict the KS-statistic

t_{2}

, for which the critical region at the

1 %

level of signiﬁcance is

[1.63, \infty)

. Any generator which features high values there (especially when reaching the cut-oﬀ point 2 in the graphics) fails in the test.

As Leeb in [58], we present here only a selection of the results, namely those for dimension

s = 3

. For each generator, the values of

k

and

l

were varied.

Interpretation: The digit test seems to be sensible to intrinsic properties of the LCG, as even the best one (FISH) fails the test for certain parameters. Leeb conjectures in [58] that the digit test is sensible on grid structures or long-range correlation as these are two features which are present in all LCG but are proven to be absent in inversive generators. Especially the lattice quality parameter

1 ∕ ν_{s}

seems to correlate with the digit test results. The better the lattice is (small values for

1 ∕ ν_{s}

), the higher the values for

l

and

k

must be to uncover deﬁciencies in the generator.

4.2 Overlapping Serial Test

The overlapping serial test, ﬁrst proposed by Marsaglia [66] as the “M-tuple test”, is in its basic setup quite similar to the digit test described above. The main diﬀerence is to use overlapping tuples. This modiﬁcation is rather small in the implementation, but requires some statistical work to calculate the expected distribution as the tuples are no longer independent.

We will describe here the empirical tests done by Wegenkittl [85, 86, 61]. As the testing procedure was very similar to the digit test setup, we will only list the diﬀerences here. Whereas Leeb in the digit test used only one way to extract bits from the stream of pseudorandom numbers, Wegenkittl used two:

This time these blocks of bits were used to generate variable number of tuples. Whereas in the digit test the sample size

M

was always tuned to the subsequent

χ^{2}

test, Wegenkittl generated up to

M = 2^{26}

tuples. From these tuples, a modiﬁed

χ^{2}

statistic

t_{1}^{(o)}

was computed, resulting in the test-statistic

χ_{o}

. Whereas the “normal

χ^{2}

test statistic” does not converge to a

χ^{2}

distribution for overlapping tuples due to the correlations, this modiﬁed one does (see [85, p. 57] for details). This procedure was repeated 32 times and the resulting empirical distribution of the

χ_{o}

values was compared to the theoretical one using a KS test.

The following graphics² depict the results for all the generators used in the digit test, as well as for EICG7 which stands for

eicg (2^{31} - 1, 7, 0)

The left hand diagrams show the values for each of the

32

calculated

t_{1}^{(o)}

test statistics as the lightness of each small rectangle. A white square signiﬁes a low value for

t_{1}^{(o)}

, meaning perfect equidistribution of the tuples, whereas a black one signiﬁes extreme deviations. The transformation

t_{1}^{(o)} \mapsto

lightness was chosen in such way that each gray-scale level should be equally likely. Unfortunately, these diagrams are not available for all parameters.

In the right hand diagrams these

32

values were distilled into one single KS value representing the quality of their distribution. The critical region at the

1 %

level of signiﬁcance is in this case

[1.58, \infty)

Interpretation: Like the Digit Test, the overlapping serial test does uncover deﬁciencies in the linear generators. Whereas the setup in the Digit Test focused on the number of bits you can take from a number while still getting good distributions, Wegenkittl turned his attention to the number of tuples one can generate before the PRNG fails. The results are basically the same: even the best LCG has a limited load-capability; if you start to do heady-duty computations using many PRN you must be careful not to run into the breaking point of the LCG. The traditionally rule of thumb concerning which percentage of the period length one can safely use (up to

\sqrt{period}

) seems to be sound for the LCG.

The inversive generators are not perfect either, even they tend to fail the test at some point. But their load-capability is much better. Even when taking a high number of samples, their distribution quality decreases quite slowly.

4.3 Run Test

A completely diﬀerent kind of empirical test is the run test. The basic idea behind this test is to check if the occurrences of runs conforms to its expected value. There is a variety of diﬀerent ways to implement this idea, but we will only describe the idea and Entacher’s implementation [24].

First of all, what do we mean by “runs” ? In a binary context, that is sequence consisting only of two symbols, a run is deﬁned as a subsequence consisting only of one symbol. For example, consider the sequence

As the common pseudorandom sequences are far from being binary, one has to transform them ﬁrst. A straight forward way of doing this is

which is binary as

x_{n + 1} \neq x_{n}

for PRNG we consider. A run of 1s of length

k

corresponds to a monotonically increasing subsequence of length

k + 1

in the original sequence called a “run up”. The distribution properties of these runs in a sequence of “really random numbers” is well known.

According to Wolfowitz [63] (see also Knuth [50, p. 68]) Entacher constructed an asymptotically

χ^{2}

-distributed test statistic

U_{r}

which involves counts for ascending runs up to length

r

. The calculation done by Entacher involved evaluating

U_{6}

100 times for each generator and testing this empirical distribution against the expected one using a KS test.

We will focus on Entacher’s results concerning the behaviour of LCGs and EICGs. The list of generators tested include the previously deﬁned generators FISH, ANSI, MINSTD, and RANDU, as well as two other LCGs with a bad lattice structure: LCG5

= lcg (2^{31}, 2^{31} - 3, 0, 1)

and LCG6

= lcg (2^{31} - 1, 2^{21} + 1, 0, 1)

. As examples for the EICG Entacher used

eicg (2^{31} - 1, a, 0, 0)

with parameter

a \in {2^{5}, 2^{10}, 2^{15}, 2^{20}, 2^{25}, 2^{30}}

which he labeled EICG1 to EICG6.

The following ﬁgures³ depict the results; The sample size was varied between

2^{12}

and

2^{24}

(the labels are

{log}_{2} (N)

), the height of each square shows the resulting KS statistic. If the square is coloured dark gray, then the KS statistic exceeds the critical value for the signiﬁcance level 0.01.

Interpretation: The run test seems to be able to distinguish good LCGs from bad ones. All basically randomly chosen EICG pass the test without any problems, showing again that the EICG is not sensitive to the parameter selection.

Another conclusion from this test is that subsequences taken by a leap frog technique are save when using EICGs, whereas such sequences taken from an LCG may exhibit a bad lattice and can fail the run test. See an upcoming paper from Entacher [24] for details.

4.4 Weighted Spectral Test

As mentioned on page §, the weighted spectral test is a promising new approach to assess the quality of pseudorandom numbers, see Hellekalek [40, 41], Hellekalek and Niederreiter [46], and Hellekalek and Leeb [44]. One numerical realization of the weighted spectral test is the diaphony, see Hellekalek and Niederreiter [46].

Both the diaphony [90] as well as the classic spectral test [9] approach the point set from a Fourier point of view, looking for any disturbances in the spectrum. If the point set has lattice structure (as in the case of LCGs), the ﬁrst wavelength which yields a non-zero Fourier coeﬃcient corresponds to the largest distance between hyperplanes. Where the classic spectral test targets just this wavelength, the diaphony tries to include more information, namely a weighted sum over all possible wavelengths. High wavelengths (which correspond to low frequencies) in the point set indicate a large scale imbalance, whereas high frequencies target ﬁne structures in the set. As these ﬁne structures are unavoidable at the certain point, higher frequencies are considered less important and thus will contribute little to the diaphony.

So basically the

s

-dimensional diaphony is a weighted sum over the correlation coeﬃcients of the point set, which are basically the Weyl sums

S_{N}

we encountered in Section 3.3, see [41] for details.

There are close ties between the discrepancy and the diaphony. One can bound one in terms of the other (see Stegbuchner [79]), one can interpret it, too, as integration error (see James, Hoogland and Kleiss [47]), and it is possible to derive Lemmata similar to those in section 3.3.2 (see [41]).

Whereas it is virtually impossible to do any reasonable empirical studies with discrepancy due to its computational complexity of

𝒪 (N^{s})

, the diaphony only needs

𝒪 (s \cdot N^{2})

steps to calculate it. Thus it is possible to conduct empirical test using this ﬁgure of merit. In the following we will describe the results obtained by Hellekalek [40].

The test procedure was as follows: For a given generator, for a given dimension

s

, and a given sample size

N

, the diaphony

F_{N}^{2}

was evaluated for 20 samples of non-overlapping

s

-tuples. As the expected value for

F_{N}^{2}

equals

1 ∕ N

, Hellekalek multiplied the diaphony by

N

to get the same expected value for all sample sizes, see Leeb [60] for the theoretical distribution.

In the graphics⁴, the abscissa shows the sample size as

{log}_{2} N

and the ordinate the average value over the 20 samples. For each generator from the now familiar set (see Table 4.2) the calculation was done in dimensions 2 (

△

), 3 (

\circ

), 4 (

⋆

), 5 (

♢

), and dimension 6 (

∣

4.5 Other Results

Frank Härtel [37] implemented an impressive array of PRNG and compared their performance under a variety of statistical tests. Unfortunately he used only one EICG (

eicg (1000081, 240318, 197, 0)

), whose modulus (

p < 2^{20}

) is far too small to be able to compete with generators featuring a period length of

2^{32}

. We will therefore not elaborate on his results.

Both Eichenauer-Herrmann and L’Ecuyer presented results of empirical test concerning the minimal distance between vectors of PRN on the MC&QCM’96 conference in Salzburg. Although the results have not been published yet, one can summarize them as follows: due to their lattice structure, LCGs are not able to simulate the correct behaviour of the test statistic whereas EICGs with the same period length pass this test with ﬂying colors. I refer to the upcoming proceedings for details.

Chapter 5
Implementation

This section discusses the implementation of the EICG pseudorandom number generator using a standard procedural programming language. We will use C syntax for the code printed here.

The algorithms presented here were used to write a generic and portable PRNG library which implements not only the EICG, but other congruential generators, too. This library (written in ANSI C) is available on the Internet from the pLab WWW server at http://random.mat.sbg.ac.at/.

5.1 Overview

If we look at the deﬁnition of the EICG (see p. 1.3.4), we see that generating the numbers is a two step procedure. First we have to compute the

y_{n} : = \bar{a (n_{0} + n) + b}

, a calculation operating in the ﬁnite ﬁeld

ℤ_{p}

which can be done by standard integer calculation modulo

p

. The second step is the scaling operation

x_{n} : = y_{n} ∕ p

, which we will implement as a straight forward ﬂoating point division.

We will thus focus on the ﬁrst step, which includes the following three operations

How they are best implemented depends on how numbers are represented in the computer. We use integer arithmetic based on the native integer format of the computer. Most current workstations use 32-bit integers which can hold numbers from

- 2^{31}

2^{31} - 1

. This limits the choice of the modulus to values smaller than

2^{31}

, a common choice is the Mersenne prime

2^{31} - 1

. Large values for

p

give the resulting PRN a ﬁne resolution, but one has to pay for this with an increase in calculation time. If this resolution, as well as the period length it implies, are not adequate, one can use the technique of combining generators (see p. §), to generate even better pseudorandom numbers.

5.2 Modular Inversion

First we turn to the problem of modular inversion (denoted throughout this text by overlining (

\bar{a}

) the operand) which is deﬁned as

where

a^{- 1}

is the uniquely deﬁned element of

ℤ_{p}^{*} = ℤ_{p} ∖ {0}

such that

a \bar{a} mod p = 1

The special case

a = 0

is easily handled, what remains is to ﬁnd

a^{- 1}

for

a \neq 0

. There are two diﬀerent ways to compute the inverse, one of them is to utilize the fact that

by the well-known theorem of Euler–Fermat, and thus

a^{φ (p) - 1} \equiv \bar{a} (m o d p)

holds. In our case here the modulus is prime, thus

φ (p)

, Euler’s totient function, is equal to

p - 1

, which gives us

Evaluating

a^{b} (m o d p)

is a well-known exercise in computational number theory (e.g. in the RSA cryptosystem). It can be solved in logarithmic time [8, p. 829], but intermediate results exceed the domain

[- p, p]

. This fact renders an implementation diﬃcult because of the limited integers available on a computer.

A diﬀerent approach is to use an extended version of Euclid’s algorithm. This algorithm is usually used to calculate the greatest common divisor of two numbers, but it can also be used to calculate the integers

x

and

y

which fulﬁll the linear diophantic equation

a x + b y = gcd (a, b)

. By substituting

p

for

b

and observing that

gcd (a, p) = 1

, one gets

This algorithm to calculate

x

and

y

is based on the following recursion: The division with remainder

a = q b + r

is used to calculate

q

and

r

. If we can ﬁnd

x^{'}

and

y^{'}

which fulﬁll

x^{'} b + y^{'} r = 1

, then

x = y^{'}

and

y = x^{'} - q y^{'}

will satisfy

x a + y b = 1

. Since

b < a

and

r < b

the question how to ﬁnd the

x^{'}

and

y^{'}

will lead to a trivial case.

Figure 5.1 shows a straight-forward recursive implementation of the extended Euclid’s algorithm. Figure 5.2 demonstrates how the modular inversion can be based on rec_eeuclid.

This recursive implementation is not very eﬃcient due to the overhead caused by the repeated function calls. Although rec_eeuclid is not end-recursive, it is possible to rewrite it as an iterative function [50]. Figure 5.3 is a C implementation of the modular inversion using this method. A further optimization [77, p. 521] is to unroll the loop twice to avoid unnecessary swapping of the variables in each iteration.

Gordon [35] describes a modiﬁcation which uses shift operations to avoid multiplications and divisions. This does pay on certain computers, for example on SPARC, R4000, or Alpha AXP based systems this approach is faster. On the other hand, the division on the i486 is comparatively fast, thus the original version runs faster there.

The number of recursive calls in Euclid’s algorithm is of the order

O (lg b)

, see [8, p. 810]. An equivalent statement is, that the arguments of these calls decrease exponentially. Another way to put this is that rec_eeuclid will need about the same number of recursive calls to get from

b \approx 16384

b \approx 128

as it needs from

b \approx 128

to the end of the recursion.

This observation leads to another technique to speed up the computation of the multiplicative inverse: For all

a, b

smaller than some threshold we use a table of precomputed values for

x

and

y

instead of continuing the recursion. This way some recursive calls can be avoided.

To test the advantages of this approach, we compared it with the optimized iterative implementation. Table 5.1 lists the timings of my test case, that is calculating the multiplicative inverses of 2147483 uniformly distributed numbers modulo

2^{31} - 1

As long as the threshold is not larger than 256, a byte is enough to hold an element of the table. A threshold of 512 forces the program to resort to 16-bit integers which doubles the memory requirements. It is hard to make a general statement which threshold is best, too much depends on the cache size, memory access speed, and on memory available. We have set the default to 256, which seems to result in a reasonable tradeoﬀ between memory and speed.

Our experience has shown that there is no single “best” algorithm, too much depends on the relative execution speed of various elementary operations. Thus our implementation includes three diﬀerent algorithms as well as a proﬁling program which can be used to select the one which is running fastest on the user’s computer.

5.3 Modular Multiplication

The problem of evaluating

a \cdot n (m o d p)

lies in the limited range of the integers available in common programming languages. The intermediate result

a n

of the straightforward implementation is very likely not to ﬁt in machine size integers, so one has to devise an algorithm to calculate

a n (m o d p)

in which all intermediate results are representable on a

b

-bit computer.

One approach is the following algorithm due to Bratley, Fox, and Schrage [4, Sec. 6.5.2] which can compute

a n (m o d p)

a^{2} < p

. The idea is to factor the modulus, but since this is not possible with primes, one has to deal with remainders, too. Let

r < q

, which is a direct consequence of

a^{2} < p

, evaluating

γ (n)

does not pose a numerical problem, for

and thus

| γ (n) | \in {0, \dots, p - 1}

. Since

a n (m o d p) \in {0, \dots, p - 1}

evaluating

δ (n)

is unnecessary, because

δ (n) = 0

iﬀ

γ (n) \geq 0

and

δ (n) = 1

iﬀ

γ (n) < 0

. This leads to the following algorithm:

q

and

r

can be precomputed, calculating

(n div q)

and

(n mod q)

requires on some computers only one instruction, so this is a very eﬃcient algorithm.

For the usual choice of

2^{31} - 1

for

p

the above can be applied for all

a < 2^{15}

. If

p

is smaller than that, then the limitation that all intermediate results should be between

- p

and

p

is unnecessarily tight. On a

b

-bit computer all integers between

- 2^{b - 1}

and

2^{b - 1}

(exclusive) are representable. If we loosen the restriction on

a

from

a^{2} < p

a^{2} < 2^{b - 2}

, the term

r (n div q)

is no longer bounded by

p

. But it can be shown that

2^{b - 1}

is an upper bound:

We can now conclude that

- 2^{b - 1} < γ (n) < p

and

a n (m o d p) = γ (n) (m o d p)

. Thus one possible algorithm is this:

The while loop can execute at most

⌈ 2^{b - 1} ∕ p ⌉

times. L’Ecuyer and Côté [56] observed that in the average case very few iterations are executed, thus this algorithm is eﬃcient.

a

is not restricted in any way (except of course by the requirement

a < p

) one can use decomposition to reduce the multiplication to cases for which we already have solutions. This can be achieved by writing

a

in base

2^{d}

, where

d = (b - 2) ∕ 2

(usually 15 on current 32-bit computers):

In all four products modulo

p

one of the factors is bounded by

2^{d}

, so the previously discussed algorithm can be applied. Figure 5.4 shows a C implementation of this method. It is used whenever it is not possible to resort to a simpler algorithm.

5.4 Modular Addition

Modular addition is simple to solve, though it is not trivial. A straightforward implementation might look like this:

This is correct, as long as

a + b

is still fully representable with the data type used. Assuming that the usual signed integer type is used, an overﬂow would occur if

a + b

is negative. It cannot happen that

a + b

is positive in spite of an overﬂow, since

a

and

b

are both smaller than

2^{b - 1} - 1

(assuming a

b

-bit computer) and

(2^{b - 1} - 1) + 1

wraps to

- 2^{b - 1}

. If we detect an overﬂow, subtracting

p

will bring the result back into the interval

[0, p - 1]

. Thus this

Chapter 6
Summary

Bibliography

[1] S.L. Anderson. Random number generators on vector supercomputers and other advanced architectures. SIAM Rev., 32:221–251, 1990.

[2] D.A. André, G.L. Mullen, and H. Niederreiter. Figures of merit for digital multistep pseudorandom numbers. Math. Comp., 54:737–748, 1990.

[3] K. Binder and D.W. Heermann. Monte Carlo Simulation in Statistical Physics. An Introduction. 2nd corr. ed. Springer-Verlag Heidelberg New York, 1992.

[4] P. Bratley, B. L. Fox, and L. E. Schrage. A Guide to Simulation. Springer Verlag, second edition, 1987.

[5] G.J. Chaitin. Randomness and mathematical proof. Sci. Amer., 232:47–52, 1975.

[6] Wun-Seng Chou. On inversive maximal period polynomials over ﬁnite ﬁelds. Appl. Algebra Engrg. Comm. Comput., 6:245–250, 1995.

[7] T. Cochrane. On a trigonometric inequality of Vinogradov. J. Number Th., 27:9–16, 1987.

[8] Thomas H. Corman, Charles E. Leiserson, and Ronald L. Rivest. Introduction to Algorithms. The MIT Press, ﬁrst edition, 1989.

[9] R.R. Coveyou and R.D. MacPherson. Fourier analysis of uniform random number generators. J. Assoc. Comput. Mach., 14:100–119, 1967.

[10] A. De Matteis and S. Pagnutti. Parallelization of random number generators and long-range correlations. Numer. Math., 53:595–608, 1988.

[11] G. Dueck and T. Scheuer. Threshold Accepting: A General Purpose Algorithm Appearing Superior to Simulated Annealing. Journal of Computational Physics, pages 161–175, 1990.

[12] W.F. Eddy. Random number generators for parallel processors. J. Comp. Appl. Math., 31:63–71, 1990.

[13] J. Eichenauer, H. Grothe, and J. Lehn. Marsaglia’s lattice test and non-linear congruential pseudo random number generators. Metrika, 35:241–250, 1988.

[14] J. Eichenauer and J. Lehn. A non-linear congruential pseudo random number generator. Statist. Papers, 27:315–326, 1986.

[15] J. Eichenauer-Herrmann. Statistical independence of a new class of inversive congruential pseudorandom numbers. Math. Comp., 60:375–384, 1993.

[16] J. Eichenauer-Herrmann. Nonoverlapping pairs of explicit inversive congruential pseudorandom numbers. Monatsh. Math., 119:49–61, 1995.

[17] J. Eichenauer-Herrmann. Modiﬁed explicit inversive congruential pseudorandom numbers with power of 2 modulus. Statistics and Computing, 6:31–36, 1996.

[18] J. Eichenauer-Herrmann and F. Emmerich. A review of compound methods for pseudorandom number generation. In P. Hellekalek, G. Larcher, and P. Zinterhof, editors, Proceedings of the 1st Salzburg Minisymposium on Pseudorandom Number Generation and Quasi-Monte Carlo Methods, Salzburg, Nov 18, 1994, volume ACPC/TR 95-4 of Technical Report Series, pages 5–14. ACPC – Austrian Center for Parallel Computation, University of Vienna, Austria, 1995.

[19] J. Eichenauer-Herrmann and F. Emmerich. Compound inversive congruential numbers: an average-case analysis. Math. Comp., to appear, 65:215–225, 1996.

[20] J. Eichenauer-Herrmann and H. Grothe. A remark on long-range correlations in multiplicative congruential pseudo random number generators. Numer. Math., 56:609–611, 1989.

[21] J. Eichenauer-Herrmann and E. Herrmann. A survey of quadratic and inversive congruential pseudorandom numbers. Submitted to Proceedings of the Second International Conference on Monte Carlo and Quasi-Monte Carlo Methods in Scientiﬁc Computing, Salzburg, July 9–12, 1996.

[22] J. Eichenauer-Herrmann and K. Ickstadt. Explicit inversive congruential pseudorandom numbers with power of two modulus. Math. Comp., 62:787–797, 1994.

[23] J. Eichenauer-Herrmann and H. Niederreiter. Bounds for exponential sums and their applications to pseudorandom numbers. Acta Arith., 67:269–281, 1994.

[24] K. Entacher. Selected random number generators in run tests. Preprint, Institut für Mathematik, Universität Salzburg, Austria, 1996.

[25] K. Entacher and P. Hellekalek. Parallel stochastic simulation: inversive pseudorandom number generators. In G. De Pietro and P. Zinterhof A. Giordano, M. Vajteršic, editors, Proceedings of the International Workshop Parallel Numerics 95, Sorrento, Italy, September 27–29, 1995, pages 1–14. IRSIP Institute of the NRC of Italy, Naples, 1995.

[26] K. Entacher and H. Leeb. Inversive pseudorandom number generators: empirical results. In Proceedings of the Conference Parallel Numerics 95, Sorrento, Italy, September 27–29, 1995, 1995.

[27] G.S. Fishman. Multiplicative congruential random number generators with modulus $2^{β}$ : an exhaustive analysis for $β = 32$ and a partial analysis for $β = 48$ . Math. Comp., 54:331–344, 1990.

[28] G.S. Fishman and L.R. Moore. A statistical evaluation of multiplicative congruential random number generators with modulus $2^{31} - 1$ . J. Amer. Statist. Assoc., 77:129–136, 1982.

[29] G.S. Fishman and L.R. Moore. An exhaustive analysis of multiplicative congruential random number generators with modulus $2^{31} - 1$ . SIAM J. Sci. Statist. Comput. (see also the Erratum, ibid. 7(1986), p. 1058), 7:24–45, 1986.

[30] M. Flahive and H. Niederreiter. On inversive congruential generators for pseudorandom numbers. In G.L. Mullen and P.J.-S. Shiue, editors, Finite Fields, Coding Theory, and Advances in Communications and Computing, pages 75–80. Dekker, New York, 1992.

[31] Mark Fleischer. Simulated Annealing: Past, Present, and Future. In Proceedings of the 1995 Winter Simulation Conference, pages 155–161, 1995.

[32] J. Foley, A. van Dam, S. Feiner, and J. Hughes. Computer Graphics: Principles and Practice. Addison-Wesley, second edition, 1990.

[33] I. Goldberg and D. Wagner. Netscape SSL implementation cracked! Available on the WWW at
http://tezcat.com/web/security/items/ssl-news.txt.

[34] T. Gonzales, S. Sahni, and W.R. Franta. An eﬃcient algorithm for the Kolmogorov-Smirnov and Lillefors tests. ACM Trans. Mathem. Softw., 3:60–64, 1977.

[35] J. Gordon. Fast Multiplicative Inverse in Moduar Arithmetic. In H. J. Beker and F. S. Piper, editors, Cryptography and Coding. Oxford Clarendon Press, 1989.

[36] J.H. Halton. Pseudo-random trees: multiple independent sequence generators for parallel and branching computations. J. Comp. Physics, 84:1–56, 1989.

[37] F. Härtel. Zufallszahlen für Simulationsmodelle. PhD thesis, Hochschule St. Gallen für Wirtschafts-, Rechts- und Sozialwissenschaften, St. Gallen, 1994.

[38] S. Heinrich. Eﬃcient algorithms for computing the $L_{2}$ discrepancy. Interner Bericht, Fachbereich Informatik, Universität Kaiserslautern, 1995.

[39] P. Hellekalek. Study of algorithms for primitive polynomials. Report D5H-1, CEI-PACT Project, WP5.1.2.1.2, Research Institute for Software Technology, University of Salzburg, Austria, 1994.

[40] P. Hellekalek. Correlations between pseudorandom numbers: theory and numerical practice. In P. Hellekalek, G. Larcher, and P. Zinterhof, editors, Proceedings of the 1st Salzburg Minisymposium on Pseudorandom Number Generation and Quasi-Monte Carlo Methods, Salzburg, Nov 18, 1994, volume ACPC/TR 95-4 of Technical Report Series, pages 43–73. ACPC – Austrian Center for Parallel Computation, University of Vienna, Austria, 1995.

[41] P. Hellekalek. On correlation analysis of pseudorandom numbers. Submitted to Proceedings of the Second International Conference on Monte Carlo and Quasi-Monte Carlo Methods in Scientiﬁc Computing, Salzburg, July 9–12, 1996, 1996.

[42] P. Hellekalek and K. Entacher. Revised implementation and testing of the algorithms for IMP-polynomials. Report D5H-3, CEI-PACT Project, WP5.1.2.1.2, Research Institute for Software Technology, University of Salzburg, Austria, 1995.

[43] P. Hellekalek and K. Entacher. Tables of IMP-polynomials. Report D5H-4, CEI-PACT Project, WP5.1.2.1.2, Research Institute for Software Technology, University of Salzburg, Austria, 1995.

[44] P. Hellekalek and H. Leeb. Dyadic diaphony. To appear in Acta Arithmetica, 1996.

[45] P. Hellekalek, M. Mayer, and A. Weingartner. Implementation of algorithms for IMP-polynomials. Report D5H-2, CEI-PACT Project, WP5.1.2.1.2, Research Institute for Software Technology, University of Salzburg, Austria, 1994.

[46] P. Hellekalek and H. Niederreiter. The weighted spectral test: diaphony. In preparation, 1996.

[47] F. James, J. Hoogland, and R. Kleiss. Multidimensional sampling for simulation and integration: measures, discrepancies, and quasi-random numbers. Preprint submitted to Computer Physics Communications, 1996.

[48] M.H. Kalos and P.A. Whitlock. Monte Carlo Methods, Volume I: Basics. Wiley, New York, 1986.

[49] J. Kiefer. On large deviations of the empiric d.f. of vector chance variables and a law of the iterated logarithm. Paciﬁc J. Math., 11:649–660, 1961.

[50] D. E. Knuth. The Art of Computer Programming, Vol. 2: Seminumerical Algorithms. Addison-Wesley, Reading, MA, second edition, 1981.

[51] L. Kuipers and H. Niederreiter. Uniform Distribution of Sequences. Wiley and Sons, New York London Sydney Toronto, 1974.

[52] J. C. Lagarias. Pseudorandom numbers. Statistical Science, 8:31–39, 1993.

[53] P. L’Ecuyer. Random numbers for simulation. Comm. ACM, 33:85–97, 1990.

[54] P. L’Ecuyer. Testing random number generators. In J.J. Swain et al., editor, Proc. 1992 Winter Simulation Conference (Arlington, Va., 1992), pages 305–313. IEEE Press, Piscataway, N.J., 1992.

[55] P. L’Ecuyer. Uniform random number generation. Ann. Oper. Res., 53:77–120, 1994.

[56] P. L’Ecuyer and S. Côté. Implementing a Random Number Package with Splitting Facilities. ACM Transactions on Mathematical Software, 17(1):98–111, March 1991.

[57] H. Leeb. pLab – a system for testing random numbers. In M. Vajteršic and P. Zinterhof, editors, Proceedings of the International Workshop on Parallel Numerics ’94, Smolenice, Sept. 19–21, pages 89–99. Slovak Academy of Sciences, Institute for Informatics, 1994. Available on the internet at http://random.mat.sbg.ac.at.

[58] H. Leeb. On the digit test. In P. Hellekalek, G. Larcher, and P. Zinterhof, editors, Proceedings of the 1st Salzburg Minisymposium on Pseudorandom Number Generation and Quasi-Monte Carlo Methods, Salzburg, Nov 18, 1994, volume ACPC/TR 95-4 of Technical Report Series, pages 109–121. ACPC – Austrian Center for Parallel Computation, University of Vienna, Austria, 1995.

[59] H. Leeb. Random Numbers for Computer Simulation. Master’s thesis, Institut für Mathematik, Universität Salzburg, Austria, 1995.

[60] H. Leeb. A weak law for diaphony. Rist++ 13, Research Institute for Software Technology, University of Salzburg, 1996.

[61] H. Leeb and S. Wegenkittl. Inversive and linear congruential pseudorandom number generators in empirical tests. submitted to ACM Trans. Modeling and Computer Simulation, 1996.

[62] V. F. Lev. On two versions of $L^{2}$ -discrepancy and geometrical interpretation of diaphony. Acta Math. Hungar., 69:281–300, 1995.

[63] H. Levene and J. Wolfowitz. The covariance matrix of runs up and down. Annals Math. Stat., 15 :59–69, 1944.

[64] R. Lidl and H. Niederreiter. Finite Fields. Addison-Wesley, Reading, Mass., 1983.

[65] G. Marsaglia. Random numbers fall mainly in the planes. Proc. Nat. Acad. Sci., 61:25–28, 1968.

[66] G. Marsaglia. A current view of random number generators. In L. Brillard, editor, Computer Science and Statistics: The Interface, pages 3–10, Amsterdam, 1985. Elsevier Science Publishers B.V. (North Holland).

[67] L. Ming and P. Vitány. An Introduction To Kolmogorov Complexity And Its Applications. Texts and Monographs in Computer Science. Springer Verlag, New York, 1993.

[68] C.J. Moreno and O. Moreno. Exponential sums and Goppa codes: I. Proc. Amer. Math. Soc., 111:523–531, 1991.

[69] Netscape Communications Corporation. Potential Vulnerability in Netscape Products. Available on the WWW at
http://www.netscape.com/newsref/std/random_seed_security.html.

[70] H. Niederreiter. Random Number Generation and Quasi-Monte Carlo Methods. SIAM, Philadelphia, USA, 1992.

[71] H. Niederreiter. On a new class of pseudorandom numbers for simulation methods. J. Comput. Appl. Math., 56:159–167, 1994.

[72] H. Niederreiter. New developments in uniform pseudorandom number and vector generation. In H. Niederreiter and P.J.-S. Shiue, editors, Monte Carlo and Quasi-Monte Carlo Methods in Scientiﬁc Computing, volume 106 of Lecture Notes in Statistics. Springer-Verlag, Heidelberg New York, 1995.

[73] S.K. Park and K.W. Miller. Random number generators: good ones are hard to ﬁnd. Comm. ACM, 31:1192–1201, 1988.

[74] C. A. Pickover. Random number generators: pretty good ones are easy to ﬁnd. The Visual Computer, 11:369–377, 1995.

[75] B. D. Ripley. Stochastic Simulation. John Wiley, New York, 1987.

[76] R. A. Rueppel. Stream Ciphers. In Gustavus J. Simmons, editor, Contemporary cryptology: the science of information integrity, chapter 2. IEEE Press, 1992.

[77] Bruce Schneier. Applied Cryptography. John Wiley & Sons, Inc., ﬁrst edition, 1993.

[78] I.M. Sobol. Die Monte-Carlo-Methode. VEB Deutscher Verlag der Wissenschaften, 1983.

[79] H. Stegbuchner. Zur quantitativen Theorie der Gleichverteilung mod 1. Arbeitsberichte, Mathematisches Institut der Universität Salzburg, Salzburg, Austria, 1980.

[80] O. Strauch. $L^{2}$ discrepancy. Math. Slovaca, 44:601–632, 1994.

[81] R.C. Tausworthe. Random numbers generated by linear recurrence modulo two. Math. Comp., 19:201–209, 1965.

[82] S. Tezuka. Uniform Random Numbers: Theory and Practice. Kluwer Academic Publ., 1995.

[83] S. Tezuka and M. Fushimi. Calculation of Fibonacci polynomials for GFSR sequences with low discrepancies. Math. Comp., 60:763–770, 1993.

[84] J.F. Traub and H. Woźniakowski. The Monte Carlo algorithm with a pseudorandom generator. Math. Comp., 58:323–339, 1992.

[85] S. Wegenkittl. Empirical Testing of Pseudorandom Number Generators. Master’s thesis, Institut für Mathematik, Universität Salzburg, Austria, 1995.

[86] S. Wegenkittl. On empirical testing of pseudorandom number generators. In G. De Pietro, A. Giordano, M. Vajteršic, and P. Zinterhof, editors, Proceedings of the International Workshop Parallel Numerics 95, Sorrento, Italy, September 27–29, 1995, pages 113–123. IRSIP Institute of the NRC of Italy, Naples, 1995.

[87] A. Weingartner. Nonlinear congruential pseudorandom number generators. Master’s thesis, Universität Salzburg, Austria, 1994.

[88] B.A. Wichmann and I.D. Hill. An eﬃcient and portable pseudo-random number generator. Appl. Statist., 31:188–190, 1982. Corrections, ibid. 33, 123 (1994).

[89] P. Winker and Kai-Tai Fang. Application of threshold accepting to the evaluation of the discrepancy of a set of points. Research report, Universität Konstanz, Germany, 1995.

[90] P. Zinterhof. Über einige Abschätzungen bei der Approximation von Funktionen mit Gleichverteilungsmethoden. Sitzungsber. Österr. Akad. Wiss. Math.-Natur. Kl. II, 185:121–132, 1976.


Decimal	Binary	Base 8	selected digit

0.5859375	0.100 101 10	0.454	5
0.82421875	0.110 100 11	0.646	4
0.21484375	0.001 101 11	0.156	5
0.765625	0.110 001 00	0.610	1
0.16015625	0.001 010 01	0.122	2
0.48046875	0.011 110 11	0.366	6

1976–1980:	Volksschule in Salzburg
1980–1988:	Gymnasium in Salzburg
1988–1996:	University studies (M.Sc. in Mathematics)
	at the University of Salzburg

Contents

Chapter 1Introduction

1.1 What do we need Pseudo-Random Numbers for ?

1.2 Criteria for PRN Generator Selection

1.2.1 Reproducibility

1.2.2 Statistical properties

1.2.3 Empirical Test Results

1.2.4 Possibility of Theoretical Analysis

1.2.5 Results of Theoretical Analysis

1.2.6 Eﬃciency

1.2.7 Practical Aspects

1.3 Important Types of PRN Generators

1.3.1 The Linear Congruential Generator

1.3.2 Shift-register Generators

1.3.3 The Inversive Congruential Generator

1.3.4 The Explicit Inversive Congruential Generator

1.3.5 EICG Variants

1.3.6 Compound Techniques

Chapter 2The Notion of “Randomness”

2.1 Randomness by Intuition

2.2 Formalizing the Intuitive Notion

2.3 Randomness in Mathematical Terms

2.3.1 Random Variables and Probability

2.3.2 Testing

2.3.3 Interpreting Test Results

Chapter 3Theoretical Results

3.1 Relations between diﬀerent EICG

3.2 Structural Properties

3.3 Correlation Analysis

3.3.1 Background

3.3.2 Auxiliary Results

3.3.3 Bounds

Upper Bounds

Lower Bounds

3.4 Other Results

Chapter 4Empirical Tests

4.1 Digit Test

4.2 Overlapping Serial Test

4.3 Run Test

4.4 Weighted Spectral Test

4.5 Other Results

Chapter 5Implementation

5.1 Overview

5.2 Modular Inversion

5.3 Modular Multiplication

5.4 Modular Addition

Chapter 6Summary

Bibliography

Chapter 1
Introduction

Chapter 2
The Notion of “Randomness”

Chapter 3
Theoretical Results

Chapter 4
Empirical Tests

Chapter 5
Implementation

Chapter 6
Summary