Is everything on this planet determined by randomness? This question is open to philosophical debate. What is certain is that every day thousands and thousands of engineers, scientists, business persons, manufacturers, and others are using tools from probability and statistics. The theory and practice of probability and statistics were developed during the last century and are still actively being reﬁned and extended. In this book we will introduce the basic notions and ideas, and in this ﬁrst chapter we present a diverse collection of examples where randomness plays a role.

1.1 Biometry: iris recognition

Biometry is the art of identifying a person on the basis of his or her personal biological characteristics, such as ﬁngerprints or voice. From recent research it appears that with the human iris one can beat all existing automatic human identiﬁcation systems. Iris recognition technology is based on the visible qualities of the iris. It converts these—via a video camera—into an “iris code” consisting of just 2048 bits. This is done in such a way that the code is hardly sensitive to the size of the iris or the size of the pupil. However, at diﬀerent times and diﬀerent places the iris code of the same person will not be exactly the same. Thus one has to allow for a certain percentage of mismatching bits when identifying a person. In fact, the system allows about 34% mismatches! How can this lead to a reliable identiﬁcation system? The miracle is that different persons have very diﬀerent irides. In particular, over a large collection of diﬀerent irides the code bits take the values 0 and 1 about half of the time. But that is certainly not suﬃcient: if one bit would determine the other 2047, then we could only distinguish two persons. In other words, single bits may be random, but the correlation between bits is also crucial (we will discuss correlation at length in Chapter 10). John Daugman who has developed the iris recognition technology made comparisons between 222 743 pairs of iris

2

1 Why probability and statistics?

codes and concluded that of the 2048 bits 266 may be considered as uncorrelated ([6]). He then argues that we may consider an iris code as the result of 266 coin tosses with a fair coin. This implies that if we compare two such codes from diﬀerent persons, then there is an astronomically small probability that these two diﬀer in less than 34% of the bits—almost all pairs will diﬀer in about 50% of the bits. This is illustrated in Figure 1.1, which originates from [6], and was kindly provided by John Daugman. The iris code data consist of numbers between 0 and 1, each a Hamming distance (the fraction of mismatches) between two iris codes. The data have been summarized in two histograms, that is, two graphs that show the number of counts of Hamming distances falling in a certain interval. We will encounter histograms and other summaries of data in Chapter 15. One sees from the ﬁgure that for codes from the same iris (left side) the mismatch fraction is only about 0.09, while for diﬀerent irides (right side) it is about 0.46.

120

10 20 30 40 50 60 70 80 90 100

DECISION ENVIRONMENT FOR IRIS RECOGNITION

Count

222,743 comparisons of different iris pairs 546 comparisons of same iris pairs mean = 0.089 stnd dev = 0.042 mean = 0.456 stnd dev = 0.018

d’ = 11.36

Theoretical curves: binomial family Theoretical cross-over rate: 1 in 1.2 million C

0.0

0.1

0.2

0.3

0.4 0.5 0.6 Hamming Distance

0.7

0.8

0.9

1.0

Fig. 1.1. Comparison of same and diﬀerent iris pairs.

Source: J.Daugman. Second IMA Conference on Image Processing: MatheEllis Horwood Pubmatical Methods, Algorithms and Applications, 2000. lishing Limited.

You may still wonder how it is possible that irides distinguish people so well. What about twins, for instance? The surprising thing is that although the color of eyes is hereditary, many features of iris patterns...