Description
Book SynopsisFeatures a broad introduction to recent research on Turing's formula and presents modern applications in statistics, probability, information theory, and other areas of modern data science
Turing''s formula is, perhaps, the only known method for estimating the underlying distributional characteristics beyond the range of observed data without making any parametric or semiparametric assumptions. This book presents a clear introduction to Turing's formula and its connections to statistics. Topics with relevance to a variety of different fields of study are included such as information theory; statistics; probability; computer science inclusive of artificial intelligence and machine learning; big data; biology; ecology; and genetics. The author provides examinations of many core statistical issues within modern data science from Turing''s perspective. A systematic approach to long-standing problems such as entropy and mutual information estimation, diversity index estimat
Table of Contents
Preface xi
1 Turing’s Formula 1
1.1 Turing’s Formula 3
1.2 Univariate Normal Laws 10
1.3 Multivariate Normal Laws 22
1.4 Turing’s Formula Augmented 27
1.5 Goodness-of-Fit by Counting Zeros 33
1.6 Remarks 42
1.7 Exercises 45
2 Estimation of Simpson’s Indices 49
2.1 Generalized Simpson’s Indices 49
2.2 Estimation of Simpson’s Indices 52
2.3 Normal Laws 54
2.4 Illustrative Examples 61
2.5 Remarks 66
2.6 Exercises 68
3 Estimation of Shannon’s Entropy 71
3.1 A Brief Overview 72
3.2 The Plug-In Entropy Estimator 76
3.2.1 When K Is Finite 76
3.2.2 When K Is Countably Infinite 81
3.3 Entropy Estimator in Turing’s Perspective 86
3.3.1 When K Is Finite 88
3.3.2 When K Is Countably Infinite 94
3.4 Appendix 107
3.4.1 Proof of Lemma 3.2 107
3.4.2 Proof of Lemma 3.5 110
3.4.3 Proof of Corollary 3.5 111
3.4.4 Proof of Lemma 3.14 112
3.4.5 Proof of Lemma 3.18 116
3.5 Remarks 120
3.6 Exercises 121
4 Estimation of Diversity Indices 125
4.1 A Unified Perspective on Diversity Indices 126
4.2 Estimation of Linear Diversity Indices 131
4.3 Estimation of Rényi’s Entropy 138
4.4 Remarks 142
4.5 Exercises 145
5 Estimation of Information 149
5.1 Introduction 149
5.2 Estimation of Mutual Information 162
5.2.1 The Plug-In Estimator 163
5.2.2 Estimation in Turing’s Perspective 170
5.2.3 Estimation of StandardizedMutual Information 173
5.2.4 An Illustrative Example 176
5.3 Estimation of Kullback–Leibler Divergence 182
5.3.1 The Plug-In Estimator 184
5.3.2 Properties of the Augmented Plug-In Estimator 186
5.3.3 Estimation in Turing’s Perspective 189
5.3.4 Symmetrized Kullback–Leibler Divergence 193
5.4 Tests of Hypotheses 196
5.5 Appendix 199
5.5.1 Proof of Theorem 5.12 199
5.6 Exercises 204
6 Domains of Attraction on Countable Alphabets 209
6.1 Introduction 209
6.2 Domains of Attraction 212
6.3 Examples and Remarks 223
6.4 Appendix 228
6.4.1 Proof of Lemma 6.3 228
6.4.2 Proof of Theorem 6.2 229
6.4.3 Proof of Lemma 6.6 232
6.5 Exercises 236
7 Estimation of Tail Probability 241
7.1 Introduction 241
7.2 Estimation of Pareto Tail 244
7.3 Statistical Properties of AMLE 248
7.4 Remarks 253
7.5 Appendix 256
7.5.1 Proof of Lemma 7.7 256
7.5.2 Proof of Lemma 7.9 263
7.6 Exercises 267
References 269
Author Index 275
Subject Index 279