Benford’s law applied to hydrology data-results and relevance to other geophysical data.

*(English)*Zbl 1155.86322Summary: Benford’s law gives the expected frequencies of the digits in tabulated data and asserts that the lower digits (1, 2, and 3) are expected to occur more frequently than the higher digits. This study tested whether the law applied to two large earth science data sets. The first test analyzed streamflow statistics and the finding was a close conformity to Benford’s law. The second test analyzed the sizes of lakes and wetlands, and the finding was that the data did not conform to Benford’s law. Further analysis showed that the lake and wetland data followed a power law. The expected digit frequencies for data following a power law were derived, and the lake data had a close fit to these expected digit frequencies. The use of Benford’s law could serve as a quality check for streamflow data subsets, perhaps related to time or geographical area. Also, with the importance of lakes as essential components of the water cycle, either Benford’s law or the expected digit frequencies of data following a power law could be used as an authenticity and validity check on future databases dealing with water bodies.

We give several applications and avenues for future research, including an assessment of whether the digit frequencies of data could be used to derive the power law exponent, and whether the digit frequencies could be used to verify the range over which a power law applies. Our results indicate that data related to water bodies should conform to Benford’s Law and that nonconformity could be indicators of (a) an incomplete data set, (b) the sample not being representative of the population, (c) excessive rounding of the data, (d) data errors, inconsistencies, or anomalies, and/or (e) conformity to a power law with a large exponent.

We give several applications and avenues for future research, including an assessment of whether the digit frequencies of data could be used to derive the power law exponent, and whether the digit frequencies could be used to verify the range over which a power law applies. Our results indicate that data related to water bodies should conform to Benford’s Law and that nonconformity could be indicators of (a) an incomplete data set, (b) the sample not being representative of the population, (c) excessive rounding of the data, (d) data errors, inconsistencies, or anomalies, and/or (e) conformity to a power law with a large exponent.

##### MSC:

86A32 | Geostatistics |

##### Keywords:

data integrity; hydrographic statistics; hydrometric statistics; streamflow analysis; power law exponent
PDF
BibTeX
XML
Cite

\textit{M. J. Nigrini} and \textit{S. J. Miller}, Math. Geol. 39, No. 5, 469--490 (2007; Zbl 1155.86322)

Full Text:
DOI

##### References:

[1] | Benford F (1938) The law of anomalous numbers. Proc Am Philos Soc 78(4):551–572 · JFM 64.0555.03 |

[2] | DeGroot M, Schervish M (2002) Probability and statistics, 3rd edn. Addison-Wesley, Reading |

[3] | Diaconis P (1976) The distribution of leading digits and uniform distribution mod 1. Ann Probab 5(1):72–81 · Zbl 0364.10025 · doi:10.1214/aop/1176995891 |

[4] | Drake PD, Nigrini MJ (2000) Computer assisted analytical procedures using Benford’s Law. J Account Educ 18(2):127–146 · doi:10.1016/S0748-5751(00)00008-7 |

[5] | Hill TP (1995) Base-invariance implies Benford’s Law. Proc Am Math Soc 123(3):887–895 · Zbl 0813.60002 |

[6] | Kontorovich AV, Miller SJ (2005) Benford’s Law, values of L-functions, and the 3x+1 problem. Acta Arith 120(3):269–297 · Zbl 1139.11033 · doi:10.4064/aa120-3-4 |

[7] | Lagarias J, Soundararajan K (2006) Benford’s Law for the 3x+1 function. J Lond Math Soc 74(2):273–288 · Zbl 1117.11018 · doi:10.1112/S0024610706023131 |

[8] | Leemis LM, Schmeiser BW, Evans DL (2000) Survival distributions satisfying Benford’s Law. Am Stat 54(3):1–6 · Zbl 04563209 · doi:10.2307/2685773 |

[9] | Lehner B, Döll P (2004) Development and validation of a global database of lakes, reservoirs and wetlands. J Hydrol 296(1–4):1–22 · doi:10.1016/j.jhydrol.2004.03.028 |

[10] | Ley E (1996) On the peculiar distribution of the US Stock Indices first digits. Am Stat 50(4):311–313 · Zbl 04536381 · doi:10.2307/2684926 |

[11] | Newman MEJ (2005) Power laws, Pareto distributions and Zipf’s law. Contemp Phys 46(5):323–351 · doi:10.1080/00107510500052444 |

[12] | Nigrini MJ (1996) A taxpayer compliance application of Benford’s Law. J Am Tax Assoc 18:72–91 |

[13] | Nigrini MJ (2005) An assessment of the change in the incidence of earnings management around the Enron–Andersen episode. Rev Account Financ 4(1):92–110 · doi:10.1108/eb043420 |

[14] | Nigrini MJ, Mittermaier LJ (1997) The use of Benford’s Law as an aid in analytical procedures: Auditing. J Pract Theory 16(2):52–67 |

[15] | Pinkham RS (1961) On the distribution of first significant digits. Ann Math Stat 32(4):1223–1230 · Zbl 0102.14205 · doi:10.1214/aoms/1177704862 |

[16] | Raimi R (1969) The peculiar distribution of first digits. Sci Am 221(6):109–120 · doi:10.1038/scientificamerican1269-109 |

[17] | Raimi R (1976) The first digit problem. Am Math Mon 83(7):521–538 · Zbl 0349.60014 · doi:10.2307/2319349 |

[18] | The Economist (2006) Scientific fraud: Egg on his face, 5 January 2006 |

[19] | Wallace WA (2002) Assessing the quality of data used for benchmarking and decision-making. J Gov Financ Manag 51(3):16–22 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.