Hootsgo
Published on Hootsgo (https://hootsgo.org)


More about Benford's Law

The challenge this month is to find out more about Benford's law.

The math puzzle Are All Digits Equal [1]? introduced Benford’s law, which states that in many collections of data the first digits are not randomly distributed among 1 through 9. Rather, the digit 1 is the most common first digit. The others occur in order with decreasing frequency.

Numbers

Can you find more examples?


This content has been re-published with permission from SEED. Copyright © 2025 Schlumberger Excellence in Education Development (SEED), Inc.

Course: 

  • Math [2]
  • Probability [3]
Result/Solution(s)

Solution: More About Benford's Law Math Puzzle

Here are a few more examples of Benford's law.

Juan Calvarado's Solution

Juan Calvarado looked at the sizes of the files in the /system32/ directory of his computer.


First
digit
Number of files
with that first digit
Percentage of total

1

587

29

2

311

15

3

236

12

4

169

8

5

185

9

6

250

12

7

115

6

8

112

5

9

85

4


The most common first digit is 1, with the others following in almost decreasing order of frequency.

Of the 235 countries listed on the GeoHive [4] Web site, 64 have population figures that begin with 1. That’s 27%.

The finfacts [5] Web site shows the performance of Japan’s Nikkei stock market index for the 91 years from 1914–2004 inclusive. The end-of-year value of the index began with a 1 in 30 of those years. That’s 33%.

José Navarro's Solution

José Navarro offers this explanation of Benford’s law:

I think that the simplest explanations come in two or three parts. First, show that any measure that changes in a multiplicative manner (like stock values) obeys Benford's law. Then, show that any measure that is scale invariant (such as anything that has arbitrary units, like length in km or weight in pounds) is also a multiplicative measure and hence obeys Benford's law. And finally, show that this effect is more universal than it first appears.

1. Multiplication
This is the easiest to understand intuitively. It refers to measures that change in a multiplicative manner from the last value, such as stock values from day to day that change on a percentage basis regardless of their actual value.

Malcolm Browne said in a New York Times article (published August 4, 1998), Following Benford's Law, or Looking Out for No. 1 [6]

Most numbers we see every day are not random quantities in and of themselves. They're usually computed qualities with some aspect of multiplication to them.

Consider, for example, any property which grows on a percentage basis. Like, say, the Dow Jones Industrial Average. It typically grows a few percent a year. Suppose, just to pick a rate, that on average the DJIA grows at 7% a year. At that rate, it doubles about every ten years. Suppose that the DJIA is 10000. After ten years of having 1 as the leading digit, it finally gets to 20000. Ten years go by again, but in that ten years, it doubles to 40000, not 30000. Therefore, those ten years were spent about half starting with 2, and about half starting with 3. Ten more years go by, and it doubles again to 80000. Now ten years have 4, 5, 6 and 7 as the leading digits in only ten years. Eventually we get up to 100000, and spend another ten years starting with 1. Pick a random date and you'd expect that the DJIA on that day would be twice as likely to start with 1 as 2, and four times as likely to start with 1 as 5.

2. Scale Invariance
Imagine you have a measure like length of rivers in kilometers. When you look at the first digit of all the numbers, you get a particular distribution of the numbers 1 through 9. Now convert the lengths to another unit, say, miles, by multiplying by 0.62, and you will get a different first-digit distribution. If you now say that because either unit is arbitrary, you should have roughly the same distribution of first digits in either unit system, then this scale invariance in first-digit distribution is the same as requiring invariance under multiplication. A rigorous approach is found in many places, for example MathWorld [7].

3. Central-Limit, or Distribution-of-Distributions
It is also true that even if you start with a uniform distribution in first digit, after enough multiplications or scale changes you end up with a Benford distribution. Think of it as the limiting distribution after sufficient multiplications, regardless of which distribution you started with. There are also references to this effect in MathWorld [7].

Another Solution

Finally, here’s a case that seems to contradict the scale invariance principle that José describes above. Large Lakes of the World [8] lists the 35 largest lakes:


Name and location sq. mi. km

Caspian Sea,
Azerbaijan-Russia- Kazakhstan-Turkmenistan-Iran

152,239

394,299

Superior, U.S.-Canada

31,820

82,414

Victoria, Tanzania-Uganda

26,828

69,485

Huron, U.S.-Canada

23,010

59,596

Michigan, U.S.

22,400

58,016

Aral, Kazakhstan-Uzbekistan

13,000

33,800

Tanganyika, Tanzania-Congo

12,700

32,893

Baikal, Russia

12,162

31,500

Great Bear, Canada

12,000

31,080

Nyasa, Malawi-Mozambique-Tanzania

11,600

30,044

Great Slave, Canada

11,170

28,930

Chad, Chad-Niger-Nigeria

9,946

25,760

Erie, U.S.-Canada

9,930

25,719

Winnipeg, Canada

9,094

23,553

Ontario, U.S.-Canada

7,520

19,477

Balkhash, Kazakhstan

7,115

18,428

Ladoga, Russia

7,000

18,130

Onega, Russia

3,819

9,891

Titicaca, Bolivia-Peru

3,141

8,135

Nicaragua, Nicaragua

3,089

8,001

Athabaska, Canada

3,058

7,920

Rudolf, Kenya

2,473

6,405

Reindeer, Canada

2,444

6,330

Eyre, South Australia

2,400

6,216

Issyk-Kul, Kyrgyzstan

2,394

6,200

Urmia, Iran

2,317

6,001

Torrens, South Australia

2,200

5,698

Vänern, Sweden

2,141

5,545

Winnipegosis, Canada

2,086

5,403

Mobutu Sese Seko, Uganda

2,046

5,299

Nettilling, Baffin Island, Canada

1,950

5,051

Nipigon, Canada

1,870

4,843

Manitoba, Canada

1,817

4,706

Great Salt, U.S.

1,800

4,662

Kioga, Uganda

1,700

4,403


If the areas of the lakes are measured in square miles, 12 of them, or 34% , then begin with the digit one. But if they are measured in square kilometers, only 3 of them, less that 9%, begin with one. Why? A footnote to the table gives us an answer: Only lakes with an area greater than 1,700 sq mi (4,400 sq km) are included. This means that the five smallest lakes, which all begin with 1 when measured in square miles, move into the 4,000s when measured in square kilometers.

If the cutoff point had been 1,700 sq km (1,056 sq mi), the result might be very different. Maybe you can track down that data.

  • Probability puzzle [9]
  • Math Puzzle [10]
Copyright © 2018 Hootsgo. All Rights Reserved. Hootsgo is a registered 501 (c) (3) non-profit organization.
Donated by Dev2Source I.T. Services Ltd.

Source URL: https://hootsgo.org/?q=more-about-benfords-law

Links
[1] https://hootsgo.org/node/6524
[2] https://hootsgo.org/?q=taxonomy/term/50
[3] https://hootsgo.org/?q=course/probability
[4] http://www.geohive.com/
[5] http://www.finfacts.com/Private/curency/nikkei225performance.htm
[6] http://www.nytimes.com/1998/08/04/science/following-benford-s-law-or-looking-out-for-no-1.html
[7] http://mathworld.wolfram.com/BenfordsLaw.html
[8] http://www.infoplease.com/ipa/A0001777.html
[9] https://hootsgo.org/?q=tags/probability-puzzle
[10] https://hootsgo.org/?q=tags/math-puzzle