Philosophical musings on a diverse variety of subjects.

"Chenango" is an old Indian word allegedly meaning "land of the bullthistle. Or so the traditional story has it. The bullthistle (Cirsium vulgare) is not native to North America; it was probably brought over from Europe. Nevertheless, we in Chenango County, New York, use it as our county logo. I am a Bullthistle Birder, a Bullthistle Botanizer, and a Bullthistle Hiker. With this blog I am now a Bullthistle Blogger.
For posts specific to Chenango County click these links.



Monday, September 4, 2017

BENFORD'S LAW HAS ITS ORIGIN IN THE PARTITION OF INTEGERS


BENFORD’S LAW HAS ITS ORIGIN IN THE PARTITION OF INTEGERS

Donald A. Windsor


Benford’s Law states that 1 tends to be the most frequent number in compilations of measured data (1).

In the partition of any integer, 1 is the most frequent number. The partitioning of integers is a basic property of our system of numbers and could be the origin of Benford’s Law.

An integer is partitioned by assembling all of the possible ways that add up to it. For example, the partition of 5 consists of 7 partition sets.

5
4+1
3+2
3+1+1
2+2+1
2+1+1+1
1+1+1+1+1

The frequency distribution of the 20 numbers in the partition sets is:

5
4
3 3
2 2 2 2
1 1 1 1 1 1 1 1 1 1 1 1

This highest frequency of 1s holds true for all integers, because the lowest partition state of any integer is all 1s. Therefore, 1 is the most frequent number; 2 is second, and so forth.

I have been using the partitioning of integers as a non-probabilistic standard for modeling phylogenetic, bibliometric, ecological, and economic distributions (2). The reason is that partitions provide a standard, immutable, frequency distribution for comparing against the wavering frequency distributions found in nature and based on randomness and probabilities.

I can simulate many natural frequency distributions by using a simple urn model driven by a random number generator. Some of these Monte Carlo simulations resemble the partition distributions. The biggest departure is at the top with the highest value numbers. The highest value in the partition of 5 has to be 5. In a simulation the highest value could be several times 5. this work continues, albeit at a slow pace because of my advancing age and my obligations as caregiver for my disabled wife.

I did not appreciate the importance of relating partitions to Benford’s Law until I read the article by Brooks (3) in which he asks, “Why on earth should Benford’s law exist?” That was when I realized that the partition of integers could be the origin of Benford’s law, not just another example, because partitions are a basic property of our system of numbers.

I suspect that if Benford had known about my partition model, he may have based his Law upon it.


References cited:
1. Benford, Frank. The law of anomalous numbers. Proceedings of the American Philosophical Society 1938 March 31; 78(4): 551-572.

2. Windsor, Donald A. Integer partitions result in skewed rank-frequency distributions. Journal of the American Society for Information Science and Technology 2002 December; 53(14): 1276.

3. Brooks, Michael. Benford’s law. New Scientist 2017 August 26; 235(3140):38-39.

====