Talk:Benford's law
This is the talk page for discussing improvements to the Benford's law article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
Archives: 1, 2, 3, 4Auto-archiving period: 12 months |
This article is rated B-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||||||
|
This article has been mentioned by a media organization:
|
|
||||
This page has archives. Sections older than 365 days may be automatically archived by Lowercase sigmabot III when more than 5 sections are present. |
regarding the "Multiplicative Fluctuations" section
[edit]why does the log-normal distribution prove Benford's law? NadaB04 (talk) 16:14, 26 July 2023 (UTC)
- It doesn't. It explains Benford's law under the assumption that measurements of many natural processes seem to be distributed uniformly on a log scale. Constant314 (talk) 17:11, 26 July 2023 (UTC)
- (first of all, thanks for explaining it to me:)
- so just to make sure, we don't really have an interest here in the log-normal distribution? but rather just broad distribution which will be semi uniform?
- also, i believe there's an error in this section about the increasing variance part, i couldn't find any version of the central limit theorem which isn't about fixed variance. 2A10:8012:F:64D9:A9D3:6666:1CD0:DEA4 (talk) 21:50, 26 July 2023 (UTC)
Discarded zeros.
[edit]Zero is a digit and a number can start with it.
"Zero", AKA "0", is a digit, as is stated in the article, here:
"[...] in a given base with a fixed number of digits 0, 1, ..., n, ..., [...]"
and here:
"Four digits is often enough to assume a uniform distribution of 10% as "0" appears 10.0176% of the time in the fourth digit, while "9" appears 9.9824% of the time."
Numbers *can* start with the digit zero, as is also stated in the article, here:
"Numbers satisfying this include 3.14159..., 314285.7... and 0.00314465... ."
Too little too late about discarded zeros.
The role of zeros is perhaps neglected a bit by the article, to the detriment of the accessibility of the article. It's not obvious what the roles of zero are, in Benford's law. That zeros are implicitly being excluded is not always clear.
In fact, this fundamental point is not touched on in the lede, and only touched on explicitly twice in the body of the article, and in passing, literally in parenthesis each time.
It's easily missed, I think. It's not very accessible to most nonmathematicians either. I think using ellipsis is a false economy here, making it harder to notice that there are just *nine* digits there, and zero is not among them, and it would be far better to just write all the digits out.
The first time discarding of "zero" or "0" is touched on is in the body of the article is, but it's not explicit, and easily goes unnoticed:
"A set of numbers is said to satisfy Benford's law if the leading digit d (d ∈ {1, ..., 9}) occurs with probability [...]"
The first explicit reference to the discarding of zeros is quite far down in the article:
"For example, the first (non-zero) digit on the aforementioned list of lengths should have the same distribution whether the unit of measurement is feet or yards."
The second explicit reference to it is:
"It is possible to extend the law to digits beyond the first. In particular, for any given number of digits, the probability of encountering a number starting with the string of digits n of that length – discarding leading zeros – is given by [...]"
Possible improvements.
The first sentence of the lede is:
"Benford's law, also known as the Newcomb–Benford law, the law of anomalous numbers, or the first-digit law, is an observation that in many real-life sets of numerical data, the leading digit is likely to be small."
Maybe "the leading digit" should be instead, "the leading digit (discarding leading zeros) ", "the leading nonzero digit", "the leading digit of the normalized significand", or "the leading significant digit", to make it clear that a number starting with zero, "0.998", say, does not count as a number starting with a small digit.
Also, how about some explanation of *why* leading zeros are discarded. As Dale Carnegie once said, "I keep stating the obvious, because the obvious is what people need to be told." Polar Apposite (talk) 19:53, 17 September 2023 (UTC)
There's no such term as a "high burglary".
[edit]The WP article contains this:
"Television crime drama NUMB3RS used Benford's law in the 2006 episode "The Running Man" to help solve a series of high burglaries.[30]"
I don't think "high burglary" is a real term (Google has never heard of it), and have no idea what it could mean. A burglary that is a high crime? A high altitude burglary? A burglary committed while intoxicated? A burglary of a mansion? The link does not contain the term, and the burglary referred to in the link is a fictional one in an episode of "Numb3rs", a break-in at a university laboratory that is equipped with the latest high tech anti-burglary security equipment (the burglars are nevertheless successful in defeating the security equipment).
https://numb3rs.fandom.com/wiki/The_Running_Man contains this:
"He has a past selling high-end break-in tools. Some of the tech that the robbers would have had to get past are after his time. He suggests going to look for somebody else and for the police to stop bothering him."
So the word "high" seems to have broken off from "high-end" and somehow got attached to the front of "burglaries", for no apparent reason.
I therefore propose deleting the word "high" from the sentence. Polar Apposite (talk) 20:06, 17 September 2023 (UTC)
- Absolutely - go for it! - DavidWBrooks (talk) 20:27, 17 September 2023 (UTC)
Reverted edit
[edit]@Constant314: I'm aware; what I was saying was that it's obvious information that did not need to be included, especially as an entire sentence in the lead. Snowmanonahoe (talk · contribs · typos) 05:37, 11 April 2024 (UTC)
- I missed the implication of the sarcasm. The 11% needs to be there to contrast with the 30% and 5% in the previous sentence. But the one out of nine is redundant. I will fix it. Constant314 (talk) 20:48, 11 April 2024 (UTC)
First to Apply Benford's Law to Election Forensics
[edit]In the election data section, the article states: "Walter Mebane, a political scientist and statistician at the University of Michigan, was the first to apply the second-digit Benford's law-test (2BL-test) in election forensics." I'm writing a paper on this field currently, and from my research, I don't believe this is true. I'm pretty sure the first paper to apply Benford's Law to detecting election fraud was Pericchi and Torres: https://urru.org/papers/2004_varios/pericchi-torres.pdf
It's not widely credited as it's in Spanish, but it's even cited in Mebane's original paper. Matthewuzhere (talk) 01:22, 1 June 2024 (UTC)