The Classical Occupancy Distribution: Computation and Approximation

Ben O’Neill*

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    11 Citations (Scopus)

    Abstract

    We examine the discrete distributional form that arises from the “classical occupancy problem,” which looks at the behavior of the number of occupied bins when we allocate a given number of balls uniformly at random to a given number of bins. We review the mass function and moments of the classical occupancy distribution and derive exact and asymptotic results for the mean, variance, skewness and kurtosis. We develop an algorithm to compute a cubic array of log-probabilities from the classical occupancy distribution. This algorithm allows the computation of large blocks of values while avoiding underflow problems in computation. Using this algorithm, we compute the classical occupancy distribution for a large block of values of balls and bins, and we measure the accuracy of its asymptotic approximation using the normal distribution. We analyze the accuracy of the normal approximation with respect to the variance, skewness and kurtosis of the distribution. Based on this analysis, we give some practical guidance on the feasibility of computing large blocks of values from the occupancy distribution, and when approximation is required.

    Original languageEnglish
    Pages (from-to)364-375
    Number of pages12
    JournalAmerican Statistician
    Volume75
    Issue number4
    DOIs
    Publication statusPublished - 2021

    Fingerprint

    Dive into the research topics of 'The Classical Occupancy Distribution: Computation and Approximation'. Together they form a unique fingerprint.

    Cite this