Skip to content

Uniform distribution in helpers.arrayElements results #1765

@mjomble

Description

@mjomble

Clear and concise description of the problem

I want faker.helpers.arrayElements to always return each element with the same probability.
However, in certain situations, some indexes are picked far more often than others.

For example, when sampling 10 elements from an array with 1000 elements, indexes ending in 9 are are picked ~20 times more often than indexes ending in 1. And when sampling 9 or fewer elements, indexes ending in 1 seem to be never picked at all.

Suggested solution

The root cause of the problem is that arrayElements picks array indices using faker.number.float() here with the default precision of 0.01
This works fine with an array of 100 elements, but as the length grows, anomalies begin to appear.

The simplest solution would be to replace this.faker.number.float({ max: 0.99 }) with Math.random(). This would, however, break some deterministic test cases.

Alternative

We could also use something like this.faker.number.float({ max: 0.999999999, precision: 0.000000001 }) but I'm not sure what the best number of digits is.
The precision could also be conceivably derived from the length of the given array.

If I get some suggestions from maintainers, I may be able to submit a PR.

Additional context

Code to reproduce issue:

const { faker } = require('@faker-js/faker/locale/en')

const arrayLength = 1000
const ids = Array(arrayLength)

for (let i = 0; i < arrayLength; i++) {
    ids[i] = i
}

const countByMod = new Map()

for (let i = 0; i < 1000; i++) {
    for (const id of faker.helpers.arrayElements(ids, 10)) {
        const mod = id % 10
        const count = countByMod.get(mod) ?? 0
        countByMod.set(mod, count + 1)
    }
}

console.log('nines:', countByMod.get(9), 'ones:', countByMod.get(1))

Some outputs:

nines: 2793 ones: 116
nines: 2760 ones: 126
nines: 2755 ones: 107

Both should be much closer to 1000, which is the case when the same code is run on a fixed version of arrayElements:

nines: 990 ones: 1098
nines: 1008 ones: 1013
nines: 1023 ones: 943

Metadata

Metadata

Assignees

Labels

c: bugSomething isn't workingm: helpersSomething is referring to the helpers modulep: 1-normalNothing urgents: acceptedAccepted feature / Confirmed bug

Projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions