As of version 3.6, the random
package has the function choices
that allows you to set the weights of each value within a population.
from random import choices
population = [1, 7, 15]
weights = [40, 30, 30]
samples = choices(population, weights, k=100)
The code above will generate 100 random values following the defined weights.
Based on
Random draw, but with different odds
The easiest way to implement is to generate a structure that already has these probabilities. Since the number 1 must have a probability equal to 40%, while the numbers 7 and 15 must have a probability equal to 30%, you can generate a list that has 10 elements in total, repeating the number 1 four times and the numbers 7 and 15 three times each.
v = [1, 1, 1, 1, 7, 7, 7, 15, 15, 15]
So, by doing random.choice(v)
, the desired probabilities will be respected.
Implementing in a genetic way, and quite simple, you can define a function that generates this list as you need it. For example:
from typing import Tuple, Any, List
from itertools import starmap, chain, repeat
from random import choice
def sample_space(*values: Tuple[Any, float]) -> List:
if sum(value[1] for value in values) != 1.0:
raise ValueError('Soma das probabilidades não é 1.0')
while True:
if all(probability.is_integer() for _, probability in values):
break
values = [(value, probability * 10) for value, probability in values]
values = [(value, int(probability)) for value, probability in values]
return list(chain(*starmap(lambda a, b: list(repeat(a, b)), values)))
The sample_space
function will generate for you a list that defines exactly the sample space you want, just pass as a parameter a set of tuples with desired value and probability. For the data presented in the question, it would look like:
>>> space = sample_space((1, 0.4), (7, 0.3), (15, 0.3))
>>> print(space)
[1, 1, 1, 1, 7, 7, 7, 15, 15, 15]
If you draw 100 numbers from this sample space and check how many times they repeat (you can use collections.Counter
for this), you'll see that the probabilities tend to be followed:
>>> samples = [choice(space) for _ in range(100)]
>>> print(Counter(samples))
Counter({1: 40, 7: 32, 15: 28})
See working at Repl.it | Ideone
It will work for any probabilities, as long as the sum is always 1.0. For example, for a sample space that is 99% True
and 1% False
would be:
>>> space = sample_space((True, 0.99), (False, 0.01))
However, this would generate a list with 100 values, being 99 True
and 1 False
; so for this solution, given its simplicity, be careful not to generate too large lists and affect the application memory. The more decimal places you have the probabilities, the more elements in memory will be needed.