问题:如何创建无重复的随机数列表?
我尝试使用random.randint(0, 100)
,但一些数字相同。有没有一种方法/模块来创建唯一的随机数列表?
注意:以下代码基于答案,并在发布答案后添加。这不是问题的一部分。这是解决方案。
def getScores():
# open files to read and write
f1 = open("page.txt", "r");
p1 = open("pgRes.txt", "a");
gScores = [];
bScores = [];
yScores = [];
# run 50 tests of 40 random queries to implement "bootstrapping" method
for i in range(50):
# get 40 random queries from the 50
lines = random.sample(f1.readlines(), 40);
I tried using random.randint(0, 100)
, but some numbers were the same. Is there a method/module to create a list unique random numbers?
Note: The following code is based on an answer and has been added after the answer was posted. It is not a part of the question; it is the solution.
def getScores():
# open files to read and write
f1 = open("page.txt", "r");
p1 = open("pgRes.txt", "a");
gScores = [];
bScores = [];
yScores = [];
# run 50 tests of 40 random queries to implement "bootstrapping" method
for i in range(50):
# get 40 random queries from the 50
lines = random.sample(f1.readlines(), 40);
回答 0
这将返回从0到99范围内选择的10个数字的列表,没有重复。
import random
random.sample(range(100), 10)
参考您的特定代码示例,您可能希望一次从文件中读取所有行,然后从内存中的已保存列表中选择随机行。例如:
all_lines = f1.readlines()
for i in range(50):
lines = random.sample(all_lines, 40)
这样,您只需要在循环之前实际从文件中读取一次即可。与返回文件的开头并f1.readlines()
为每次循环迭代再次调用相比,这样做的效率要高得多。
This will return a list of 10 numbers selected from the range 0 to 99, without duplicates.
import random
random.sample(range(100), 10)
With reference to your specific code example, you probably want to read all the lines from the file once and then select random lines from the saved list in memory. For example:
all_lines = f1.readlines()
for i in range(50):
lines = random.sample(all_lines, 40)
This way, you only need to actually read from the file once, before your loop. It’s much more efficient to do this than to seek back to the start of the file and call f1.readlines()
again for each loop iteration.
回答 1
您可以像下面这样使用随机模块中的随机播放功能:
import random
my_list = list(xrange(1,100)) # list of integers from 1 to 99
# adjust this boundaries to fit your needs
random.shuffle(my_list)
print my_list # <- List of unique random numbers
请注意,在这里shuffle方法不会像预期的那样返回任何列表,它只会对通过引用传递的列表进行随机排序。
You can use the shuffle function from the random module like this:
import random
my_list = list(xrange(1,100)) # list of integers from 1 to 99
# adjust this boundaries to fit your needs
random.shuffle(my_list)
print my_list # <- List of unique random numbers
Note here that the shuffle method doesn’t return any list as one may expect, it only shuffle the list passed by reference.
回答 2
You can first create a list of numbers from a
to b
, where a
and b
are respectively the smallest and greatest numbers in your list, then shuffle it with Fisher-Yates algorithm or using the Python’s random.shuffle
method.
回答 3
此答案中给出的解决方案有效,但如果样本量很小但总数很大(例如random.sample(insanelyLargeNumber, 10)
),则可能会在内存问题上产生问题。
为了解决这个问题,我可以这样做:
answer = set()
sampleSize = 10
answerSize = 0
while answerSize < sampleSize:
r = random.randint(0,100)
if r not in answer:
answerSize += 1
answer.add(r)
# answer now contains 10 unique, random integers from 0.. 100
The solution presented in this answer works, but it could become problematic with memory if the sample size is small, but the population is huge (e.g. random.sample(insanelyLargeNumber, 10)
).
To fix that, I would go with this:
answer = set()
sampleSize = 10
answerSize = 0
while answerSize < sampleSize:
r = random.randint(0,100)
if r not in answer:
answerSize += 1
answer.add(r)
# answer now contains 10 unique, random integers from 0.. 100
回答 4
线性同余伪随机数生成器
O(1)记忆
O(k)运算
这个问题可以用一个简单的线性同余发生器来解决。这需要恒定的内存开销(8个整数)和最多2 *(序列长度)的计算。
所有其他解决方案都使用更多的内存和更多的计算资源!如果只需要几个随机序列,则此方法将便宜得多。对于大小范围N
,如果要按N
唯一k
序列或更大的顺序生成,我建议使用内置方法接受的解决方案,random.sample(range(N),k)
因为它已在python中进行了速度优化。
码
# Return a randomized "range" using a Linear Congruential Generator
# to produce the number sequence. Parameters are the same as for
# python builtin "range".
# Memory -- storage for 8 integers, regardless of parameters.
# Compute -- at most 2*"maximum" steps required to generate sequence.
#
def random_range(start, stop=None, step=None):
import random, math
# Set a default values the same way "range" does.
if (stop == None): start, stop = 0, start
if (step == None): step = 1
# Use a mapping to convert a standard range into the desired range.
mapping = lambda i: (i*step) + start
# Compute the number of numbers in this range.
maximum = (stop - start) // step
# Seed range with a random integer.
value = random.randint(0,maximum)
#
# Construct an offset, multiplier, and modulus for a linear
# congruential generator. These generators are cyclic and
# non-repeating when they maintain the properties:
#
# 1) "modulus" and "offset" are relatively prime.
# 2) ["multiplier" - 1] is divisible by all prime factors of "modulus".
# 3) ["multiplier" - 1] is divisible by 4 if "modulus" is divisible by 4.
#
offset = random.randint(0,maximum) * 2 + 1 # Pick a random odd-valued offset.
multiplier = 4*(maximum//4) + 1 # Pick a multiplier 1 greater than a multiple of 4.
modulus = int(2**math.ceil(math.log2(maximum))) # Pick a modulus just big enough to generate all numbers (power of 2).
# Track how many random numbers have been returned.
found = 0
while found < maximum:
# If this is a valid value, yield it in generator fashion.
if value < maximum:
found += 1
yield mapping(value)
# Calculate the next value in the sequence.
value = (value*multiplier + offset) % modulus
用法
此函数“ random_range”的用法与任何生成器(例如“ range”)相同。一个例子:
# Show off random range.
print()
for v in range(3,6):
v = 2**v
l = list(random_range(v))
print("Need",v,"found",len(set(l)),"(min,max)",(min(l),max(l)))
print("",l)
print()
样品结果
Required 8 cycles to generate a sequence of 8 values.
Need 8 found 8 (min,max) (0, 7)
[1, 0, 7, 6, 5, 4, 3, 2]
Required 16 cycles to generate a sequence of 9 values.
Need 9 found 9 (min,max) (0, 8)
[3, 5, 8, 7, 2, 6, 0, 1, 4]
Required 16 cycles to generate a sequence of 16 values.
Need 16 found 16 (min,max) (0, 15)
[5, 14, 11, 8, 3, 2, 13, 1, 0, 6, 9, 4, 7, 12, 10, 15]
Required 32 cycles to generate a sequence of 17 values.
Need 17 found 17 (min,max) (0, 16)
[12, 6, 16, 15, 10, 3, 14, 5, 11, 13, 0, 1, 4, 8, 7, 2, ...]
Required 32 cycles to generate a sequence of 32 values.
Need 32 found 32 (min,max) (0, 31)
[19, 15, 1, 6, 10, 7, 0, 28, 23, 24, 31, 17, 22, 20, 9, ...]
Required 64 cycles to generate a sequence of 33 values.
Need 33 found 33 (min,max) (0, 32)
[11, 13, 0, 8, 2, 9, 27, 6, 29, 16, 15, 10, 3, 14, 5, 24, ...]
Linear Congruential Pseudo-random Number Generator
O(1) Memory
O(k) Operations
This problem can be solved with a simple Linear Congruential Generator. This requires constant memory overhead (8 integers) and at most 2*(sequence length) computations.
All other solutions use more memory and more compute! If you only need a few random sequences, this method will be significantly cheaper. For ranges of size N
, if you want to generate on the order of N
unique k
-sequences or more, I recommend the accepted solution using the builtin methods random.sample(range(N),k)
as this has been optimized in python for speed.
Code
# Return a randomized "range" using a Linear Congruential Generator
# to produce the number sequence. Parameters are the same as for
# python builtin "range".
# Memory -- storage for 8 integers, regardless of parameters.
# Compute -- at most 2*"maximum" steps required to generate sequence.
#
def random_range(start, stop=None, step=None):
import random, math
# Set a default values the same way "range" does.
if (stop == None): start, stop = 0, start
if (step == None): step = 1
# Use a mapping to convert a standard range into the desired range.
mapping = lambda i: (i*step) + start
# Compute the number of numbers in this range.
maximum = (stop - start) // step
# Seed range with a random integer.
value = random.randint(0,maximum)
#
# Construct an offset, multiplier, and modulus for a linear
# congruential generator. These generators are cyclic and
# non-repeating when they maintain the properties:
#
# 1) "modulus" and "offset" are relatively prime.
# 2) ["multiplier" - 1] is divisible by all prime factors of "modulus".
# 3) ["multiplier" - 1] is divisible by 4 if "modulus" is divisible by 4.
#
offset = random.randint(0,maximum) * 2 + 1 # Pick a random odd-valued offset.
multiplier = 4*(maximum//4) + 1 # Pick a multiplier 1 greater than a multiple of 4.
modulus = int(2**math.ceil(math.log2(maximum))) # Pick a modulus just big enough to generate all numbers (power of 2).
# Track how many random numbers have been returned.
found = 0
while found < maximum:
# If this is a valid value, yield it in generator fashion.
if value < maximum:
found += 1
yield mapping(value)
# Calculate the next value in the sequence.
value = (value*multiplier + offset) % modulus
Usage
The usage of this function “random_range” is the same as for any generator (like “range”). An example:
# Show off random range.
print()
for v in range(3,6):
v = 2**v
l = list(random_range(v))
print("Need",v,"found",len(set(l)),"(min,max)",(min(l),max(l)))
print("",l)
print()
Sample Results
Required 8 cycles to generate a sequence of 8 values.
Need 8 found 8 (min,max) (0, 7)
[1, 0, 7, 6, 5, 4, 3, 2]
Required 16 cycles to generate a sequence of 9 values.
Need 9 found 9 (min,max) (0, 8)
[3, 5, 8, 7, 2, 6, 0, 1, 4]
Required 16 cycles to generate a sequence of 16 values.
Need 16 found 16 (min,max) (0, 15)
[5, 14, 11, 8, 3, 2, 13, 1, 0, 6, 9, 4, 7, 12, 10, 15]
Required 32 cycles to generate a sequence of 17 values.
Need 17 found 17 (min,max) (0, 16)
[12, 6, 16, 15, 10, 3, 14, 5, 11, 13, 0, 1, 4, 8, 7, 2, ...]
Required 32 cycles to generate a sequence of 32 values.
Need 32 found 32 (min,max) (0, 31)
[19, 15, 1, 6, 10, 7, 0, 28, 23, 24, 31, 17, 22, 20, 9, ...]
Required 64 cycles to generate a sequence of 33 values.
Need 33 found 33 (min,max) (0, 32)
[11, 13, 0, 8, 2, 9, 27, 6, 29, 16, 15, 10, 3, 14, 5, 24, ...]
回答 5
If the list of N numbers from 1 to N is randomly generated, then yes, there is a possibility that some numbers may be repeated.
If you want a list of numbers from 1 to N in a random order, fill an array with integers from 1 to N, and then use a Fisher-Yates shuffle or Python’s random.shuffle()
.
回答 6
如果需要采样非常大的数字,则不能使用 range
random.sample(range(10000000000000000000000000000000), 10)
因为它抛出:
OverflowError: Python int too large to convert to C ssize_t
另外,如果random.sample
由于范围太小而无法生成所需数量的物品
random.sample(range(2), 1000)
它抛出:
ValueError: Sample larger than population
此函数解决了两个问题:
import random
def random_sample(count, start, stop, step=1):
def gen_random():
while True:
yield random.randrange(start, stop, step)
def gen_n_unique(source, n):
seen = set()
seenadd = seen.add
for i in (i for i in source() if i not in seen and not seenadd(i)):
yield i
if len(seen) == n:
break
return [i for i in gen_n_unique(gen_random,
min(count, int(abs(stop - start) / abs(step))))]
大量使用:
print('\n'.join(map(str, random_sample(10, 2, 10000000000000000000000000000000))))
样本结果:
7822019936001013053229712669368
6289033704329783896566642145909
2473484300603494430244265004275
5842266362922067540967510912174
6775107889200427514968714189847
9674137095837778645652621150351
9969632214348349234653730196586
1397846105816635294077965449171
3911263633583030536971422042360
9864578596169364050929858013943
范围小于所请求项目数的用法:
print(', '.join(map(str, random_sample(100000, 0, 3))))
样本结果:
2, 0, 1
它还适用于负范围和步骤:
print(', '.join(map(str, random_sample(10, 10, -10, -2))))
print(', '.join(map(str, random_sample(10, 5, -5, -2))))
样本结果:
2, -8, 6, -2, -4, 0, 4, 10, -6, 8
-3, 1, 5, -1, 3
If you need to sample extremely large numbers, you cannot use range
random.sample(range(10000000000000000000000000000000), 10)
because it throws:
OverflowError: Python int too large to convert to C ssize_t
Also, if random.sample
cannot produce the number of items you want due to the range being too small
random.sample(range(2), 1000)
it throws:
ValueError: Sample larger than population
This function resolves both problems:
import random
def random_sample(count, start, stop, step=1):
def gen_random():
while True:
yield random.randrange(start, stop, step)
def gen_n_unique(source, n):
seen = set()
seenadd = seen.add
for i in (i for i in source() if i not in seen and not seenadd(i)):
yield i
if len(seen) == n:
break
return [i for i in gen_n_unique(gen_random,
min(count, int(abs(stop - start) / abs(step))))]
Usage with extremely large numbers:
print('\n'.join(map(str, random_sample(10, 2, 10000000000000000000000000000000))))
Sample result:
7822019936001013053229712669368
6289033704329783896566642145909
2473484300603494430244265004275
5842266362922067540967510912174
6775107889200427514968714189847
9674137095837778645652621150351
9969632214348349234653730196586
1397846105816635294077965449171
3911263633583030536971422042360
9864578596169364050929858013943
Usage where the range is smaller than the number of requested items:
print(', '.join(map(str, random_sample(100000, 0, 3))))
Sample result:
2, 0, 1
It also works with with negative ranges and steps:
print(', '.join(map(str, random_sample(10, 10, -10, -2))))
print(', '.join(map(str, random_sample(10, 5, -5, -2))))
Sample results:
2, -8, 6, -2, -4, 0, 4, 10, -6, 8
-3, 1, 5, -1, 3
回答 7
您可以使用Numpy库进行快速解答,如下所示-
给定的代码段列出了0到5之间的6个唯一数字。您可以根据需要调整参数。
import numpy as np
import random
a = np.linspace( 0, 5, 6 )
random.shuffle(a)
print(a)
输出量
[ 2. 1. 5. 3. 4. 0.]
它不把任何约束,因为我们在random.sample看到称为这里。
希望这个对你有帮助。
You can use Numpy library for quick answer as shown below –
Given code snippet lists down 6 unique numbers between the range of 0 to 5. You can adjust the parameters for your comfort.
import numpy as np
import random
a = np.linspace( 0, 5, 6 )
random.shuffle(a)
print(a)
Output
[ 2. 1. 5. 3. 4. 0.]
It doesn’t put any constraints as we see in random.sample as referred here.
Hope this helps a bit.
回答 8
此处提供的答案在时间和内存方面都非常有效,但由于它使用了诸如yield的高级python构造,因此有点复杂。在简单的答案行之有效的做法,但与回答的问题是,它可以实际构建所需的集之前产生许多虚假的整数。使用人口大小= 1000,样本大小= 999进行尝试。理论上,它有可能不会终止。
下面的答案解决了这两个问题,因为它是确定性的,虽然目前还不如其他两个效率高,但它还是有一定效率的。
def randomSample(populationSize, sampleSize):
populationStr = str(populationSize)
dTree, samples = {}, []
for i in range(sampleSize):
val, dTree = getElem(populationStr, dTree, '')
samples.append(int(val))
return samples, dTree
函数getElem,percolateUp的定义如下
import random
def getElem(populationStr, dTree, key):
msd = int(populationStr[0])
if not key in dTree.keys():
dTree[key] = range(msd + 1)
idx = random.randint(0, len(dTree[key]) - 1)
key = key + str(dTree[key][idx])
if len(populationStr) == 1:
dTree[key[:-1]].pop(idx)
return key, (percolateUp(dTree, key[:-1]))
newPopulation = populationStr[1:]
if int(key[-1]) != msd:
newPopulation = str(10**(len(newPopulation)) - 1)
return getElem(newPopulation, dTree, key)
def percolateUp(dTree, key):
while (dTree[key] == []):
dTree[key[:-1]].remove( int(key[-1]) )
key = key[:-1]
return dTree
最后,对于较大的n值,平均时间约为15毫秒,如下所示:
In [3]: n = 10000000000000000000000000000000
In [4]: %time l,t = randomSample(n, 5)
Wall time: 15 ms
In [5]: l
Out[5]:
[10000000000000000000000000000000L,
5731058186417515132221063394952L,
85813091721736310254927217189L,
6349042316505875821781301073204L,
2356846126709988590164624736328L]
The answer provided here works very well with respect to time
as well as memory but a bit more complicated as it uses advanced python
constructs such as yield. The simpler answer works well in practice but, the issue with that
answer is that it may generate many spurious integers before actually constructing
the required set. Try it out with populationSize = 1000, sampleSize = 999.
In theory, there is a chance that it doesn’t terminate.
The answer below addresses both issues, as it is deterministic and somewhat efficient
though currently not as efficient as the other two.
def randomSample(populationSize, sampleSize):
populationStr = str(populationSize)
dTree, samples = {}, []
for i in range(sampleSize):
val, dTree = getElem(populationStr, dTree, '')
samples.append(int(val))
return samples, dTree
where the functions getElem, percolateUp are as defined below
import random
def getElem(populationStr, dTree, key):
msd = int(populationStr[0])
if not key in dTree.keys():
dTree[key] = range(msd + 1)
idx = random.randint(0, len(dTree[key]) - 1)
key = key + str(dTree[key][idx])
if len(populationStr) == 1:
dTree[key[:-1]].pop(idx)
return key, (percolateUp(dTree, key[:-1]))
newPopulation = populationStr[1:]
if int(key[-1]) != msd:
newPopulation = str(10**(len(newPopulation)) - 1)
return getElem(newPopulation, dTree, key)
def percolateUp(dTree, key):
while (dTree[key] == []):
dTree[key[:-1]].remove( int(key[-1]) )
key = key[:-1]
return dTree
Finally, the timing on average was about 15ms for a large value of n as shown below,
In [3]: n = 10000000000000000000000000000000
In [4]: %time l,t = randomSample(n, 5)
Wall time: 15 ms
In [5]: l
Out[5]:
[10000000000000000000000000000000L,
5731058186417515132221063394952L,
85813091721736310254927217189L,
6349042316505875821781301073204L,
2356846126709988590164624736328L]
回答 9
为了获得确定性,高效且使用基本编程结构构建的,生成无重复值的随机值列表的程序,请考虑以下extractSamples
定义的函数,
def extractSamples(populationSize, sampleSize, intervalLst) :
import random
if (sampleSize > populationSize) :
raise ValueError("sampleSize = "+str(sampleSize) +" > populationSize (= " + str(populationSize) + ")")
samples = []
while (len(samples) < sampleSize) :
i = random.randint(0, (len(intervalLst)-1))
(a,b) = intervalLst[i]
sample = random.randint(a,b)
if (a==b) :
intervalLst.pop(i)
elif (a == sample) : # shorten beginning of interval
intervalLst[i] = (sample+1, b)
elif ( sample == b) : # shorten interval end
intervalLst[i] = (a, sample - 1)
else :
intervalLst[i] = (a, sample - 1)
intervalLst.append((sample+1, b))
samples.append(sample)
return samples
基本思想是跟踪intervalLst
从中选择所需元素的可能值的间隔。从确定的意义上说,这是确定性的,我们可以保证在固定数量的步骤(仅取决于populationSize
和sampleSize
)内生成样本。
要使用上述功能生成我们所需的列表,
In [3]: populationSize, sampleSize = 10**17, 10**5
In [4]: %time lst1 = extractSamples(populationSize, sampleSize, [(0, populationSize-1)])
CPU times: user 289 ms, sys: 9.96 ms, total: 299 ms
Wall time: 293 ms
我们还可以将其与较早的解决方案进行比较(以更低的种群数量值)
In [5]: populationSize, sampleSize = 10**8, 10**5
In [6]: %time lst = random.sample(range(populationSize), sampleSize)
CPU times: user 1.89 s, sys: 299 ms, total: 2.19 s
Wall time: 2.18 s
In [7]: %time lst1 = extractSamples(populationSize, sampleSize, [(0, populationSize-1)])
CPU times: user 449 ms, sys: 8.92 ms, total: 458 ms
Wall time: 442 ms
请注意,我降低了populationSize
数值,因为使用random.sample
解决方案时,它会为更高的值产生“内存错误” (也在此处和此处的先前答案中提到)。对于上述值,我们还可以观察到其extractSamples
性能优于该random.sample
方法。
PS:尽管核心方法与我之前的回答类似,但是在实现和方法上都进行了实质性修改,同时还提高了清晰度。
In order to obtain a program that generates a list of random values without duplicates that is deterministic, efficient and built with basic programming constructs consider the function extractSamples
defined below,
def extractSamples(populationSize, sampleSize, intervalLst) :
import random
if (sampleSize > populationSize) :
raise ValueError("sampleSize = "+str(sampleSize) +" > populationSize (= " + str(populationSize) + ")")
samples = []
while (len(samples) < sampleSize) :
i = random.randint(0, (len(intervalLst)-1))
(a,b) = intervalLst[i]
sample = random.randint(a,b)
if (a==b) :
intervalLst.pop(i)
elif (a == sample) : # shorten beginning of interval
intervalLst[i] = (sample+1, b)
elif ( sample == b) : # shorten interval end
intervalLst[i] = (a, sample - 1)
else :
intervalLst[i] = (a, sample - 1)
intervalLst.append((sample+1, b))
samples.append(sample)
return samples
The basic idea is to keep track of intervals intervalLst
for possible values from which to select our required elements from. This is deterministic in the sense that we are guaranteed to generate a sample within a fixed number of steps (solely dependent on populationSize
and sampleSize
).
To use the above function to generate our required list,
In [3]: populationSize, sampleSize = 10**17, 10**5
In [4]: %time lst1 = extractSamples(populationSize, sampleSize, [(0, populationSize-1)])
CPU times: user 289 ms, sys: 9.96 ms, total: 299 ms
Wall time: 293 ms
We may also compare with an earlier solution (for a lower value of populationSize)
In [5]: populationSize, sampleSize = 10**8, 10**5
In [6]: %time lst = random.sample(range(populationSize), sampleSize)
CPU times: user 1.89 s, sys: 299 ms, total: 2.19 s
Wall time: 2.18 s
In [7]: %time lst1 = extractSamples(populationSize, sampleSize, [(0, populationSize-1)])
CPU times: user 449 ms, sys: 8.92 ms, total: 458 ms
Wall time: 442 ms
Note that I reduced populationSize
value as it produces Memory Error for higher values when using the random.sample
solution (also mentioned in previous answers here and here). For above values, we can also observe that extractSamples
outperforms the random.sample
approach.
P.S. : Though the core approach is similar to my earlier answer, there are substantial modifications in implementation as well as approach alongwith improvement in clarity.
回答 10
一个非常简单的功能,也可以解决您的问题
from random import randint
data = []
def unique_rand(inicial, limit, total):
data = []
i = 0
while i < total:
number = randint(inicial, limit)
if number not in data:
data.append(number)
i += 1
return data
data = unique_rand(1, 60, 6)
print(data)
"""
prints something like
[34, 45, 2, 36, 25, 32]
"""
A very simple function that also solves your problem
from random import randint
data = []
def unique_rand(inicial, limit, total):
data = []
i = 0
while i < total:
number = randint(inicial, limit)
if number not in data:
data.append(number)
i += 1
return data
data = unique_rand(1, 60, 6)
print(data)
"""
prints something like
[34, 45, 2, 36, 25, 32]
"""
回答 11
基于集合的方法(“如果返回值中有随机值,请重试”)的问题是,由于冲突(需要另一次“重试”迭代),不确定它们的运行时间,尤其是当返回大量随机值时从范围。
以下是不倾向于这种不确定性运行时的替代方法:
import bisect
import random
def fast_sample(low, high, num):
""" Samples :param num: integer numbers in range of
[:param low:, :param high:) without replacement
by maintaining a list of ranges of values that
are permitted.
This list of ranges is used to map a random number
of a contiguous a range (`r_n`) to a permissible
number `r` (from `ranges`).
"""
ranges = [high]
high_ = high - 1
while len(ranges) - 1 < num:
# generate a random number from an ever decreasing
# contiguous range (which we'll map to the true
# random number).
# consider an example with low=0, high=10,
# part way through this loop with:
#
# ranges = [0, 2, 3, 7, 9, 10]
#
# r_n :-> r
# 0 :-> 1
# 1 :-> 4
# 2 :-> 5
# 3 :-> 6
# 4 :-> 8
r_n = random.randint(low, high_)
range_index = bisect.bisect_left(ranges, r_n)
r = r_n + range_index
for i in xrange(range_index, len(ranges)):
if ranges[i] <= r:
# as many "gaps" we iterate over, as much
# is the true random value (`r`) shifted.
r = r_n + i + 1
elif ranges[i] > r_n:
break
# mark `r` as another "gap" of the original
# [low, high) range.
ranges.insert(i, r)
# Fewer values possible.
high_ -= 1
# `ranges` happens to contain the result.
return ranges[:-1]
The problem with the set based approaches (“if random value in return values, try again”) is that their runtime is undetermined due to collisions (which require another “try again” iteration), especially when a large amount of random values are returned from the range.
An alternative that isn’t prone to this non-deterministic runtime is the following:
import bisect
import random
def fast_sample(low, high, num):
""" Samples :param num: integer numbers in range of
[:param low:, :param high:) without replacement
by maintaining a list of ranges of values that
are permitted.
This list of ranges is used to map a random number
of a contiguous a range (`r_n`) to a permissible
number `r` (from `ranges`).
"""
ranges = [high]
high_ = high - 1
while len(ranges) - 1 < num:
# generate a random number from an ever decreasing
# contiguous range (which we'll map to the true
# random number).
# consider an example with low=0, high=10,
# part way through this loop with:
#
# ranges = [0, 2, 3, 7, 9, 10]
#
# r_n :-> r
# 0 :-> 1
# 1 :-> 4
# 2 :-> 5
# 3 :-> 6
# 4 :-> 8
r_n = random.randint(low, high_)
range_index = bisect.bisect_left(ranges, r_n)
r = r_n + range_index
for i in xrange(range_index, len(ranges)):
if ranges[i] <= r:
# as many "gaps" we iterate over, as much
# is the true random value (`r`) shifted.
r = r_n + i + 1
elif ranges[i] > r_n:
break
# mark `r` as another "gap" of the original
# [low, high) range.
ranges.insert(i, r)
# Fewer values possible.
high_ -= 1
# `ranges` happens to contain the result.
return ranges[:-1]
回答 12
import random
sourcelist=[]
resultlist=[]
for x in range(100):
sourcelist.append(x)
for y in sourcelist:
resultlist.insert(random.randint(0,len(resultlist)),y)
print (resultlist)
import random
sourcelist=[]
resultlist=[]
for x in range(100):
sourcelist.append(x)
for y in sourcelist:
resultlist.insert(random.randint(0,len(resultlist)),y)
print (resultlist)
回答 13
如果希望确保添加的数字唯一,则可以使用Set对象
(如果使用2.7或更高版本),或者如果未使用,则导入sets模块。
正如其他人提到的那样,这意味着数字并不是真正的随机数。
If you wish to ensure that the numbers being added are unique, you could use a Set object
if using 2.7 or greater, or import the sets module if not.
As others have mentioned, this means the numbers are not truly random.
回答 14
采样整数而不在minval
和之间进行替换maxval
:
import numpy as np
minval, maxval, n_samples = -50, 50, 10
generator = np.random.default_rng(seed=0)
samples = generator.permutation(np.arange(minval, maxval))[:n_samples]
# or, if minval is 0,
samples = generator.permutation(maxval)[:n_samples]
与贾克斯:
import jax
minval, maxval, n_samples = -50, 50, 10
key = jax.random.PRNGKey(seed=0)
samples = jax.random.shuffle(key, jax.numpy.arange(minval, maxval))[:n_samples]
to sample integers without replacement between minval
and maxval
:
import numpy as np
minval, maxval, n_samples = -50, 50, 10
generator = np.random.default_rng(seed=0)
samples = generator.permutation(np.arange(minval, maxval))[:n_samples]
# or, if minval is 0,
samples = generator.permutation(maxval)[:n_samples]
with jax:
import jax
minval, maxval, n_samples = -50, 50, 10
key = jax.random.PRNGKey(seed=0)
samples = jax.random.shuffle(key, jax.numpy.arange(minval, maxval))[:n_samples]
回答 15
从win xp中的CLI:
python -c "import random; print(sorted(set([random.randint(6,49) for i in range(7)]))[:6])"
在加拿大,我们有6/49乐透。我只是将上面的代码包装在lotto.bat中,然后运行C:\home\lotto.bat
或just C:\home\lotto
。
因为random.randint
经常重复一个数字,所以我使用set
,range(7)
然后将其缩短为6。
有时,如果数字重复超过2倍,则结果列表长度将小于6。
编辑:但是,这random.sample(range(6,49),6)
是正确的方法。
From the CLI in win xp:
python -c "import random; print(sorted(set([random.randint(6,49) for i in range(7)]))[:6])"
In Canada we have the 6/49 Lotto. I just wrap the above code in lotto.bat and run C:\home\lotto.bat
or just C:\home\lotto
.
Because random.randint
often repeats a number, I use set
with range(7)
and then shorten it to a length of 6.
Occasionally if a number repeats more than 2 times the resulting list length will be less than 6.
EDIT: However, random.sample(range(6,49),6)
is the correct way to go.
回答 16
import random
result=[]
for i in range(1,50):
rng=random.randint(1,20)
result.append(rng)
import random
result=[]
for i in range(1,50):
rng=random.randint(1,20)
result.append(rng)