标签归档:random-seed

Python中numpy.random和random.random之间的区别

问题:Python中numpy.random和random.random之间的区别

我在Python中有一个大脚本。我从其他人的代码中获得启发,因此最终我将该numpy.random模块用于某些方面(例如,创建从二项式分布中获取的随机数数组),而在其他地方则使用该模块random.random

有人可以告诉我两者之间的主要区别吗?在这两个文档的文档网页上,我似乎numpy.random都拥有更多的方法,但是我不清楚随机数的生成方式有何不同。

我问的原因是因为我需要为调试目的播种我的主程序。但是,除非我在要导入的所有模块中使用相同的随机数生成器,否则它将无法正常工作?

另外,我在另一篇文章中阅读了有关不使用的讨论numpy.random.seed(),但是我并不真正理解为什么这是一个糟糕的主意。如果有人向我解释为什么会这样,我将不胜感激。

I have a big script in Python. I inspired myself in other people’s code so I ended up using the numpy.random module for some things (for example for creating an array of random numbers taken from a binomial distribution) and in other places I use the module random.random.

Can someone please tell me the major differences between the two? Looking at the doc webpage for each of the two it seems to me that numpy.random just has more methods, but I am unclear about how the generation of the random numbers is different.

The reason why I am asking is because I need to seed my main program for debugging purposes. But it doesn’t work unless I use the same random number generator in all the modules that I am importing, is this correct?

Also, I read here, in another post, a discussion about NOT using numpy.random.seed(), but I didn’t really understand why this was such a bad idea. I would really appreciate if someone explain me why this is the case.


回答 0

您已经做出了许多正确的观察!

除非您希望为两个随机生成器都作为种子,否则从长远来看选择一个或另一个生成器可能更简单。但是,如果您确实需要同时使用两者,那么是的,您还需要同时对两者进行播种,因为它们彼此独立地生成随机数。

对于numpy.random.seed(),主要的困难是它不是线程安全的-也就是说,如果您有许多不同的执行线程,则使用它是不安全的,因为如果两个不同的线程同时执行该函数,则不能保证它能正常工作。如果您不使用线程,并且可以合理地期望将来不需要以这种方式重写程序,那numpy.random.seed()应该很好。如果有任何理由怀疑您将来可能需要线程,那么从长远来看,按照建议进行操作并创建numpy.random.Random该类的本地实例会更加安全。据我所知,它random.random.seed()是线程安全的(或者至少没有找到相反的证据)。

numpy.random库包含一些科学研究中常用的额外概率分布,以及用于生成随机数据数组的几个便捷函数。该random.random库要轻巧一些,如果您不从事科学研究或其他统计工作,那应该很好。

否则,它们都使用Mersenne扭曲序列生成它们的随机数,并且它们都是完全确定性的-也就是说,如果您知道一些关键信息,则可以绝对确定地预测下一个数字。因此,numpy.random和random.random都不适合任何严重的加密用途。但是因为序列非常长,所以在您不必担心有人试图对数据进行反向工程的情况下,两者都适合生成随机数。这也是必须播种随机值的原因-如果每次都从同一位置开始,那么您将始终获得相同的随机数序列!

附带说明一下,如果您确实需要加密级别的随机性,则应该使用secrets模块,或者如果使用的是Python 3.6之前的Python版本,则应使用Crypto.Random之类的东西。

You have made many correct observations already!

Unless you’d like to seed both of the random generators, it’s probably simpler in the long run to choose one generator or the other. But if you do need to use both, then yes, you’ll also need to seed them both, because they generate random numbers independently of each other.

For numpy.random.seed(), the main difficulty is that it is not thread-safe – that is, it’s not safe to use if you have many different threads of execution, because it’s not guaranteed to work if two different threads are executing the function at the same time. If you’re not using threads, and if you can reasonably expect that you won’t need to rewrite your program this way in the future, numpy.random.seed() should be fine. If there’s any reason to suspect that you may need threads in the future, it’s much safer in the long run to do as suggested, and to make a local instance of the numpy.random.Random class. As far as I can tell, random.random.seed() is thread-safe (or at least, I haven’t found any evidence to the contrary).

The numpy.random library contains a few extra probability distributions commonly used in scientific research, as well as a couple of convenience functions for generating arrays of random data. The random.random library is a little more lightweight, and should be fine if you’re not doing scientific research or other kinds of work in statistics.

Otherwise, they both use the Mersenne twister sequence to generate their random numbers, and they’re both completely deterministic – that is, if you know a few key bits of information, it’s possible to predict with absolute certainty what number will come next. For this reason, neither numpy.random nor random.random is suitable for any serious cryptographic uses. But because the sequence is so very very long, both are fine for generating random numbers in cases where you aren’t worried about people trying to reverse-engineer your data. This is also the reason for the necessity to seed the random value – if you start in the same place each time, you’ll always get the same sequence of random numbers!

As a side note, if you do need cryptographic level randomness, you should use the secrets module, or something like Crypto.Random if you’re using a Python version earlier than Python 3.6.


回答 1

Python的数据分析中,该模块numpy.random对Python random进行了补充,以通过多种概率分布有效地生成样本值的整个数组。

相比之下,Python的内置random模块一次只能采样一个值,而numpy.random可以更快地生成非常大的采样。使用IPython魔术函数,%timeit可以看到哪个模块执行得更快:

In [1]: from random import normalvariate
In [2]: N = 1000000

In [3]: %timeit samples = [normalvariate(0, 1) for _ in xrange(N)]
1 loop, best of 3: 963 ms per loop

In [4]: %timeit np.random.normal(size=N)
10 loops, best of 3: 38.5 ms per loop

From Python for Data Analysis, the module numpy.random supplements the Python random with functions for efficiently generating whole arrays of sample values from many kinds of probability distributions.

By contrast, Python’s built-in random module only samples one value at a time, while numpy.random can generate very large sample faster. Using IPython magic function %timeit one can see which module performs faster:

In [1]: from random import normalvariate
In [2]: N = 1000000

In [3]: %timeit samples = [normalvariate(0, 1) for _ in xrange(N)]
1 loop, best of 3: 963 ms per loop

In [4]: %timeit np.random.normal(size=N)
10 loops, best of 3: 38.5 ms per loop

回答 2

种子的来源和使用的分发配置文件将影响输出-如果您正在寻找加密随机性,则os.urandom()的种子将从设备颤动(即以太网或磁盘)中获得几乎真实的随机字节(即/ BSD上的dev / random)

这样可以避免您提供种子,从而避免生成确定的随机数。但是,随机调用然后允许您将数字拟合为一个分布(我称之为科学随机性-最终,您想要的只是一个随机数的钟形曲线分布,numpy最擅长于解决这个问题。

所以,是的,坚持使用一个生成器,但要确定您想要的随机数-随机,但是会偏离分布曲线,或者在没有量子设备的情况下尽可能地随机。

The source of the seed and the distribution profile used are going to affect the outputs – if you are looking for cryptgraphic randomness, seeding from os.urandom() will get nearly real random bytes from device chatter (ie ethernet or disk) (ie /dev/random on BSD)

this will avoid you giving a seed and so generating determinisitic random numbers. However the random calls then allow you to fit the numbers to a distribution (what I call scientific random ness – eventually all you want is a bell curve distribution of random numbers, numpy is best at delviering this.

SO yes, stick with one generator, but decide what random you want – random, but defitniely from a distrubtuion curve, or as random as you can get without a quantum device.


random.seed():它是做什么的?

问题:random.seed():它是做什么的?

我对random.seed()Python中的功能有些困惑。例如,为什么下面的试验(一致地)做什么?

>>> import random
>>> random.seed(9001)
>>> random.randint(1, 10)
1
>>> random.randint(1, 10)
3
>>> random.randint(1, 10)
6
>>> random.randint(1, 10)
6
>>> random.randint(1, 10)
7

我找不到关于此的好的文档。

I am a bit confused on what random.seed() does in Python. For example, why does the below trials do what they do (consistently)?

>>> import random
>>> random.seed(9001)
>>> random.randint(1, 10)
1
>>> random.randint(1, 10)
3
>>> random.randint(1, 10)
6
>>> random.randint(1, 10)
6
>>> random.randint(1, 10)
7

I couldn’t find good documentation on this.


回答 0

伪随机数生成器通过对值执行某些操作来工作。通常,此值是生成器生成的先前编号。但是,第一次使用生成器时,没有先前的值。

播种伪随机数生成器会为其赋予第一个“上一个”值。每个种子值将对应于给定随机数生成器的一系列生成值。也就是说,如果两次提供相同的种子,则两次获得相同的数字序列。

通常,您希望为随机数生成器添加一些值,这些值将更改程序的每次执行。例如,当前时间是一个经常使用的种子。之所以不会自动发生,是因为您可以根据需要提供特定的种子来获得已知的数字序列。

Pseudo-random number generators work by performing some operation on a value. Generally this value is the previous number generated by the generator. However, the first time you use the generator, there is no previous value.

Seeding a pseudo-random number generator gives it its first “previous” value. Each seed value will correspond to a sequence of generated values for a given random number generator. That is, if you provide the same seed twice, you get the same sequence of numbers twice.

Generally, you want to seed your random number generator with some value that will change each execution of the program. For instance, the current time is a frequently-used seed. The reason why this doesn’t happen automatically is so that if you want, you can provide a specific seed to get a known sequence of numbers.


回答 1

所有其他答案似乎都无法解释random.seed()的用法。这是一个简单的示例():

import random
random.seed( 3 )
print "Random number with seed 3 : ", random.random() #will generate a random number 
#if you want to use the same random number once again in your program
random.seed( 3 )
random.random()   # same random number as before

All the other answers don’t seem to explain the use of random.seed(). Here is a simple example (source):

import random
random.seed( 3 )
print "Random number with seed 3 : ", random.random() #will generate a random number 
#if you want to use the same random number once again in your program
random.seed( 3 )
random.random()   # same random number as before

回答 2

>>> random.seed(9001)   
>>> random.randint(1, 10)  
1     
>>> random.seed(9001)     
>>> random.randint(1, 10)    
1           
>>> random.seed(9001)          
>>> random.randint(1, 10)                 
1                  
>>> random.seed(9001)         
>>> random.randint(1, 10)          
1     
>>> random.seed(9002)                
>>> random.randint(1, 10)             
3

你试试这个。

假设“ random.seed”为随机值生成器(“ random.randint()”)提供了一个值,该值会根据该种子生成这些值。随机数的必须属性之一是它们应该是可重现的。当您放入相同的种子时,您将获得相同的随机数模式。这样,您就可以从一开始就生成它们。您提供了不同的种子-它以不同的首字母开头(高于3)。

给定一个种子,它将一个接一个地生成1到10之间的随机数。因此,您为一个种子值假设一组数字。

>>> random.seed(9001)   
>>> random.randint(1, 10)  
1     
>>> random.seed(9001)     
>>> random.randint(1, 10)    
1           
>>> random.seed(9001)          
>>> random.randint(1, 10)                 
1                  
>>> random.seed(9001)         
>>> random.randint(1, 10)          
1     
>>> random.seed(9002)                
>>> random.randint(1, 10)             
3

You try this.

Let’s say ‘random.seed’ gives a value to random value generator (‘random.randint()’) which generates these values on the basis of this seed. One of the must properties of random numbers is that they should be reproducible. When you put same seed, you get the same pattern of random numbers. This way you are generating them right from the start. You give a different seed- it starts with a different initial (above 3).

Given a seed, it will generate random numbers between 1 and 10 one after another. So you assume one set of numbers for one seed value.


回答 3

通过对先前值进行某些运算会生成一个随机数

如果没有先前的值,那么当前时间将自动作为先前的值。我们可以使用random.seed(x)where自己提供此先前的值x可以是任何数字或字符串等。

因此random.random()实际上不是完美的随机数,可以通过预测random.seed(x)

import random 
random.seed(45)            #seed=45  
random.random()            #1st rand value=0.2718754143840908
0.2718754143840908  
random.random()            #2nd rand value=0.48802820785090784
0.48802820785090784  
random.seed(45)            # again reasign seed=45  
random.random()
0.2718754143840908         #matching with 1st rand value  
random.random()
0.48802820785090784        #matching with 2nd rand value

因此,生成随机数实际上不是随机的,因为它运行在算法上。算法总是基于相同的输入给出相同的输出。这意味着,它取决于种子的价值。因此,为了使其更具随机性,时间会自动分配给seed()

A random number is generated by some operation on previous value.

If there is no previous value then the current time is taken as previous value automatically. We can provide this previous value by own using random.seed(x) where x could be any number or string etc.

Hence random.random() is not actually perfect random number, it could be predicted via random.seed(x).

import random 
random.seed(45)            #seed=45  
random.random()            #1st rand value=0.2718754143840908
0.2718754143840908  
random.random()            #2nd rand value=0.48802820785090784
0.48802820785090784  
random.seed(45)            # again reasign seed=45  
random.random()
0.2718754143840908         #matching with 1st rand value  
random.random()
0.48802820785090784        #matching with 2nd rand value

Hence, generating a random number is not actually random, because it runs on algorithms. Algorithms always give the same output based on the same input. This means, it depends on the value of the seed. So, in order to make it more random, time is automatically assigned to seed().


回答 4

Seed() can be used for later use ---

Example:
>>> import numpy as np
>>> np.random.seed(12)
>>> np.random.rand(4)
array([0.15416284, 0.7400497 , 0.26331502, 0.53373939])
>>>
>>>
>>> np.random.seed(10)
>>> np.random.rand(4)
array([0.77132064, 0.02075195, 0.63364823, 0.74880388])
>>>
>>>
>>> np.random.seed(12) # When you use same seed as before you will get same random output as before
>>> np.random.rand(4)
array([0.15416284, 0.7400497 , 0.26331502, 0.53373939])
>>>
>>>
>>> np.random.seed(10)
>>> np.random.rand(4)
array([0.77132064, 0.02075195, 0.63364823, 0.74880388])
>>>
Seed() can be used for later use ---

Example:
>>> import numpy as np
>>> np.random.seed(12)
>>> np.random.rand(4)
array([0.15416284, 0.7400497 , 0.26331502, 0.53373939])
>>>
>>>
>>> np.random.seed(10)
>>> np.random.rand(4)
array([0.77132064, 0.02075195, 0.63364823, 0.74880388])
>>>
>>>
>>> np.random.seed(12) # When you use same seed as before you will get same random output as before
>>> np.random.rand(4)
array([0.15416284, 0.7400497 , 0.26331502, 0.53373939])
>>>
>>>
>>> np.random.seed(10)
>>> np.random.rand(4)
array([0.77132064, 0.02075195, 0.63364823, 0.74880388])
>>>

回答 5

# Simple Python program to understand random.seed() importance

import random

random.seed(10)

for i in range(5):    
    print(random.randint(1, 100))

多次执行以上程序…

第一次尝试:打印1到100范围内的5个随机整数

第二次尝试:打印出与上述执行中相同的5个随机数。

第三次尝试:相同

…..等等

说明:每次运行上述程序时,我们都将seed设置为10,然后随机数生成器将其作为参考变量。然后通过执行一些预定义的公式,它会生成一个随机数。

因此,在下一次执行中将种子设置为10时,会将引用号再次设置为10,并且相同的行为再次开始…

一旦我们重置种子值,它就会得到相同的植物。

注意:更改种子值并运行程序,您将看到与前一个不同的随机序列。

# Simple Python program to understand random.seed() importance

import random

random.seed(10)

for i in range(5):    
    print(random.randint(1, 100))

Execute the above program multiple times…

1st attempt: prints 5 random integers in the range of 1 – 100

2nd attempt: prints same 5 random numbers appeared in the above execution.

3rd attempt: same

…..So on

Explanation: Every time we are running the above program we are setting seed to 10, then random generator takes this as a reference variable. And then by doing some predefined formula, it generates a random number.

Hence setting seed to 10 in the next execution again sets reference number to 10 and again the same behavior starts…

As soon as we reset the seed value it gives the same plants.

Note: Change the seed value and run the program, you’ll see a different random sequence than the previous one.


回答 6

在这种情况下,随机实际上是伪随机。给定种子,它将生成具有相等分布的数字。但是使用相同的种子,它将每次生成相同的数字序列。如果要更改它,则必须更改种子。许多人喜欢基于当前时间或某种东西来生成种子。

In this case, random is actually pseudo-random. Given a seed, it will generate numbers with an equal distribution. But with the same seed, it will generate the same number sequence every time. If you want it to change, you’ll have to change your seed. A lot of people like to generate a seed based on the current time or something.


回答 7

恕我直言,当您random.seed(samedigit)再次使用它时,它会产生相同的随机过程结果。

In [47]: random.randint(7,10)

Out[47]: 9


In [48]: random.randint(7,10)

Out[48]: 9


In [49]: random.randint(7,10)

Out[49]: 7


In [50]: random.randint(7,10)

Out[50]: 10


In [51]: random.seed(5)


In [52]: random.randint(7,10)

Out[52]: 9


In [53]: random.seed(5)


In [54]: random.randint(7,10)

Out[54]: 9

Imho, it is used to generate same random course result when you use random.seed(samedigit) again.

In [47]: random.randint(7,10)

Out[47]: 9


In [48]: random.randint(7,10)

Out[48]: 9


In [49]: random.randint(7,10)

Out[49]: 7


In [50]: random.randint(7,10)

Out[50]: 10


In [51]: random.seed(5)


In [52]: random.randint(7,10)

Out[52]: 9


In [53]: random.seed(5)


In [54]: random.randint(7,10)

Out[54]: 9

回答 8

seed(x)生成一组随机数之前,并使用相同的种子产生相同随机数集。在重现问题时很有用。

>>> from random import *
>>> seed(20)
>>> randint(1,100)
93
>>> randint(1,100)
88
>>> randint(1,100)
99
>>> seed(20)
>>> randint(1,100)
93
>>> randint(1,100)
88
>>> randint(1,100)
99
>>> 

Set the seed(x) before generating a set of random numbers and use the same seed to generate the same set of random numbers. Useful in case of reproducing the issues.

>>> from random import *
>>> seed(20)
>>> randint(1,100)
93
>>> randint(1,100)
88
>>> randint(1,100)
99
>>> seed(20)
>>> randint(1,100)
93
>>> randint(1,100)
88
>>> randint(1,100)
99
>>> 

回答 9

这是我的理解。每次我们设置种子值时,都会生成一个“标签”或“参考”。下一个random.function调用将附加到此“标签”,因此,下次您调用相同的种子值和random.function时,它将获得相同的结果。

np.random.seed( 3 )
print(np.random.randn()) # output: 1.7886284734303186

np.random.seed( 3 )
print(np.random.rand()) # different function. output: 0.5507979025745755

np.random.seed( 5 )
print(np.random.rand()) # different seed value. output: 0.22199317108973948

Here is my understanding. Every time we set a seed value, a “label” or ” reference” is generated. The next random.function call is attached to this “label”, so next time you call the same seed value and random.function, it will give you the same result.

np.random.seed( 3 )
print(np.random.randn()) # output: 1.7886284734303186

np.random.seed( 3 )
print(np.random.rand()) # different function. output: 0.5507979025745755

np.random.seed( 5 )
print(np.random.rand()) # different seed value. output: 0.22199317108973948

回答 10

这是一个小测试,演示seed()用相同的参数输入方法会导致相同的伪随机结果:

# testing random.seed()

import random

def equalityCheck(l):
    state=None
    x=l[0]
    for i in l:
        if i!=x:
            state=False
            break
        else:
            state=True
    return state


l=[]

for i in range(1000):
    random.seed(10)
    l.append(random.random())

print "All elements in l are equal?",equalityCheck(l)

Here is a small test that demonstrates that feeding the seed() method with the same argument will cause the same pseudo-random result:

# testing random.seed()

import random

def equalityCheck(l):
    state=None
    x=l[0]
    for i in l:
        if i!=x:
            state=False
            break
        else:
            state=True
    return state


l=[]

for i in range(1000):
    random.seed(10)
    l.append(random.random())

print "All elements in l are equal?",equalityCheck(l)

回答 11

random.seed(a, version)在python中用于初始化伪随机数生成器(PRNG)

PRNG是一种生成近似随机数属性的数字序列的算法。可以使用种子值复制这些随机数。因此,如果提供种子值,则PRNG使用种子从任意的起始状态开始。

参数a 是种子值。如果a值为None,则默认情况下将使用当前系统时间。

并且version是一个整数,指定了如何将一个参数转换为整数。预设值为2。

import random
random.seed(9001)
random.randint(1, 10) #this gives output of 1
# 1

如果要复制相同的随机数,请再次提供相同的种子

random.seed(9001)
random.randint(1, 10) # this will give the same output of 1
# 1

如果您不提供种子,那么它将生成不同的数字,而不是以前的1

random.randint(1, 10) # this gives 7 without providing seed
# 7

如果您提供的种子比以前有所不同,那么它将为您提供不同的随机数

random.seed(9002)
random.randint(1, 10) # this gives you 5 not 1
# 5

因此,总而言之,如果要复制相同的随机数,请提供种子。具体来说,是相同的种子

random.seed(a, version) in python is used to initialize the pseudo-random number generator (PRNG).

PRNG is algorithm that generates sequence of numbers approximating the properties of random numbers. These random numbers can be reproduced using the seed value. So, if you provide seed value, PRNG starts from an arbitrary starting state using a seed.

Argument a is the seed value. If the a value is None, then by default, current system time is used.

and version is An integer specifying how to convert the a parameter into a integer. Default value is 2.

import random
random.seed(9001)
random.randint(1, 10) #this gives output of 1
# 1

If you want the same random number to be reproduced then provide the same seed again

random.seed(9001)
random.randint(1, 10) # this will give the same output of 1
# 1

If you don’t provide the seed, then it generate different number and not 1 as before

random.randint(1, 10) # this gives 7 without providing seed
# 7

If you provide different seed than before, then it will give you a different random number

random.seed(9002)
random.randint(1, 10) # this gives you 5 not 1
# 5

So, in summary, if you want the same random number to be reproduced, provide the seed. Specifically, the same seed.