问题:列表的标准偏差
我想找到几个(Z)列表的第一,第二,…个数字的均值和标准差。例如,我有
A_rank=[0.8,0.4,1.2,3.7,2.6,5.8]
B_rank=[0.1,2.8,3.7,2.6,5,3.4]
C_Rank=[1.2,3.4,0.5,0.1,2.5,6.1]
# etc (up to Z_rank )...
现在,我要获取的均值和std *_Rank[0]
,的均值和std *_Rank[1]
等
(即:所有(A..Z)_rank列表中第一个数字
的均值和std;来自的第二个数字的均值和std所有(A..Z)_rank列表;
第三个数字的均值和std …;等等)。
I want to find mean and standard deviation of 1st, 2nd,… digits of several (Z) lists. For example, I have
A_rank=[0.8,0.4,1.2,3.7,2.6,5.8]
B_rank=[0.1,2.8,3.7,2.6,5,3.4]
C_Rank=[1.2,3.4,0.5,0.1,2.5,6.1]
# etc (up to Z_rank )...
Now I want to take the mean and std of *_Rank[0]
, the mean and std of *_Rank[1]
, etc.
(ie: mean and std of the 1st digit from all the (A..Z)_rank lists;
the mean and std of the 2nd digit from all the (A..Z)_rank lists;
the mean and std of the 3rd digit…; etc).
回答 0
从Python 3.4 / PEP450开始statistics module
,标准库中提供了一个,该库提供了一种stdev
用于计算像您这样的可迭代对象的标准偏差的方法:
>>> A_rank = [0.8, 0.4, 1.2, 3.7, 2.6, 5.8]
>>> import statistics
>>> statistics.stdev(A_rank)
2.0634114147853952
Since Python 3.4 / PEP450 there is a statistics module
in the standard library, which has a method stdev
for calculating the standard deviation of iterables like yours:
>>> A_rank = [0.8, 0.4, 1.2, 3.7, 2.6, 5.8]
>>> import statistics
>>> statistics.stdev(A_rank)
2.0634114147853952
回答 1
我将A_Rank
等人放入二维NumPy数组中,然后使用numpy.mean()
和numpy.std()
计算均值和标准差:
In [17]: import numpy
In [18]: arr = numpy.array([A_rank, B_rank, C_rank])
In [20]: numpy.mean(arr, axis=0)
Out[20]:
array([ 0.7 , 2.2 , 1.8 , 2.13333333, 3.36666667,
5.1 ])
In [21]: numpy.std(arr, axis=0)
Out[21]:
array([ 0.45460606, 1.29614814, 1.37355985, 1.50628314, 1.15566239,
1.2083046 ])
I would put A_Rank
et al into a 2D NumPy array, and then use numpy.mean()
and numpy.std()
to compute the means and the standard deviations:
In [17]: import numpy
In [18]: arr = numpy.array([A_rank, B_rank, C_rank])
In [20]: numpy.mean(arr, axis=0)
Out[20]:
array([ 0.7 , 2.2 , 1.8 , 2.13333333, 3.36666667,
5.1 ])
In [21]: numpy.std(arr, axis=0)
Out[21]:
array([ 0.45460606, 1.29614814, 1.37355985, 1.50628314, 1.15566239,
1.2083046 ])
回答 2
这是一些纯Python代码,可用于计算均值和标准差。
以下所有代码均基于statistics
Python 3.4+中的模块。
def mean(data):
"""Return the sample arithmetic mean of data."""
n = len(data)
if n < 1:
raise ValueError('mean requires at least one data point')
return sum(data)/n # in Python 2 use sum(data)/float(n)
def _ss(data):
"""Return sum of square deviations of sequence data."""
c = mean(data)
ss = sum((x-c)**2 for x in data)
return ss
def stddev(data, ddof=0):
"""Calculates the population standard deviation
by default; specify ddof=1 to compute the sample
standard deviation."""
n = len(data)
if n < 2:
raise ValueError('variance requires at least two data points')
ss = _ss(data)
pvar = ss/(n-ddof)
return pvar**0.5
注意:为提高浮点求和时的准确性,该statistics
模块使用了自定义函数,_sum
而不是sum
我使用的内置函数。
现在我们有例如:
>>> mean([1, 2, 3])
2.0
>>> stddev([1, 2, 3]) # population standard deviation
0.816496580927726
>>> stddev([1, 2, 3], ddof=1) # sample standard deviation
0.1
Here’s some pure-Python code you can use to calculate the mean and standard deviation.
All code below is based on the statistics
module in Python 3.4+.
def mean(data):
"""Return the sample arithmetic mean of data."""
n = len(data)
if n < 1:
raise ValueError('mean requires at least one data point')
return sum(data)/n # in Python 2 use sum(data)/float(n)
def _ss(data):
"""Return sum of square deviations of sequence data."""
c = mean(data)
ss = sum((x-c)**2 for x in data)
return ss
def stddev(data, ddof=0):
"""Calculates the population standard deviation
by default; specify ddof=1 to compute the sample
standard deviation."""
n = len(data)
if n < 2:
raise ValueError('variance requires at least two data points')
ss = _ss(data)
pvar = ss/(n-ddof)
return pvar**0.5
Note: for improved accuracy when summing floats, the statistics
module uses a custom function _sum
rather than the built-in sum
which I’ve used in its place.
Now we have for example:
>>> mean([1, 2, 3])
2.0
>>> stddev([1, 2, 3]) # population standard deviation
0.816496580927726
>>> stddev([1, 2, 3], ddof=1) # sample standard deviation
0.1
回答 3
在Python 2.7.1中,您可以使用numpy.std()
以下方法计算标准差:
- 人口标准:仅使用
numpy.std()
数据列表之外的其他参数即可。
- 示例std:您需要将ddof(即Delta自由度)设置为1,如以下示例所示:
numpy.std(<您的列表>,ddof = 1)
计算中使用的除数为N-ddof,其中N表示元素数。默认情况下,ddof为零。
它计算样本std而不是总体std。
In Python 2.7.1, you may calculate standard deviation using numpy.std()
for:
- Population std: Just use
numpy.std()
with no additional arguments besides to your data list.
- Sample std: You need to pass ddof (i.e. Delta Degrees of Freedom) set to 1, as in the following example:
numpy.std(< your-list >, ddof=1)
The divisor used in calculations is N – ddof, where N represents the number of elements. By default ddof is zero.
It calculates sample std rather than population std.
回答 4
回答 5
使用python,以下是几种方法:
import statistics as st
n = int(input())
data = list(map(int, input().split()))
方法1-使用功能
stdev = st.pstdev(data)
方法2:计算方差并求平方根
variance = st.pvariance(data)
devia = math.sqrt(variance)
方法3:使用基本数学
mean = sum(data)/n
variance = sum([((x - mean) ** 2) for x in X]) / n
stddev = variance ** 0.5
print("{0:0.1f}".format(stddev))
注意:
variance
计算样本总体的方差
pvariance
计算整个人口的方差
- 相似的差异
stdev
和pstdev
Using python, here are few methods:
import statistics as st
n = int(input())
data = list(map(int, input().split()))
Approach1 – using a function
stdev = st.pstdev(data)
Approach2: calculate variance and take square root of it
variance = st.pvariance(data)
devia = math.sqrt(variance)
Approach3: using basic math
mean = sum(data)/n
variance = sum([((x - mean) ** 2) for x in X]) / n
stddev = variance ** 0.5
print("{0:0.1f}".format(stddev))
Note:
variance
calculates variance of sample population
pvariance
calculates variance of entire population
- similar differences between
stdev
and pstdev
回答 6
纯python代码:
from math import sqrt
def stddev(lst):
mean = float(sum(lst)) / len(lst)
return sqrt(float(reduce(lambda x, y: x + y, map(lambda x: (x - mean) ** 2, lst))) / len(lst))
pure python code:
from math import sqrt
def stddev(lst):
mean = float(sum(lst)) / len(lst)
return sqrt(float(reduce(lambda x, y: x + y, map(lambda x: (x - mean) ** 2, lst))) / len(lst))
回答 7
其他答案涵盖了如何在python中充分执行std dev,但没有人解释如何进行您所描述的怪异遍历。
我将假设AZ是整个人口。如果没有,请参阅Ome关于如何从样本推断的答案。
因此,要获得每个列表的第一位数字的标准差/均值,您将需要如下所示:
#standard deviation
numpy.std([A_rank[0], B_rank[0], C_rank[0], ..., Z_rank[0]])
#mean
numpy.mean([A_rank[0], B_rank[0], C_rank[0], ..., Z_rank[0]])
为了缩短代码并将其通用化为第n个数字,请使用我为您生成的以下函数:
def getAllNthRanks(n):
return [A_rank[n], B_rank[n], C_rank[n], D_rank[n], E_rank[n], F_rank[n], G_rank[n], H_rank[n], I_rank[n], J_rank[n], K_rank[n], L_rank[n], M_rank[n], N_rank[n], O_rank[n], P_rank[n], Q_rank[n], R_rank[n], S_rank[n], T_rank[n], U_rank[n], V_rank[n], W_rank[n], X_rank[n], Y_rank[n], Z_rank[n]]
现在,您可以像这样简单地从AZ获取所有n个位置的stdd和均值:
#standard deviation
numpy.std(getAllNthRanks(n))
#mean
numpy.mean(getAllNthRanks(n))
The other answers cover how to do std dev in python sufficiently, but no one explains how to do the bizarre traversal you’ve described.
I’m going to assume A-Z is the entire population. If not see Ome‘s answer on how to inference from a sample.
So to get the standard deviation/mean of the first digit of every list you would need something like this:
#standard deviation
numpy.std([A_rank[0], B_rank[0], C_rank[0], ..., Z_rank[0]])
#mean
numpy.mean([A_rank[0], B_rank[0], C_rank[0], ..., Z_rank[0]])
To shorten the code and generalize this to any nth digit use the following function I generated for you:
def getAllNthRanks(n):
return [A_rank[n], B_rank[n], C_rank[n], D_rank[n], E_rank[n], F_rank[n], G_rank[n], H_rank[n], I_rank[n], J_rank[n], K_rank[n], L_rank[n], M_rank[n], N_rank[n], O_rank[n], P_rank[n], Q_rank[n], R_rank[n], S_rank[n], T_rank[n], U_rank[n], V_rank[n], W_rank[n], X_rank[n], Y_rank[n], Z_rank[n]]
Now you can simply get the stdd and mean of all the nth places from A-Z like this:
#standard deviation
numpy.std(getAllNthRanks(n))
#mean
numpy.mean(getAllNthRanks(n))