标签归档:fft

Python / SciPy的峰值发现算法

问题:Python / SciPy的峰值发现算法

我可以通过找到一阶导数的零交叉点或类似的东西自己写点东西,但是它似乎包含在标准库中,具有足够的通用性。有人知道吗?

我的特定应用是2D阵列,但通常将其用于查找FFT等中的峰值。

具体来说,在这些类型的问题中,有多个强峰值,然后是由噪声引起的许多较小的“峰值”,应将其忽略。这些仅仅是示例;不是我的实际数据:

一维峰:

二维峰:

峰值查找算法将找到这些峰的位置(而不仅仅是它们的值),理想情况下,可能会使用二次插值或其他方法找到真正的样本间峰,而不仅仅是具有最大值的索引。

通常,您只关心几个强峰,因此选择它们是因为它们高于某个阈值,或者因为它们是有序列表的前n个峰(按振幅排序)。

正如我说的,我自己会写这样的东西。我只是问是否有一个已知的运作良好的功能或软件包。

更新:

翻译了一个MATLAB脚本,它在1-D情况下工作得很好,但可能会更好。

更新的更新:

sixtenbe 为一维案例创建了更好的版本

I can write something myself by finding zero-crossings of the first derivative or something, but it seems like a common-enough function to be included in standard libraries. Anyone know of one?

My particular application is a 2D array, but usually it would be used for finding peaks in FFTs, etc.

Specifically, in these kinds of problems, there are multiple strong peaks, and then lots of smaller “peaks” that are just caused by noise that should be ignored. These are just examples; not my actual data:

1-dimensional peaks:

2-dimensional peaks:

The peak-finding algorithm would find the location of these peaks (not just their values), and ideally would find the true inter-sample peak, not just the index with maximum value, probably using quadratic interpolation or something.

Typically you only care about a few strong peaks, so they’d either be chosen because they’re above a certain threshold, or because they’re the first n peaks of an ordered list, ranked by amplitude.

As I said, I know how to write something like this myself. I’m just asking if there’s a pre-existing function or package that’s known to work well.

Update:

I translated a MATLAB script and it works decently for the 1-D case, but could be better.

Updated update:

sixtenbe created a better version for the 1-D case.


回答 0

scipy.signal.find_peaks顾名思义,该功能对此有用。但是,要理解以及它的参数是非常重要的widththresholddistance 和高于一切prominence,以获得良好的峰值提取。

根据我的测试和文档,突出的概念是“有用的概念”,用于保持良好的峰值,并丢弃嘈杂的峰值。

什么是(地形)突出?它是“从山顶下降到更高地形所需的最低高度”,如下所示:

这个想法是:

突出程度越高,峰越“重要”。

测试:

我故意使用了一个(嘈杂的)频率变化正弦曲线,因为它显示了很多困难。我们可以看到该width参数在这里不是很有用,因为如果您将最小值设置width得太高,则它将无法跟踪高频部分中非常接近的峰值。如果设置width得太低,则信号左侧会出现许多不需要的峰值。同样的问题distancethreshold仅与直接邻居比较,在这里没有用。prominence是提供最佳解决方案的一种。请注意,您可以结合使用许多这些参数!

码:

import numpy as np
import matplotlib.pyplot as plt 
from scipy.signal import find_peaks

x = np.sin(2*np.pi*(2**np.linspace(2,10,1000))*np.arange(1000)/48000) + np.random.normal(0, 1, 1000) * 0.15
peaks, _ = find_peaks(x, distance=20)
peaks2, _ = find_peaks(x, prominence=1)      # BEST!
peaks3, _ = find_peaks(x, width=20)
peaks4, _ = find_peaks(x, threshold=0.4)     # Required vertical distance to its direct neighbouring samples, pretty useless
plt.subplot(2, 2, 1)
plt.plot(peaks, x[peaks], "xr"); plt.plot(x); plt.legend(['distance'])
plt.subplot(2, 2, 2)
plt.plot(peaks2, x[peaks2], "ob"); plt.plot(x); plt.legend(['prominence'])
plt.subplot(2, 2, 3)
plt.plot(peaks3, x[peaks3], "vg"); plt.plot(x); plt.legend(['width'])
plt.subplot(2, 2, 4)
plt.plot(peaks4, x[peaks4], "xk"); plt.plot(x); plt.legend(['threshold'])
plt.show()

The function scipy.signal.find_peaks, as its name suggests, is useful for this. But it’s important to understand well its parameters width, threshold, distance and above all prominence to get a good peak extraction.

According to my tests and the documentation, the concept of prominence is “the useful concept” to keep the good peaks, and discard the noisy peaks.

What is (topographic) prominence? It is “the minimum height necessary to descend to get from the summit to any higher terrain”, as it can be seen here:

The idea is:

The higher the prominence, the more “important” the peak is.

Test:

I used a (noisy) frequency-varying sinusoid on purpose because it shows many difficulties. We can see that the width parameter is not very useful here because if you set a minimum width too high, then it won’t be able to track very close peaks in the high frequency part. If you set width too low, you would have many unwanted peaks in the left part of the signal. Same problem with distance. threshold only compares with the direct neighbours, which is not useful here. prominence is the one that gives the best solution. Note that you can combine many of these parameters!

Code:

import numpy as np
import matplotlib.pyplot as plt 
from scipy.signal import find_peaks

x = np.sin(2*np.pi*(2**np.linspace(2,10,1000))*np.arange(1000)/48000) + np.random.normal(0, 1, 1000) * 0.15
peaks, _ = find_peaks(x, distance=20)
peaks2, _ = find_peaks(x, prominence=1)      # BEST!
peaks3, _ = find_peaks(x, width=20)
peaks4, _ = find_peaks(x, threshold=0.4)     # Required vertical distance to its direct neighbouring samples, pretty useless
plt.subplot(2, 2, 1)
plt.plot(peaks, x[peaks], "xr"); plt.plot(x); plt.legend(['distance'])
plt.subplot(2, 2, 2)
plt.plot(peaks2, x[peaks2], "ob"); plt.plot(x); plt.legend(['prominence'])
plt.subplot(2, 2, 3)
plt.plot(peaks3, x[peaks3], "vg"); plt.plot(x); plt.legend(['width'])
plt.subplot(2, 2, 4)
plt.plot(peaks4, x[peaks4], "xk"); plt.plot(x); plt.legend(['threshold'])
plt.show()

回答 1

我正在寻找一个类似的问题,并且我发现一些最佳参考来自化学(来自质谱数据中的峰)。有关峰发现算法的详尽综述,请阅读本章。这是我所遇到的关于峰发现技术的最清晰的评论之一。(小波最适合在嘈杂的数据中找到此类峰。)。

看来您的峰清晰地定义了,并且没有隐藏在噪音中。在这种情况下,我建议您使用平滑的savtizky-golay导数来查找峰(如果仅区分上面的数据,则会有一些误报。)。这是一种非常有效的技术,非常容易实现(您确实需要带有基本操作的矩阵类)。如果您只是找到一阶SG导数的零交叉,我想您会很高兴的。

I’m looking at a similar problem, and I’ve found some of the best references come from chemistry (from peaks finding in mass-spec data). For a good thorough review of peaking finding algorithms read this. This is one of the best clearest reviews of peak finding techniques that I’ve run across. (Wavelets are the best for finding peaks of this sort in noisy data.).

It looks like your peaks are clearly defined and aren’t hidden in the noise. That being the case I’d recommend using smooth savtizky-golay derivatives to find the peaks (If you just differentiate the data above you’ll have a mess of false positives.). This is a very effective technique and is pretty easy to implemented (you do need a matrix class w/ basic operations). If you simply find the zero crossing of the first S-G derivative I think you’ll be happy.


回答 2

scipy中有一个名为的功能scipy.signal.find_peaks_cwt,听起来像很适合您的需求,但是我没有经验,所以我不推荐。

http://docs.scipy.org/doc/scipy/reference/generation/scipy.signal.find_peaks_cwt.html

There is a function in scipy named scipy.signal.find_peaks_cwt which sounds like is suitable for your needs, however I don’t have experience with it so I cannot recommend..

http://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.find_peaks_cwt.html


回答 3

对于那些不确定在Python中使用哪种峰值查找算法的人,这里是替代方法的快速概述:https : //github.com/MonsieurV/py-findpeaks

想要自己等同于MatLab findpeaks函数,我发现Marcos Duarte 的detect_peaks函数是一个不错的选择。

相当容易使用:

import numpy as np
from vector import vector, plot_peaks
from libs import detect_peaks
print('Detect peaks with minimum height and distance filters.')
indexes = detect_peaks.detect_peaks(vector, mph=7, mpd=2)
print('Peaks are: %s' % (indexes))

这会给你:

For those not sure about which peak-finding algorithms to use in Python, here a rapid overview of the alternatives: https://github.com/MonsieurV/py-findpeaks

Wanting myself an equivalent to the MatLab findpeaks function, I’ve found that the detect_peaks function from Marcos Duarte is a good catch.

Pretty easy to use:

import numpy as np
from vector import vector, plot_peaks
from libs import detect_peaks
print('Detect peaks with minimum height and distance filters.')
indexes = detect_peaks.detect_peaks(vector, mph=7, mpd=2)
print('Peaks are: %s' % (indexes))

Which will give you:


回答 4

以可靠的方式检测频谱中的峰值已经进行了很多研究,例如80年代对音乐/音频信号的正弦建模的所有工作。在文献中查找“正弦建模”。

如果您的信号像示例一样干净,那么简单的“给我振幅大于N个邻居的东西”应该可以正常工作。如果您有嘈杂的信号,一种简单而有效的方法就是及时查看峰值并进行跟踪:然后检测频谱线而不是频谱峰值。IOW,您可以在信号的滑动窗口上计算FFT,以获得时间上的一组频谱(也称为频谱图)。然后,您可以查看频谱峰值随时间的变化(即在连续的窗口中)。

Detecting peaks in a spectrum in a reliable way has been studied quite a bit, for example all the work on sinusoidal modelling for music/audio signals in the 80ies. Look for “Sinusoidal Modeling” in the literature.

If your signals are as clean as the example, a simple “give me something with an amplitude higher than N neighbours” should work reasonably well. If you have noisy signals, a simple but effective way is to look at your peaks in time, to track them: you then detect spectral lines instead of spectral peaks. IOW, you compute the FFT on a sliding window of your signal, to get a set of spectrum in time (also called spectrogram). You then look at the evolution of the spectral peak in time (i.e. in consecutive windows).


回答 5

我认为您所寻找的不是SciPy提供的。在这种情况下,我将自己编写代码。

scipy.interpolate的样条曲线插值和平滑效果非常好,可能对拟合峰然后找到最大值的位置很有帮助。

I do not think that what you are looking for is provided by SciPy. I would write the code myself, in this situation.

The spline interpolation and smoothing from scipy.interpolate are quite nice and might be quite helpful in fitting peaks and then finding the location of their maximum.


回答 6

有一些标准的统计功能和方法可以找到数据的异常值,这可能是第一种情况。使用导数将解决您的第二个问题。但是,我不确定是否可以解决连续函数和采样数据的方法。

There are standard statistical functions and methods for finding outliers to data, which is probably what you need in the first case. Using derivatives would solve your second. I’m not sure for a method which solves both continuous functions and sampled data, however.


回答 7

首先,如果没有进一步说明,“峰值”的定义是模糊的。例如,对于以下系列,您将5-4-5称为一个峰还是两个峰?

1-2-1-2-1-1-5-4-5-1-1-5-1

在这种情况下,您至少需要两个阈值:1)仅在高阈值之上,极值才能注册为峰值;2)较低的阈值,以使极小值被其以下的小数值分隔开将成为两个峰值。

峰值检测是极值理论文献中一个经过充分研究的主题,也称为“极值的聚类”。它的典型应用包括基于连续读取环境变量来识别危险事件,例如分析风速以检测风暴事件。

First things first, the definition of “peak” is vague if without further specifications. For example, for the following series, would you call 5-4-5 one peak or two?

1-2-1-2-1-1-5-4-5-1-1-5-1

In this case, you’ll need at least two thresholds: 1) a high threshold only above which can an extreme value register as a peak; and 2) a low threshold so that extreme values separated by small values below it will become two peaks.

Peak detection is a well-studied topic in Extreme Value Theory literature, also known as “declustering of extreme values”. Its typical applications include identifying hazard events based on continuous readings of environmental variables e.g. analysing wind speed to detect storm events.


在Python中绘制快速傅立叶变换

问题:在Python中绘制快速傅立叶变换

我可以访问NumPy和SciPy,并希望为数据集创建一个简单的FFT。我有两个列表,一个是y值,另一个是这些y值的时间戳。

将这些列表输入SciPy或NumPy方法并绘制所得FFT的最简单方法是什么?

我查看了示例,但是它们都依赖于创建具有一定数量的数据点和频率等的伪造数据集,而并没有真正展示如何仅使用一组数据和相应的时间戳来做到这一点。 。

我尝试了以下示例:

from scipy.fftpack import fft

# Number of samplepoints
N = 600

# Sample spacing
T = 1.0 / 800.0
x = np.linspace(0.0, N*T, N)
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = fft(y)
xf = np.linspace(0.0, 1.0/(2.0*T), N/2)
import matplotlib.pyplot as plt
plt.plot(xf, 2.0/N * np.abs(yf[0:N/2]))
plt.grid()
plt.show()

但是,当我更改fft数据集的参数并将其绘制成图时,会得到极其奇怪的结果,而且看起来频率的比例可能不正确。我不确定

这是我尝试FFT的数据的粘贴框

http://pastebin.com/0WhjjMkb http://pastebin.com/ksM4FvZS

当我fft()在整个事情上使用时,它的峰值只有零,而没有别的。

这是我的代码:

## Perform FFT with SciPy
signalFFT = fft(yInterp)

## Get power spectral density
signalPSD = np.abs(signalFFT) ** 2

## Get frequencies corresponding to signal PSD
fftFreq = fftfreq(len(signalPSD), spacing)

## Get positive half of frequencies
i = fftfreq>0

##
plt.figurefigsize = (8, 4));
plt.plot(fftFreq[i], 10*np.log10(signalPSD[i]));
#plt.xlim(0, 100);
plt.xlabel('Frequency [Hz]');
plt.ylabel('PSD [dB]')

间距等于xInterp[1]-xInterp[0]

I have access to NumPy and SciPy and want to create a simple FFT of a data set. I have two lists, one that is y values and the other is timestamps for those y values.

What is the simplest way to feed these lists into a SciPy or NumPy method and plot the resulting FFT?

I have looked up examples, but they all rely on creating a set of fake data with some certain number of data points, and frequency, etc. and don’t really show how to do it with just a set of data and the corresponding timestamps.

I have tried the following example:

from scipy.fftpack import fft

# Number of samplepoints
N = 600

# Sample spacing
T = 1.0 / 800.0
x = np.linspace(0.0, N*T, N)
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = fft(y)
xf = np.linspace(0.0, 1.0/(2.0*T), N/2)
import matplotlib.pyplot as plt
plt.plot(xf, 2.0/N * np.abs(yf[0:N/2]))
plt.grid()
plt.show()

But when I change the argument of fft to my data set and plot it, I get extremely odd results, and it appears the scaling for the frequency may be off. I am unsure.

Here is a pastebin of the data I am attempting to FFT

http://pastebin.com/0WhjjMkb http://pastebin.com/ksM4FvZS

When I use fft() on the whole thing it just has a huge spike at zero and nothing else.

Here is my code:

## Perform FFT with SciPy
signalFFT = fft(yInterp)

## Get power spectral density
signalPSD = np.abs(signalFFT) ** 2

## Get frequencies corresponding to signal PSD
fftFreq = fftfreq(len(signalPSD), spacing)

## Get positive half of frequencies
i = fftfreq>0

##
plt.figurefigsize = (8, 4));
plt.plot(fftFreq[i], 10*np.log10(signalPSD[i]));
#plt.xlim(0, 100);
plt.xlabel('Frequency [Hz]');
plt.ylabel('PSD [dB]')

Spacing is just equal to xInterp[1]-xInterp[0].


回答 0

因此,我在IPython笔记本中运行了功能等效的代码形式:

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack

# Number of samplepoints
N = 600
# sample spacing
T = 1.0 / 800.0
x = np.linspace(0.0, N*T, N)
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = scipy.fftpack.fft(y)
xf = np.linspace(0.0, 1.0/(2.0*T), N/2)

fig, ax = plt.subplots()
ax.plot(xf, 2.0/N * np.abs(yf[:N//2]))
plt.show()

我得到了我认为非常合理的输出。

自从我在工学院开始考虑信号处理以来,这个时间就比我想承认的要长,但是峰值在50和80正是我所期望的。那是什么问题呢?

响应发布的原始数据和评论

这里的问题是您没有定期数据。您应该始终检查输入任何算法的数据,以确保它是适当的。

import pandas
import matplotlib.pyplot as plt
#import seaborn
%matplotlib inline

# the OP's data
x = pandas.read_csv('http://pastebin.com/raw.php?i=ksM4FvZS', skiprows=2, header=None).values
y = pandas.read_csv('http://pastebin.com/raw.php?i=0WhjjMkb', skiprows=2, header=None).values
fig, ax = plt.subplots()
ax.plot(x, y)

So I run a functionally equivalent form of your code in an IPython notebook:

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack

# Number of samplepoints
N = 600
# sample spacing
T = 1.0 / 800.0
x = np.linspace(0.0, N*T, N)
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = scipy.fftpack.fft(y)
xf = np.linspace(0.0, 1.0/(2.0*T), N/2)

fig, ax = plt.subplots()
ax.plot(xf, 2.0/N * np.abs(yf[:N//2]))
plt.show()

I get what I believe to be very reasonable output.

It’s been longer than I care to admit since I was in engineering school thinking about signal processing, but spikes at 50 and 80 are exactly what I would expect. So what’s the issue?

In response to the raw data and comments being posted

The problem here is that you don’t have periodic data. You should always inspect the data that you feed into any algorithm to make sure that it’s appropriate.

import pandas
import matplotlib.pyplot as plt
#import seaborn
%matplotlib inline

# the OP's data
x = pandas.read_csv('http://pastebin.com/raw.php?i=ksM4FvZS', skiprows=2, header=None).values
y = pandas.read_csv('http://pastebin.com/raw.php?i=0WhjjMkb', skiprows=2, header=None).values
fig, ax = plt.subplots()
ax.plot(x, y)


回答 1

关于fft的重要一点是,它只能应用于时间戳统一的数据(时间上的统一采样,如上面所示)。

如果采样不均匀,请使用函数拟合数据。有几种教程和功能可供选择:

https://github.com/tiagopereira/python_tips/wiki/Scipy%3A-curve-fitting http://docs.scipy.org/doc/numpy/reference/generation/numpy.polyfit.html

如果无法选择拟合,则可以直接使用某种形式的插值将数据插值为统一采样:

https://docs.scipy.org/doc/scipy-0.14.0/reference/tutorial/interpolate.html

当您有统一的样本时,您只需要担心t[1] - t[0]样本的时间增量()。在这种情况下,您可以直接使用fft函数

Y    = numpy.fft.fft(y)
freq = numpy.fft.fftfreq(len(y), t[1] - t[0])

pylab.figure()
pylab.plot( freq, numpy.abs(Y) )
pylab.figure()
pylab.plot(freq, numpy.angle(Y) )
pylab.show()

这应该可以解决您的问题。

The important thing about fft is that it can only be applied to data in which the timestamp is uniform (i.e. uniform sampling in time, like what you have shown above).

In case of non-uniform sampling, please use a function for fitting the data. There are several tutorials and functions to choose from:

https://github.com/tiagopereira/python_tips/wiki/Scipy%3A-curve-fitting http://docs.scipy.org/doc/numpy/reference/generated/numpy.polyfit.html

If fitting is not an option, you can directly use some form of interpolation to interpolate data to a uniform sampling:

https://docs.scipy.org/doc/scipy-0.14.0/reference/tutorial/interpolate.html

When you have uniform samples, you will only have to wory about the time delta (t[1] - t[0]) of your samples. In this case, you can directly use the fft functions

Y    = numpy.fft.fft(y)
freq = numpy.fft.fftfreq(len(y), t[1] - t[0])

pylab.figure()
pylab.plot( freq, numpy.abs(Y) )
pylab.figure()
pylab.plot(freq, numpy.angle(Y) )
pylab.show()

This should solve your problem.


回答 2

您的高尖峰信号是由于信号的DC(不变,即freq = 0)部分引起的。这是规模问题。如果要查看非DC频率内容,则为了可视化,可能需要从偏移量1而不是信号FFT的偏移量0绘制。

修改@PaulH上面给出的示例

import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack

# Number of samplepoints
N = 600
# sample spacing
T = 1.0 / 800.0
x = np.linspace(0.0, N*T, N)
y = 10 + np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = scipy.fftpack.fft(y)
xf = np.linspace(0.0, 1.0/(2.0*T), N/2)

plt.subplot(2, 1, 1)
plt.plot(xf, 2.0/N * np.abs(yf[0:N/2]))
plt.subplot(2, 1, 2)
plt.plot(xf[1:], 2.0/N * np.abs(yf[0:N/2])[1:])

输出图:

另一种方法是以对数刻度可视化数据:

使用:

plt.semilogy(xf, 2.0/N * np.abs(yf[0:N/2]))

将会呈现:

The high spike that you have is due to the DC (non-varying, i.e. freq = 0) portion of your signal. It’s an issue of scale. If you want to see non-DC frequency content, for visualization, you may need to plot from the offset 1 not from offset 0 of the FFT of the signal.

Modifying the example given above by @PaulH

import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack

# Number of samplepoints
N = 600
# sample spacing
T = 1.0 / 800.0
x = np.linspace(0.0, N*T, N)
y = 10 + np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = scipy.fftpack.fft(y)
xf = np.linspace(0.0, 1.0/(2.0*T), N/2)

plt.subplot(2, 1, 1)
plt.plot(xf, 2.0/N * np.abs(yf[0:N/2]))
plt.subplot(2, 1, 2)
plt.plot(xf[1:], 2.0/N * np.abs(yf[0:N/2])[1:])

The output plots:

Another way, is to visualize the data in log scale:

Using:

plt.semilogy(xf, 2.0/N * np.abs(yf[0:N/2]))

Will show:


回答 3

作为对已经给出的答案的补充,我想指出的是,经常需要考虑FFT的bin大小。测试一堆值并选择对您的应用程序更有意义的值将是有意义的。通常,它与样本数量的大小相同。给出的大多数答案都假定了这一点,并且产生了很好且合理的结果。如果有人想探索一下,这是我的代码版本:

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack

fig = plt.figure(figsize=[14,4])
N = 600           # Number of samplepoints
Fs = 800.0
T = 1.0 / Fs      # N_samps*T (#samples x sample period) is the sample spacing.
N_fft = 80        # Number of bins (chooses granularity)
x = np.linspace(0, N*T, N)     # the interval
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)   # the signal

# removing the mean of the signal
mean_removed = np.ones_like(y)*np.mean(y)
y = y - mean_removed

# Compute the fft.
yf = scipy.fftpack.fft(y,n=N_fft)
xf = np.arange(0,Fs,Fs/N_fft)

##### Plot the fft #####
ax = plt.subplot(121)
pt, = ax.plot(xf,np.abs(yf), lw=2.0, c='b')
p = plt.Rectangle((Fs/2, 0), Fs/2, ax.get_ylim()[1], facecolor="grey", fill=True, alpha=0.75, hatch="/", zorder=3)
ax.add_patch(p)
ax.set_xlim((ax.get_xlim()[0],Fs))
ax.set_title('FFT', fontsize= 16, fontweight="bold")
ax.set_ylabel('FFT magnitude (power)')
ax.set_xlabel('Frequency (Hz)')
plt.legend((p,), ('mirrowed',))
ax.grid()

##### Close up on the graph of fft#######
# This is the same histogram above, but truncated at the max frequence + an offset. 
offset = 1    # just to help the visualization. Nothing important.
ax2 = fig.add_subplot(122)
ax2.plot(xf,np.abs(yf), lw=2.0, c='b')
ax2.set_xticks(xf)
ax2.set_xlim(-1,int(Fs/6)+offset)
ax2.set_title('FFT close-up', fontsize= 16, fontweight="bold")
ax2.set_ylabel('FFT magnitude (power) - log')
ax2.set_xlabel('Frequency (Hz)')
ax2.hold(True)
ax2.grid()

plt.yscale('log')

输出图:

Just as a complement to the answers already given, I would like to point out that often it is important to play with the size of the bins for the FFT. It would make sense to test a bunch of values and pick the one that makes more sense to your application. Often, it is in the same magnitude of the number of samples. This was as assumed by most of the answers given, and produces great and reasonable results. In case one wants to explore that, here is my code version:

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack

fig = plt.figure(figsize=[14,4])
N = 600           # Number of samplepoints
Fs = 800.0
T = 1.0 / Fs      # N_samps*T (#samples x sample period) is the sample spacing.
N_fft = 80        # Number of bins (chooses granularity)
x = np.linspace(0, N*T, N)     # the interval
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)   # the signal

# removing the mean of the signal
mean_removed = np.ones_like(y)*np.mean(y)
y = y - mean_removed

# Compute the fft.
yf = scipy.fftpack.fft(y,n=N_fft)
xf = np.arange(0,Fs,Fs/N_fft)

##### Plot the fft #####
ax = plt.subplot(121)
pt, = ax.plot(xf,np.abs(yf), lw=2.0, c='b')
p = plt.Rectangle((Fs/2, 0), Fs/2, ax.get_ylim()[1], facecolor="grey", fill=True, alpha=0.75, hatch="/", zorder=3)
ax.add_patch(p)
ax.set_xlim((ax.get_xlim()[0],Fs))
ax.set_title('FFT', fontsize= 16, fontweight="bold")
ax.set_ylabel('FFT magnitude (power)')
ax.set_xlabel('Frequency (Hz)')
plt.legend((p,), ('mirrowed',))
ax.grid()

##### Close up on the graph of fft#######
# This is the same histogram above, but truncated at the max frequence + an offset. 
offset = 1    # just to help the visualization. Nothing important.
ax2 = fig.add_subplot(122)
ax2.plot(xf,np.abs(yf), lw=2.0, c='b')
ax2.set_xticks(xf)
ax2.set_xlim(-1,int(Fs/6)+offset)
ax2.set_title('FFT close-up', fontsize= 16, fontweight="bold")
ax2.set_ylabel('FFT magnitude (power) - log')
ax2.set_xlabel('Frequency (Hz)')
ax2.hold(True)
ax2.grid()

plt.yscale('log')

the output plots:


回答 4

我建立了一个函数,用于绘制真实信号的FFT。与先前的答案相比,我的功能额外的好处是您可以获得信号的实际幅度。

另外,由于假设是真实信号,因此FFT是对称的,因此我们只能绘制x轴的正向:

import matplotlib.pyplot as plt
import numpy as np
import warnings


def fftPlot(sig, dt=None, plot=True):
    # Here it's assumes analytic signal (real signal...) - so only half of the axis is required

    if dt is None:
        dt = 1
        t = np.arange(0, sig.shape[-1])
        xLabel = 'samples'
    else:
        t = np.arange(0, sig.shape[-1]) * dt
        xLabel = 'freq [Hz]'

    if sig.shape[0] % 2 != 0:
        warnings.warn("signal preferred to be even in size, autoFixing it...")
        t = t[0:-1]
        sig = sig[0:-1]

    sigFFT = np.fft.fft(sig) / t.shape[0]  # Divided by size t for coherent magnitude

    freq = np.fft.fftfreq(t.shape[0], d=dt)

    # Plot analytic signal - right half of frequence axis needed only...
    firstNegInd = np.argmax(freq < 0)
    freqAxisPos = freq[0:firstNegInd]
    sigFFTPos = 2 * sigFFT[0:firstNegInd]  # *2 because of magnitude of analytic signal

    if plot:
        plt.figure()
        plt.plot(freqAxisPos, np.abs(sigFFTPos))
        plt.xlabel(xLabel)
        plt.ylabel('mag')
        plt.title('Analytic FFT plot')
        plt.show()

    return sigFFTPos, freqAxisPos


if __name__ == "__main__":
    dt = 1 / 1000

    # Build a signal within Nyquist - the result will be the positive FFT with actual magnitude
    f0 = 200  # [Hz]
    t = np.arange(0, 1 + dt, dt)
    sig = 1 * np.sin(2 * np.pi * f0 * t) + \
        10 * np.sin(2 * np.pi * f0 / 2 * t) + \
        3 * np.sin(2 * np.pi * f0 / 4 * t) +\
        7.5 * np.sin(2 * np.pi * f0 / 5 * t)

    # Result in frequencies
    fftPlot(sig, dt=dt)
    # Result in samples (if the frequencies axis is unknown)
    fftPlot(sig)

I’ve built a function that deals with plotting FFT of real signals. The extra bonus in my function relative to the messages above is that you get the ACTUAL amplitude of the signal. Also, because of the assumption of a real signal, the FFT is symmetric so we can plot only the positive side of the x axis:

import matplotlib.pyplot as plt
import numpy as np
import warnings


def fftPlot(sig, dt=None, plot=True):
    # here it's assumes analytic signal (real signal...)- so only half of the axis is required

    if dt is None:
        dt = 1
        t = np.arange(0, sig.shape[-1])
        xLabel = 'samples'
    else:
        t = np.arange(0, sig.shape[-1]) * dt
        xLabel = 'freq [Hz]'

    if sig.shape[0] % 2 != 0:
        warnings.warn("signal prefered to be even in size, autoFixing it...")
        t = t[0:-1]
        sig = sig[0:-1]

    sigFFT = np.fft.fft(sig) / t.shape[0]  # divided by size t for coherent magnitude

    freq = np.fft.fftfreq(t.shape[0], d=dt)

    # plot analytic signal - right half of freq axis needed only...
    firstNegInd = np.argmax(freq < 0)
    freqAxisPos = freq[0:firstNegInd]
    sigFFTPos = 2 * sigFFT[0:firstNegInd]  # *2 because of magnitude of analytic signal

    if plot:
        plt.figure()
        plt.plot(freqAxisPos, np.abs(sigFFTPos))
        plt.xlabel(xLabel)
        plt.ylabel('mag')
        plt.title('Analytic FFT plot')
        plt.show()

    return sigFFTPos, freqAxisPos


if __name__ == "__main__":
    dt = 1 / 1000

    # build a signal within nyquist - the result will be the positive FFT with actual magnitude
    f0 = 200  # [Hz]
    t = np.arange(0, 1 + dt, dt)
    sig = 1 * np.sin(2 * np.pi * f0 * t) + \
        10 * np.sin(2 * np.pi * f0 / 2 * t) + \
        3 * np.sin(2 * np.pi * f0 / 4 * t) +\
        7.5 * np.sin(2 * np.pi * f0 / 5 * t)

    # res in freqs
    fftPlot(sig, dt=dt)
    # res in samples (if freqs axis is unknown)
    fftPlot(sig)


回答 5

此页面上已经有不错的解决方案,但是所有人都假定数据集是统一/均匀采样/分布的。我将尝试提供一个更一般的随机采样数据示例。我还将使用此MATLAB教程作为示例:

添加所需的模块:

import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack
import scipy.signal

生成样本数据:

N = 600 # Number of samples
t = np.random.uniform(0.0, 1.0, N) # Assuming the time start is 0.0 and time end is 1.0
S = 1.0 * np.sin(50.0 * 2 * np.pi * t) + 0.5 * np.sin(80.0 * 2 * np.pi * t)
X = S + 0.01 * np.random.randn(N) # Adding noise

排序数据集:

order = np.argsort(t)
ts = np.array(t)[order]
Xs = np.array(X)[order]

重采样:

T = (t.max() - t.min()) / N # Average period
Fs = 1 / T # Average sample rate frequency
f = Fs * np.arange(0, N // 2 + 1) / N; # Resampled frequency vector
X_new, t_new = scipy.signal.resample(Xs, N, ts)

绘制数据和重新采样的数据:

plt.xlim(0, 0.1)
plt.plot(t_new, X_new, label="resampled")
plt.plot(ts, Xs, label="org")
plt.legend()
plt.ylabel("X")
plt.xlabel("t")

现在计算FFT:

Y = scipy.fftpack.fft(X_new)
P2 = np.abs(Y / N)
P1 = P2[0 : N // 2 + 1]
P1[1 : -2] = 2 * P1[1 : -2]

plt.ylabel("Y")
plt.xlabel("f")
plt.plot(f, P1)

PS我终于有时间实施一个更规范的算法来获得不均匀分布数据的傅立叶变换。您可以在此处查看代码,说明和示例Jupyter笔记本。

There are already great solutions on this page, but all have assumed the dataset is uniformly/evenly sampled/distributed. I will try to provide a more general example of randomly sampled data. I will also use this MATLAB tutorial as an example:

Adding the required modules:

import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack
import scipy.signal

Generating sample data:

N = 600 # number of samples
t = np.random.uniform(0.0, 1.0, N) # assuming the time start is 0.0 and time end is 1.0
S = 1.0 * np.sin(50.0 * 2 * np.pi * t) + 0.5 * np.sin(80.0 * 2 * np.pi * t) 
X = S + 0.01 * np.random.randn(N) # adding noise

Sorting the data set:

order = np.argsort(t)
ts = np.array(t)[order]
Xs = np.array(X)[order]

Resampling:

T = (t.max() - t.min()) / N # average period 
Fs = 1 / T # average sample rate frequency
f = Fs * np.arange(0, N // 2 + 1) / N; # resampled frequency vector
X_new, t_new = scipy.signal.resample(Xs, N, ts)

plotting the data and resampled data:

plt.xlim(0, 0.1)
plt.plot(t_new, X_new, label="resampled")
plt.plot(ts, Xs, label="org")
plt.legend()
plt.ylabel("X")
plt.xlabel("t")

now calculating the fft:

Y = scipy.fftpack.fft(X_new)
P2 = np.abs(Y / N)
P1 = P2[0 : N // 2 + 1]
P1[1 : -2] = 2 * P1[1 : -2]

plt.ylabel("Y")
plt.xlabel("f")
plt.plot(f, P1)

P.S. I finally got time to implement a more canonical algorithm to get a Fourier transform of unevenly distributed data. You may see the code, description, and example Jupyter notebook here.


回答 6

我写了这个额外的答案,以解释使用FFT时尖峰扩散的根源,特别是讨论scipy.fftpack教程,我在某些时候对此表示不同意。

在此示例中,记录时间tmax=N*T=0.75。信号是sin(50*2*pi*x) + 0.5*sin(80*2*pi*x)。频率信号应包含两个尖峰,其频率5080幅度分别为10.5。但是,如果所分析的信号没有整数周期,则由于信号的截断会出现扩散:

  • 派克1:50*tmax=37.5=>频率50不是的倍数1/tmax=>存在扩散的由于在该频率信号截断。
  • 派克2:80*tmax=60=>频率80是的倍数1/tmax=>无扩散由于在该频率信号截断。

这是一段代码,它分析的信号与教程(sin(50*2*pi*x) + 0.5*sin(80*2*pi*x))中的信号相同,但略有不同:

  1. 原始的scipy.fftpack示例。
  2. 原始scipy.fftpack示例具有整数个信号周期(tmax=1.0而不是0.75为了避免截断扩散)。
  3. 原始scipy.fftpack示例,其中包含整数个信号周期,并且日期和频率均取自FFT理论。

编码:

import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack

# 1. Linspace
N = 600
# Sample spacing
tmax = 3/4
T = tmax / N # =1.0 / 800.0
x1 = np.linspace(0.0, N*T, N)
y1 = np.sin(50.0 * 2.0*np.pi*x1) + 0.5*np.sin(80.0 * 2.0*np.pi*x1)
yf1 = scipy.fftpack.fft(y1)
xf1 = np.linspace(0.0, 1.0/(2.0*T), N//2)

# 2. Integer number of periods
tmax = 1
T = tmax / N # Sample spacing
x2 = np.linspace(0.0, N*T, N)
y2 = np.sin(50.0 * 2.0*np.pi*x2) + 0.5*np.sin(80.0 * 2.0*np.pi*x2)
yf2 = scipy.fftpack.fft(y2)
xf2 = np.linspace(0.0, 1.0/(2.0*T), N//2)

# 3. Correct positioning of dates relatively to FFT theory ('arange' instead of 'linspace')
tmax = 1
T = tmax / N # Sample spacing
x3 = T * np.arange(N)
y3 = np.sin(50.0 * 2.0*np.pi*x3) + 0.5*np.sin(80.0 * 2.0*np.pi*x3)
yf3 = scipy.fftpack.fft(y3)
xf3 = 1/(N*T) * np.arange(N)[:N//2]

fig, ax = plt.subplots()
# Plotting only the left part of the spectrum to not show aliasing
ax.plot(xf1, 2.0/N * np.abs(yf1[:N//2]), label='fftpack tutorial')
ax.plot(xf2, 2.0/N * np.abs(yf2[:N//2]), label='Integer number of periods')
ax.plot(xf3, 2.0/N * np.abs(yf3[:N//2]), label='Correct positioning of dates')
plt.legend()
plt.grid()
plt.show()

输出:

就像这里可能的那样,即使使用整数周期,仍然会有一些扩散。此行为是由于scipy.fftpack教程中日期和频率的位置不正确造成的。因此,在离散傅立叶变换理论中:

  • 信号应t=0,T,...,(N-1)*T在T为采样周期且信号总持续时间为的日期进行评估tmax=N*T。请注意,我们在停下来tmax-T
  • 相关联的频率f=0,df,...,(N-1)*df,其中df=1/tmax=1/(N*T)是采样频率。信号的所有谐波应为采样频率的倍数,以避免扩散。

在上面的示例中,您可以看到使用arange代替linspace可以避免频谱中的额外扩散。此外,使用该linspace版本还会导致尖峰的偏移,该尖峰的频率略高于应有的尖峰,这在第一张图片中可以看到,在这些图片中,尖峰稍微位于频率50和的右边80

我将得出结论,用法示例应替换为以下代码(在我看来,这不太容易引起误解):

import numpy as np
from scipy.fftpack import fft

# Number of sample points
N = 600
T = 1.0 / 800.0
x = T*np.arange(N)
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = fft(y)
xf = 1/(N*T)*np.arange(N//2)
import matplotlib.pyplot as plt
plt.plot(xf, 2.0/N * np.abs(yf[0:N//2]))
plt.grid()
plt.show()

输出(第二个峰值不再扩散):

我认为此答案仍会带来一些有关如何正确应用离散傅立叶变换的附加说明。显然,我的答案太长了,总是有其他要说的东西(例如,关于“别名”的简短讨论,关于“窗口化”的说法很多),所以我将停止。

我认为,在应用离散傅里叶变换时,深刻理解离散傅里叶变换的原理非常重要,因为我们都知道很多人在应用离散傅里叶变换时都会在其中添加因数以获得自己想要的东西。

I write this additionnal answer to explain the origins of the diffusion of the spikes when using fft and especially discuss the scipy.fftpack tutorial with which I disagree at some point.

In this example, the recording time tmax=N*T=0.75. The signal is sin(50*2*pi*x)+0.5*sin(80*2*pi*x). The frequency signal should contain 2 spikes at frequencies 50 and 80 with amplitudes 1 and 0.5. However, if the analysed signal does not have a integer number of periods diffusion can appear due to the truncation of the signal:

  • Pike 1: 50*tmax=37.5 => frequency 50 is not a multiple of 1/tmax => Presence of diffusion due to signal truncation at this frequency.
  • Pike 2: 80*tmax=60 => frequency 80 is a multiple of 1/tmax => No diffusion due to signal truncation at this frequency.

Here is a code that analyses the same signal as in the tutorial (sin(50*2*pi*x)+0.5*sin(80*2*pi*x)) but with the slight differences:

  1. The original scipy.fftpack example.
  2. The original scipy.fftpack example with an integer number of signal periods (tmax=1.0 instead of 0.75 to avoid truncation diffusion).
  3. The original scipy.fftpack example with an integer number of signal periods and where the dates and frequencies are taken from the FFT theory.

The code:

import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack

# 1. Linspace
N = 600
# sample spacing
tmax = 3/4
T = tmax / N # =1.0 / 800.0
x1 = np.linspace(0.0, N*T, N)
y1 = np.sin(50.0 * 2.0*np.pi*x1) + 0.5*np.sin(80.0 * 2.0*np.pi*x1)
yf1 = scipy.fftpack.fft(y1)
xf1 = np.linspace(0.0, 1.0/(2.0*T), N//2)

# 2. Integer number of periods
tmax = 1
T = tmax / N # sample spacing
x2 = np.linspace(0.0, N*T, N)
y2 = np.sin(50.0 * 2.0*np.pi*x2) + 0.5*np.sin(80.0 * 2.0*np.pi*x2)
yf2 = scipy.fftpack.fft(y2)
xf2 = np.linspace(0.0, 1.0/(2.0*T), N//2)

# 3. Correct positionning of dates relatively to FFT theory (arange instead of linspace)
tmax = 1
T = tmax / N # sample spacing
x3 = T * np.arange(N)
y3 = np.sin(50.0 * 2.0*np.pi*x3) + 0.5*np.sin(80.0 * 2.0*np.pi*x3)
yf3 = scipy.fftpack.fft(y3)
xf3 = 1/(N*T) * np.arange(N)[:N//2]

fig, ax = plt.subplots()
# Plotting only the left part of the spectrum to not show aliasing
ax.plot(xf1, 2.0/N * np.abs(yf1[:N//2]), label='fftpack tutorial')
ax.plot(xf2, 2.0/N * np.abs(yf2[:N//2]), label='Integer number of periods')
ax.plot(xf3, 2.0/N * np.abs(yf3[:N//2]), label='Correct positionning of dates')
plt.legend()
plt.grid()
plt.show()

Output:

As it can be here, even with using an integer number of periods some diffusion still remains. This behaviour is due to a bad positionning of dates and frequencies in the scipy.fftpack tutorial. Hence, in the theory of discrete Fourier transforms:

  • the signal should be evaluated at dates t=0,T,...,(N-1)*T where T is the sampling period and the total duration of the signal is tmax=N*T. Note that we stop at tmax-T.
  • the associated frequencies are f=0,df,...,(N-1)*df where df=1/tmax=1/(N*T) is the sampling frequency. All harmonics of the signal should be multiple of the sampling frequency to avoid diffusion.

In the example above, you can see that the use of arange instead of linspace enables to avoid additional diffusion in the frequency spectrum. Moreover, using the linspace version also leads to an offset of the spikes that are located at slightly higher frequencies than what they should be as it can be seen in the first picture where the spikes are a little bit at the right of the frequencies 50 and 80.

I’ll just conclude that the example of usage should be replace by the following code (which is less misleading in my opinion):

import numpy as np
from scipy.fftpack import fft
# Number of sample points
N = 600
T = 1.0 / 800.0
x = T*np.arange(N)
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = fft(y)
xf = 1/(N*T)*np.arange(N//2)
import matplotlib.pyplot as plt
plt.plot(xf, 2.0/N * np.abs(yf[0:N//2]))
plt.grid()
plt.show()

Output (the second spike is not diffused anymore):

I think this answer still bring some additional explanations on how to apply correctly discrete Fourier transform. Obviously, my answer is too long and there is always additional things to say (@ewerlopes talked briefly about aliasing for instance and a lot can be said about windowing) so I’ll stop. I think that it is very important to understand deeply the principles of discrete Fourier transform when applying it because we all know so much people adding factors here and there when applying it in order to obtain what they want.


使用快速傅立叶变换分析音频

问题:使用快速傅立叶变换分析音频

我正在尝试在python中创建图形频谱分析仪。

我目前正在读取16位双通道,44,100 Hz采样率音频流的1024个字节,并将两个通道的幅度平均在一起。因此,现在我有一系列256条带符号的短裤。现在,我想使用numpy之类的模块在该阵列上执行fft,并使用结果创建图形频谱分析仪,其开始时只有32条。

我已经阅读了有关快速傅立叶变换和离散傅立叶变换的维基百科文章,但是我仍然不清楚结果数组代表什么。这是我使用numpy在数组上执行fft后的数组外观:

   [ -3.37260500e+05 +0.00000000e+00j   7.11787022e+05 +1.70667403e+04j
   4.10040193e+05 +3.28653370e+05j   9.90933073e+04 +1.60555003e+05j
   2.28787050e+05 +3.24141951e+05j   2.09781047e+04 +2.31063376e+05j
  -2.15941453e+05 +1.63773851e+05j  -7.07833051e+04 +1.52467334e+05j
  -1.37440802e+05 +6.28107674e+04j  -7.07536614e+03 +5.55634993e+03j
  -4.31009964e+04 -1.74891657e+05j   1.39384348e+05 +1.95956947e+04j
   1.73613033e+05 +1.16883207e+05j   1.15610357e+05 -2.62619884e+04j
  -2.05469722e+05 +1.71343186e+05j  -1.56779748e+04 +1.51258101e+05j
  -2.08639913e+05 +6.07372799e+04j  -2.90623668e+05 -2.79550838e+05j
  -1.68112214e+05 +4.47877871e+04j  -1.21289916e+03 +1.18397979e+05j
  -1.55779104e+05 +5.06852464e+04j   1.95309737e+05 +1.93876325e+04j
  -2.80400414e+05 +6.90079265e+04j   1.25892113e+04 -1.39293422e+05j
   3.10709174e+04 -1.35248953e+05j   1.31003438e+05 +1.90799303e+05j...

我想知道这些数字究竟代表什么,以及如何将这些数字转换为32个条形图的每个高度的百分比。另外,我应该将两个通道平均在一起吗?

I am trying to create a graphical spectrum analyzer in python.

I am currently reading 1024 bytes of a 16 bit dual channel 44,100 Hz sample rate audio stream and averaging the amplitude of the 2 channels together. So now I have an array of 256 signed shorts. I now want to preform a fft on that array, using a module like numpy, and use the result to create the graphical spectrum analyzer, which, to start will just be 32 bars.

I have read the wikipedia articles on Fast Fourier Transform and Discrete Fourier Transform but I am still unclear of what the resulting array represents. This is what the array looks like after I preform an fft on my array using numpy:

   [ -3.37260500e+05 +0.00000000e+00j   7.11787022e+05 +1.70667403e+04j
   4.10040193e+05 +3.28653370e+05j   9.90933073e+04 +1.60555003e+05j
   2.28787050e+05 +3.24141951e+05j   2.09781047e+04 +2.31063376e+05j
  -2.15941453e+05 +1.63773851e+05j  -7.07833051e+04 +1.52467334e+05j
  -1.37440802e+05 +6.28107674e+04j  -7.07536614e+03 +5.55634993e+03j
  -4.31009964e+04 -1.74891657e+05j   1.39384348e+05 +1.95956947e+04j
   1.73613033e+05 +1.16883207e+05j   1.15610357e+05 -2.62619884e+04j
  -2.05469722e+05 +1.71343186e+05j  -1.56779748e+04 +1.51258101e+05j
  -2.08639913e+05 +6.07372799e+04j  -2.90623668e+05 -2.79550838e+05j
  -1.68112214e+05 +4.47877871e+04j  -1.21289916e+03 +1.18397979e+05j
  -1.55779104e+05 +5.06852464e+04j   1.95309737e+05 +1.93876325e+04j
  -2.80400414e+05 +6.90079265e+04j   1.25892113e+04 -1.39293422e+05j
   3.10709174e+04 -1.35248953e+05j   1.31003438e+05 +1.90799303e+05j...

I am wondering what exactly these numbers represent and how I would convert these numbers into a percentage of a height for each of the 32 bars. Also, should I be averaging the 2 channels together?


回答 0

您要显示的阵列是音频信号的傅立叶变换系数。这些系数可用于获取音频的频率内容。FFT是为复数值输入函数定义的,因此即使您输入的都是实数值,您得出的系数也将是虚数。为了获得每个频率的功率量,您需要计算每个频率的FFT系数的大小。这不仅是系数的实部,还需要计算其实部和虚部的平方和的平方根。也就是说,如果您的系数为a + b * j,则其大小为sqrt(a ^ 2 + b ^ 2)。

一旦计算了每个FFT系数的幅度,就需要弄清楚每个FFT系数属于哪个音频。N点FFT将为您提供从0开始的N个等间隔频率的信号频率内容。因为您的采样频率为44100个样本/秒。并且FFT中的点数为256,则您的频率间隔为44100/256 = 172 Hz(大约)

数组中的第一个系数将是0频率系数。这基本上是所有频率的平均功率水平。其余的系数将从0开始以172 Hz的倍数递增,直到达到128。在FFT中,您最多只能测量一半采样点的频率。阅读有关奈奎斯特频率如果您是惩罚的嘴,并且需要知道为什么,请奈奎斯特-香农采样定理,但基本的结果是,您的低频将被复制或混叠在高频频段中。因此,频率将从0开始,对每个系数增加172 Hz,直到N / 2系数,然后降低172 Hz,直到N-1系数。

那应该是足够的信息来帮助您入门。如果您想对FFT进行比维基百科更平易近人的介绍,则可以尝试了解数字信号处理:第二版。。这对我很有帮助。

这就是这些数字所代表的含义。可以通过将每个频率分量幅度乘以所有分量幅度的总和来转换为高度百分比。虽然,这只能为您提供相对频率分布的表示,而不是每个频率的实际功率。您可以尝试按频率分量的最大幅度进行缩放,但我不确定该显示效果是否很好。找到可行的比例因子的最快方法是对响亮和柔和的音频信号进行实验,以找到正确的设置。

最后,如果要整体显示整个音频信号的频率内容,则应将两个通道平均在一起。您正在将立体声音频混合为单声道音频并显示组合的频率。如果您想要左右两个频率分别显示,那么您将需要在每个通道上分别执行傅立叶变换。

The array you are showing is the Fourier Transform coefficients of the audio signal. These coefficients can be used to get the frequency content of the audio. The FFT is defined for complex valued input functions, so the coefficients you get out will be imaginary numbers even though your input is all real values. In order to get the amount of power in each frequency, you need to calculate the magnitude of the FFT coefficient for each frequency. This is not just the real component of the coefficient, you need to calculate the square root of the sum of the square of its real and imaginary components. That is, if your coefficient is a + b*j, then its magnitude is sqrt(a^2 + b^2).

Once you have calculated the magnitude of each FFT coefficient, you need to figure out which audio frequency each FFT coefficient belongs to. An N point FFT will give you the frequency content of your signal at N equally spaced frequencies, starting at 0. Because your sampling frequency is 44100 samples / sec. and the number of points in your FFT is 256, your frequency spacing is 44100 / 256 = 172 Hz (approximately)

The first coefficient in your array will be the 0 frequency coefficient. That is basically the average power level for all frequencies. The rest of your coefficients will count up from 0 in multiples of 172 Hz until you get to 128. In an FFT, you only can measure frequencies up to half your sample points. Read these links on the Nyquist Frequency and Nyquist-Shannon Sampling Theorem if you are a glutton for punishment and need to know why, but the basic result is that your lower frequencies are going to be replicated or aliased in the higher frequency buckets. So the frequencies will start from 0, increase by 172 Hz for each coefficient up to the N/2 coefficient, then decrease by 172 Hz until the N – 1 coefficient.

That should be enough information to get you started. If you would like a much more approachable introduction to FFTs than is given on Wikipedia, you could try Understanding Digital Signal Processing: 2nd Ed.. It was very helpful for me.

So that is what those numbers represent. Converting to a percentage of height could be done by scaling each frequency component magnitude by the sum of all component magnitudes. Although, that would only give you a representation of the relative frequency distribution, and not the actual power for each frequency. You could try scaling by the maximum magnitude possible for a frequency component, but I’m not sure that that would display very well. The quickest way to find a workable scaling factor would be to experiment on loud and soft audio signals to find the right setting.

Finally, you should be averaging the two channels together if you want to show the frequency content of the entire audio signal as a whole. You are mixing the stereo audio into mono audio and showing the combined frequencies. If you want two separate displays for right and left frequencies, then you will need to perform the Fourier Transform on each channel separately.


回答 1

尽管此线程已有多年历史,但我发现它很有帮助。我只想将我的意见提供给发现此问题并试图创建类似内容的任何人。

至于条形划分,这不应该像antti所建议的那样进行,而是根据条形数将数据均分。最有用的是将数据分成八度,每个八度是前一个频率的两倍。(即100hz是50hz之上的一个八度,这是25hz之上的一个八度)。

根据所需的小节,将整个范围划分为1 / X八度范围。根据横条上给定的中心频率A,可以从以下项获得横条的上限和下限:

upper limit = A * 2 ^ ( 1 / 2X )
lower limit = A / 2 ^ ( 1 / 2X )

要计算下一个相邻的中心频率,请使用类似的计算方法:

next lower =  A / 2 ^ ( 1 / X )
next higher = A * 2 ^ ( 1 / X )

然后,您可以对适合这些范围的数据取平均值,以获取每个条形图的幅度。

例如:我们想要划分为1/3个八度音程,并且我们从1khz的中心频率开始。

Upper limit = 1000 * 2 ^ ( 1 / ( 2 * 3 ) ) = 1122.5
Lower limit = 1000 / 2 ^ ( 1 / ( 2 * 3 ) ) =  890.9

给定44100hz和1024个样本(每个数据点之间为43hz),我们应该取平均值21到26。(890.9 / 43 = 20.72〜21和1122.5 / 43 = 26.10〜26)

(1/3八度音阶将使您在〜40hz和〜20khz之间大约30个音阶)。如您现在所知道的,随着我们的提高,我们将平均更大范围的数字。低条通常仅包含1个或少量数据点。而较高的柱可以是数百个点的平均值。原因是86hz比43hz高八度…而10086hz的声音与10043hz几乎相同。

Although this thread is years old, I found it very helpful. I just wanted to give my input to anyone who finds this and are trying to create something similar.

As for the division into bars this should not be done as antti suggest, by dividing the data equally based on the number of bars. The most useful would be to divide the data into octave parts, each octave being double the frequency of the previous. (ie. 100hz is one octave above 50hz, which is one octave above 25hz).

Depending on how many bars you want, you divide the whole range into 1/X octave ranges. Based on a given center frequency of A on the bar, you get the upper and lower limits of the bar from:

upper limit = A * 2 ^ ( 1 / 2X )
lower limit = A / 2 ^ ( 1 / 2X )

To calculate the next adjoining center frequency you use a similar calculation:

next lower =  A / 2 ^ ( 1 / X )
next higher = A * 2 ^ ( 1 / X )

You then average the data that fits into these ranges to get the amplitude for each bar.

For example: We want to divide into 1/3 octaves ranges and we start with a center frequency of 1khz.

Upper limit = 1000 * 2 ^ ( 1 / ( 2 * 3 ) ) = 1122.5
Lower limit = 1000 / 2 ^ ( 1 / ( 2 * 3 ) ) =  890.9

Given 44100hz and 1024 samples (43hz between each data point) we should average out values 21 through 26. ( 890.9 / 43 = 20.72 ~ 21 and 1122.5 / 43 = 26.10 ~ 26 )

(1/3 octave bars would get you around 30 bars between ~40hz and ~20khz). As you can figure out by now, as we go higher we will average a larger range of numbers. Low bars typically only include 1 or a small number of data points. While the higher bars can be the average of hundreds of points. The reason being that 86hz is an octave above 43hz… while 10086hz sounds almost the same as 10043hz.


回答 2

您所拥有的是一个时间长度为256/44100 = 0.00580499秒的样本。这意味着您的频率分辨率为1 / 0.00580499 = 172 Hz。从Python中获得的256个值基本上对应于从86 Hz到255 * 172 + 86 Hz = 43946 Hz的频率。您得到的数字是复数(因此,第二个数字的末尾是“ j”)。

编辑:固定错误信息

您需要通过计算sqrt(i 2 + j 2)将复数转换为幅度,其中i和j是实部和虚部。

如果要有32条,就我所知,应该取四个连续振幅的平均值,得到256/4 = 32条。

what you have is a sample whose length in time is 256/44100 = 0.00580499 seconds. This means that your frequency resolution is 1 / 0.00580499 = 172 Hz. The 256 values you get out from Python correspond to the frequencies, basically, from 86 Hz to 255*172+86 Hz = 43946 Hz. The numbers you get out are complex numbers (hence the “j” at the end of every second number).

EDITED: FIXED WRONG INFORMATION

You need to convert the complex numbers into amplitude by calculating the sqrt(i2 + j2) where i and j are the real and imaginary parts, resp.

If you want to have 32 bars, you should as far as I understand take the average of four successive amplitudes, getting 256 / 4 = 32 bars as you want.


回答 3

FFT返回N个复数值,您可以计算其中的一个module=sqrt(real_part^2+imaginary_part^2)。要获得每个频段的值,您必须对频段内所有谐波的模块求和。您可以在下面看到有关10 bar频谱分析仪的示例。必须包装C代码以获得pyd python模块。

float *samples_vett;
float *out_filters_vett;
int Nsamples;
float band_power = 0.0;
float harmonic_amplitude=0.0;
int i, out_index;

out_index=0;


for (i = 0; i < Nsamples / 2 + 1; i++)       
        {
            if (i == 1 || i == 2 || i == 4 || i == 8 || i == 17 || i == 33 || i == 66 || i == 132 || i == 264 || i == 511)
            {
                out_filters_vett[out_index] = band_power; 
                band_power = 0; 
                out_index++;  
            }

            harmonic_amplitude = sqrt(pow(ttfr_out_vett[i].r, 2) + pow(ttfr_out_vett[i].i, 2));
            band_power += harmonic_amplitude;

        }

我用Python设计并制作了整个10 led条形频谱分析仪。取而代之的是使用nunmpy库(太大而没有用,无法仅获取FFT),而是创建了一个python pyd模块(仅27KB)来获取FFT并将整个音频频谱拆分为多个频段。

此外,要读取输出音频,还创建了回送WASapi portaudio pyd模块。您可以在图像10BarsSpectrumAnalyzerWithWASapi.jpg中看到项目(框图)。

刚刚在我的YouTube频道上添加了一个教程视频:如何设计和制作非常聪明的Python Spectrum Analyzer 10 LED条形图

FFT return N complex values which of you can compute the module=sqrt(real_part^2+imaginary_part^2). To get the value for each band you have to sum the modules about all harmonics inside the band. Below you can see an example about a 10 bars spectrum analyzer. The c code has to be wrapped to get a pyd python module.

float *samples_vett;
float *out_filters_vett;
int Nsamples;
float band_power = 0.0;
float harmonic_amplitude=0.0;
int i, out_index;

out_index=0;


for (i = 0; i < Nsamples / 2 + 1; i++)       
        {
            if (i == 1 || i == 2 || i == 4 || i == 8 || i == 17 || i == 33 || i == 66 || i == 132 || i == 264 || i == 511)
            {
                out_filters_vett[out_index] = band_power; 
                band_power = 0; 
                out_index++;  
            }

            harmonic_amplitude = sqrt(pow(ttfr_out_vett[i].r, 2) + pow(ttfr_out_vett[i].i, 2));
            band_power += harmonic_amplitude;

        }

I designed and made a whole 10 led bar spectrum analyzer by Python. Instead to use the nunmpy library (too big and useless to get just the FFT) a python pyd module (just 27KB) to get the FFT and to split the entire audio spectrum to bands was created.

In addition, to read the output audio a loopback WASapi portaudio pyd module was created. You can see the project (block diagram) in the image 10BarsSpectrumAnalyzerWithWASapi.jpg

Just added a tutorial video on my YouTube channel: how to design and make a very smart Python Spectrum Analyzer 10 Led Bar