Python 实用宝典

Question 1

我可以访问NumPy和SciPy，并希望为数据集创建一个简单的FFT。我有两个列表，一个是y值，另一个是这些y值的时间戳。

将这些列表输入SciPy或NumPy方法并绘制所得FFT的最简单方法是什么？

我查看了示例，但是它们都依赖于创建具有一定数量的数据点和频率等的伪造数据集，而并没有真正展示如何仅使用一组数据和相应的时间戳来做到这一点。。

我尝试了以下示例：

from scipy.fftpack import fft

# Number of samplepoints
N = 600

# Sample spacing
T = 1.0 / 800.0
x = np.linspace(0.0, N*T, N)
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = fft(y)
xf = np.linspace(0.0, 1.0/(2.0*T), N/2)
import matplotlib.pyplot as plt
plt.plot(xf, 2.0/N * np.abs(yf[0:N/2]))
plt.grid()
plt.show()

但是，当我更改fft数据集的参数并将其绘制成图时，会得到极其奇怪的结果，而且看起来频率的比例可能不正确。我不确定

这是我尝试FFT的数据的粘贴框

http://pastebin.com/0WhjjMkb http://pastebin.com/ksM4FvZS

当我fft()在整个事情上使用时，它的峰值只有零，而没有别的。

这是我的代码：

## Perform FFT with SciPy
signalFFT = fft(yInterp)

## Get power spectral density
signalPSD = np.abs(signalFFT) ** 2

## Get frequencies corresponding to signal PSD
fftFreq = fftfreq(len(signalPSD), spacing)

## Get positive half of frequencies
i = fftfreq>0

##
plt.figurefigsize = (8, 4));
plt.plot(fftFreq[i], 10*np.log10(signalPSD[i]));
#plt.xlim(0, 100);
plt.xlabel('Frequency [Hz]');
plt.ylabel('PSD [dB]')

间距等于xInterp[1]-xInterp[0]。

Question 2

I have access to NumPy and SciPy and want to create a simple FFT of a data set. I have two lists, one that is y values and the other is timestamps for those y values.

What is the simplest way to feed these lists into a SciPy or NumPy method and plot the resulting FFT?

I have looked up examples, but they all rely on creating a set of fake data with some certain number of data points, and frequency, etc. and don’t really show how to do it with just a set of data and the corresponding timestamps.

I have tried the following example:

from scipy.fftpack import fft

# Number of samplepoints
N = 600

# Sample spacing
T = 1.0 / 800.0
x = np.linspace(0.0, N*T, N)
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = fft(y)
xf = np.linspace(0.0, 1.0/(2.0*T), N/2)
import matplotlib.pyplot as plt
plt.plot(xf, 2.0/N * np.abs(yf[0:N/2]))
plt.grid()
plt.show()

But when I change the argument of fft to my data set and plot it, I get extremely odd results, and it appears the scaling for the frequency may be off. I am unsure.

Here is a pastebin of the data I am attempting to FFT

http://pastebin.com/0WhjjMkb http://pastebin.com/ksM4FvZS

When I use fft() on the whole thing it just has a huge spike at zero and nothing else.

Here is my code:

## Perform FFT with SciPy
signalFFT = fft(yInterp)

## Get power spectral density
signalPSD = np.abs(signalFFT) ** 2

## Get frequencies corresponding to signal PSD
fftFreq = fftfreq(len(signalPSD), spacing)

## Get positive half of frequencies
i = fftfreq>0

##
plt.figurefigsize = (8, 4));
plt.plot(fftFreq[i], 10*np.log10(signalPSD[i]));
#plt.xlim(0, 100);
plt.xlabel('Frequency [Hz]');
plt.ylabel('PSD [dB]')

Spacing is just equal to xInterp[1]-xInterp[0].

Question 3

因此，我在IPython笔记本中运行了功能等效的代码形式：

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack

# Number of samplepoints
N = 600
# sample spacing
T = 1.0 / 800.0
x = np.linspace(0.0, N*T, N)
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = scipy.fftpack.fft(y)
xf = np.linspace(0.0, 1.0/(2.0*T), N/2)

fig, ax = plt.subplots()
ax.plot(xf, 2.0/N * np.abs(yf[:N//2]))
plt.show()

我得到了我认为非常合理的输出。

自从我在工学院开始考虑信号处理以来，这个时间就比我想承认的要长，但是峰值在50和80正是我所期望的。那是什么问题呢？

响应发布的原始数据和评论

这里的问题是您没有定期数据。您应该始终检查输入任何算法的数据，以确保它是适当的。

import pandas
import matplotlib.pyplot as plt
#import seaborn
%matplotlib inline

# the OP's data
x = pandas.read_csv('http://pastebin.com/raw.php?i=ksM4FvZS', skiprows=2, header=None).values
y = pandas.read_csv('http://pastebin.com/raw.php?i=0WhjjMkb', skiprows=2, header=None).values
fig, ax = plt.subplots()
ax.plot(x, y)

Question 4

So I run a functionally equivalent form of your code in an IPython notebook:

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack

# Number of samplepoints
N = 600
# sample spacing
T = 1.0 / 800.0
x = np.linspace(0.0, N*T, N)
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = scipy.fftpack.fft(y)
xf = np.linspace(0.0, 1.0/(2.0*T), N/2)

fig, ax = plt.subplots()
ax.plot(xf, 2.0/N * np.abs(yf[:N//2]))
plt.show()

I get what I believe to be very reasonable output.

It’s been longer than I care to admit since I was in engineering school thinking about signal processing, but spikes at 50 and 80 are exactly what I would expect. So what’s the issue?

In response to the raw data and comments being posted

The problem here is that you don’t have periodic data. You should always inspect the data that you feed into any algorithm to make sure that it’s appropriate.

import pandas
import matplotlib.pyplot as plt
#import seaborn
%matplotlib inline

# the OP's data
x = pandas.read_csv('http://pastebin.com/raw.php?i=ksM4FvZS', skiprows=2, header=None).values
y = pandas.read_csv('http://pastebin.com/raw.php?i=0WhjjMkb', skiprows=2, header=None).values
fig, ax = plt.subplots()
ax.plot(x, y)

Question 5

关于fft的重要一点是，它只能应用于时间戳统一的数据（即时间上的统一采样，如上面所示）。

如果采样不均匀，请使用函数拟合数据。有几种教程和功能可供选择：

https://github.com/tiagopereira/python_tips/wiki/Scipy%3A-curve-fitting http://docs.scipy.org/doc/numpy/reference/generation/numpy.polyfit.html

如果无法选择拟合，则可以直接使用某种形式的插值将数据插值为统一采样：

https://docs.scipy.org/doc/scipy-0.14.0/reference/tutorial/interpolate.html

当您有统一的样本时，您只需要担心t[1] - t[0]样本的时间增量（）。在这种情况下，您可以直接使用fft函数

Y    = numpy.fft.fft(y)
freq = numpy.fft.fftfreq(len(y), t[1] - t[0])

pylab.figure()
pylab.plot( freq, numpy.abs(Y) )
pylab.figure()
pylab.plot(freq, numpy.angle(Y) )
pylab.show()

这应该可以解决您的问题。

Question 6

The important thing about fft is that it can only be applied to data in which the timestamp is uniform (i.e. uniform sampling in time, like what you have shown above).

In case of non-uniform sampling, please use a function for fitting the data. There are several tutorials and functions to choose from:

https://github.com/tiagopereira/python_tips/wiki/Scipy%3A-curve-fitting http://docs.scipy.org/doc/numpy/reference/generated/numpy.polyfit.html

If fitting is not an option, you can directly use some form of interpolation to interpolate data to a uniform sampling:

https://docs.scipy.org/doc/scipy-0.14.0/reference/tutorial/interpolate.html

When you have uniform samples, you will only have to wory about the time delta (t[1] - t[0]) of your samples. In this case, you can directly use the fft functions

Y    = numpy.fft.fft(y)
freq = numpy.fft.fftfreq(len(y), t[1] - t[0])

pylab.figure()
pylab.plot( freq, numpy.abs(Y) )
pylab.figure()
pylab.plot(freq, numpy.angle(Y) )
pylab.show()

This should solve your problem.

Question 7

您的高尖峰信号是由于信号的DC（不变，即freq = 0）部分引起的。这是规模问题。如果要查看非DC频率内容，则为了可视化，可能需要从偏移量1而不是信号FFT的偏移量0绘制。

修改@PaulH上面给出的示例

import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack

# Number of samplepoints
N = 600
# sample spacing
T = 1.0 / 800.0
x = np.linspace(0.0, N*T, N)
y = 10 + np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = scipy.fftpack.fft(y)
xf = np.linspace(0.0, 1.0/(2.0*T), N/2)

plt.subplot(2, 1, 1)
plt.plot(xf, 2.0/N * np.abs(yf[0:N/2]))
plt.subplot(2, 1, 2)
plt.plot(xf[1:], 2.0/N * np.abs(yf[0:N/2])[1:])

输出图：

另一种方法是以对数刻度可视化数据：

使用：

plt.semilogy(xf, 2.0/N * np.abs(yf[0:N/2]))

将会呈现：

Question 8

The high spike that you have is due to the DC (non-varying, i.e. freq = 0) portion of your signal. It’s an issue of scale. If you want to see non-DC frequency content, for visualization, you may need to plot from the offset 1 not from offset 0 of the FFT of the signal.

Modifying the example given above by @PaulH

import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack

# Number of samplepoints
N = 600
# sample spacing
T = 1.0 / 800.0
x = np.linspace(0.0, N*T, N)
y = 10 + np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = scipy.fftpack.fft(y)
xf = np.linspace(0.0, 1.0/(2.0*T), N/2)

plt.subplot(2, 1, 1)
plt.plot(xf, 2.0/N * np.abs(yf[0:N/2]))
plt.subplot(2, 1, 2)
plt.plot(xf[1:], 2.0/N * np.abs(yf[0:N/2])[1:])

The output plots:

Another way, is to visualize the data in log scale:

Using:

plt.semilogy(xf, 2.0/N * np.abs(yf[0:N/2]))

Will show:

Question 9

作为对已经给出的答案的补充，我想指出的是，经常需要考虑FFT的bin大小。测试一堆值并选择对您的应用程序更有意义的值将是有意义的。通常，它与样本数量的大小相同。给出的大多数答案都假定了这一点，并且产生了很好且合理的结果。如果有人想探索一下，这是我的代码版本：

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack

fig = plt.figure(figsize=[14,4])
N = 600           # Number of samplepoints
Fs = 800.0
T = 1.0 / Fs      # N_samps*T (#samples x sample period) is the sample spacing.
N_fft = 80        # Number of bins (chooses granularity)
x = np.linspace(0, N*T, N)     # the interval
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)   # the signal

# removing the mean of the signal
mean_removed = np.ones_like(y)*np.mean(y)
y = y - mean_removed

# Compute the fft.
yf = scipy.fftpack.fft(y,n=N_fft)
xf = np.arange(0,Fs,Fs/N_fft)

##### Plot the fft #####
ax = plt.subplot(121)
pt, = ax.plot(xf,np.abs(yf), lw=2.0, c='b')
p = plt.Rectangle((Fs/2, 0), Fs/2, ax.get_ylim()[1], facecolor="grey", fill=True, alpha=0.75, hatch="/", zorder=3)
ax.add_patch(p)
ax.set_xlim((ax.get_xlim()[0],Fs))
ax.set_title('FFT', fontsize= 16, fontweight="bold")
ax.set_ylabel('FFT magnitude (power)')
ax.set_xlabel('Frequency (Hz)')
plt.legend((p,), ('mirrowed',))
ax.grid()

##### Close up on the graph of fft#######
# This is the same histogram above, but truncated at the max frequence + an offset. 
offset = 1    # just to help the visualization. Nothing important.
ax2 = fig.add_subplot(122)
ax2.plot(xf,np.abs(yf), lw=2.0, c='b')
ax2.set_xticks(xf)
ax2.set_xlim(-1,int(Fs/6)+offset)
ax2.set_title('FFT close-up', fontsize= 16, fontweight="bold")
ax2.set_ylabel('FFT magnitude (power) - log')
ax2.set_xlabel('Frequency (Hz)')
ax2.hold(True)
ax2.grid()

plt.yscale('log')

输出图：

Question 10

Just as a complement to the answers already given, I would like to point out that often it is important to play with the size of the bins for the FFT. It would make sense to test a bunch of values and pick the one that makes more sense to your application. Often, it is in the same magnitude of the number of samples. This was as assumed by most of the answers given, and produces great and reasonable results. In case one wants to explore that, here is my code version:

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack

fig = plt.figure(figsize=[14,4])
N = 600           # Number of samplepoints
Fs = 800.0
T = 1.0 / Fs      # N_samps*T (#samples x sample period) is the sample spacing.
N_fft = 80        # Number of bins (chooses granularity)
x = np.linspace(0, N*T, N)     # the interval
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)   # the signal

# removing the mean of the signal
mean_removed = np.ones_like(y)*np.mean(y)
y = y - mean_removed

# Compute the fft.
yf = scipy.fftpack.fft(y,n=N_fft)
xf = np.arange(0,Fs,Fs/N_fft)

##### Plot the fft #####
ax = plt.subplot(121)
pt, = ax.plot(xf,np.abs(yf), lw=2.0, c='b')
p = plt.Rectangle((Fs/2, 0), Fs/2, ax.get_ylim()[1], facecolor="grey", fill=True, alpha=0.75, hatch="/", zorder=3)
ax.add_patch(p)
ax.set_xlim((ax.get_xlim()[0],Fs))
ax.set_title('FFT', fontsize= 16, fontweight="bold")
ax.set_ylabel('FFT magnitude (power)')
ax.set_xlabel('Frequency (Hz)')
plt.legend((p,), ('mirrowed',))
ax.grid()

##### Close up on the graph of fft#######
# This is the same histogram above, but truncated at the max frequence + an offset. 
offset = 1    # just to help the visualization. Nothing important.
ax2 = fig.add_subplot(122)
ax2.plot(xf,np.abs(yf), lw=2.0, c='b')
ax2.set_xticks(xf)
ax2.set_xlim(-1,int(Fs/6)+offset)
ax2.set_title('FFT close-up', fontsize= 16, fontweight="bold")
ax2.set_ylabel('FFT magnitude (power) - log')
ax2.set_xlabel('Frequency (Hz)')
ax2.hold(True)
ax2.grid()

plt.yscale('log')

the output plots:

Question 11

我建立了一个函数，用于绘制真实信号的FFT。与先前的答案相比，我的功能额外的好处是您可以获得信号的实际幅度。

另外，由于假设是真实信号，因此FFT是对称的，因此我们只能绘制x轴的正向：

import matplotlib.pyplot as plt
import numpy as np
import warnings


def fftPlot(sig, dt=None, plot=True):
    # Here it's assumes analytic signal (real signal...) - so only half of the axis is required

    if dt is None:
        dt = 1
        t = np.arange(0, sig.shape[-1])
        xLabel = 'samples'
    else:
        t = np.arange(0, sig.shape[-1]) * dt
        xLabel = 'freq [Hz]'

    if sig.shape[0] % 2 != 0:
        warnings.warn("signal preferred to be even in size, autoFixing it...")
        t = t[0:-1]
        sig = sig[0:-1]

    sigFFT = np.fft.fft(sig) / t.shape[0]  # Divided by size t for coherent magnitude

    freq = np.fft.fftfreq(t.shape[0], d=dt)

    # Plot analytic signal - right half of frequence axis needed only...
    firstNegInd = np.argmax(freq < 0)
    freqAxisPos = freq[0:firstNegInd]
    sigFFTPos = 2 * sigFFT[0:firstNegInd]  # *2 because of magnitude of analytic signal

    if plot:
        plt.figure()
        plt.plot(freqAxisPos, np.abs(sigFFTPos))
        plt.xlabel(xLabel)
        plt.ylabel('mag')
        plt.title('Analytic FFT plot')
        plt.show()

    return sigFFTPos, freqAxisPos


if __name__ == "__main__":
    dt = 1 / 1000

    # Build a signal within Nyquist - the result will be the positive FFT with actual magnitude
    f0 = 200  # [Hz]
    t = np.arange(0, 1 + dt, dt)
    sig = 1 * np.sin(2 * np.pi * f0 * t) + \
        10 * np.sin(2 * np.pi * f0 / 2 * t) + \
        3 * np.sin(2 * np.pi * f0 / 4 * t) +\
        7.5 * np.sin(2 * np.pi * f0 / 5 * t)

    # Result in frequencies
    fftPlot(sig, dt=dt)
    # Result in samples (if the frequencies axis is unknown)
    fftPlot(sig)

Question 12

I’ve built a function that deals with plotting FFT of real signals. The extra bonus in my function relative to the messages above is that you get the ACTUAL amplitude of the signal. Also, because of the assumption of a real signal, the FFT is symmetric so we can plot only the positive side of the x axis:

import matplotlib.pyplot as plt
import numpy as np
import warnings


def fftPlot(sig, dt=None, plot=True):
    # here it's assumes analytic signal (real signal...)- so only half of the axis is required

    if dt is None:
        dt = 1
        t = np.arange(0, sig.shape[-1])
        xLabel = 'samples'
    else:
        t = np.arange(0, sig.shape[-1]) * dt
        xLabel = 'freq [Hz]'

    if sig.shape[0] % 2 != 0:
        warnings.warn("signal prefered to be even in size, autoFixing it...")
        t = t[0:-1]
        sig = sig[0:-1]

    sigFFT = np.fft.fft(sig) / t.shape[0]  # divided by size t for coherent magnitude

    freq = np.fft.fftfreq(t.shape[0], d=dt)

    # plot analytic signal - right half of freq axis needed only...
    firstNegInd = np.argmax(freq < 0)
    freqAxisPos = freq[0:firstNegInd]
    sigFFTPos = 2 * sigFFT[0:firstNegInd]  # *2 because of magnitude of analytic signal

    if plot:
        plt.figure()
        plt.plot(freqAxisPos, np.abs(sigFFTPos))
        plt.xlabel(xLabel)
        plt.ylabel('mag')
        plt.title('Analytic FFT plot')
        plt.show()

    return sigFFTPos, freqAxisPos


if __name__ == "__main__":
    dt = 1 / 1000

    # build a signal within nyquist - the result will be the positive FFT with actual magnitude
    f0 = 200  # [Hz]
    t = np.arange(0, 1 + dt, dt)
    sig = 1 * np.sin(2 * np.pi * f0 * t) + \
        10 * np.sin(2 * np.pi * f0 / 2 * t) + \
        3 * np.sin(2 * np.pi * f0 / 4 * t) +\
        7.5 * np.sin(2 * np.pi * f0 / 5 * t)

    # res in freqs
    fftPlot(sig, dt=dt)
    # res in samples (if freqs axis is unknown)
    fftPlot(sig)

Question 13

此页面上已经有不错的解决方案，但是所有人都假定数据集是统一/均匀采样/分布的。我将尝试提供一个更一般的随机采样数据示例。我还将使用此MATLAB教程作为示例：

添加所需的模块：

import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack
import scipy.signal

生成样本数据：

N = 600 # Number of samples
t = np.random.uniform(0.0, 1.0, N) # Assuming the time start is 0.0 and time end is 1.0
S = 1.0 * np.sin(50.0 * 2 * np.pi * t) + 0.5 * np.sin(80.0 * 2 * np.pi * t)
X = S + 0.01 * np.random.randn(N) # Adding noise

排序数据集：

order = np.argsort(t)
ts = np.array(t)[order]
Xs = np.array(X)[order]

重采样：

T = (t.max() - t.min()) / N # Average period
Fs = 1 / T # Average sample rate frequency
f = Fs * np.arange(0, N // 2 + 1) / N; # Resampled frequency vector
X_new, t_new = scipy.signal.resample(Xs, N, ts)

绘制数据和重新采样的数据：

plt.xlim(0, 0.1)
plt.plot(t_new, X_new, label="resampled")
plt.plot(ts, Xs, label="org")
plt.legend()
plt.ylabel("X")
plt.xlabel("t")

现在计算FFT：

Y = scipy.fftpack.fft(X_new)
P2 = np.abs(Y / N)
P1 = P2[0 : N // 2 + 1]
P1[1 : -2] = 2 * P1[1 : -2]

plt.ylabel("Y")
plt.xlabel("f")
plt.plot(f, P1)

PS我终于有时间实施一个更规范的算法来获得不均匀分布数据的傅立叶变换。您可以在此处查看代码，说明和示例Jupyter笔记本。

Question 14

There are already great solutions on this page, but all have assumed the dataset is uniformly/evenly sampled/distributed. I will try to provide a more general example of randomly sampled data. I will also use this MATLAB tutorial as an example:

Adding the required modules:

import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack
import scipy.signal

Generating sample data:

N = 600 # number of samples
t = np.random.uniform(0.0, 1.0, N) # assuming the time start is 0.0 and time end is 1.0
S = 1.0 * np.sin(50.0 * 2 * np.pi * t) + 0.5 * np.sin(80.0 * 2 * np.pi * t) 
X = S + 0.01 * np.random.randn(N) # adding noise

Sorting the data set:

order = np.argsort(t)
ts = np.array(t)[order]
Xs = np.array(X)[order]

Resampling:

T = (t.max() - t.min()) / N # average period 
Fs = 1 / T # average sample rate frequency
f = Fs * np.arange(0, N // 2 + 1) / N; # resampled frequency vector
X_new, t_new = scipy.signal.resample(Xs, N, ts)

plotting the data and resampled data:

plt.xlim(0, 0.1)
plt.plot(t_new, X_new, label="resampled")
plt.plot(ts, Xs, label="org")
plt.legend()
plt.ylabel("X")
plt.xlabel("t")

now calculating the fft:

Y = scipy.fftpack.fft(X_new)
P2 = np.abs(Y / N)
P1 = P2[0 : N // 2 + 1]
P1[1 : -2] = 2 * P1[1 : -2]

plt.ylabel("Y")
plt.xlabel("f")
plt.plot(f, P1)

P.S. I finally got time to implement a more canonical algorithm to get a Fourier transform of unevenly distributed data. You may see the code, description, and example Jupyter notebook here.

Question 15

我写了这个额外的答案，以解释使用FFT时尖峰扩散的根源，特别是讨论scipy.fftpack教程，我在某些时候对此表示不同意。

在此示例中，记录时间tmax=N*T=0.75。信号是sin(50*2*pi*x) + 0.5*sin(80*2*pi*x)。频率信号应包含两个尖峰，其频率50和80幅度分别为1和0.5。但是，如果所分析的信号没有整数周期，则由于信号的截断会出现扩散：

派克1：50*tmax=37.5=>频率50不是的倍数1/tmax=>存在扩散的由于在该频率信号截断。
派克2：80*tmax=60=>频率80是的倍数1/tmax=>无扩散由于在该频率信号截断。

这是一段代码，它分析的信号与教程（sin(50*2*pi*x) + 0.5*sin(80*2*pi*x)）中的信号相同，但略有不同：

原始的scipy.fftpack示例。
原始scipy.fftpack示例具有整数个信号周期（tmax=1.0而不是0.75为了避免截断扩散）。
原始scipy.fftpack示例，其中包含整数个信号周期，并且日期和频率均取自FFT理论。

编码：

import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack

# 1. Linspace
N = 600
# Sample spacing
tmax = 3/4
T = tmax / N # =1.0 / 800.0
x1 = np.linspace(0.0, N*T, N)
y1 = np.sin(50.0 * 2.0*np.pi*x1) + 0.5*np.sin(80.0 * 2.0*np.pi*x1)
yf1 = scipy.fftpack.fft(y1)
xf1 = np.linspace(0.0, 1.0/(2.0*T), N//2)

# 2. Integer number of periods
tmax = 1
T = tmax / N # Sample spacing
x2 = np.linspace(0.0, N*T, N)
y2 = np.sin(50.0 * 2.0*np.pi*x2) + 0.5*np.sin(80.0 * 2.0*np.pi*x2)
yf2 = scipy.fftpack.fft(y2)
xf2 = np.linspace(0.0, 1.0/(2.0*T), N//2)

# 3. Correct positioning of dates relatively to FFT theory ('arange' instead of 'linspace')
tmax = 1
T = tmax / N # Sample spacing
x3 = T * np.arange(N)
y3 = np.sin(50.0 * 2.0*np.pi*x3) + 0.5*np.sin(80.0 * 2.0*np.pi*x3)
yf3 = scipy.fftpack.fft(y3)
xf3 = 1/(N*T) * np.arange(N)[:N//2]

fig, ax = plt.subplots()
# Plotting only the left part of the spectrum to not show aliasing
ax.plot(xf1, 2.0/N * np.abs(yf1[:N//2]), label='fftpack tutorial')
ax.plot(xf2, 2.0/N * np.abs(yf2[:N//2]), label='Integer number of periods')
ax.plot(xf3, 2.0/N * np.abs(yf3[:N//2]), label='Correct positioning of dates')
plt.legend()
plt.grid()
plt.show()

输出：

就像这里可能的那样，即使使用整数周期，仍然会有一些扩散。此行为是由于scipy.fftpack教程中日期和频率的位置不正确造成的。因此，在离散傅立叶变换理论中：

信号应t=0,T,...,(N-1)*T在T为采样周期且信号总持续时间为的日期进行评估tmax=N*T。请注意，我们在停下来tmax-T。
相关联的频率f=0,df,...,(N-1)*df，其中df=1/tmax=1/(N*T)是采样频率。信号的所有谐波应为采样频率的倍数，以避免扩散。

在上面的示例中，您可以看到使用arange代替linspace可以避免频谱中的额外扩散。此外，使用该linspace版本还会导致尖峰的偏移，该尖峰的频率略高于应有的尖峰，这在第一张图片中可以看到，在这些图片中，尖峰稍微位于频率50和的右边80。

我将得出结论，用法示例应替换为以下代码（在我看来，这不太容易引起误解）：

import numpy as np
from scipy.fftpack import fft

# Number of sample points
N = 600
T = 1.0 / 800.0
x = T*np.arange(N)
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = fft(y)
xf = 1/(N*T)*np.arange(N//2)
import matplotlib.pyplot as plt
plt.plot(xf, 2.0/N * np.abs(yf[0:N//2]))
plt.grid()
plt.show()

输出（第二个峰值不再扩散）：

我认为此答案仍会带来一些有关如何正确应用离散傅立叶变换的附加说明。显然，我的答案太长了，总是有其他要说的东西（例如，关于“别名”的简短讨论，关于“窗口化”的说法很多），所以我将停止。

我认为，在应用离散傅里叶变换时，深刻理解离散傅里叶变换的原理非常重要，因为我们都知道很多人在应用离散傅里叶变换时都会在其中添加因数以获得自己想要的东西。

Question 16

I write this additionnal answer to explain the origins of the diffusion of the spikes when using fft and especially discuss the scipy.fftpack tutorial with which I disagree at some point.

In this example, the recording time tmax=N*T=0.75. The signal is sin(50*2*pi*x)+0.5*sin(80*2*pi*x). The frequency signal should contain 2 spikes at frequencies 50 and 80 with amplitudes 1 and 0.5. However, if the analysed signal does not have a integer number of periods diffusion can appear due to the truncation of the signal:

Pike 1: 50*tmax=37.5 => frequency 50 is not a multiple of 1/tmax => Presence of diffusion due to signal truncation at this frequency.
Pike 2: 80*tmax=60 => frequency 80 is a multiple of 1/tmax => No diffusion due to signal truncation at this frequency.

Here is a code that analyses the same signal as in the tutorial (sin(50*2*pi*x)+0.5*sin(80*2*pi*x)) but with the slight differences:

The original scipy.fftpack example.
The original scipy.fftpack example with an integer number of signal periods (tmax=1.0 instead of 0.75 to avoid truncation diffusion).
The original scipy.fftpack example with an integer number of signal periods and where the dates and frequencies are taken from the FFT theory.

The code:

import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack

# 1. Linspace
N = 600
# sample spacing
tmax = 3/4
T = tmax / N # =1.0 / 800.0
x1 = np.linspace(0.0, N*T, N)
y1 = np.sin(50.0 * 2.0*np.pi*x1) + 0.5*np.sin(80.0 * 2.0*np.pi*x1)
yf1 = scipy.fftpack.fft(y1)
xf1 = np.linspace(0.0, 1.0/(2.0*T), N//2)

# 2. Integer number of periods
tmax = 1
T = tmax / N # sample spacing
x2 = np.linspace(0.0, N*T, N)
y2 = np.sin(50.0 * 2.0*np.pi*x2) + 0.5*np.sin(80.0 * 2.0*np.pi*x2)
yf2 = scipy.fftpack.fft(y2)
xf2 = np.linspace(0.0, 1.0/(2.0*T), N//2)

# 3. Correct positionning of dates relatively to FFT theory (arange instead of linspace)
tmax = 1
T = tmax / N # sample spacing
x3 = T * np.arange(N)
y3 = np.sin(50.0 * 2.0*np.pi*x3) + 0.5*np.sin(80.0 * 2.0*np.pi*x3)
yf3 = scipy.fftpack.fft(y3)
xf3 = 1/(N*T) * np.arange(N)[:N//2]

fig, ax = plt.subplots()
# Plotting only the left part of the spectrum to not show aliasing
ax.plot(xf1, 2.0/N * np.abs(yf1[:N//2]), label='fftpack tutorial')
ax.plot(xf2, 2.0/N * np.abs(yf2[:N//2]), label='Integer number of periods')
ax.plot(xf3, 2.0/N * np.abs(yf3[:N//2]), label='Correct positionning of dates')
plt.legend()
plt.grid()
plt.show()

Output:

As it can be here, even with using an integer number of periods some diffusion still remains. This behaviour is due to a bad positionning of dates and frequencies in the scipy.fftpack tutorial. Hence, in the theory of discrete Fourier transforms:

the signal should be evaluated at dates t=0,T,...,(N-1)*T where T is the sampling period and the total duration of the signal is tmax=N*T. Note that we stop at tmax-T.
the associated frequencies are f=0,df,...,(N-1)*df where df=1/tmax=1/(N*T) is the sampling frequency. All harmonics of the signal should be multiple of the sampling frequency to avoid diffusion.

In the example above, you can see that the use of arange instead of linspace enables to avoid additional diffusion in the frequency spectrum. Moreover, using the linspace version also leads to an offset of the spikes that are located at slightly higher frequencies than what they should be as it can be seen in the first picture where the spikes are a little bit at the right of the frequencies 50 and 80.

I’ll just conclude that the example of usage should be replace by the following code (which is less misleading in my opinion):

import numpy as np
from scipy.fftpack import fft
# Number of sample points
N = 600
T = 1.0 / 800.0
x = T*np.arange(N)
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = fft(y)
xf = 1/(N*T)*np.arange(N//2)
import matplotlib.pyplot as plt
plt.plot(xf, 2.0/N * np.abs(yf[0:N//2]))
plt.grid()
plt.show()

Output (the second spike is not diffused anymore):

I think this answer still bring some additional explanations on how to apply correctly discrete Fourier transform. Obviously, my answer is too long and there is always additional things to say (@ewerlopes talked briefly about aliasing for instance and a lot can be said about windowing) so I’ll stop. I think that it is very important to understand deeply the principles of discrete Fourier transform when applying it because we all know so much people adding factors here and there when applying it in order to obtain what they want.

Python 实用宝典

在Python中绘制快速傅立叶变换

问题：在Python中绘制快速傅立叶变换

回答 0

响应发布的原始数据和评论

In response to the raw data and comments being posted

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

有趣好用的Python教程