I’m looking for a way to use something like the modulus operator in django. What I am trying to do is to add a classname to every fourth element in a loop.
With modulus it would look like this:
{% for p in posts %}
<div class="post width1 height2 column {% if forloop.counter0 % 4 == 0 %}first{% endif %}}">
<div class="preview">
</div>
<div class="overlay">
</div>
<h2>p.title</h2>
</div>
{% endfor %}
Of course this doesn’t work because % is a reserved character. Is there any other way to do this?
Exception happened during processing of request from('127.0.0.1',34226)Traceback(most recent call last):File"/usr/lib/python2.7/SocketServer.py", line 284,in
_handle_request_noblock
self.process_request(request, client_address)File"/usr/lib/python2.7/SocketServer.py", line 310,in process_request
self.finish_request(request, client_address)File"/usr/lib/python2.7/SocketServer.py", line 323,in finish_request
self.RequestHandlerClass(request, client_address, self)File"/usr/lib/python2.7/SocketServer.py", line 641,in __init__
self.finish()File"/usr/lib/python2.7/SocketServer.py", line 694,in finish
self.wfile.flush()File"/usr/lib/python2.7/socket.py", line 303,in flush
self._sock.sendall(view[write_offset:write_offset+buffer_size])
error:[Errno32]Broken pipe
Currently I am using an app built in python. When I run it in personal computer, it works without problems.
However, when I move it into a production server. It keeps showing me the error attached as below:.
I’ve done some research and I got the reason that the end user browser stops the connection while the server is still busy sending data.
I wonder why did it happen and what is the root cause that prevents it from running properly in production server, while it works on my personal computer. Any advice is appreciated
Exception happened during processing of request from ('127.0.0.1', 34226)
Traceback (most recent call last):
File "/usr/lib/python2.7/SocketServer.py", line 284, in
_handle_request_noblock
self.process_request(request, client_address)
File "/usr/lib/python2.7/SocketServer.py", line 310, in process_request
self.finish_request(request, client_address)
File "/usr/lib/python2.7/SocketServer.py", line 323, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/usr/lib/python2.7/SocketServer.py", line 641, in __init__
self.finish()
File "/usr/lib/python2.7/SocketServer.py", line 694, in finish
self.wfile.flush()
File "/usr/lib/python2.7/socket.py", line 303, in flush
self._sock.sendall(view[write_offset:write_offset+buffer_size])
error: [Errno 32] Broken pipe
Your server process has received a SIGPIPE writing to a socket. This usually happens when you write to a socket fully closed on the other (client) side. This might be happening when a client program doesn’t wait till all the data from the server is received and simply closes a socket (using close function).
In a C program you would normally try setting to ignore SIGPIPE signal or setting a dummy signal handler for it. In this case a simple error will be returned when writing to a closed socket. In your case a python seems to throw an exception that can be handled as a premature disconnect of the client.
It depends on how you tested it, and possibly on differences in the TCP stack implementation of the personal computer and the server.
For example, if your sendall always completes immediately (or very quickly) on the personal computer, the connection may simply never have broken during sending. This is very likely if your browser is running on the same machine (since there is no real network latency).
In general, you just need to handle the case where a client disconnects before you’re finished, by handling the exception.
Remember that TCP communications are asynchronous, but this is much more obvious on physically remote connections than on local ones, so conditions like this can be hard to reproduce on a local workstation. Specifically, loopback connections on a single machine are often almost synchronous.
The broken pipe error usually occurs if your request is blocked or takes too long and after request-side timeout, it’ll close the connection and then, when the respond-side (server) tries to write to the socket, it will throw a pipe broken error.
回答 3
这可能是因为您使用两种方法将数据插入数据库中,这导致站点速度降低。
def add_subscriber(request, email=None):if request.method =='POST':
email = request.POST['email_field']
e =Subscriber.objects.create(email=email).save()<====returnHttpResponseRedirect('/')else:returnHttpResponseRedirect('/')
在上面的函数中,错误是箭头指向的位置。正确的实现如下:
def add_subscriber(request, email=None):if request.method =='POST':
email = request.POST['email_field']
e =Subscriber.objects.create(email=email)returnHttpResponseRedirect('/')else:returnHttpResponseRedirect('/')
I need an algorithm that can give me positions around a sphere for N points (less than 20, probably) that vaguely spreads them out. There’s no need for “perfection”, but I just need it so none of them are bunched together.
This question provided good code, but I couldn’t find a way to make this uniform, as this seemed 100% randomized.
This blog post recommended had two ways allowing input of number of points on the sphere, but the Saff and Kuijlaars algorithm is exactly in psuedocode I could transcribe, and the code example I found contained “node[k]”, which I couldn’t see explained and ruined that possibility. The second blog example was the Golden Section Spiral, which gave me strange, bunched up results, with no clear way to define a constant radius.
This algorithm from this question seems like it could possibly work, but I can’t piece together what’s on that page into psuedocode or anything.
A few other question threads I came across spoke of randomized uniform distribution, which adds a level of complexity I’m not concerned about. I apologize that this is such a silly question, but I wanted to show that I’ve truly looked hard and still come up short.
So, what I’m looking for is simple pseudocode to evenly distribute N points around a unit sphere, that either returns in spherical or Cartesian coordinates. Even better if it can even distribute with a bit of randomization (think planets around a star, decently spread out, but with room for leeway).
> cat ll.py
from math import asin
nx =4; ny =5for x in range(nx):
lon =360*((x+0.5)/ nx)for y in range(ny):
midpt =(y+0.5)/ ny
lat =180* asin(2*((y+0.5)/ny-0.5))print lon,lat
> python2.7 ll.py
45.0-166.9131392445.0-74.073032292145.00.045.074.073032292145.0166.91313924135.0-166.91313924135.0-74.0730322921135.00.0135.074.0730322921135.0166.91313924225.0-166.91313924225.0-74.0730322921225.00.0225.074.0730322921225.0166.91313924315.0-166.91313924315.0-74.0730322921315.00.0315.074.0730322921315.0166.91313924
In this example codenode[k] is just the kth node. You are generating an array N points and node[k] is the kth (from 0 to N-1). If that is all that is confusing you, hopefully you can use that now.
(in other words, k is an array of size N that is defined before the code fragment starts, and which contains a list of the points).
Alternatively, building on the other answer here (and using Python):
> cat ll.py
from math import asin
nx = 4; ny = 5
for x in range(nx):
lon = 360 * ((x+0.5) / nx)
for y in range(ny):
midpt = (y+0.5) / ny
lat = 180 * asin(2*((y+0.5)/ny-0.5))
print lon,lat
> python2.7 ll.py
45.0 -166.91313924
45.0 -74.0730322921
45.0 0.0
45.0 74.0730322921
45.0 166.91313924
135.0 -166.91313924
135.0 -74.0730322921
135.0 0.0
135.0 74.0730322921
135.0 166.91313924
225.0 -166.91313924
225.0 -74.0730322921
225.0 0.0
225.0 74.0730322921
225.0 166.91313924
315.0 -166.91313924
315.0 -74.0730322921
315.0 0.0
315.0 74.0730322921
315.0 166.91313924
If you plot that, you’ll see that the vertical spacing is larger near the poles so that each point is situated in about the same total area of space (near the poles there’s less space “horizontally”, so it gives more “vertically”).
This isn’t the same as all points having about the same distance to their neighbours (which is what I think your links are talking about), but it may be sufficient for what you want and improves on simply making a uniform lat/lon grid.
import math
def fibonacci_sphere(samples=1):
points =[]
phi = math.pi *(3.- math.sqrt(5.))# golden angle in radiansfor i in range(samples):
y =1-(i / float(samples -1))*2# y goes from 1 to -1
radius = math.sqrt(1- y * y)# radius at y
theta = phi * i # golden angle increment
x = math.cos(theta)* radius
z = math.sin(theta)* radius
points.append((x, y, z))return points
import math
def fibonacci_sphere(samples=1):
points = []
phi = math.pi * (3. - math.sqrt(5.)) # golden angle in radians
for i in range(samples):
y = 1 - (i / float(samples - 1)) * 2 # y goes from 1 to -1
radius = math.sqrt(1 - y * y) # radius at y
theta = phi * i # golden angle increment
x = math.cos(theta) * radius
z = math.sin(theta) * radius
points.append((x, y, z))
return points
from numpy import pi, cos, sin, sqrt, arange
import matplotlib.pyplot as pp
num_pts =100
indices = arange(0, num_pts, dtype=float)+0.5
r = sqrt(indices/num_pts)
theta = pi *(1+5**0.5)* indices
pp.scatter(r*cos(theta), r*sin(theta))
pp.show()
我们的区域元素,这是[R d [R d θ,现在变成了没有,备受更复杂的罪孽(φ)d φ d θ。因此,我们对统一的间距联合密度是罪(φ)/4π。积分出θ,我们发现˚F(φ)= SIN(φ)/ 2,从而˚F(φ)=(1 – COS(φ))/ 2。反相此我们可以看到,一个均匀随机变量看起来像ACOS(1 – 2 ü),但我们采样均匀,而不是随机的,所以我们改为使用φ ķ = ACOS(1 – 2( ķ+ 0.5)/ N)。算法的其余部分只是将其投影到x,y和z坐标上:
from numpy import pi, cos, sin, arccos, arange
import mpl_toolkits.mplot3d
import matplotlib.pyplot as pp
num_pts =1000
indices = arange(0, num_pts, dtype=float)+0.5
phi = arccos(1-2*indices/num_pts)
theta = pi *(1+5**0.5)* indices
x, y, z = cos(theta)* sin(phi), sin(theta)* sin(phi), cos(phi);
pp.figure().add_subplot(111, projection='3d').scatter(x, y, z);
pp.show()
一旦您知道这是结果,证明就很简单。如果您问z < Z < z + d z的概率是什么,这与问z < F -1(U)< z + d z的概率是什么,将F应用于所有三个表达式表示它是一个单调递增的函数,因此F(z)< U < F(z + d z),向外扩展右侧以找到F(z)+ f(z)d z,并且由于U是均匀的,因此如所承诺的,该概率仅为f(z)d z。
You said you couldn’t get the golden spiral method to work and that’s a shame because it’s really, really good. I would like to give you a complete understanding of it so that maybe you can understand how to keep this away from being “bunched up.”
So here’s a fast, non-random way to create a lattice that is approximately correct; as discussed above, no lattice will be perfect, but this may be good enough. It is compared to other methods e.g. at BendWavy.org but it just has a nice and pretty look as well as a guarantee about even spacing in the limit.
Primer: sunflower spirals on the unit disk
To understand this algorithm, I first invite you to look at the 2D sunflower spiral algorithm. This is based on the fact that the most irrational number is the golden ratio (1 + sqrt(5))/2 and if one emits points by the approach “stand at the center, turn a golden ratio of whole turns, then emit another point in that direction,” one naturally constructs a spiral which, as you get to higher and higher numbers of points, nevertheless refuses to have well-defined ‘bars’ that the points line up on.(Note 1.)
The algorithm for even spacing on a disk is,
from numpy import pi, cos, sin, sqrt, arange
import matplotlib.pyplot as pp
num_pts = 100
indices = arange(0, num_pts, dtype=float) + 0.5
r = sqrt(indices/num_pts)
theta = pi * (1 + 5**0.5) * indices
pp.scatter(r*cos(theta), r*sin(theta))
pp.show()
and it produces results that look like (n=100 and n=1000):
Spacing the points radially
The key strange thing is the formula r = sqrt(indices / num_pts); how did I come to that one? (Note 2.)
Well, I am using the square root here because I want these to have even-area spacing around the disk. That is the same as saying that in the limit of large N I want a little region R ∈ (r, r + dr), Θ ∈ (θ, θ + dθ) to contain a number of points proportional to its area, which is r dr dθ. Now if we pretend that we are talking about a random variable here, this has a straightforward interpretation as saying that the joint probability density for (R, Θ) is just c r for some constant c. Normalization on the unit disk would then force c = 1/π.
Now let me introduce a trick. It comes from probability theory where it’s known as sampling the inverse CDF: suppose you wanted to generate a random variable with a probability density f(z) and you have a random variable U ~ Uniform(0, 1), just like comes out of random() in most programming languages. How do you do this?
First, turn your density into a cumulative distribution function or CDF, which we will call F(z). A CDF, remember, increases monotonically from 0 to 1 with derivative f(z).
Then calculate the CDF’s inverse function F-1(z).
You will find that Z = F-1(U) is distributed according to the target density. (Note 3).
Now the golden-ratio spiral trick spaces the points out in a nicely even pattern for θ so let’s integrate that out; for the unit disk we are left with F(r) = r2. So the inverse function is F-1(u) = u1/2, and therefore we would generate random points on the disk in polar coordinates with r = sqrt(random()); theta = 2 * pi * random().
Now instead of randomly sampling this inverse function we’re uniformly sampling it, and the nice thing about uniform sampling is that our results about how points are spread out in the limit of large N will behave as if we had randomly sampled it. This combination is the trick. Instead of random() we use (arange(0, num_pts, dtype=float) + 0.5)/num_pts, so that, say, if we want to sample 10 points they are r = 0.05, 0.15, 0.25, ... 0.95. We uniformly sample r to get equal-area spacing, and we use the sunflower increment to avoid awful “bars” of points in the output.
Now doing the sunflower on a sphere
The changes that we need to make to dot the sphere with points merely involve switching out the polar coordinates for spherical coordinates. The radial coordinate of course doesn’t enter into this because we’re on a unit sphere. To keep things a little more consistent here, even though I was trained as a physicist I’ll use mathematicians’ coordinates where 0 ≤ φ ≤ π is latitude coming down from the pole and 0 ≤ θ ≤ 2π is longitude. So the difference from above is that we are basically replacing the variable r with φ.
Our area element, which was r dr dθ, now becomes the not-much-more-complicated sin(φ) dφ dθ. So our joint density for uniform spacing is sin(φ)/4π. Integrating out θ, we find f(φ) = sin(φ)/2, thus F(φ) = (1 − cos(φ))/2. Inverting this we can see that a uniform random variable would look like acos(1 – 2 u), but we sample uniformly instead of randomly, so we instead use φk = acos(1 − 2 (k + 0.5)/N). And the rest of the algorithm is just projecting this onto the x, y, and z coordinates:
from numpy import pi, cos, sin, arccos, arange
import mpl_toolkits.mplot3d
import matplotlib.pyplot as pp
num_pts = 1000
indices = arange(0, num_pts, dtype=float) + 0.5
phi = arccos(1 - 2*indices/num_pts)
theta = pi * (1 + 5**0.5) * indices
x, y, z = cos(theta) * sin(phi), sin(theta) * sin(phi), cos(phi);
pp.figure().add_subplot(111, projection='3d').scatter(x, y, z);
pp.show()
Again for n=100 and n=1000 the results look like:
Further research
I wanted to give a shout out to Martin Roberts’s blog. Note that above I created an offset of my indices by adding 0.5 to each index. This was just visually appealing to me, but it turns out that the choice of offset matters a lot and is not constant over the interval and can mean getting as much as 8% better accuracy in packing if chosen correctly. There should also be a way to get his R2 sequence to cover a sphere and it would be interesting to see if this also produced a nice even covering, perhaps as-is but perhaps needing to be, say, taken from only a half of the unit square cut diagonally or so and stretched around to get a circle.
Notes
Those “bars” are formed by rational approximations to a number, and the best rational approximations to a number come from its continued fraction expression, z + 1/(n_1 + 1/(n_2 + 1/(n_3 + ...))) where z is an integer and n_1, n_2, n_3, ... is either a finite or infinite sequence of positive integers:
def continued_fraction(r):
while r != 0:
n = floor(r)
yield n
r = 1/(r - n)
Since the fraction part 1/(...) is always between zero and one, a large integer in the continued fraction allows for a particularly good rational approximation: “one divided by something between 100 and 101” is better than “one divided by something between 1 and 2.” The most irrational number is therefore the one which is 1 + 1/(1 + 1/(1 + ...)) and has no particularly good rational approximations; one can solve φ = 1 + 1/φ by multiplying through by φ to get the formula for the golden ratio.
For folks who are not so familiar with NumPy — all of the functions are “vectorized,” so that sqrt(array) is the same as what other languages might write map(sqrt, array). So this is a component-by-component sqrt application. The same also holds for division by a scalar or addition with scalars — those apply to all components in parallel.
The proof is simple once you know that this is the result. If you ask what’s the probability that z < Z < z + dz, this is the same as asking what’s the probability that z < F-1(U) < z + dz, apply F to all three expressions noting that it is a monotonically increasing function, hence F(z) < U < F(z + dz), expand the right hand side out to find F(z) + f(z) dz, and since U is uniform this probability is just f(z) dz as promised.
This is known as packing points on a sphere, and there is no (known) general, perfect solution. However, there are plenty of imperfect solutions. The three most popular seem to be:
Create a simulation. Treat each point as an electron constrained to a sphere, then run a simulation for a certain number of steps. The electrons’ repulsion will naturally tend the system to a more stable state, where the points are about as far away from each other as they can get.
Hypercube rejection. This fancy-sounding method is actually really simple: you uniformly choose points (much more than n of them) inside of the cube surrounding the sphere, then reject the points outside of the sphere. Treat the remaining points as vectors, and normalize them. These are your “samples” – choose n of them using some method (randomly, greedy, etc).
Spiral approximations. You trace a spiral around a sphere, and evenly-distribute the points around the spiral. Because of the mathematics involved, these are more complicated to understand than the simulation, but much faster (and probably involving less code). The most popular seems to be by Saff, et al.
A lot more information about this problem can be found here
What you are looking for is called a spherical covering. The spherical covering problem is very hard and solutions are unknown except for small numbers of points. One thing that is known for sure is that given n points on a sphere, there always exist two points of distance d = (4-csc^2(\pi n/6(n-2)))^(1/2) or closer.
If you want a probabilistic method for generating points uniformly distributed on a sphere, it’s easy: generate points in space uniformly by Gaussian distribution (it’s built into Java, not hard to find the code for other languages). So in 3-dimensional space, you need something like
Random r = new Random();
double[] p = { r.nextGaussian(), r.nextGaussian(), r.nextGaussian() };
Then project the point onto the sphere by normalizing its distance from the origin
The Gaussian distribution in n dimensions is spherically symmetric so the projection onto the sphere is uniform.
Of course, there’s no guarantee that the distance between any two points in a collection of uniformly generated points will be bounded below, so you can use rejection to enforce any such conditions that you might have: probably it’s best to generate the whole collection and then reject the whole collection if necessary. (Or use “early rejection” to reject the whole collection you’ve generated so far; just don’t keep some points and drop others.) You can use the formula for d given above, minus some slack, to determine the min distance between points below which you will reject a set of points. You’ll have to calculate n choose 2 distances, and the probability of rejection will depend on the slack; it’s hard to say how, so run a simulation to get a feel for the relevant statistics.
from math import cos, sin, pi, sqrt
defGetPointsEquiAngularlyDistancedOnSphere(numberOfPoints=45):""" each point you get will be of form 'x, y, z'; in cartesian coordinates
eg. the 'l2 distance' from the origion [0., 0., 0.] for each point will be 1.0
------------
converted from: http://web.archive.org/web/20120421191837/http://www.cgafaq.info/wiki/Evenly_distributed_points_on_sphere )
"""
dlong = pi*(3.0-sqrt(5.0))# ~2.39996323
dz =2.0/numberOfPoints
long =0.0
z =1.0- dz/2.0
ptsOnSphere =[]for k in range(0, numberOfPoints):
r = sqrt(1.0-z*z)
ptNew =(cos(long)*r, sin(long)*r, z)
ptsOnSphere.append( ptNew )
z = z - dz
long = long + dlong
return ptsOnSphere
if __name__ =='__main__':
ptsOnSphere =GetPointsEquiAngularlyDistancedOnSphere(80)#toggle True/False to print themif(True):for pt in ptsOnSphere:print( pt)#toggle True/False to plot themif(True):from numpy import*import pylab as p
import mpl_toolkits.mplot3d.axes3d as p3
fig=p.figure()
ax = p3.Axes3D(fig)
x_s=[];y_s=[]; z_s=[]for pt in ptsOnSphere:
x_s.append( pt[0]); y_s.append( pt[1]); z_s.append( pt[2])
ax.scatter3D( array( x_s), array( y_s), array( z_s))
ax.set_xlabel('X'); ax.set_ylabel('Y'); ax.set_zlabel('Z')
p.show()#end
This answer is based on the same ‘theory’ that is outlined well by this answer
I’m adding this answer as:
— None of the other options fit the ‘uniformity’ need ‘spot-on’ (or not obviously-clearly so). (Noting to get the planet like distribution looking behavior particurally wanted in the original ask, you just reject from the finite list of the k uniformly created points at random (random wrt the index count in the k items back).)
–The closest other impl forced you to decide the ‘N’ by ‘angular axis’, vs. just ‘one value of N’ across both angular axis values ( which at low counts of N is very tricky to know what may, or may not matter (e.g. you want ‘5’ points — have fun ) )
–Furthermore, it’s very hard to ‘grok’ how to differentiate between the other options without any imagery, so here’s what this option looks like (below), and the ready-to-run implementation that goes with it.
from math import cos, sin, pi, sqrt
def GetPointsEquiAngularlyDistancedOnSphere(numberOfPoints=45):
""" each point you get will be of form 'x, y, z'; in cartesian coordinates
eg. the 'l2 distance' from the origion [0., 0., 0.] for each point will be 1.0
------------
converted from: http://web.archive.org/web/20120421191837/http://www.cgafaq.info/wiki/Evenly_distributed_points_on_sphere )
"""
dlong = pi*(3.0-sqrt(5.0)) # ~2.39996323
dz = 2.0/numberOfPoints
long = 0.0
z = 1.0 - dz/2.0
ptsOnSphere =[]
for k in range( 0, numberOfPoints):
r = sqrt(1.0-z*z)
ptNew = (cos(long)*r, sin(long)*r, z)
ptsOnSphere.append( ptNew )
z = z - dz
long = long + dlong
return ptsOnSphere
if __name__ == '__main__':
ptsOnSphere = GetPointsEquiAngularlyDistancedOnSphere( 80)
#toggle True/False to print them
if( True ):
for pt in ptsOnSphere: print( pt)
#toggle True/False to plot them
if(True):
from numpy import *
import pylab as p
import mpl_toolkits.mplot3d.axes3d as p3
fig=p.figure()
ax = p3.Axes3D(fig)
x_s=[];y_s=[]; z_s=[]
for pt in ptsOnSphere:
x_s.append( pt[0]); y_s.append( pt[1]); z_s.append( pt[2])
ax.scatter3D( array( x_s), array( y_s), array( z_s) )
ax.set_xlabel('X'); ax.set_ylabel('Y'); ax.set_zlabel('Z')
p.show()
#end
tested at low counts (N in 2, 5, 7, 13, etc) and seems to work ‘nice’
回答 6
尝试:
function sphere ( N:float,k:int):Vector3{
var inc =Mathf.PI *(3-Mathf.Sqrt(5));
var off =2/ N;
var y = k * off -1+(off /2);
var r =Mathf.Sqrt(1- y*y);
var phi = k * inc;returnVector3((Mathf.Cos(phi)*r), y,Mathf.Sin(phi)*r);};
function sphere ( N:float,k:int):Vector3 {
var inc = Mathf.PI * (3 - Mathf.Sqrt(5));
var off = 2 / N;
var y = k * off - 1 + (off / 2);
var r = Mathf.Sqrt(1 - y*y);
var phi = k * inc;
return Vector3((Mathf.Cos(phi)*r), y, Mathf.Sin(phi)*r);
};
The above function should run in loop with N loop total and k loop current iteration.
It is based on a sunflower seeds pattern, except the sunflower seeds are curved around into a half dome, and again into a sphere.
It’s probably overkill, but maybe after looking at it you’ll realize some of it’s other nice properties are interesting to you. It’s way more than just a function that outputs a point cloud.
I landed here trying to find it again; the name “healpix” doesn’t exactly evoke spheres…
回答 8
仅需少量点就可以运行模拟:
from random import random,randint
r =10
n =20
best_closest_d =0
best_points =[]
points =[(r,0,0)for i in range(n)]for simulation in range(10000):
x = random()*r
y = random()*r
z = r-(x**2+y**2)**0.5if randint(0,1):
x =-x
if randint(0,1):
y =-y
if randint(0,1):
z =-z
closest_dist =(2*r)**2
closest_index =Nonefor i in range(n):for j in range(n):if i==j:continue
p1,p2 = points[i],points[j]
x1,y1,z1 = p1
x2,y2,z2 = p2
d =(x1-x2)**2+(y1-y2)**2+(z1-z2)**2if d < closest_dist:
closest_dist = d
closest_index = i
if simulation %100==0:print simulation,closest_dist
if closest_dist > best_closest_d:
best_closest_d = closest_dist
best_points = points[:]
points[closest_index]=(x,y,z)print best_points
>>> best_points
[(9.921692138442777,-9.930808529773849,4.037839326088124),(5.141893371460546,1.7274947332807744,-4.575674650522637),(-4.917695758662436,-1.090127967097737,-4.9629263893193745),(3.6164803265540666,7.004158551438312,-2.1172868271109184),(-9.550655088997003,-9.580386054762917,3.5277052594769422),(-0.062238110294250415,6.803105171979587,3.1966101417463655),(-9.600996012203195,9.488067284474834,-3.498242301168819),(-8.601522086624803,4.519484132245867,-0.2834204048792728),(-1.1198210500791472,-2.2916581379035694,7.44937337008726),(7.981831370440529,8.539378431788634,1.6889099589074377),(0.513546008372332,-2.974333486904779,-6.981657873262494),(-4.13615438946178,-6.707488383678717,2.1197605651446807),(2.2859494919024326,-8.14336582650039,1.5418694699275672),(-7.241410895247996,9.907335206038226,2.271647103735541),(-9.433349952523232,-7.999106443463781,-2.3682575660694347),(3.704772125650199,1.0526567864085812,6.148581714099761),(-3.5710511242327048,5.512552040316693,-3.4318468250897647),(-7.483466337225052,-1.506434920354559,2.36641535124918),(7.73363824231576,-8.460241422163824,-1.4623228616326003),(10,0,0)]
Take the two largest factors of your N, if N==20 then the two largest factors are {5,4}, or, more generally {a,b}. Calculate
dlat = 180/(a+1)
dlong = 360/(b+1})
Put your first point at {90-dlat/2,(dlong/2)-180}, your second at {90-dlat/2,(3*dlong/2)-180}, your 3rd at {90-dlat/2,(5*dlong/2)-180}, until you’ve tripped round the world once, by which time you’ve got to about {75,150} when you go next to {90-3*dlat/2,(dlong/2)-180}.
Obviously I’m working this in degrees on the surface of the spherical earth, with the usual conventions for translating +/- to N/S or E/W. And obviously this gives you a completely non-random distribution, but it is uniform and the points are not bunched together.
To add some degree of randomness, you could generate 2 normally-distributed (with mean 0 and std dev of {dlat/3, dlong/3} as appropriate) and add them to your uniformly distributed points.
Your language typically has a uniform random number primitive. For example in python you can use random.random() to return a number in the range [0,1). You can multiply this number by k to get a random number in the range [0,k). Thus in python, uniform([0,2pi)) would mean random.random()*2*math.pi.
Proof
Now we can’t assign θ uniformly, otherwise we’d get clumping at the poles. We wish to assign probabilities proportional to the surface area of the spherical wedge (the θ in this diagram is actually φ):
An angular displacement dφ at the equator will result in a displacement of dφ*r. What will that displacement be at an arbitrary azimuth θ? Well, the radius from the z-axis is r*sin(θ), so the arclength of that “latitude” intersecting the wedge is dφ * r*sin(θ). Thus we calculate the cumulative distribution of the area to sample from it, by integrating the area of the slice from the south pole to the north pole.
OR… to place 20 points, compute the centers of the icosahedronal faces. For 12 points, find the vertices of the icosahedron. For 30 points, the mid point of the edges of the icosahedron. you can do the same thing with the tetrahedron, cube, dodecahedron and octahedrons: one set of points is on the vertices, another on the center of the face and another on the center of the edges. They cannot be mixed, however.
@robert king It’s a really nice solution but has some sloppy bugs in it. I know it helped me a lot though, so never mind the sloppiness. :)
Here is a cleaned up version….
from math import pi, asin, sin, degrees
halfpi, twopi = .5 * pi, 2 * pi
sphere_area = lambda R=1.0: 4 * pi * R ** 2
lat_dist = lambda lat, R=1.0: R*(1-sin(lat))
#A = 2*pi*R^2(1-sin(lat))
def sphere_latarea(lat, R=1.0):
if -halfpi > lat or lat > halfpi:
raise ValueError("lat must be between -halfpi and halfpi")
return 2 * pi * R ** 2 * (1-sin(lat))
sphere_lonarea = lambda lon, R=1.0: \
4 * pi * R ** 2 * lon / twopi
#A = 2*pi*R^2 |sin(lat1)-sin(lat2)| |lon1-lon2|/360
# = (pi/180)R^2 |sin(lat1)-sin(lat2)| |lon1-lon2|
sphere_rectarea = lambda lat0, lat1, lon0, lon1, R=1.0: \
(sphere_latarea(lat0, R)-sphere_latarea(lat1, R)) * (lon1-lon0) / twopi
def test_sphere(n_lats=10, n_lons=19, radius=540.0):
total_area = 0.0
for i_lons in range(n_lons):
lon0 = twopi * float(i_lons) / n_lons
lon1 = twopi * float(i_lons+1) / n_lons
for i_lats in range(n_lats):
lat0 = asin(2 * float(i_lats) / n_lats - 1)
lat1 = asin(2 * float(i_lats+1)/n_lats - 1)
area = sphere_rectarea(lat0, lat1, lon0, lon1, radius)
print("{:} {:}: {:9.4f} to {:9.4f}, {:9.4f} to {:9.4f} => area {:10.4f}"
.format(i_lats, i_lons
, degrees(lat0), degrees(lat1)
, degrees(lon0), degrees(lon1)
, area))
total_area += area
print("total_area = {:10.4f} (difference of {:10.4f})"
.format(total_area, abs(total_area) - sphere_area(radius)))
test_sphere()
回答 14
这行得通,而且非常简单。您想要的点数:
private function moveTweets():void {
var newScale:Number=Scale(meshes.length,50,500,6,2);
trace("new scale:"+newScale);
var l:Number=this.meshes.length;
var tweetMeshInstance:TweetMesh;
var destx:Number;
var desty:Number;
var destz:Number;for(var i:Number=0;i<this.meshes.length;i++){
tweetMeshInstance=meshes[i];
var phi:Number=Math.acos(-1+(2* i )/ l );
var theta:Number=Math.sqrt( l *Math.PI )* phi;
tweetMeshInstance.origX =(sphereRadius+5)*Math.cos( theta )*Math.sin( phi );
tweetMeshInstance.origY=(sphereRadius+5)*Math.sin( theta )*Math.sin( phi );
tweetMeshInstance.origZ =(sphereRadius+5)*Math.cos( phi );
destx=sphereRadius *Math.cos( theta )*Math.sin( phi );
desty=sphereRadius *Math.sin( theta )*Math.sin( phi );
destz=sphereRadius *Math.cos( phi );
tweetMeshInstance.lookAt(new Vector3D());TweenMax.to(tweetMeshInstance,1,{scaleX:newScale,scaleY:newScale,x:destx,y:desty,z:destz,onUpdate:onLookAtTween, onUpdateParams:[tweetMeshInstance]});}}
private function onLookAtTween(theMesh:TweetMesh):void {
theMesh.lookAt(new Vector3D());}
You normally can’t get instance attributes given just a class, at least not without instantiating the class. You can get instance attributes given an instance, though, or class attributes given a class. See the ‘inspect’ module. You can’t get a list of instance attributes because instances really can have anything as attribute, and — as in your example — the normal way to create them is to just assign to them in the __init__ method.
An exception is if your class uses slots, which is a fixed list of attributes that the class allows instances to have. Slots are explained in http://www.python.org/2.2.3/descrintro.html, but there are various pitfalls with slots; they affect memory layout, so multiple inheritance may be problematic, and inheritance in general has to take slots into account, too.
回答 3
Vars()和dict方法都将适用于OP发布的示例,但不适用于“松散”定义的对象,例如:
class foo:
a ='foo'
b ='bar'
要打印所有不可调用的属性,可以使用以下功能:
def printVars(object):for i in[v for v in dir(object)ifnot callable(getattr(object,v))]:print'\n%s:'% i
exec('print object.%s\n\n')% i
Your example shows “instance variables”, not really class variables.
Look in hi_obj.__class__.__dict__.items() for the class variables, along with other other class members like member functions and the containing module.
Although not directly an answer to the OP question, there is a pretty sweet way of finding out what variables are in scope in a function. take a look at this code:
The func_code attribute has all kinds of interesting things in it. It allows you todo some cool stuff. Here is an example of how I have have used this:
def exec_command(self, cmd, msg, sig):
def message(msg):
a = self.link.process(self.link.recieved_message(msg))
self.exec_command(*a)
def error(msg):
self.printer.printInfo(msg)
def set_usrlist(msg):
self.client.connected_users = msg
def chatmessage(msg):
self.printer.printInfo(msg)
if not locals().has_key(cmd): return
cmd = locals()[cmd]
try:
if 'sig' in cmd.func_code.co_varnames and \
'msg' in cmd.func_code.co_varnames:
cmd(msg, sig)
elif 'msg' in cmd.func_code.co_varnames:
cmd(msg)
else:
cmd()
except Exception, e:
print '\n-----------ERROR-----------'
print 'error: ', e
print 'Error proccessing: ', cmd.__name__
print 'Message: ', msg
print 'Sig: ', sig
print '-----------ERROR-----------\n'
def sprint(object):
result =''for i in[v for v in dir(object)ifnot callable(getattr(object, v))and v[0]!='_']:
result +='\n%s:'% i + str(getattr(object, i,''))return result
built on dmark’s answer to get the following, which is useful if you want the equiv of sprintf and hopefully will help someone…
def sprint(object):
result = ''
for i in [v for v in dir(object) if not callable(getattr(object, v)) and v[0] != '_']:
result += '\n%s:' % i + str(getattr(object, i, ''))
return result
回答 9
有时您想根据公共/私有变量来过滤列表。例如
def pub_vars(self):"""Gives the variable names of our instance we want to expose
"""return[k for k in vars(self)ifnot k.startswith('_')]
I have a date column in a MySQL table. I want to insert a datetime.datetime() object into this column. What should I be using in the execute statement?
I have tried:
now = datetime.datetime(2009,5,5)
cursor.execute("INSERT INTO table
(name, id, datecolumn) VALUES (%s, %s
, %s)",("name", 4,now))
I am getting an error as: "TypeError: not all arguments converted during string formatting"
What should I use instead of %s?
MySQL recognizes DATETIME and TIMESTAMP values in these formats:
As a string in either ‘YYYY-MM-DD HH:MM:SS’ or ‘YY-MM-DD HH:MM:SS’
format. A “relaxed” syntax is permitted here, too: Any punctuation
character may be used as the delimiter between date parts or time
parts. For example, ‘2012-12-31 11:30:45’, ‘2012^12^31 11+30+45’,
‘2012/12/31 11*30*45’, and ‘2012@12@31 11^30^45’ are equivalent.
The only delimiter recognized between a date and time part and a
fractional seconds part is the decimal point.
The date and time parts can be separated by T rather than a space. For
example, ‘2012-12-31 11:30:45’ ‘2012-12-31T11:30:45’ are equivalent.
As a string with no delimiters in either ‘YYYYMMDDHHMMSS’ or
‘YYMMDDHHMMSS’ format, provided that the string makes sense as a date.
For example, ‘20070523091528’ and ‘070523091528’ are interpreted as
‘2007-05-23 09:15:28’, but ‘071122129015’ is illegal (it has a
nonsensical minute part) and becomes ‘0000-00-00 00:00:00’.
As a number in either YYYYMMDDHHMMSS or YYMMDDHHMMSS format, provided
that the number makes sense as a date. For example, 19830905132800 and
830905132800 are interpreted as ‘1983-09-05 13:28:00’.
What database are you connecting to? I know Oracle can be picky about date formats and likes ISO 8601 format.
**Note: Oops, I just read you are on MySQL. Just format the date and try it as a separate direct SQL call to test.
In Python, you can get an ISO date like
now.isoformat()
For instance, Oracle likes dates like
insert into x values(99, '31-may-09');
Depending on your database, if it is Oracle you might need to TO_DATE it:
insert into x
values(99, to_date('2009/05/31:12:00:00AM', 'yyyy/mm/dd:hh:mi:ssam'));
The general usage of TO_DATE is:
TO_DATE(<string>, '<format>')
If using another database (I saw the cursor and thought Oracle; I could be wrong) then check their date format tools. For MySQL it is DATE_FORMAT() and SQL Server it is CONVERT.
Also using a tool like SQLAlchemy will remove differences like these and make your life easy.
If you’re just using a python datetime.date (not a full datetime.datetime), just cast the date as a string. This is very simple and works for me (mysql, python 2.7, Ubuntu). The column published_date is a MySQL date field, the python variable publish_date is datetime.date.
# make the record for the passed link info
sql_stmt = "INSERT INTO snippet_links (" + \
"link_headline, link_url, published_date, author, source, coco_id, link_id)" + \
"VALUES(%s, %s, %s, %s, %s, %s, %s) ;"
sql_data = ( title, link, str(publish_date), \
author, posted_by, \
str(coco_id), str(link_id) )
try:
dbc.execute(sql_stmt, sql_data )
except Exception, e:
...
The “Python Distribute” guide (was at python-distribute.org, but that registration has lapsed) tells me to include doc/txt files and .py files are excluded in MANIFEST.in file
The sourcedist documentation tells me only sdist uses MANIFEST.in and only includes file you specify and to include .py files. It also tells me to use: python setup.py sdist --manifest-only to generate a MANIFEST, but python tells me this doesn’t exist
I appreciate these are from different versions of python and the distribution system is in a
complete mess, but assuming I am using python 3 and setuptools (the new one that includes distribute but now called setuptools, not the old setuptools that was deprecated for distribute tools only to be brought back into distribute and distribute renamed to setuptools…..)
and I’m following the ‘standard’ folder structure and setup.py file,
Do I need a MANIFEST.in ?
What should be in it ?
When will all these different package systems and methods be made into one single simple process ?
No, you do not have to use MANIFEST.in. Both, distutils and setuptools are including in source
distribution package all the files mentioned in setup.py – modules, package python files,
README.txt and test/test*.py. If this is all you want to have in distribution package, you do
not have to use MANIFEST.in.
If you want to manipulate (add or remove) default files to include, you have to use MANIFEST.in.
Re: What should be in it?
The procedure is simple:
Make sure, in your setup.py you include (by means of setup arguments) all the files you feel important for the program to run (modules, packages, scripts …)
Clarify, if there are some files to add or some files to exclude. If neither is needed, then there is no need for using MANIFEST.in.
If MANIFEST.in is needed, create it. Usually, you add there tests*/*.py files, README.rst if you do not use README.txt, docs files and possibly some data files for test suite, if necessary.
For example:
include README.rst
include COPYING.txt
To test it, run python setup.py sdist, and examine the tarball created under dist/.
When will all these different package systems …
Comparing the situation today and 2 years ago – the situation is much much better – setuptools is the way to go. You can ignore the fact, distutils is a bit broken and is low level base for setuptools as setuptools shall take care of hiding these things from you.
EDIT: Last few projects I use pbr for building distribution packages with three line setup.py and rest being in setup.cfg and requirements.txt. No need to care about MANIFEST.in and other strange stuff. Even though the package would deserve a bit more documentation. See http://docs.openstack.org/developer/pbr/
No, you don’t need MANIFEST.in. However, to get setuptools to do what you (usually) mean, you do need to use the setuptools_scm, which takes the role of MANIFEST.in in 2 key places:
It ensures all relevant files are packaged when running the sdist command (where all relevant files is defined as “all files under source control”)
When using include_package_data to include package data as part of the build or bdist_wheel. (again: files under source control)
The historical understanding of MANIFEST.in is: when you don’t have a source control system, you need some other mechanism to distinguish between “source files” and “files that happen to be in your working directory”. However, your project is under source control (right??) so there’s no need for MANIFEST.in. More info in this article.
When I try to run app.py (Python 3.3, PyCrypto 2.6) my virtualenv keeps returning the error listed above. My import statement is just from Crypto.Cipher import AES. I looked for duplicates and you might say that there are some, but I tried the solutions (although most are not even solutions) and nothing worked.
You can see what the files are like for PyCrypto below:
I ran into this on Mac as well, and it seems to be related to having an unfortunately similarly named “crypto” module (not sure what that is for) installed alongside of pycrypto via pip.
The fix seems to be removing both crypto and pycrypto with pip:
# install python3 and pip3
sudo apt update
sudo apt upgrade
sudo apt install python3
sudo apt install python3-pip
# install virtualenv
pip3 install virtualenv
# install and create a virtual environment in your target folder
mkdir target_folder
cd target_folder
python3 -m virtualenv .# now activate your venv and install pycryptodome
source bin/activate
pip3 install pycryptodome
# check if everything worked: # start the interactive python console and import the Crypto module# when there is no import error then it worked
python
>>>fromCrypto.Cipherimport AES
>>> exit()# don't forget to deactivate your venv again
deactivate
As you can read on this page, the usage of pycrypto is not safe anymore:
Pycrypto is vulnerable to a heap-based buffer overflow in the ALGnew function in block_templace.c. It allows remote attackers to execute arbitrary code in the python application. It was assigned the CVE-2013-7459 number.
Pycrypto didn’t release any fix to that vulnerability and no commit was made to the project since Jun 20, 2014.
Make sure to uninstall other versions of crypto or pycrypto first because both packages install under the same folder Crypto where also pycryptodome will be installed.
Best practice: virtual environments
In order to avoid problems with pip packages in different versions or packages that install under the same folder (i.e. pycrypto and pycryptodome) you can make use of a so called virtual environment. There, the installed pip packages can be managed for every single project individually.
To install a virtual environment and setup everything, use the following commands:
# install python3 and pip3
sudo apt update
sudo apt upgrade
sudo apt install python3
sudo apt install python3-pip
# install virtualenv
pip3 install virtualenv
# install and create a virtual environment in your target folder
mkdir target_folder
cd target_folder
python3 -m virtualenv .
# now activate your venv and install pycryptodome
source bin/activate
pip3 install pycryptodome
# check if everything worked:
# start the interactive python console and import the Crypto module
# when there is no import error then it worked
python
>>> from Crypto.Cipher import AES
>>> exit()
# don't forget to deactivate your venv again
deactivate
I’ve had the same problem 'ImportError: No module named Crypto.Cipher', since using GoogleAppEngineLauncher (version > 1.8.X) with GAE Boilerplate on OSX 10.8.5 (Mountain Lion). In Google App Engine SDK with python 2.7 runtime, pyCrypto 2.6 is the suggested version.
The solution that worked for me was…
1) Download pycrypto2.6 source extract it somewhere(~/Downloads/pycrypto26)
To date, I’m having same issue when importing from Crypto.Cipher import AES even when I’ve installed/reinstalled pycrypto a few times. End up it’s because pip defaulted to python3.
~ pip --version
pip 18.0 from /usr/local/lib/python3.7/site-packages/pip (python 3.7)
installing pycrypto with pip2 should solve this issue.
This problem can be fixed by installing the C++ compiler (python27 or python26). Download it from Microsoft https://www.microsoft.com/en-us/download/details.aspx?id=44266 and re-run the command : pip install pycrypto to run the gui web access when you kill the process of easy_install.exe.
Well this might appear weird but after installing pycrypto or pycryptodome , we need to update the directory name crypto to Crypto in lib/site-packages
I’m with 3.7. The issue remains after I try to install crypto. And pycrypto just fails in my case. So in the end my build passed via package below:
pip install pycryptodome
>>> s ='12abcd405'>>> result =''.join([i for i in s ifnot i.isdigit()])>>> result
'abcd'
这利用了列表理解,这里发生的事情与此结构类似:
no_digits =[]# Iterate through the string, adding non-numbers to the no_digits listfor i in s:ifnot i.isdigit():
no_digits.append(i)# Now join all elements of the list with '', # which puts all of the characters together.
result =''.join(no_digits)
>>> s = '12abcd405'
>>> result = ''.join([i for i in s if not i.isdigit()])
>>> result
'abcd'
This makes use of a list comprehension, and what is happening here is similar to this structure:
no_digits = []
# Iterate through the string, adding non-numbers to the no_digits list
for i in s:
if not i.isdigit():
no_digits.append(i)
# Now join all elements of the list with '',
# which puts all of the characters together.
result = ''.join(no_digits)
As @AshwiniChaudhary and @KirkStrauser point out, you actually do not need to use the brackets in the one-liner, making the piece inside the parentheses a generator expression (more efficient than a list comprehension). Even if this doesn’t fit the requirements for your assignment, it is something you should read about eventually :) :
>>> s = '12abcd405'
>>> result = ''.join(i for i in s if not i.isdigit())
>>> result
'abcd'
回答 1
而且,经常把它丢进去,是经常被遗忘的str.translate,它比循环/正则表达式快得多:
对于Python 2:
from string import digits
s ='abc123def456ghi789zero0'
res = s.translate(None, digits)# 'abcdefghizero'
对于Python 3:
from string import digits
s ='abc123def456ghi789zero0'
remove_digits = str.maketrans('','', digits)
res = s.translate(remove_digits)# 'abcdefghizero'
as mentioned above.
But my guess that you need something very simple
so say s is your string
and st_res is a string without digits, then here is your code
l = ['0','1','2','3','4','5','6','7','8','9']
st_res=""
for ch in s:
if ch not in l:
st_res+=ch
回答 6
我很乐意使用正则表达式来完成此操作,但是由于您只能使用列表,循环,函数等。
这是我想出的:
stringWithNumbers="I have 10 bananas for my 5 monkeys!"
stringWithoutNumbers=''.join(c if c notin map(str,range(0,10))else""for c in stringWithNumbers)print(stringWithoutNumbers)#I have bananas for my monkeys!
I’d love to use regex to accomplish this, but since you can only use lists, loops, functions, etc..
here’s what I came up with:
stringWithNumbers="I have 10 bananas for my 5 monkeys!"
stringWithoutNumbers=''.join(c if c not in map(str,range(0,10)) else "" for c in stringWithNumbers)
print(stringWithoutNumbers) #I have bananas for my monkeys!
If i understand your question right, one way to do is break down the string in chars and then check each char in that string using a loop whether it’s a string or a number and then if string save it in a variable and then once the loop is finished, display that to the user
You can’t do exactly what you want in Python (if I read you correctly). You need to put values in for each element of the list (or as you called it, array).
But, try this:
a = [0 for x in range(N)] # N = size of list you want
a[i] = 5 # as long as i < N, you're okay
For lists of other types, use something besides 0. None is often a good choice as well.
If you (or other searchers of this question) were actually interested in creating a contiguous array to fill with integers, consider bytearray and memoryivew:
import array
a = array.array('i', x * [0])
a[3] = 5
try:
[5] = 'a'
except TypeError:
print('integers only allowed')
Note that there’s no concept of un-initialized variable in python. A variable is a name that is bound to a value, so that value must have something. In the example above the array is initialized with zeros.
However, this is uncommon in python, unless you actually need it for low-level stuff. In most cases, you are better-off using an empty list or empty numpy array, as other answers suggest.
I’m looking for the fastest way to check for the occurrence of NaN (np.nan) in a NumPy array X. np.isnan(X) is out of the question, since it builds a boolean array of shape X.shape, which is potentially gigantic.
I tried np.nan in X, but that seems not to work because np.nan != np.nan. Is there a fast and memory-efficient way to do this at all?
(To those who would ask “how gigantic”: I can’t tell. This is input validation for library code.)
In[40]: x = np.random.rand(100000)In[41]:%timeit np.isnan(np.min(x))10000 loops, best of 3:153 us per loop
In[42]:%timeit np.isnan(np.sum(x))10000 loops, best of 3:95.9 us per loop
In[43]: x[50000]= np.nan
In[44]:%timeit np.isnan(np.min(x))1000 loops, best of 3:239 us per loop
In[45]:%timeit np.isnan(np.sum(x))10000 loops, best of 3:95.8 us per loop
In[46]: x[0]= np.nan
In[47]:%timeit np.isnan(np.min(x))1000 loops, best of 3:326 us per loop
In[48]:%timeit np.isnan(np.sum(x))10000 loops, best of 3:95.9 us per loop
Ray’s solution is good. However, on my machine it is about 2.5x faster to use numpy.sum in place of numpy.min:
In [13]: %timeit np.isnan(np.min(x))
1000 loops, best of 3: 244 us per loop
In [14]: %timeit np.isnan(np.sum(x))
10000 loops, best of 3: 97.3 us per loop
Unlike min, sum doesn’t require branching, which on modern hardware tends to be pretty expensive. This is probably the reason why sum is faster.
edit The above test was performed with a single NaN right in the middle of the array.
It is interesting to note that min is slower in the presence of NaNs than in their absence. It also seems to get slower as NaNs get closer to the start of the array. On the other hand, sum‘s throughput seems constant regardless of whether there are NaNs and where they’re located:
In [40]: x = np.random.rand(100000)
In [41]: %timeit np.isnan(np.min(x))
10000 loops, best of 3: 153 us per loop
In [42]: %timeit np.isnan(np.sum(x))
10000 loops, best of 3: 95.9 us per loop
In [43]: x[50000] = np.nan
In [44]: %timeit np.isnan(np.min(x))
1000 loops, best of 3: 239 us per loop
In [45]: %timeit np.isnan(np.sum(x))
10000 loops, best of 3: 95.8 us per loop
In [46]: x[0] = np.nan
In [47]: %timeit np.isnan(np.min(x))
1000 loops, best of 3: 326 us per loop
In [48]: %timeit np.isnan(np.sum(x))
10000 loops, best of 3: 95.9 us per loop
In[]: x= rand(1e5)In[]:%timeit isnan(x.min())10000 loops, best of 3:200 us per loop
In[]:%timeit isnan(x.sum())10000 loops, best of 3:169 us per loop
In[]:%timeit isnan(dot(x, x))10000 loops, best of 3:134 us per loop
In[]: x[5e4]=NaNIn[]:%timeit isnan(x.min())100 loops, best of 3:4.47 ms per loop
In[]:%timeit isnan(x.sum())100 loops, best of 3:6.44 ms per loop
In[]:%timeit isnan(dot(x, x))10000 loops, best of 3:138 us per loop
Even there exist an accepted answer, I’ll like to demonstrate the following (with Python 2.7.2 and Numpy 1.6.0 on Vista):
In []: x= rand(1e5)
In []: %timeit isnan(x.min())
10000 loops, best of 3: 200 us per loop
In []: %timeit isnan(x.sum())
10000 loops, best of 3: 169 us per loop
In []: %timeit isnan(dot(x, x))
10000 loops, best of 3: 134 us per loop
In []: x[5e4]= NaN
In []: %timeit isnan(x.min())
100 loops, best of 3: 4.47 ms per loop
In []: %timeit isnan(x.sum())
100 loops, best of 3: 6.44 ms per loop
In []: %timeit isnan(dot(x, x))
10000 loops, best of 3: 138 us per loop
Thus, the really efficient way might be heavily dependent on the operating system. Anyway dot(.) based seems to be the most stable one.
Apply some cumulative operation that preserves nans (like sum) and check its result.
While the first approach is certainly the cleanest, the heavy optimization of some of the cumulative operations (particularly the ones that are executed in BLAS, like dot) can make those quite fast. Note that dot, like some other BLAS operations, are multithreaded under certain conditions. This explains the difference in speed between different machines.
import numba as nb
import math
@nb.njit
def anynan(array):
array = array.ravel()for i in range(array.size):if math.isnan(array[i]):returnTruereturnFalse
如果没有NaN该函数,实际上可能会比慢np.min,这是因为np.min对大型数组使用了多重处理:
import numpy as np
array = np.random.random(2000000)%timeit anynan(array)# 100 loops, best of 3: 2.21 ms per loop%timeit np.isnan(array.sum())# 100 loops, best of 3: 4.45 ms per loop%timeit np.isnan(array.min())# 1000 loops, best of 3: 1.64 ms per loop
但是,如果数组中存在NaN,特别是如果它的位置在低索引处,那么它会快得多:
array = np.random.random(2000000)
array[100]= np.nan
%timeit anynan(array)# 1000000 loops, best of 3: 1.93 µs per loop%timeit np.isnan(array.sum())# 100 loops, best of 3: 4.57 ms per loop%timeit np.isnan(array.min())# 1000 loops, best of 3: 1.65 ms per loop
If you’re comfortable with numba it allows to create a fast short-circuit (stops as soon as a NaN is found) function:
import numba as nb
import math
@nb.njit
def anynan(array):
array = array.ravel()
for i in range(array.size):
if math.isnan(array[i]):
return True
return False
If there is no NaN the function might actually be slower than np.min, I think that’s because np.min uses multiprocessing for large arrays:
import numpy as np
array = np.random.random(2000000)
%timeit anynan(array) # 100 loops, best of 3: 2.21 ms per loop
%timeit np.isnan(array.sum()) # 100 loops, best of 3: 4.45 ms per loop
%timeit np.isnan(array.min()) # 1000 loops, best of 3: 1.64 ms per loop
But in case there is a NaN in the array, especially if it’s position is at low indices, then it’s much faster:
array = np.random.random(2000000)
array[100] = np.nan
%timeit anynan(array) # 1000000 loops, best of 3: 1.93 µs per loop
%timeit np.isnan(array.sum()) # 100 loops, best of 3: 4.57 ms per loop
%timeit np.isnan(array.min()) # 1000 loops, best of 3: 1.65 ms per loop
Similar results may be achieved with Cython or a C extension, these are a bit more complicated (or easily avaiable as bottleneck.anynan) but ultimatly do the same as my anynan function.
回答 6
与此相关的是如何找到首次出现的NaN的问题。这是我所知道的最快的处理方式:
index = next((i for(i,n)in enumerate(iterable)if n!=n),None)