from functools import reduce
def factors(n):
return set(reduce(list.__add__,
([i, n//i] for i in range(1, int(n**0.5) + 1) if n % i == 0)))
This will return all of the factors, very quickly, of a number n.
Why square root as the upper limit?
sqrt(x) * sqrt(x) = x. So if the two factors are the same, they’re both the square root. If you make one factor bigger, you have to make the other factor smaller. This means that one of the two will always be less than or equal to sqrt(x), so you only have to search up to that point to find one of the two matching factors. You can then use x / fac1 to get fac2.
The reduce(list.__add__, ...) is taking the little lists of [fac1, fac2] and joining them together in one long list.
The [i, n/i] for i in range(1, int(sqrt(n)) + 1) if n % i == 0 returns a pair of factors if the remainder when you divide n by the smaller one is zero (it doesn’t need to check the larger one too; it just gets that by dividing n by the smaller one.)
The set(...) on the outside is getting rid of duplicates, which only happens for perfect squares. For n = 4, this will return 2 twice, so set gets rid of one of them.
from math import sqrt
def factors(n):
step =2if n%2else1return set(reduce(list.__add__,([i, n//i]for i in range(1, int(sqrt(n))+1, step)if n % i ==0)))
import timeit
from math import sqrt
from matplotlib.pyplot import plot, legend, show
def factors_1(n):
step =2if n%2else1return set(reduce(list.__add__,([i, n//i]for i in range(1, int(sqrt(n))+1, step)if n % i ==0)))def factors_2(n):return set(reduce(list.__add__,([i, n//i]for i in range(1, int(sqrt(n))+1)if n % i ==0)))
X = range(1,100000,1000)
Y =[]for i in X:
f_1 = timeit.timeit('factors_1({})'.format(i), setup='from __main__ import factors_1', number=10000)
f_2 = timeit.timeit('factors_2({})'.format(i), setup='from __main__ import factors_2', number=10000)
Y.append(f_1/f_2)
plot(X,Y, label='Running time with/without parity check')
legend()
show()
The solution presented by @agf is great, but one can achieve ~50% faster run time for an arbitrary odd number by checking for parity. As the factors of an odd number always are odd themselves, it is not necessary to check these when dealing with odd numbers.
I’ve just started solving Project Euler puzzles myself. In some problems, a divisor check is called inside two nested for loops, and the performance of this function is thus essential.
Combining this fact with agf’s excellent solution, I’ve ended up with this function:
from math import sqrt
def factors(n):
step = 2 if n%2 else 1
return set(reduce(list.__add__,
([i, n//i] for i in range(1, int(sqrt(n))+1, step) if n % i == 0)))
However, on small numbers (~ < 100), the extra overhead from this alteration may cause the function to take longer.
I ran some tests in order to check the speed. Below is the code used. To produce the different plots, I altered the X = range(1,100,1) accordingly.
import timeit
from math import sqrt
from matplotlib.pyplot import plot, legend, show
def factors_1(n):
step = 2 if n%2 else 1
return set(reduce(list.__add__,
([i, n//i] for i in range(1, int(sqrt(n))+1, step) if n % i == 0)))
def factors_2(n):
return set(reduce(list.__add__,
([i, n//i] for i in range(1, int(sqrt(n)) + 1) if n % i == 0)))
X = range(1,100000,1000)
Y = []
for i in X:
f_1 = timeit.timeit('factors_1({})'.format(i), setup='from __main__ import factors_1', number=10000)
f_2 = timeit.timeit('factors_2({})'.format(i), setup='from __main__ import factors_2', number=10000)
Y.append(f_1/f_2)
plot(X,Y, label='Running time with/without parity check')
legend()
show()
X = range(1,100,1)
No significant difference here, but with bigger numbers, the advantage is obvious:
X = range(1,100000,1000) (only odd numbers)
X = range(2,100000,100) (only even numbers)
X = range(1,100000,1001) (alternating parity)
回答 2
AGF的答案确实很酷。我想看看是否可以重写它以避免使用reduce()。这是我想出的:
import itertools
flatten_iter = itertools.chain.from_iterable
def factors(n):return set(flatten_iter((i, n//i)for i in range(1, int(n**0.5)+1)if n % i ==0))
我还尝试了使用棘手的生成器功能的版本:
def factors(n):return set(x for tup in([i, n//i]for i in range(1, int(n**0.5)+1)if n % i ==0)for x in tup)
我通过计算来计时:
start =10000000
end = start +40000for n in range(start, end):
factors(n)
agf’s answer is really quite cool. I wanted to see if I could rewrite it to avoid using reduce(). This is what I came up with:
import itertools
flatten_iter = itertools.chain.from_iterable
def factors(n):
return set(flatten_iter((i, n//i)
for i in range(1, int(n**0.5)+1) if n % i == 0))
I also tried a version that uses tricky generator functions:
def factors(n):
return set(x for tup in ([i, n//i]
for i in range(1, int(n**0.5)+1) if n % i == 0) for x in tup)
I timed it by computing:
start = 10000000
end = start + 40000
for n in range(start, end):
factors(n)
I ran it once to let Python compile it, then ran it under the time(1) command three times and kept the best time.
reduce version: 11.58 seconds
itertools version: 11.49 seconds
tricky version: 11.12 seconds
Note that the itertools version is building a tuple and passing it to flatten_iter(). If I change the code to build a list instead, it slows down slightly:
iterools (list) version: 11.62 seconds
I believe that the tricky generator functions version is the fastest possible in Python. But it’s not really much faster than the reduce version, roughly 4% faster based on my measurements.
回答 3
AGF回答的另一种方法:
def factors(n):
result = set()for i in range(1, int(n **0.5)+1):
div, mod = divmod(n, i)if mod ==0:
result |={i, div}return result
Here’s an alternative to @agf’s solution which implements the same algorithm in a more pythonic style:
def factors(n):
return set(
factor for i in range(1, int(n**0.5) + 1) if n % i == 0
for factor in (i, n//i)
)
This solution works in both Python 2 and Python 3 with no imports and is much more readable. I haven’t tested the performance of this approach, but asymptotically it should be the same, and if performance is a serious concern, neither solution is optimal.
This took under a minute. It switches among a cocktail of methods. See the documentation linked above.
Given all the prime factors, all other factors can be built easily.
Note that even if the accepted answer was allowed to run for long enough (i.e. an eternity) to factor the above number, for some large numbers it will fail, such the following example. This is due to the sloppy int(n**0.5). For example, when n = 10000000000000079**2, we have
>>> int(n**0.5)
10000000000000078L
Since 10000000000000079 is a prime, the accepted answer’s algorithm will never find this factor. Note that it’s not just an off-by-one; for larger numbers it will be off by more. For this reason it’s better to avoid floating-point numbers in algorithms of this sort.
回答 6
对于n高达10 ** 16(甚至更多)的情况,这是一个快速的纯Python 3.6解决方案,
from itertools import compress
def primes(n):""" Returns a list of primes < n for n > 2 """
sieve = bytearray([True])*(n//2)for i in range(3,int(n**0.5)+1,2):if sieve[i//2]:
sieve[i*i//2::i]= bytearray((n-i*i-1)//(2*i)+1)return[2,*compress(range(3,n,2), sieve[1:])]def factorization(n):""" Returns a list of the prime factorization of n """
pf =[]for p in primeslist:if p*p > n :break
count =0whilenot n % p:
n //= p
count +=1if count >0: pf.append((p, count))if n >1: pf.append((n,1))return pf
def divisors(n):""" Returns an unsorted list of the divisors of n """
divs =[1]for p, e in factorization(n):
divs +=[x*p**k for k in range(1,e+1)for x in divs]return divs
n =600851475143
primeslist = primes(int(n**0.5)+1)print(divisors(n))
For n up to 10**16 (maybe even a bit more), here is a fast pure Python 3.6 solution,
from itertools import compress
def primes(n):
""" Returns a list of primes < n for n > 2 """
sieve = bytearray([True]) * (n//2)
for i in range(3,int(n**0.5)+1,2):
if sieve[i//2]:
sieve[i*i//2::i] = bytearray((n-i*i-1)//(2*i)+1)
return [2,*compress(range(3,n,2), sieve[1:])]
def factorization(n):
""" Returns a list of the prime factorization of n """
pf = []
for p in primeslist:
if p*p > n : break
count = 0
while not n % p:
n //= p
count += 1
if count > 0: pf.append((p, count))
if n > 1: pf.append((n, 1))
return pf
def divisors(n):
""" Returns an unsorted list of the divisors of n """
divs = [1]
for p, e in factorization(n):
divs += [x*p**k for k in range(1,e+1) for x in divs]
return divs
n = 600851475143
primeslist = primes(int(n**0.5)+1)
print(divisors(n))
def factors(n):
l1, l2 =[],[]for i in range(1, int(n **0.5)+1):
q,r = n//i, n%i # Alter: divmod() fn can be used.if r ==0:
l1.append(i)
l2.append(q)# q's obtained are decreasing.if l1[-1]== l2[-1]:# To avoid duplication of the possible factor sqrt(n)
l1.pop()
l2.reverse()return l1 + l2
Further improvement to afg & eryksun’s solution.
The following piece of code returns a sorted list of all the factors without changing run time asymptotic complexity:
def factors(n):
l1, l2 = [], []
for i in range(1, int(n ** 0.5) + 1):
q,r = n//i, n%i # Alter: divmod() fn can be used.
if r == 0:
l1.append(i)
l2.append(q) # q's obtained are decreasing.
if l1[-1] == l2[-1]: # To avoid duplication of the possible factor sqrt(n)
l1.pop()
l2.reverse()
return l1 + l2
Idea: Instead of using the list.sort() function to get a sorted list which gives nlog(n) complexity; It is much faster to use list.reverse() on l2 which takes O(n) complexity. (That’s how python is made.)
After l2.reverse(), l2 may be appended to l1 to get the sorted list of factors.
Notice, l1 contains i-s which are increasing. l2 contains q-s which are decreasing. Thats the reason behind using the above idea.
I’ve tried most of these wonderful answers with timeit to compare their efficiency versus my simple function and yet I constantly see mine outperform those listed here. I figured I’d share it and see what you all think.
def factors(n):
results = set()
for i in xrange(1, int(math.sqrt(n)) + 1):
if n % i == 0:
results.add(i)
results.add(int(n/i))
return results
As it’s written you’ll have to import math to test, but replacing math.sqrt(n) with n**.5 should work just as well. I don’t bother wasting time checking for duplicates because duplicates can’t exist in a set regardless.
回答 9
这是另一个没有reduce的替代方法,可以很好地处理大量数据。它用于sum拉平列表。
def factors(n):return set(sum([[i, n//i]for i in xrange(1, int(n**0.5)+1)ifnot n%i],[]))
import math
def factor(x):if x ==0or x ==1:returnNone
res =[]for i in range(2,int(math.floor(math.sqrt(x)+1))):while x % i ==0:
x /= i
res.append(i)if x !=1:# Unusual numbers
res.append(x)return res
Be sure to grab the number larger than sqrt(number_to_factor) for unusual numbers like 99 which has 3*3*11 and floor sqrt(99)+1 == 10.
import math
def factor(x):
if x == 0 or x == 1:
return None
res = []
for i in range(2,int(math.floor(math.sqrt(x)+1))):
while x % i == 0:
x /= i
res.append(i)
if x != 1: # Unusual numbers
res.append(x)
return res
回答 11
查找数量因子的最简单方法:
def factors(x):return[i for i in range(1,x+1)if x%i==0]
def factors(x):
return [i for i in range(1,x+1) if x%i==0]
回答 12
这是一个示例,如果您想使用质数更快。这些列表很容易在Internet上找到。我在代码中添加了注释。
# http://primes.utm.edu/lists/small/10000.txt# First 10000 primes
_PRIMES =(2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97,101,103,107,109,113,127,131,137,139,149,151,157,163,167,173,179,181,191,193,197,199,211,223,227,229,233,239,241,251,257,263,269,271,277,281,283,293,307,311,313,317,331,337,347,349,353,359,367,373,379,383,389,397,401,409,419,421,431,433,439,443,449,457,461,463,467,479,487,491,499,503,509,521,523,541,547,557,563,569,571,577,587,593,599,601,607,613,617,619,631,641,643,647,653,659,661,673,677,683,691,701,709,719,727,733,739,743,751,757,761,769,773,787,797,809,811,821,823,827,829,839,853,857,859,863,877,881,883,887,907,911,919,929,937,941,947,953,967,971,977,983,991,997,1009,1013,# Mising a lot of primes for the purpose of the example)from bisect import bisect_left as _bisect_left
from math import sqrt as _sqrt
def get_factors(n):assert isinstance(n, int),"n must be an integer."assert n >0,"n must be greather than zero."
limit = pow(_PRIMES[-1],2)assert n <= limit,"n is greather then the limit of {0}".format(limit)
result = set((1, n))
root = int(_sqrt(n))
primes =[t for t in get_primes_smaller_than(root +1)ifnot n % t]
result.update(primes)# Add all the primes factors less or equal to root squarefor t in primes:
result.update(get_factors(n/t))# Add all the factors associted for the primes by using the same processreturn sorted(result)def get_primes_smaller_than(n):return _PRIMES[:_bisect_left(_PRIMES, n)]
Here is an example if you want to use the primes number to go a lot faster. These lists are easy to find on the internet. I added comments in the code.
# http://primes.utm.edu/lists/small/10000.txt
# First 10000 primes
_PRIMES = (2, 3, 5, 7, 11, 13, 17, 19, 23, 29,
31, 37, 41, 43, 47, 53, 59, 61, 67, 71,
73, 79, 83, 89, 97, 101, 103, 107, 109, 113,
127, 131, 137, 139, 149, 151, 157, 163, 167, 173,
179, 181, 191, 193, 197, 199, 211, 223, 227, 229,
233, 239, 241, 251, 257, 263, 269, 271, 277, 281,
283, 293, 307, 311, 313, 317, 331, 337, 347, 349,
353, 359, 367, 373, 379, 383, 389, 397, 401, 409,
419, 421, 431, 433, 439, 443, 449, 457, 461, 463,
467, 479, 487, 491, 499, 503, 509, 521, 523, 541,
547, 557, 563, 569, 571, 577, 587, 593, 599, 601,
607, 613, 617, 619, 631, 641, 643, 647, 653, 659,
661, 673, 677, 683, 691, 701, 709, 719, 727, 733,
739, 743, 751, 757, 761, 769, 773, 787, 797, 809,
811, 821, 823, 827, 829, 839, 853, 857, 859, 863,
877, 881, 883, 887, 907, 911, 919, 929, 937, 941,
947, 953, 967, 971, 977, 983, 991, 997, 1009, 1013,
# Mising a lot of primes for the purpose of the example
)
from bisect import bisect_left as _bisect_left
from math import sqrt as _sqrt
def get_factors(n):
assert isinstance(n, int), "n must be an integer."
assert n > 0, "n must be greather than zero."
limit = pow(_PRIMES[-1], 2)
assert n <= limit, "n is greather then the limit of {0}".format(limit)
result = set((1, n))
root = int(_sqrt(n))
primes = [t for t in get_primes_smaller_than(root + 1) if not n % t]
result.update(primes) # Add all the primes factors less or equal to root square
for t in primes:
result.update(get_factors(n/t)) # Add all the factors associted for the primes by using the same process
return sorted(result)
def get_primes_smaller_than(n):
return _PRIMES[:_bisect_left(_PRIMES, n)]
def factors(n):'''
return prime factors and multiplicity of n
n = p0^e0 * p1^e1 * ... * pk^ek encoded as
res = [(p0, e0), (p1, e1), ..., (pk, ek)]
'''
res =[]# get rid of all the factors of 2 using bit shifts
mult =0whilenot n &1:
mult +=1
n >>=1if mult !=0:
res.append((2, mult))
limit = round(sqrt(n))
test_prime =3while test_prime <= limit:
mult =0while n % test_prime ==0:
mult +=1
n //= test_prime
if mult !=0:
res.append((test_prime, mult))if n ==1:# only useful if ek >= 3 (ek: multiplicitybreak# of the last prime)
limit = round(sqrt(n))# adjust the limit
test_prime +=2# will often not be prime...if n !=1:
res.append((n,1))return res
当然,这仍然是审判部门,仅此而已。因此,其效率仍然非常有限(尤其是对于没有小除数的大量用户)。
这是python3; 划分//应该是您唯一需要适应python 2(add from __future__ import division)的东西。
a potentially more efficient algorithm than the ones presented here already (especially if there are small prime factons in n). the trick here is to adjust the limit up to which trial division is needed every time prime factors are found:
def factors(n):
'''
return prime factors and multiplicity of n
n = p0^e0 * p1^e1 * ... * pk^ek encoded as
res = [(p0, e0), (p1, e1), ..., (pk, ek)]
'''
res = []
# get rid of all the factors of 2 using bit shifts
mult = 0
while not n & 1:
mult += 1
n >>= 1
if mult != 0:
res.append((2, mult))
limit = round(sqrt(n))
test_prime = 3
while test_prime <= limit:
mult = 0
while n % test_prime == 0:
mult += 1
n //= test_prime
if mult != 0:
res.append((test_prime, mult))
if n == 1: # only useful if ek >= 3 (ek: multiplicity
break # of the last prime)
limit = round(sqrt(n)) # adjust the limit
test_prime += 2 # will often not be prime...
if n != 1:
res.append((n, 1))
return res
this is of course still trial division and nothing more fancy. and therefore still very limited in its efficiency (especially for big numbers without small divisors).
this is python3; the division // should be the only thing you need to adapt for python 2 (add from __future__ import division).
回答 14
使用set(...)会使代码稍微慢一些,并且仅在检查平方根时才真正需要。这是我的版本:
def factors(num):if(num ==1or num ==0):return[]
f =[1]
sq = int(math.sqrt(num))for i in range(2, sq):if num % i ==0:
f.append(i)
f.append(num/i)if sq >1and num % sq ==0:
f.append(sq)if sq*sq != num:
f.append(num/sq)return f
的 if sq*sq != num:对于12之类的数字,条件是必需的,其中平方根不是整数,但是平方根的底数是一个因子。
Using set(...) makes the code slightly slower, and is only really necessary for when you check the square root. Here’s my version:
def factors(num):
if (num == 1 or num == 0):
return []
f = [1]
sq = int(math.sqrt(num))
for i in range(2, sq):
if num % i == 0:
f.append(i)
f.append(num/i)
if sq > 1 and num % sq == 0:
f.append(sq)
if sq*sq != num:
f.append(num/sq)
return f
The if sq*sq != num: condition is necessary for numbers like 12, where the square root is not an integer, but the floor of the square root is a factor.
Note that this version doesn’t return the number itself, but that is an easy fix if you want it. The output also isn’t sorted.
I timed it running 10000 times on all numbers 1-200 and 100 times on all numbers 1-5000. It outperforms all the other versions I tested, including dansalmo’s, Jason Schorn’s, oxrock’s, agf’s, steveha’s, and eryksun’s solutions, though oxrock’s is by far the closest.
回答 15
您的最大因数不超过您的数字,所以,假设
def factors(n):
factors =[]for i in range(1, n//2+1):if n % i ==0:
factors.append (i)
factors.append(n)return factors
Use something as simple as the following list comprehension, noting that we do not need to test 1 and the number we are trying to find:
def factors(n):
return [x for x in range(2, n//2+1) if n%x == 0]
In reference to the use of square root, say we want to find factors of 10. The integer portion of the sqrt(10) = 4 therefore range(1, int(sqrt(10))) = [1, 2, 3, 4] and testing up to 4 clearly misses 5.
Unless I am missing something I would suggest, if you must do it this way, using int(ceil(sqrt(x))). Of course this produces a lot of unnecessary calls to functions.
I was pretty surprised when I saw this question that no one used numpy even when numpy is way faster than python loops. By implementing @agf’s solution with numpy and it turned out at average 8x faster.
I belive that if you implemented some of the other solutions in numpy you could get amazing times.
Here is my function:
import numpy as np
def b(n):
r = np.arange(1, int(n ** 0.5) + 1)
x = r[np.mod(n, r) == 0]
return set(np.concatenate((x, n / x), axis=None))
Notice that the numbers of the x-axis are not the input to the functions. The input to the functions is 2 to the the number on the x-axis minus 1.
So where ten is the input would be 2**10-1 = 1023
回答 20
import'dart:math';
generateFactorsOfN(N){//determine lowest bound divisor range
final lowerBoundCheck = sqrt(N).toInt();
var factors =Set<int>();//stores factors
/***Lets take 16:*4= sqrt(16)* start from1...4 inclusive
* check mod 16%1==0? set[1,(16/1)]* check mod 16%2==0? set[1,(16/1),2,(16/2)]* check mod 16%3==0? set[1,(16/1),2,(16/2)]-> unchanged
* check mod 16%4==0? set[1,(16/1),2,(16/2),4,(16/4)]********************* set is used to remove duplicate
******************** case 4and(16/4) both equal to 4*return factor set<int>.. this isn't ordered
*/
for(var divisor = 1; divisor <= lowerBoundCheck; divisor++){
if(N % divisor == 0){
factors.add(divisor);
factors.add(N ~/ divisor); // ~/ integer division
}
}
return factors;
}
I’ve read every other google source and SO thread, with nothing working.
Python 2.7.3 32bit installed on Windows 7 64bit. Download, extracting, and then trying to install PyCrypto results in "Unable to find vcvarsall.bat".
So I install MinGW and tack that on the install line as the compiler of choice. But then I get the error "RuntimeError: chmod error".
How in the world do I get around this? I’ve tried using pip, which gives the same result. I found a prebuilt PyCrypto 2.3 binary and installed that, but it’s nowhere to be found on the system (not working).
Any ideas?
回答 0
如果尚未安装与Python.org分发的Visual Studio二进制文件兼容的C / C ++开发环境,则应坚持仅安装纯Python软件包或可用于Windows二进制文件的软件包。
If you don’t already have a C/C++ development environment installed that is compatible with the Visual Studio binaries distributed by Python.org, then you should stick to installing only pure Python packages or packages for which a Windows binary is available.
Microsoft has recently recently released a standalone, dedicated Microsoft Visual C++ Compiler for Python 2.7. If you’re using Python 2.7, simply install that compiler and Setuptools 6.0 or later, and most packages with C extensions will now compile readily.
After years and years, python finally agreed for a binary disribution called wheel which allows to install even binary extensions on Windows without having a compiler with simple pip install packagename. There is a list of popular packages with their status. Pycrypto is not there yet, but lxml, PySide and Scrapy for example.
Edited Nov 2015: pip uninstall pycrypto & pip install pycryptodome. It is a pycrypto fork with new features and it supports wheel. It replaces pycrypto, so existing code will continue to work (see https://pycryptodome.readthedocs.org/en/latest/src/examples.html)
vcvarsall.bat is part of the Visual C++ compiler, you need that to install what you are trying to install. Don’t even try to deal with MingGW if your Python was compiled with Visual Studio toolchain and vice versa. Even the version of the Microsoft tool chain is important. Python compiled with VS 2008 won’t work with extensions compiled with VS 2010!
Beware using Visual Studio 2010 or not using Visual Studio 2008
As far as I know the following is still true. This was posted in the link above in June, 2010 referring to trying to build extensions with VS 2010 Express against the Python installers available on python.org.
Be careful if you do this. Python 2.6 and 2.7 from python.org are
built with Visual Studio 2008 compilers. You will need to link with
the same CRT (msvcr90.dll) as Python.
Visual Studio 2010 Express links with the wrong CRT version:
msvcr100.dll.
If you do this, you must also re-build Python with Visual Studio 2010
Express. You cannot use the standard Python binary installer for
Windows. Nor can you use any C/C++ extensions built with a different
compiler than Visual Studio 2010 (Express).
Opinion: This is one reason I abandoned Windows for all serious development work for OSX!
I have managed to get pycrypto to compile by using MinGW32 and MSYS. This presumes that you have pip or easy_install installed.
Here’s how I did it:
1) Install MinGW32. For the sake of this explanation, let’s assume it’s installed in C:\MinGW. When using the installer, which I recommend, select the C++ compiler. MSYS should install with MinGW
2) Add c:\mingw\bin,c:\mingw\mingw32\bin,C:\MinGW\msys\1.0, c:\mingw\msys\1.0\bin and c:\mingw\msys\1.0\sbin to your %PATH%. If you aren’t familiar, this article is very helpful.
3) From the search bar, run msys and the MSYS terminal will open. For those familiar with Cygwin, it works in a similar fashion.
4) From within the MSYS terminal pip install pycrypto should run without error after this.
It’s possible to build PyCrypto using the Windows 7 SDK toolkits. There are two versions of the Windows 7 SDK. The original version (for .Net 3.5) includes the VS 2008 command-line compilers. Both 32- and 64-bit compilers can be installed.
The first step is to compile mpir to provide fast arithmetic. I’ve documented the process I use in the gmpy library. Detailed instructions for building mpir using the SDK compiler can be found at sdk_build
The key steps to use the SDK compilers from a DOS prompt are:
1) Run either vcvars32.bat or vcvars64.bat as appropriate.
2) At the prompt, execute “set MSSdk=1”
3) At the prompt, execute “set DISTUTILS_USE_SDK=1”
This should allow “python setup.py install” to succeed assuming there are no other issues with the C code. But I vaaguely remember that I had to edit a couple of PyCrypto files to enable mpir and to find the mpir libraries but I don’t have my Windows system up at the moment. It will be a couple of days before I’ll have time to recreate the steps. If you haven’t reported success by then, I’ll post the PyCrypto steps. The steps will assume you were able to compile mpir.
打开Windows提升的命令提示符cmd.exe(带有“以管理员身份运行”),为所有用户安装“适用于Python 2.7的Microsoft Visual C ++编译器”。您可以使用以下命令来执行此操作:msiexec / i C:\users\jozko\download\VCForPython27.msi ALLUSERS=1仅使用您自己的文件路径:msiexec /i <path to MSI> ALLUSERS=1
Install setuptools (setuptools 6.0 or later is required for Python to automatically detect this compiler package)
either by:pip install setuptoolsor download “Setuptools bootstrapping installer” source from, save this file somwhere on your filestystem as “ez_python.py” and install with: python ez_python.py
Install wheel (wheel is recommended for producing pre-built binary packages). You can install it with: pip install wheel
Open Windows elevated Command Prompt cmd.exe (with “Run as administrator”) to install “Microsoft Visual C++ Compiler for Python 2.7” for all users. You can use following command to do so: msiexec /i C:\users\jozko\download\VCForPython27.msi ALLUSERS=1 just use your own path to file: msiexec /i <path to MSI> ALLUSERS=1
Now you should be able to install pycrypto with: pip install pycrypto
If you are on Windows and struggling with installing Pycrypcto just use the:
pip install pycryptodome.
It works like a miracle and it will make your life much easier than trying to do a lot of configurations and tweaks.
回答 14
这可能不是最佳解决方案,但是您可以从MS下载并安装免费的Visual C ++ Express软件包。这将为您提供编译PyCrypto代码所需的C ++编译器。
This probably isn’t the optimal solution but you might download and install the free Visual C++ Express package from MS. This will give you the C++ compiler you need to compile the PyCrypto code.
My answer might not be related to problem mention here, but I had same problem with Python 3.4 where Crypto.Cipher wasn’t a valid import. So I tried installing PyCrypto and went into problems.
After some research I found with 3.4 you should use pycryptodome.
I install pycryptodome using pycharm and I was good.
Due to weird legal reasons, binaries are not published the normal way. Voidspace is normally the best second source. But since quite some time, voidspace maintainer did not update.
Use the zip from [https://www.dropbox.com/s/n6rckn0k6u4nqke/pycrypto-2.6.1.zip?dl=0]
步骤6a:通过将目录更改为C:\ Program Files(x86)\ Microsoft Visual Studio版本\ VC \在C:\ Program Files(x86)\ Microsoft Visual Studio 10.0 \ VC \ bin \ amd64中创建vcvars64.bat文件命令提示符。在命令提示符下键入命令:
cd C:\Program Files (x86)\Microsoft Visual Studio version\VC\r
Step 1: Install Visual C++ 2010 Express from
here.
(Do not install Microsoft Visual Studio 2010 Service Pack 1 )
Step 2: Remove all the Microsoft Visual C++ 2010 Redistributable packages from Control Panel\Programs and Features. If you don’t do those then the install is going to fail with an obscure “Fatal error during installation” error.
Step 3: Install offline version of Windows SDK for Visual Studio 2010 (v7.1) from here.
This is required for 64bit extensions. Windows has builtin mounting for ISOs like Pismo.
Step 4: You need to install the ISO file with Pismo File Mount Audit Package. Download Pismo from here
Step 5: Right click the downloaded ISO file and choose mount with Pismo. Thereafter, install the Setup\SDKSetup.exe instead of setup.exe.
Step 6a: Create a vcvars64.bat file in C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin\amd64 by changing directory to C:\Program Files (x86)\Microsoft Visual Studio version\VC\ on the command prompt.
Type command on the command prompt:
cd C:\Program Files (x86)\Microsoft Visual Studio version\VC\r
Step 6b:
To configure this Command Prompt window for 64-bit command-line builds that target x86 platforms, at the command prompt, enter:
vcvarsall x86 Click here for more options.
Step 7: At the command prompt, install the PyCrypto by typing:
C:\Python3X>pip install -U your_wh_file
Go to pycharm -> file -> setting -> project interpreter
Click on +
Search for "pycrypto" and install the package
Note: If you don’t have “Microsoft Visual C++ Compiler for Python 2.7” installed then it will prompt for installation, once installation finished try the above steps it should work fine.
...
soup =BeautifulSoup(html,"lxml")File"/Library/Python/2.7/site-packages/bs4/__init__.py", line 152,in __init__%",".join(features))
bs4.FeatureNotFound:Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?
以上输出在我的终端上。我在Mac OS 10.7.x上。我有Python 2.7.1,并按照本教程操作获得了Beautiful Soup和lxml,它们都已成功安装并与位于此处的单独测试文件一起使用。在导致此错误的Python脚本中,我包含以下行:
from pageCrawler import comparePages
在pageCrawler文件中,我包含以下两行:
from bs4 import BeautifulSoupfrom urllib2 import urlopen
...
soup = BeautifulSoup(html, "lxml")
File "/Library/Python/2.7/site-packages/bs4/__init__.py", line 152, in __init__
% ",".join(features))
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?
The above outputs on my Terminal. I am on Mac OS 10.7.x. I have Python 2.7.1, and followed this tutorial to get Beautiful Soup and lxml, which both installed successfully and work with a separate test file located here. In the Python script that causes this error, I have included this line:
from pageCrawler import comparePages
And in the pageCrawler file I have included the following two lines:
from bs4 import BeautifulSoupfrom urllib2 import urlopen
Any help in figuring out what the problem is and how it can be solved would much be appreciated.
I have a suspicion that this is related to the parser that BS will use to read the HTML. They document is here, but if you’re like me (on OSX) you might be stuck with something that requires a bit of work:
You’ll notice that in the BS4 documentation page above, they point out that by default BS4 will use the Python built-in HTML parser. Assuming you are in OSX, the Apple-bundled version of Python is 2.7.2 which is not lenient for character formatting. I hit this same problem, so I upgraded my version of Python to work around it. Doing this in a virtualenv will minimize disruption to other projects.
If doing that sounds like a pain, you can switch over to the LXML parser:
pip install lxml
And then try:
soup = BeautifulSoup(html, "lxml")
Depending on your scenario, that might be good enough. I found this annoying enough to warrant upgrading my version of Python. Using virtualenv, you can migrate your packages fairly easily.
Although BeautifulSoup supports the HTML parser by default
If you want to use any other third-party Python parsers you need to install that external parser like(lxml).
soup_object= BeautifulSoup(markup,"html.parser") #Python HTML parser
But if you don’t specified any parser as parameter you will get an warning that no parser specified.
soup_object= BeautifulSoup(markup) #Warnning
To use any other external parser you need to install it and then need to specify it. like
pip install lxml
soup_object= BeautifulSoup(markup,'lxml') # C dependent parser
External parser have c and python dependency which may have some advantage and disadvantage.
回答 7
我遇到了同样的问题。我发现原因是我有一个过时的python 6软件包。
>>>import html5lib
Traceback(most recent call last):File"<stdin>", line 1,in<module>File"/usr/local/lib/python2.7/site-packages/html5lib/__init__.py", line 16,in<module>from.html5parser importHTMLParser, parse, parseFragment
File"/usr/local/lib/python2.7/site-packages/html5lib/html5parser.py", line 2,in<module>from six import with_metaclass, viewkeys, PY3
ImportError: cannot import name viewkeys
I encountered the same issue. I found the reason is that I had a slightly-outdated python six package.
>>> import html5lib
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/site-packages/html5lib/__init__.py", line 16, in <module>
from .html5parser import HTMLParser, parse, parseFragment
File "/usr/local/lib/python2.7/site-packages/html5lib/html5parser.py", line 2, in <module>
from six import with_metaclass, viewkeys, PY3
ImportError: cannot import name viewkeys
The error is coming because of the parser you are using. In general, if you have HTML file/code then you need to use html5lib(documentation can be found here) & in-case you have XML file/data then you need to use lxml(documentation can be found here). You can use lxml for HTML file/code also but sometimes it gives an error as above. So, better to choose the package wisely based on the type of data/file. You can also use html_parser which is built-in module. But, this also sometimes do not work.
For more details regarding when to use which package you can see the details here
Blank parameter will result in a warning for best available.
soup = BeautifulSoup(html)
—————/UserWarning: No parser was explicitly specified, so I’m using the best available HTML parser for this system (“html5lib”). This usually isn’t a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.———————-/
>>> input("Enter your name: ")Enter your name: dummy
Traceback(most recent call last):File"<input>", line 1,in<module>File"<string>", line 1,in<module>NameError: name 'dummy'isnot defined
input function in Python 2.7, evaluates whatever your enter, as a Python expression. If you simply want to read strings, then use raw_input function in Python 2.7, which will not evaluate the read strings.
If you are using Python 3.x, raw_input has been renamed to input. Quoting the Python 3.0 release notes,
raw_input() was renamed to input(). That is, the new input() function reads a line from sys.stdin and returns it with the trailing newline stripped. It raises EOFError if the input is terminated prematurely. To get the old behavior of input(), use eval(input())
In Python 2.7, there are two functions which can be used to accept user inputs. One is input and the other one is raw_input. You can think of the relation between them as follows
input = eval(raw_input)
Consider the following piece of code to understand this better
>>> dude = "thefourtheye"
>>> input_variable = input("Enter your name: ")
Enter your name: dude
>>> input_variable
'thefourtheye'
input accepts a string from the user and evaluates the string in the current Python context. When I type dude as input, it finds that dude is bound to the value thefourtheye and so the result of evaluation becomes thefourtheye and that gets assigned to input_variable.
If I enter something else which is not there in the current python context, it will fail will the NameError.
>>> input("Enter your name: ")
Enter your name: dummy
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "<string>", line 1, in <module>
NameError: name 'dummy' is not defined
Security considerations with Python 2.7’s input:
Since whatever user types is evaluated, it imposes security issues as well. For example, if you have already loaded os module in your program with import os, and then the user types in
os.remove("/etc/hosts")
this will be evaluated as a function call expression by python and it will be executed. If you are executing Python with elevated privileges, /etc/hosts file will be deleted. See, how dangerous it could be?
To demonstrate this, let’s try to execute input function again.
>>> dude = "thefourtheye"
>>> input("Enter your name: ")
Enter your name: input("Enter your name again: ")
Enter your name again: dude
Now, when input("Enter your name: ") is executed, it waits for the user input and the user input is a valid Python function invocation and so that is also invoked. That is why we are seeing Enter your name again: prompt again.
So, you are better off with raw_input function, like this
input_variable = raw_input("Enter your name: ")
If you need to convert the result to some other type, then you can use appropriate functions to convert the string returned by raw_input. For example, to read inputs as integers, use the int function, like shown in this answer.
In python 3.x, there is only one function to get user inputs and that is called input, which is equivalent to Python 2.7’s raw_input.
You should use raw_input because you are using python-2.7. When you use input() on a variable (for example: s = input('Name: ')), it will execute the command ON the Python environment without saving what you wrote on the variable (s) and create an error if what you wrote is not defined.
raw_input() will save correctly what you wrote on the variable (for example: f = raw_input('Name : ')), and it will not execute it in the Python environment without creating any possible error:
input_variable = raw_input('Enter Your Name : ')
print("Your Name Is : " + (input_variable))
For anyone else that may run into this issue, turns out that even if you include #!/usr/bin/env python3 at the beginning of your script, the shebang is ignored if the file isn’t executable.
To determine whether or not your file is executable:
run ./filename.py from the command line
if you get -bash: ./filename.py: Permission denied, run chmod a+x filename.py
run ./filename.py again
If you’ve included import sys; print(sys.version) as Kevin suggested, you’ll now see that the script is being interpreted by python3
回答 10
以前的贡献不错。
import sys;print(sys.version)def ingreso(nombre):print('Hi ', nombre, type(nombre))def bienvenida(nombre):print("Hi "+nombre+", bye ")
nombre = raw_input("Enter your name: ")
ingreso(nombre)
bienvenida(nombre)#Works in Python 2 and 3:try: input = raw_input
exceptNameError:passprint(input("Your name: "))
1st is simple without code change that is run your script by Python3, if you still want to run on python2 then
after running your python script, when you are entering the input keep in mind
if you want to enter string then just start typing down with “input goes with double-quote” and it will work in python2.7 and
if you want to enter character then use the input with a single quote like ‘your input goes here’
if you want to enter number not an issue you simply type the number
2nd way is with code changes
use the below import and run with any version of python
from six.moves import input
Use raw_input() function instead of input() function in your code with any import
sanitise your code with str() function like str(input()) and then assign to any variable
As error implies: name ‘dude’ is not defined
i.e. for python ‘dude’ become variable here and it’s not having any value of python defined type assigned so only its crying like baby so if we define a ‘dude’ variable and assign any value and pass to it, it will work but that’s not what we want as we don’t know what user will enter and moreover we want to capture the user input.
Fact about these method: input() function: This function takes the value and type of the input you enter as it is without modifying it type. raw_input()
function: This function explicitly converts the input you give into type string,
Note: The vulnerability in input() method lies in the fact that
the variable accessing the value of input can be accessed by anyone
just by using the name of variable or method.
You can change which python you’re using with your IDE, if you’ve already downloaded python 3.x it shouldn’t be too hard to switch. But your script works fine on python 3.x, I would just change
print ("your name is" + input_variable)
to
print ("your name is", input_variable)
Because with the comma it prints with a whitespace in between your name is and whatever the user inputted. AND: if you’re using 2.7 just use raw_input instead of input.
How to send a multipart/form-data with requests in python? How to send a file, I understand, but how to send the form data by this method can not understand.
from requests_toolbelt.multipart.encoder importMultipartEncoder
mp_encoder =MultipartEncoder(
fields={'foo':'bar',# plain file object, no filename or mime type produces a# Content-Disposition header with just the part name'spam':('spam.txt', open('spam.txt','rb'),'text/plain'),})
r = requests.post('http://httpbin.org/post',
data=mp_encoder,# The MultipartEncoder is posted as data, don't use files=...!# The MultipartEncoder provides the content-type header with the boundary:
headers={'Content-Type': mp_encoder.content_type})
Basically, if you specify a files parameter (a dictionary), then requests will send a multipart/form-data POST instead of a application/x-www-form-urlencoded POST. You are not limited to using actual files in that dictionary, however:
Better still, you can further control the filename, content type and additional headers for each part by using a tuple instead of a single string or bytes object. The tuple is expected to contain between 2 and 4 elements; the filename, the content, optionally a content type, and an optional dictionary of further headers.
I’d use the tuple form with None as the filename, so that the filename="..." parameter is dropped from the request for those parts:
If you specify both files and data, then it depends on the value of data what will be used to create the POST body. If data is a string, only it willl be used; otherwise both data and files are used, with the elements in data listed first.
There is also the excellent requests-toolbelt project, which includes advanced Multipart support. It takes field definitions in the same format as the files parameter, but unlike requests, it defaults to not setting a filename parameter. In addition, it can stream the request from open file objects, where requests will first construct the request body in memory:
from requests_toolbelt.multipart.encoder import MultipartEncoder
mp_encoder = MultipartEncoder(
fields={
'foo': 'bar',
# plain file object, no filename or mime type produces a
# Content-Disposition header with just the part name
'spam': ('spam.txt', open('spam.txt', 'rb'), 'text/plain'),
}
)
r = requests.post(
'http://httpbin.org/post',
data=mp_encoder, # The MultipartEncoder is posted as data, don't use files=...!
# The MultipartEncoder provides the content-type header with the boundary:
headers={'Content-Type': mp_encoder.content_type}
)
Fields follow the same conventions; use a tuple with between 2 and 4 elements to add a filename, part mime-type or extra headers. Unlike the files parameter, no attempt is made to find a default filename value if you don’t use a tuple.
Since the previous answers were written, requests have changed. Have a look at the bug thread at Github for more detail and this comment for an example.
In short, the files parameter takes a dict with the key being the name of the form field and the value being either a string or a 2, 3 or 4-length tuple, as described in the section POST a Multipart-Encoded File in the requests quickstart:
def request(method, url,**kwargs):"""Constructs and sends a :class:`Request <Request>`.
...
:param files: (optional) Dictionary of ``'name': file-like-objects``
(or ``{'name': file-tuple}``) for multipart encoding upload.
``file-tuple`` can be a 2-tuple ``('filename', fileobj)``,
3-tuple ``('filename', fileobj, 'content_type')``
or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``,
where ``'content-type'`` is a string
defining the content type of the given file
and ``custom_headers`` a dict-like object
containing additional headers to add for the file.
相关部分是: file-tuple can be a2-tuple,。3-tupleor a4-tuple
def request(method, url, **kwargs):
"""Constructs and sends a :class:`Request <Request>`.
...
:param files: (optional) Dictionary of ``'name': file-like-objects``
(or ``{'name': file-tuple}``) for multipart encoding upload.
``file-tuple`` can be a 2-tuple ``('filename', fileobj)``,
3-tuple ``('filename', fileobj, 'content_type')``
or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``,
where ``'content-type'`` is a string
defining the content type of the given file
and ``custom_headers`` a dict-like object
containing additional headers to add for the file.
The relevant part is: file-tuple can be a2-tuple, 3-tupleor a4-tuple.
Based on the above, the simplest multipart form request that includes both files to upload and form fields will look like this:
☝ Note the None as the first argument in the tuple for plain text fields — this is a placeholder for the filename field which is only used for file uploads, but for text fields passing None as the first parameter is required in order for the data to be submitted.
Multiple fields with the same name
If you need to post multiple fields with the same name then instead of a dictionary you can define your payload as a list (or a tuple) of tuples:
If the above API is not pythonic enough for you, then consider using requests toolbelt (pip install requests_toolbelt) which is an extension of the core requests module that provides support for file upload streaming as well as the MultipartEncoder which can be used instead of files, and which also lets you define the payload as a dictionary, tuple or list.
MultipartEncoder can be used both for multipart requests with or without actual upload fields. It must be assigned to the data parameter.
import requests
from requests_toolbelt.multipart.encoder import MultipartEncoder
multipart_data = MultipartEncoder(
fields={
# a file upload field
'file': ('file.zip', open('file.zip', 'rb'), 'text/plain')
# plain text fields
'field0': 'value0',
'field1': 'value1',
}
)
response = requests.post('http://httpbin.org/post', data=multipart_data,
headers={'Content-Type': multipart_data.content_type})
If you need to send multiple fields with the same name, or if the order of form fields is important, then a tuple or a list can be used instead of a dictionary:
For the server side please check the multer documentation at: https://github.com/expressjs/multer
here the field single(‘fieldName’) is used to accept one single file, as in:
I am using below referred code to edit a csv using Python. Functions called in the code form upper part of the code.
Problem: I want the below referred code to start editing the csv from 2nd row, I want it to exclude 1st row which contains headers. Right now it is applying the functions on 1st row only and my header row is getting changed.
with open("tmob_notcleaned.csv","rb")as infile, open("tmob_cleaned.csv","wb")as outfile:
reader = csv.reader(infile)
next(reader,None)# skip the headers
writer = csv.writer(outfile)for row in reader:# process each row
writer.writerow(row)# no need to close, the files are closed automatically when you get to this point.
Your reader variable is an iterable, by looping over it you retrieve the rows.
To make it skip one item before your loop, simply call next(reader, None) and ignore the return value.
You can also simplify your code a little; use the opened files as context managers to have them closed automatically:
with open("tmob_notcleaned.csv", "rb") as infile, open("tmob_cleaned.csv", "wb") as outfile:
reader = csv.reader(infile)
next(reader, None) # skip the headers
writer = csv.writer(outfile)
for row in reader:
# process each row
writer.writerow(row)
# no need to close, the files are closed automatically when you get to this point.
If you wanted to write the header to the output file unprocessed, that’s easy too, pass the output of next() to writer.writerow():
headers = next(reader, None) # returns the headers or `None` if the input is empty
if headers:
writer.writerow(headers)
回答 1
解决此问题的另一种方法是使用DictReader类,该类“跳过”标题行并将其用于允许命名索引。
给定“ foo.csv”,如下所示:
FirstColumn,SecondColumn
asdf,1234
qwer,5678
像这样使用DictReader:
import csv
with open('foo.csv')as f:
reader = csv.DictReader(f, delimiter=',')for row in reader:print(row['FirstColumn'])# Access by column header instead of column numberprint(row['SecondColumn'])
Another way of solving this is to use the DictReader class, which “skips” the header row and uses it to allowed named indexing.
Given “foo.csv” as follows:
FirstColumn,SecondColumn
asdf,1234
qwer,5678
Use DictReader like this:
import csv
with open('foo.csv') as f:
reader = csv.DictReader(f, delimiter=',')
for row in reader:
print(row['FirstColumn']) # Access by column header instead of column number
print(row['SecondColumn'])
sudo -s pip install scrapy
Collecting scrapy
DownloadingScrapy-1.0.2-py2-none-any.whl (290kB)100%|████████████████████████████████|290kB345kB/s
Requirement already satisfied (use --upgrade to upgrade): cssselect>=0.9in/Library/Python/2.7/site-packages (from scrapy)Requirement already satisfied (use --upgrade to upgrade): queuelib in/Library/Python/2.7/site-packages (from scrapy)Requirement already satisfied (use --upgrade to upgrade): pyOpenSSL in/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python (from scrapy)Collecting w3lib>=1.8.0(from scrapy)Downloading w3lib-1.12.0-py2.py3-none-any.whl
Collecting lxml (from scrapy)Downloading lxml-3.4.4.tar.gz (3.5MB)100%|████████████████████████████████|3.5MB112kB/s
CollectingTwisted>=10.0.0(from scrapy)DownloadingTwisted-15.3.0.tar.bz2 (4.4MB)100%|████████████████████████████████|4.4MB94kB/s
Collecting six>=1.5.2(from scrapy)Downloading six-1.9.0-py2.py3-none-any.whl
Requirement already satisfied (use --upgrade to upgrade): zope.interface>=3.6.0in/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python (fromTwisted>=10.0.0->scrapy)Requirement already satisfied (use --upgrade to upgrade): setuptools in/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python (from zope.interface>=3.6.0->Twisted>=10.0.0->scrapy)Installing collected packages: six, w3lib, lxml,Twisted, scrapy
Found existing installation: six 1.4.1
DEPRECATION:Uninstalling a distutils installed project (six) has been deprecated and will be removed in a future version.Thisis due to the fact that uninstalling a distutils project will only partially uninstall the project.Uninstalling six-1.4.1:Exception:Traceback(most recent call last):File"/Library/Python/2.7/site-packages/pip-7.1.0-py2.7.egg/pip/basecommand.py", line 223,in main
status = self.run(options, args)File"/Library/Python/2.7/site-packages/pip-7.1.0-py2.7.egg/pip/commands/install.py", line 299,in run
root=options.root_path,File"/Library/Python/2.7/site-packages/pip-7.1.0-py2.7.egg/pip/req/req_set.py", line 640,in install
requirement.uninstall(auto_confirm=True)File"/Library/Python/2.7/site-packages/pip-7.1.0-py2.7.egg/pip/req/req_install.py", line 726,in uninstall
paths_to_remove.remove(auto_confirm)File"/Library/Python/2.7/site-packages/pip-7.1.0-py2.7.egg/pip/req/req_uninstall.py", line 125,in remove
renames(path, new_path)File"/Library/Python/2.7/site-packages/pip-7.1.0-py2.7.egg/pip/utils/__init__.py", line 314,in renames
shutil.move(old, new)File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 302,in move
copy2(src, real_dst)File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 131,in copy2
copystat(src, dst)File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 103,in copystat
os.chflags(dst, st.st_flags)OSError:[Errno1]Operationnot permitted:'/tmp/pip-nIfswi-uninstall/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/six-1.4.1-py2.7.egg-info'
I’m trying to install Scrapy Python framework in OSX 10.11 (El Capitan) via pip. The installation script downloads the required modules and at some point returns the following error:
OSError: [Errno 1] Operation not permitted: '/tmp/pip-nIfswi-uninstall/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/six-1.4.1-py2.7.egg-info'
I’ve tried to deactivate the rootless feature in OSX 10.11 with the command:
sudo nvram boot-args="rootless=0";sudo reboot
but I still get the same error when the machine reboots.
Any clue or idea from my fellow StackExchangers?
If it helps, the full script output is the following:
sudo -s pip install scrapy
Collecting scrapy
Downloading Scrapy-1.0.2-py2-none-any.whl (290kB)
100% |████████████████████████████████| 290kB 345kB/s
Requirement already satisfied (use --upgrade to upgrade): cssselect>=0.9 in /Library/Python/2.7/site-packages (from scrapy)
Requirement already satisfied (use --upgrade to upgrade): queuelib in /Library/Python/2.7/site-packages (from scrapy)
Requirement already satisfied (use --upgrade to upgrade): pyOpenSSL in /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python (from scrapy)
Collecting w3lib>=1.8.0 (from scrapy)
Downloading w3lib-1.12.0-py2.py3-none-any.whl
Collecting lxml (from scrapy)
Downloading lxml-3.4.4.tar.gz (3.5MB)
100% |████████████████████████████████| 3.5MB 112kB/s
Collecting Twisted>=10.0.0 (from scrapy)
Downloading Twisted-15.3.0.tar.bz2 (4.4MB)
100% |████████████████████████████████| 4.4MB 94kB/s
Collecting six>=1.5.2 (from scrapy)
Downloading six-1.9.0-py2.py3-none-any.whl
Requirement already satisfied (use --upgrade to upgrade): zope.interface>=3.6.0 in /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python (from Twisted>=10.0.0->scrapy)
Requirement already satisfied (use --upgrade to upgrade): setuptools in /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python (from zope.interface>=3.6.0->Twisted>=10.0.0->scrapy)
Installing collected packages: six, w3lib, lxml, Twisted, scrapy
Found existing installation: six 1.4.1
DEPRECATION: Uninstalling a distutils installed project (six) has been deprecated and will be removed in a future version. This is due to the fact that uninstalling a distutils project will only partially uninstall the project.
Uninstalling six-1.4.1:
Exception:
Traceback (most recent call last):
File "/Library/Python/2.7/site-packages/pip-7.1.0-py2.7.egg/pip/basecommand.py", line 223, in main
status = self.run(options, args)
File "/Library/Python/2.7/site-packages/pip-7.1.0-py2.7.egg/pip/commands/install.py", line 299, in run
root=options.root_path,
File "/Library/Python/2.7/site-packages/pip-7.1.0-py2.7.egg/pip/req/req_set.py", line 640, in install
requirement.uninstall(auto_confirm=True)
File "/Library/Python/2.7/site-packages/pip-7.1.0-py2.7.egg/pip/req/req_install.py", line 726, in uninstall
paths_to_remove.remove(auto_confirm)
File "/Library/Python/2.7/site-packages/pip-7.1.0-py2.7.egg/pip/req/req_uninstall.py", line 125, in remove
renames(path, new_path)
File "/Library/Python/2.7/site-packages/pip-7.1.0-py2.7.egg/pip/utils/__init__.py", line 314, in renames
shutil.move(old, new)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 302, in move
copy2(src, real_dst)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 131, in copy2
copystat(src, dst)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 103, in copystat
os.chflags(dst, st.st_flags)
OSError: [Errno 1] Operation not permitted: '/tmp/pip-nIfswi-uninstall/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/six-1.4.1-py2.7.egg-info'
As the other answers said, it’s because of the new System Integrity Protection, but I believe the other answers are overcomplicated.
If you’re only gonna use that package in the current user, you should be able to install it just fine, without the need to disable the SIP, by using the --user flag. Like this:
I would suggest very strongly against modifying the system Python on Mac; there are numerous issues that can occur.
Your particular error shows that the installer has issues resolving the dependencies for Scrapy without impacting the current Python installation. The system uses Python for a number of essential tasks, so it’s important to keep the system installation stable and as originally installed by Apple.
I would also exhaust all other possibilities before bypassing built in security.
Package Manager Solutions:
Please look into a Python virtualization tool such as virtualenv first; this will allow you to experiment safely.
Another useful tool to use languages and software without conflicting with your Mac OS is Homebrew. Like MacPorts or Fink, Homebrew is a package manager for Mac, and is useful for safely trying lots of other languages and tools.
“Roll your own” Software Installs:
If you don’t like the package manager approach, you could use the /usr/local path or create an /opt/local directory for installing an alternate Python installation and fix up your paths in your .bashrc. Note that you’ll have to enable root for these solutions.
How to do it anyway:
If you absolutely must disable the security check (and I sincerely hope it’s for something other than messing with the system languages and resources), you can disable it temporarily and re-enable it using some of the techniques in this post on how to Disable System Integrity-Protection.
Restart Mac -> hold down “Command + R” after the startup chime -> Opens OS X Utilities -> Open Terminal and type “csrutil disable” -> Reboot OS X -> Open Terminal and check “csrutil status”
— close SIP(system Integrity Protection)
— then reboot, use command +R to enter debug mode, then select terminal:
csrutil disable
reboot
2.
sudo C_INCLUDE_PATH=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/usr/include/libxml2
:/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/usr/include/libxml2/libxml
:/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/usr/include
pip install scrapy –ignore-installed six
3.
— then remove old six, install it again
sudo rm -rf /Library/Python/2.7/site-packages/six*
sudo rm -rf /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/six*
sudo pip install six
Then, you can install things for Python 3.2 with pip-3.2, and install things for Python 2-7 with pip-2.7. The pip command will end up pointing to one of these, but I’m not sure which, so you will have to check.
This worked for me on OS X: (I say this because sometimes is a pain that mac has “its own” version of every open source tool, and you cannot remove it because “its improvements” make it unique for other apple stuff to work, and if you remove it things start falling appart)
I followed the steps provided by @Lennart Regebro to get pip for python 3, nevertheless pip for python 2 was still first on the path, so… what I did is to create a symbolic link to python 3 inside /usr/bin (in deed I did the same to have my 2 pythons running in peace):
I have a browser which sends utf-8 characters to my Python server, but when I retrieve it from the query string, the encoding that Python returns is ASCII. How can I convert the plain string to utf-8?
NOTE: The string passed from the web is already UTF-8 encoded, I just want to make Python to treat it as UTF-8 not ASCII.
Translate with ord() and unichar().
Every unicode char have a number asociated, something like an index. So Python have a few methods to translate between a char and his number. Downside is a ñ example. Hope it can help.
>>> C = 'ñ'
>>> U = C.decode('utf8')
>>> U
u'\xf1'
>>> ord(U)
241
>>> unichr(241)
u'\xf1'
>>> print unichr(241).encode('utf8')
ñ