$ MOZ_HEADLESS=1 python manage.py test # testing example in Django with headless Firefox
要么
$ export MOZ_HEADLESS=1# this way you only have to set it once
$ python manage.py test functional/tests/directory
$ unset MOZ_HEADLESS # if you want to disable headless mode
There’s another way to accomplish headless mode. If you need to disable or enable the headless mode in Firefox, without changing the code, you can set the environment variable MOZ_HEADLESS to whatever if you want Firefox to run headless, or don’t set it at all.
This is very useful when you are using for example continuous integration and you want to run the functional tests in the server but still be able to run the tests in normal mode in your PC.
$ MOZ_HEADLESS=1 python manage.py test # testing example in Django with headless Firefox
or
$ export MOZ_HEADLESS=1 # this way you only have to set it once
$ python manage.py test functional/tests/directory
$ unset MOZ_HEADLESS # if you want to disable headless mode
Just a note for people who may have found this later (and want java way of achieving this); FirefoxOptions is also capable of enabling the headless mode:
FirefoxOptions firefoxOptions = new FirefoxOptions();
firefoxOptions.setHeadless(true);
回答 4
Used below code to set driver type based on need of Headless/Headfor both Firefoxand chrome://Canpass browser type if brower.lower()=='chrome':
driver = webdriver.Chrome('..\drivers\chromedriver')elif brower.lower()=='headless chrome':
ch_Options =Options()
ch_Options.add_argument('--headless')
ch_Options.add_argument("--disable-gpu")
driver = webdriver.Chrome('..\drivers\chromedriver',options=ch_Options)elif brower.lower()=='firefox':
driver = webdriver.Firefox(executable_path=r'..\drivers\geckodriver.exe')elif brower.lower()=='headless firefox':
ff_option =FFOption()
ff_option.add_argument('--headless')
ff_option.add_argument("--disable-gpu")
driver = webdriver.Firefox(executable_path=r'..\drivers\geckodriver.exe', options=ff_option)elif brower.lower()=='ie':
driver = webdriver.Ie('..\drivers\IEDriverServer')else:raiseException('Invalid Browser Type')
import boto3
s3 = boto3.client('s3', aws_access_key_id='key', aws_secret_access_key='secret_key')
read_file = s3.get_object(Bucket,Key)
df = pd.read_csv(read_file['Body'])# Make alterations to DataFrame# Then export DataFrame to CSV through direct transfer to s3
I have a pandas DataFrame that I want to upload to a new CSV file. The problem is that I don’t want to save the file locally before transferring it to s3. Is there any method like to_csv for writing the dataframe to s3 directly? I am using boto3.
Here is what I have so far:
import boto3
s3 = boto3.client('s3', aws_access_key_id='key', aws_secret_access_key='secret_key')
read_file = s3.get_object(Bucket, Key)
df = pd.read_csv(read_file['Body'])
# Make alterations to DataFrame
# Then export DataFrame to CSV through direct transfer to s3
回答 0
您可以使用:
from io importStringIO# python3; python2: BytesIO import boto3
bucket ='my_bucket_name'# already created on S3
csv_buffer =StringIO()
df.to_csv(csv_buffer)
s3_resource = boto3.resource('s3')
s3_resource.Object(bucket,'df.csv').put(Body=csv_buffer.getvalue())
In[1]:import pandas as pd
In[2]: df = pd.DataFrame([[1,1,1],[2,2,2]], columns=['a','b','c'])In[3]: df
Out[3]:
a b c
01111222In[4]: df.to_csv('s3://experimental/playground/temp_csv/dummy.csv', index=False)In[5]: pd.__version__
Out[5]:'0.24.1'In[6]: new_df = pd.read_csv('s3://experimental/playground/temp_csv/dummy.csv')In[7]: new_df
Out[7]:
a b c
01111222
You can directly use the S3 path. I am using Pandas 0.24.1
In [1]: import pandas as pd
In [2]: df = pd.DataFrame( [ [1, 1, 1], [2, 2, 2] ], columns=['a', 'b', 'c'])
In [3]: df
Out[3]:
a b c
0 1 1 1
1 2 2 2
In [4]: df.to_csv('s3://experimental/playground/temp_csv/dummy.csv', index=False)
In [5]: pd.__version__
Out[5]: '0.24.1'
In [6]: new_df = pd.read_csv('s3://experimental/playground/temp_csv/dummy.csv')
In [7]: new_df
Out[7]:
a b c
0 1 1 1
1 2 2 2
pandas now uses s3fs for handling S3 connections. This shouldn’t break any code. However, since s3fs is not a required dependency, you will need to install it separately, like boto in prior versions of pandas. GH11915.
import s3fs
s3 = s3fs.S3FileSystem(anon=False)
# Use 'w' for py3, 'wb' for py2
with s3.open('<bucket-name>/<filename>.csv','w') as f:
df.to_csv(f)
The problem with StringIO is that it will eat away at your memory. With this method, you are streaming the file to s3, rather than converting it to string, then writing it into s3. Holding the pandas dataframe and its string copy in memory seems very inefficient.
If you are working in an ec2 instant, you can give it an IAM role to enable writing it to s3, thus you dont need to pass in credentials directly. However, you can also connect to a bucket by passing credentials to the S3FileSystem() function. See documention:https://s3fs.readthedocs.io/en/latest/
I have a list of items that likely has some export issues. I would like to get a list of the duplicate items so I can manually compare them. When I try to use pandas duplicated method, it only returns the first duplicate. Is there a a way to get all of the duplicates and not just the first one?
There area a couple duplicate items. But, when I use the above code, I only get the first item. In the API reference, I see how I can get the last item, but I would like to have all of them so I can visually inspect them to see why I am getting the discrepancy. So, in this example I would like to get all three A036 entries and both 11795 entries and any other duplicated entries, instead of the just first one. Any help is most appreciated.
In[1]:import pandas as pd
In[2]: df = pd.DataFrame(['a','b','c','d','a','b'])In[3]: df
Out[3]:00 a
1 b
2 c
3 d
4 a
5 b
In[4]: df[df.duplicated(keep=False)]Out[4]:00 a
1 b
4 a
5 b
With Pandas version 0.17, you can set ‘keep = False’ in the duplicated function to get all the duplicate items.
In [1]: import pandas as pd
In [2]: df = pd.DataFrame(['a','b','c','d','a','b'])
In [3]: df
Out[3]:
0
0 a
1 b
2 c
3 d
4 a
5 b
In [4]: df[df.duplicated(keep=False)]
Out[4]:
0
0 a
1 b
4 a
5 b
Using an element-wise logical or and setting the take_last argument of the pandas duplicated method to both True and False you can obtain a set from your dataframe that includes all of the duplicates.
sort("ID") does not seem to be working now, seems deprecated as per sort doc, so use sort_values("ID") instead to sort after duplicate filter, as following:
running install
running bdist_egg running egg_info writing requirements to
pip.egg-info/requires.txt writing pip.egg-info/PKG-INFO writing
top-level names to pip.egg-info/top_level.txt writing dependency_links
to pip.egg-info/dependency_links.txt writing entry points to
pip.egg-info/entry_points.txt warning: manifest_maker: standard file
'setup.py' not found
reading manifest file 'pip.egg-info/SOURCES.txt' writing manifest file
'pip.egg-info/SOURCES.txt' installing library code to
build/bdist.macosx-10.6-intel/egg running install_lib warning:
install_lib: 'build/lib' does not exist -- no Python modules to
install
creating build/bdist.macosx-10.6-intel/egg creating
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/PKG-INFO -> build/bdist.macosx-10.6-intel/egg/EGG-INFO
copying pip.egg-info/SOURCES.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/dependency_links.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/entry_points.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/not-zip-safe ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/requires.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/top_level.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO creating
'dist/pip-1.4.1-py2.7.egg' and adding
'build/bdist.macosx-10.6-intel/egg' to it removing
'build/bdist.macosx-10.6-intel/egg' (and everything under it)
Processing pip-1.4.1-py2.7.egg removing
'/Users/dl/Library/Python/2.7/lib/python/site-packages/pip-1.4.1-py2.7.egg'
(and everything under it) creating
/Users/dl/Library/Python/2.7/lib/python/site-packages/pip-1.4.1-py2.7.egg
Extracting pip-1.4.1-py2.7.egg to
/Users/dl/Library/Python/2.7/lib/python/site-packages pip 1.4.1 is
already the active version in easy-install.pth Installing pip script
to /Users/dl/Library/Python/2.7/bin Installing pip-2.7 script to
/Users/dl/Library/Python/2.7/bin
Installed
/Users/dl/Library/Python/2.7/lib/python/site-packages/pip-1.4.1-py2.7.egg
Processing dependencies for pip==1.4.1 Finished processing
dependencies for pip==1.4.1
然后我输入pip install,错误信息显示为
Traceback (most recent call last): File
"/Library/Frameworks/Python.framework/Versions/2.7/bin/pip", line 9,
in <module>
load_entry_point('pip==1.4.1', 'console_scripts', 'pip')() File "build/bdist.macosx-10.6-intel/egg/pkg_resources.py", line 357, in
load_entry_point File
"build/bdist.macosx-10.6-intel/egg/pkg_resources.py", line 2394, in
load_entry_point File
"build/bdist.macosx-10.6-intel/egg/pkg_resources.py", line 2108, in
load ImportError: No module named pip
Run (sudo) python setup.py install in iTerm shows that
running install
running bdist_egg running egg_info writing requirements to
pip.egg-info/requires.txt writing pip.egg-info/PKG-INFO writing
top-level names to pip.egg-info/top_level.txt writing dependency_links
to pip.egg-info/dependency_links.txt writing entry points to
pip.egg-info/entry_points.txt warning: manifest_maker: standard file
'setup.py' not found
reading manifest file 'pip.egg-info/SOURCES.txt' writing manifest file
'pip.egg-info/SOURCES.txt' installing library code to
build/bdist.macosx-10.6-intel/egg running install_lib warning:
install_lib: 'build/lib' does not exist -- no Python modules to
install
creating build/bdist.macosx-10.6-intel/egg creating
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/PKG-INFO -> build/bdist.macosx-10.6-intel/egg/EGG-INFO
copying pip.egg-info/SOURCES.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/dependency_links.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/entry_points.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/not-zip-safe ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/requires.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO copying
pip.egg-info/top_level.txt ->
build/bdist.macosx-10.6-intel/egg/EGG-INFO creating
'dist/pip-1.4.1-py2.7.egg' and adding
'build/bdist.macosx-10.6-intel/egg' to it removing
'build/bdist.macosx-10.6-intel/egg' (and everything under it)
Processing pip-1.4.1-py2.7.egg removing
'/Users/dl/Library/Python/2.7/lib/python/site-packages/pip-1.4.1-py2.7.egg'
(and everything under it) creating
/Users/dl/Library/Python/2.7/lib/python/site-packages/pip-1.4.1-py2.7.egg
Extracting pip-1.4.1-py2.7.egg to
/Users/dl/Library/Python/2.7/lib/python/site-packages pip 1.4.1 is
already the active version in easy-install.pth Installing pip script
to /Users/dl/Library/Python/2.7/bin Installing pip-2.7 script to
/Users/dl/Library/Python/2.7/bin
Installed
/Users/dl/Library/Python/2.7/lib/python/site-packages/pip-1.4.1-py2.7.egg
Processing dependencies for pip==1.4.1 Finished processing
dependencies for pip==1.4.1
Then I inputed pip install, the error message showed like that
Traceback (most recent call last): File
"/Library/Frameworks/Python.framework/Versions/2.7/bin/pip", line 9,
in <module>
load_entry_point('pip==1.4.1', 'console_scripts', 'pip')() File "build/bdist.macosx-10.6-intel/egg/pkg_resources.py", line 357, in
load_entry_point File
"build/bdist.macosx-10.6-intel/egg/pkg_resources.py", line 2394, in
load_entry_point File
"build/bdist.macosx-10.6-intel/egg/pkg_resources.py", line 2108, in
load ImportError: No module named pip
Anyone who met the same problem before and can give me some tips to solve it?
spencers-macbook-pro:python-novaclient root# python setup.py install
running install
/usr/bin/python:No module named pip
error:/usr/bin/python -m pip.__init__ install 'pbr>=0.5.21,<1.0''iso8601>=0.1.4''PrettyTable>=0.6,<0.8''requests>=1.1''simplejson>=2.0.9''six''Babel>=0.9.6' returned 1
我使用自制软件,因此我解决了 sudo easy_install pip
spencers-macbook-pro:python-novaclient root# brew search pip
aespipe brew-pip lesspipe pipebench pipemeter spiped pipeviewer
If you meant "pip" precisely:Homebrew provides pip via:`brew install python`.However you will then
have two Pythons installed on your Mac, so alternatively you can:
sudo easy_install pip
spencers-macbook-pro:python-novaclient root# sudo easy_install pip
I ran into this same issue when I attempted to install the nova client.
spencers-macbook-pro:python-novaclient root# python setup.py install
running install
/usr/bin/python: No module named pip
error: /usr/bin/python -m pip.__init__ install 'pbr>=0.5.21,<1.0' 'iso8601>=0.1.4' 'PrettyTable>=0.6,<0.8' 'requests>=1.1' 'simplejson>=2.0.9' 'six' 'Babel>=0.9.6' returned 1
I use homebrew so I worked around the issue with sudo easy_install pip
spencers-macbook-pro:python-novaclient root# brew search pip
aespipe brew-pip lesspipe pipebench pipemeter spiped pipeviewer
If you meant "pip" precisely:
Homebrew provides pip via: `brew install python`. However you will then
have two Pythons installed on your Mac, so alternatively you can:
sudo easy_install pip
spencers-macbook-pro:python-novaclient root# sudo easy_install pip
The commands should be similar if you use macports.
requests library – for retrieving data from web APIs.
This runs the pip module and asks it to find the requests library on PyPI.org (the Python Package Index) and install it in your local system so that it becomes available for you to import
I solved a similar error on Linux by setting PYTHONPATH to the site-packages location. This was after running python get-pip.py --prefix /home/chet/pip.
[chet@rhel1 ~]$ ~/pip/bin/pip -V
Traceback (most recent call last):
File "/home/chet/pip/bin/pip", line 7, in <module>
from pip import main
ImportError: No module named pip
[chet@rhel1 ~]$ export PYTHONPATH=/home/chet/pip/lib/python2.6/site-packages
[chet@rhel1 ~]$ ~/pip/bin/pip -V
pip 9.0.1 from /home/chet/pip/lib/python2.6/site-packages (python 2.6)
anant$ python pip.py --help
Usage: pip.py COMMAND [OPTIONS]Options:--version show program's version number and exit
-h, --help show this help message and exit
-E DIR, --environment=DIR
virtualenv environment to run pip in (either give the
interpreter or the environment base directory)
-v, --verbose Give more output
-q, --quiet Give less output
--log=FILENAME Log file where a complete (maximum verbosity) record
will be kept
--proxy=PROXY Specify a proxy in the form
user:passwd@proxy.server:port. Note that the
user:password@ is optional and required only if you
are behind an authenticated proxy. If you provide
user@proxy.server:port then you will be prompted for a
password.
--timeout=SECONDS Set the socket timeout (default 15 seconds)
anant$ python pip.py --help
Usage: pip.py COMMAND [OPTIONS]
Options:
--version show program's version number and exit
-h, --help show this help message and exit
-E DIR, --environment=DIR
virtualenv environment to run pip in (either give the
interpreter or the environment base directory)
-v, --verbose Give more output
-q, --quiet Give less output
--log=FILENAME Log file where a complete (maximum verbosity) record
will be kept
--proxy=PROXY Specify a proxy in the form
user:passwd@proxy.server:port. Note that the
user:password@ is optional and required only if you
are behind an authenticated proxy. If you provide
user@proxy.server:port then you will be prompted for a
password.
--timeout=SECONDS Set the socket timeout (default 15 seconds)
回答 13
以下是使用MacPorts升级到Python 3的最少说明:
sudo port install py37-pip
sudo port select --set pip pip37
sudo port select --set pip3 pip37
sudo pip install numpy, scipy, matplotlib
On some kind of linux like ubuntu, first, do apt-get update and then try installing the python-pip package.
without apt-get update, you might get error such as
I am trying to plot a simple graph using pyplot, e.g.:
import matplotlib.pyplot as plt
plt.plot([1,2,3],[5,7,4])
plt.show()
but the figure does not appear and I get the following message:
UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure.
I saw in several places that one had to change the configuration of matplotlib using the following:
import matplotlib
matplotlib.use('TkAgg')
import matplotlib.pyplot as plt
I did this, but then got an error message because it cannot find a module:
ModuleNotFoundError: No module named 'tkinter'
Then, I tried to install “tkinter” using pip install tkinter (inside the virtual environment), but it does not find it:
Collecting tkinter
Could not find a version that satisfies the requirement tkinter (from versions: )
No matching distribution found for tkinter
I should also mention that I am running all this on Pycharm Community Edition IDE using a virtual environment, and that my operating system is Linux/Ubuntu 18.04.
I would like to know how I can solve this problem in order to be able to display the graph.
In my case, the error message was implying that I was working in a headless console. So plt.show() could not work. What worked was calling plt.savefig:
import matplotlib.pyplot as plt
plt.plot([1,2,3], [5,7,4])
plt.savefig("mygraph.png")
I’ve tried your way, it seems no error to run at my computer, it successfully shows the figure. maybe because pycharm have tkinter as a system package, so u don’t need to install it. But if u can’t find tkinter inside, you can go to Tkdocs to see the way of installing tkinter, as it mentions, tkinter is a core package for python.
After upgrading lots of packages (Spyder 3 to 4, Keras and Tensorflow and lots of their dependencies), I had the same problem today! I cannot figure out what happened; but the (conda-based) virtual environment that kept using Spyder 3 did not have the problem. Although installing tkinter or changing the backend, via matplotlib.use('TkAgg) as shown above, or this nice post on how to change the backend, might well resolve the problem, I don’t see these as rigid solutions. For me, uninstalling matplotlib and reinstalling it was magic and the problem was solved.
pip uninstall matplotlib
… then, install
pip install matplotlib
From all the above, this could be a package management problem, and BTW, I use both conda and pip, whenever feasible.
When I ran into this error on Spyder, I changed from running my code line by line to highlighting my block of plotting code and running that all at once. Voila, the image appeared.
tkinter is python version-specific in the sense that sudo apt-get install python3-tk will install tkinter exclusively for your default version of python. Suppose you have different python versions within various virtual environments, you will have to install tkinter for the desired python version used in that virtual environment. For example, sudo apt-get install python3.7-tk. Not doing this will still lead to No module named ' tkinter' errors, even after installing it for the global python version.
Does anyone here have any useful code which uses reduce() function in python? Is there any code other than the usual + and * that we see in the examples?
The usage of reduce that I found in my code involved the situation where I had some class structure for logic expression and I needed to convert a list of these expression objects to a conjunction of the expressions. I already had a function make_and to create a conjunction given two expressions, so I wrote reduce(make_and,l). (I knew the list wasn’t empty; otherwise it would have been something like reduce(make_and,l,make_true).)
This is exactly the reason that (some) functional programmers like reduce (or fold functions, as such functions are typically called). There are often already many binary functions like +, *, min, max, concatenation and, in my case, make_and and make_or. Having a reduce makes it trivial to lift these operations to lists (or trees or whatever you got, for fold functions in general).
Of course, if certain instantiations (such as sum) are often used, then you don’t want to keep writing reduce. However, instead of defining the sum with some for-loop, you can just as easily define it with reduce.
Readability, as mentioned by others, is indeed an issue. You could argue, however, that only reason why people find reduce less “clear” is because it is not a function that many people know and/or use.
回答 6
函数组成:如果您已经有了要连续应用的函数列表,例如:
color =lambda x: x.replace('brown','blue')
speed =lambda x: x.replace('quick','slow')
work =lambda x: x.replace('lazy','industrious')
fs =[str.lower, color, speed, work, str.title]
然后,您可以使用以下命令连续应用它们:
>>> call =lambda s, func: func(s)>>> s ="The Quick Brown Fox Jumps Over the Lazy Dog">>> reduce(call, fs, s)'The Slow Blue Fox Jumps Over The Industrious Dog'
>>> call = lambda s, func: func(s)
>>> s = "The Quick Brown Fox Jumps Over the Lazy Dog"
>>> reduce(call, fs, s)
'The Slow Blue Fox Jumps Over The Industrious Dog'
In this case, method chaining may be more readable. But sometimes it isn’t possible, and this kind of composition may be more readable and maintainable than a f1(f2(f3(f4(x)))) kind of syntax.
回答 7
您可以替换value = json_obj['a']['b']['c']['d']['e']为:
value = reduce(dict.__getitem__,'abcde', json_obj)
@Blair Conrad: You could also implement your glob/reduce using sum, like so:
files = sum([glob.glob(f) for f in args], [])
This is less verbose than either of your two examples, is perfectly Pythonic, and is still only one line of code.
So to answer the original question, I personally try to avoid using reduce because it’s never really necessary and I find it to be less clear than other approaches. However, some people get used to reduce and come to prefer it to list comprehensions (especially Haskell programmers). But if you’re not already thinking about a problem in terms of reduce, you probably don’t need to worry about using it.
I’m writing a compose function for a language, so I construct the composed function using reduce along with my apply operator.
In a nutshell, compose takes a list of functions to compose into a single function. If I have a complex operation that is applied in stages, I want to put it all together like so:
Reduce isn’t limited to scalar operations; it can also be used to sort things into buckets. (This is what I use reduce for most often).
Imagine a case in which you have a list of objects, and you want to re-organize it hierarchically based on properties stored flatly in the object. In the following example, I produce a list of metadata objects related to articles in an XML-encoded newspaper with the articles function. articles generates a list of XML elements, and then maps through them one by one, producing objects that hold some interesting info about them. On the front end, I’m going to want to let the user browse the articles by section/subsection/headline. So I use reduce to take the list of articles and return a single dictionary that reflects the section/subsection/article hierarchy.
from lxml import etree
from Reader import Reader
class IssueReader(Reader):
def articles(self):
arts = self.q('//div3') # inherited ... runs an xpath query against the issue
subsection = etree.XPath('./ancestor::div2/@type')
section = etree.XPath('./ancestor::div1/@type')
header_text = etree.XPath('./head//text()')
return map(lambda art: {
'text_id': self.id,
'path': self.getpath(art)[0],
'subsection': (subsection(art)[0] or '[none]'),
'section': (section(art)[0] or '[none]'),
'headline': (''.join(header_text(art)) or '[none]')
}, arts)
def by_section(self):
arts = self.articles()
def extract(acc, art): # acc for accumulator
section = acc.get(art['section'], False)
if section:
subsection = acc.get(art['subsection'], False)
if subsection:
subsection.append(art)
else:
section[art['subsection']] = [art]
else:
acc[art['section']] = {art['subsection']: [art]}
return acc
return reduce(extract, arts, {})
I give both functions here because I think it shows how map and reduce can complement each other nicely when dealing with objects. The same thing could have been accomplished with a for loop, … but spending some serious time with a functional language has tended to make me think in terms of map and reduce.
By the way, if anybody has a better way to set properties like I’m doing in extract, where the parents of the property you want to set might not exist yet, please let me know.
At first glance the following projects use reduce()
MoinMoin
Zope
Numeric
ScientificPython
etc. etc. but then these are hardly surprising since they are huge projects.
The functionality of reduce can be done using function recursion which I guess Guido thought was more explicit.
Update:
Since Google’s Code Search was discontinued on 15-Jan-2012, besides reverting to regular Google searches, there’s something called Code Snippets Collection that looks promising. A number of other resources are mentioned in answers this (closed) question Replacement for Google Code Search?.
Update 2 (29-May-2017):
A good source for Python examples (in open-source code) is the Nullege search engine.
回答 16
import os
files =[# full filenames"var/log/apache/errors.log","home/kane/images/avatars/crusader.png","home/jane/documents/diary.txt","home/kane/images/selfie.jpg","var/log/abc.txt","home/kane/.vimrc","home/kane/images/avatars/paladin.png",]# unfolding of plain filiname list to file-tree
fs_tree =({},# dict of folders[])# list of filesfor full_name in files:
path, fn = os.path.split(full_name)
reduce(# this fucction walks deep into path# and creates placeholders for subfolderslambda d, k: d[0].setdefault(k,# walk deep({},[])),# or create subfolder storage
path.split(os.path.sep),
fs_tree
)[1].append(fn)print fs_tree
#({'home': (# {'jane': (# {'documents': (# {},# ['diary.txt']# )},# []# ),# 'kane': (# {'images': (# {'avatars': (# {},# ['crusader.png',# 'paladin.png']# )},# ['selfie.jpg']# )},# ['.vimrc']# )},# []# ),# 'var': (# {'log': (# {'apache': (# {},# ['errors.log']# )},# ['abc.txt']# )},# [])#},#[])
from collections importCounter
stat2011 =Counter({"January":12,"February":20,"March":50,"April":70,"May":15,"June":35,"July":30,"August":15,"September":20,"October":60,"November":13,"December":50})
stat2012 =Counter({"January":36,"February":15,"March":50,"April":10,"May":90,"June":25,"July":35,"August":15,"September":20,"October":30,"November":10,"December":25})
stat2013 =Counter({"January":10,"February":60,"March":90,"April":10,"May":80,"June":50,"July":30,"August":15,"September":20,"October":75,"November":60,"December":15})
stat_list =[stat2011, stat2012, stat2013]print reduce(lambda x, y: x & y, stat_list)# MINprint reduce(lambda x, y: x | y, stat_list)# MAX
Let say that there are some yearly statistic data stored a list of Counters.
We want to find the MIN/MAX values in each month across the different years.
For example, for January it would be 10. And for February it would be 15.
We need to store the results in a new Counter.
Note that it handles edge cases that popular answer in SO doesn’t. For more in-depth explanation, I am redirecting you to original blog post.
回答 23
使用reduce()来确定日期列表是否连续:
from datetime import date, timedelta
def checked(d1, d2):"""
We assume the date list is sorted.
If d2 & d1 are different by 1, everything up to d2 is consecutive, so d2
can advance to the next reduction.
If d2 & d1 are not different by 1, returning d1 - 1 for the next reduction
will guarantee the result produced by reduce() to be something other than
the last date in the sorted date list.
Definition 1: 1/1/14, 1/2/14, 1/2/14, 1/3/14 is consider consecutive
Definition 2: 1/1/14, 1/2/14, 1/2/14, 1/3/14 is consider not consecutive
"""#if (d2 - d1).days == 1 or (d2 - d1).days == 0: # for Definition 1if(d2 - d1).days ==1:# for Definition 2return d2
else:return d1 + timedelta(days=-1)# datelist = [date(2014, 1, 1), date(2014, 1, 3),# date(2013, 12, 31), date(2013, 12, 30)]# datelist = [date(2014, 2, 19), date(2014, 2, 19), date(2014, 2, 20),# date(2014, 2, 21), date(2014, 2, 22)]
datelist =[date(2014,2,19), date(2014,2,21),
date(2014,2,22), date(2014,2,20)]
datelist.sort()if datelist[-1]== reduce(checked, datelist):print"dates are consecutive"else:print"dates are not consecutive"
Using reduce() to find out if a list of dates are consecutive:
from datetime import date, timedelta
def checked(d1, d2):
"""
We assume the date list is sorted.
If d2 & d1 are different by 1, everything up to d2 is consecutive, so d2
can advance to the next reduction.
If d2 & d1 are not different by 1, returning d1 - 1 for the next reduction
will guarantee the result produced by reduce() to be something other than
the last date in the sorted date list.
Definition 1: 1/1/14, 1/2/14, 1/2/14, 1/3/14 is consider consecutive
Definition 2: 1/1/14, 1/2/14, 1/2/14, 1/3/14 is consider not consecutive
"""
#if (d2 - d1).days == 1 or (d2 - d1).days == 0: # for Definition 1
if (d2 - d1).days == 1: # for Definition 2
return d2
else:
return d1 + timedelta(days=-1)
# datelist = [date(2014, 1, 1), date(2014, 1, 3),
# date(2013, 12, 31), date(2013, 12, 30)]
# datelist = [date(2014, 2, 19), date(2014, 2, 19), date(2014, 2, 20),
# date(2014, 2, 21), date(2014, 2, 22)]
datelist = [date(2014, 2, 19), date(2014, 2, 21),
date(2014, 2, 22), date(2014, 2, 20)]
datelist.sort()
if datelist[-1] == reduce(checked, datelist):
print "dates are consecutive"
else:
print "dates are not consecutive"
Next to the find method there is as well index. find and index both yield the same result: returning the position of the first occurrence, but if nothing is found index will raise a ValueError whereas find returns -1. Speedwise, both have the same benchmark results.
s.find(t) #returns: -1, or index where t starts in s
s.index(t) #returns: Same as find, but raises ValueError if t is not in s
Additional knowledge: rfind and rindex:
In general, find and index return the smallest index where the passed-in string starts, and rfind and rindex return the largest index where it starts
Most of the string searching algorithms search from left to right, so functions starting with r indicate that the search happens from right to left.
So in case that the likelihood of the element you are searching is close to the end than to the start of the list, rfind or rindex would be faster.
s.rfind(t) #returns: Same as find, but searched right to left
s.rindex(t) #returns: Same as index, but searches right to left
def find_pos(string,word):for i in range(len(string)- len(word)+1):if string[i:i+len(word)]== word:return i
return'Not Found'
string ="the dude is a cool dude"
word ='dude1'print(find_pos(string,word))# output 4
to implement this in algorithmic way, by not using any python inbuilt function .
This can be implemented as
def find_pos(string,word):
for i in range(len(string) - len(word)+1):
if string[i:i+len(word)] == word:
return i
return 'Not Found'
string = "the dude is a cool dude"
word = 'dude1'
print(find_pos(string,word))
# output 4
回答 3
def find_pos(chaine,x):for i in range(len(chaine)):if chaine[i]==x :return'yes',i
return'no'
I use double quote here to take the shell out of the equation, but single quotes may be better for some platforms. Also note the escapes for characters that fabric considers delimiters.
You need to pass all Python variables as strings, especially if you are using sub-process to run the scripts, or you will get an error. You will need to convert the variables back to int/boolean types separately.
def print_this(var):
print str(var)
fab print_this:'hello world'
fab print_this='hello'
fab print_this:'99'
fab print_this='True'
The documentation for the round() function states that you pass it a number, and the positions past the decimal to round. Thus it should do this:
n = 5.59
round(n, 1) # 5.6
But, in actuality, good old floating point weirdness creeps in and you get:
5.5999999999999996
For the purposes of UI, I need to display 5.6. I poked around the Internet and found some documentation that this is dependent on my implementation of Python. Unfortunately, this occurs on both my Windows dev machine and each Linux server I’ve tried. See here also.
Short of creating my own round library, is there any way around this?
If you use the Decimal module you can approximate without the use of the ’round’ function. Here is what I’ve been using for rounding especially when writing monetary applications:
Floating point math is vulnerable to slight, but annoying, precision inaccuracies. If you can work with integer or fixed point, you will be guaranteed precision.
Decimal “is based on a floating-point
model which was designed with people
in mind, and necessarily has a
paramount guiding principle –
computers must provide an arithmetic
that works in the same way as the
arithmetic that people learn at
school.” – excerpt from the decimal
arithmetic specification.
and
Decimal numbers can be represented
exactly. In contrast, numbers like 1.1
and 2.2 do not have an exact
representations in binary floating
point. End users typically would not
expect 1.1 + 2.2 to display as
3.3000000000000003 as it does with binary floating point.
Decimal provides the kind of operations that make it easy to write apps that require floating point operations and also need to present those results in a human readable format, e.g., accounting.
from math import ceil
decimal_count =2print(ceil(61.295*10** decimal_count)/10** decimal_count)print(ceil(1.295*10** decimal_count)/10** decimal_count)
I would avoid relying on round() at all in this case. Consider
print(round(61.295, 2))
print(round(1.295, 2))
will output
61.3
1.29
which is not a desired output if you need solid rounding to the nearest integer. To bypass this behavior go with math.ceil() (or math.floor() if you want to round down):
Here’s where I see round failing. What if you wanted to round these 2 numbers to one decimal place?
23.45
23.55
My education was that from rounding these you should get:
23.4
23.6
the “rule” being that you should round up if the preceding number was odd, not round up if the preceding number were even.
The round function in python simply truncates the 5.
import re
def custom_round(num, precision=0):# Get the type of given number
type_num = type(num)# If the given type is not a valid number type, raise TypeErrorif type_num notin[int, float,Decimal]:raiseTypeError("type {} doesn't define __round__ method".format(type_num.__name__))# If passed number is int, there is no rounding off.if type_num == int:return num
# Convert number to string.
str_num = str(num).lower()# We will remove negative context from the number and add it back in the end
negative_number =Falseif num <0:
negative_number =True
str_num = str_num[1:]# If number is in format 1e-12 or 2e+13, we have to convert it to# to a string in standard decimal notation.if'e-'in str_num:# For 1.23e-7, e_power = 7
e_power = int(re.findall('e-[0-9]+', str_num)[0][2:])# For 1.23e-7, number = 123
number =''.join(str_num.split('e-')[0].split('.'))
zeros =''# Number of zeros = e_power - 1 = 6for i in range(e_power -1):
zeros = zeros +'0'# Scientific notation 1.23e-7 in regular decimal = 0.000000123
str_num ='0.'+ zeros + number
if'e+'in str_num:# For 1.23e+7, e_power = 7
e_power = int(re.findall('e\+[0-9]+', str_num)[0][2:])# For 1.23e+7, number_characteristic = 1# characteristic is number left of decimal point.
number_characteristic = str_num.split('e+')[0].split('.')[0]# For 1.23e+7, number_mantissa = 23# mantissa is number right of decimal point.
number_mantissa = str_num.split('e+')[0].split('.')[1]# For 1.23e+7, number = 123
number = number_characteristic + number_mantissa
zeros =''# Eg: for this condition = 1.23e+7if e_power >= len(number_mantissa):# Number of zeros = e_power - mantissa length = 5for i in range(e_power - len(number_mantissa)):
zeros = zeros +'0'# Scientific notation 1.23e+7 in regular decimal = 12300000.0
str_num = number + zeros +'.0'# Eg: for this condition = 1.23e+1if e_power < len(number_mantissa):# In this case, we only need to shift the decimal e_power digits to the right# So we just copy the digits from mantissa to characteristic and then remove# them from mantissa.for i in range(e_power):
number_characteristic = number_characteristic + number_mantissa[i]
number_mantissa = number_mantissa[i:]# Scientific notation 1.23e+1 in regular decimal = 12.3
str_num = number_characteristic +'.'+ number_mantissa
# characteristic is number left of decimal point.
characteristic_part = str_num.split('.')[0]# mantissa is number right of decimal point.
mantissa_part = str_num.split('.')[1]# If number is supposed to be rounded to whole number,# check first decimal digit. If more than 5, return# characteristic + 1 else return characteristicif precision ==0:if mantissa_part and int(mantissa_part[0])>=5:return type_num(int(characteristic_part)+1)return type_num(characteristic_part)# Get the precision of the given number.
num_precision = len(mantissa_part)# Rounding off is done only if number precision is# greater than requested precisionif num_precision <= precision:return num
# Replace the last '5' with 6 so that rounding off returns desired resultsif str_num[-1]=='5':
str_num = re.sub('5$','6', str_num)
result = round(type_num(str_num), precision)# If the number was negative, add negative context backif negative_number:
result = result *-1return result
The problem is only when last digit is 5. Eg. 0.045 is internally stored as 0.044999999999999… You could simply increment last digit to 6 and round off. This will give you the desired results.
import re
def custom_round(num, precision=0):
# Get the type of given number
type_num = type(num)
# If the given type is not a valid number type, raise TypeError
if type_num not in [int, float, Decimal]:
raise TypeError("type {} doesn't define __round__ method".format(type_num.__name__))
# If passed number is int, there is no rounding off.
if type_num == int:
return num
# Convert number to string.
str_num = str(num).lower()
# We will remove negative context from the number and add it back in the end
negative_number = False
if num < 0:
negative_number = True
str_num = str_num[1:]
# If number is in format 1e-12 or 2e+13, we have to convert it to
# to a string in standard decimal notation.
if 'e-' in str_num:
# For 1.23e-7, e_power = 7
e_power = int(re.findall('e-[0-9]+', str_num)[0][2:])
# For 1.23e-7, number = 123
number = ''.join(str_num.split('e-')[0].split('.'))
zeros = ''
# Number of zeros = e_power - 1 = 6
for i in range(e_power - 1):
zeros = zeros + '0'
# Scientific notation 1.23e-7 in regular decimal = 0.000000123
str_num = '0.' + zeros + number
if 'e+' in str_num:
# For 1.23e+7, e_power = 7
e_power = int(re.findall('e\+[0-9]+', str_num)[0][2:])
# For 1.23e+7, number_characteristic = 1
# characteristic is number left of decimal point.
number_characteristic = str_num.split('e+')[0].split('.')[0]
# For 1.23e+7, number_mantissa = 23
# mantissa is number right of decimal point.
number_mantissa = str_num.split('e+')[0].split('.')[1]
# For 1.23e+7, number = 123
number = number_characteristic + number_mantissa
zeros = ''
# Eg: for this condition = 1.23e+7
if e_power >= len(number_mantissa):
# Number of zeros = e_power - mantissa length = 5
for i in range(e_power - len(number_mantissa)):
zeros = zeros + '0'
# Scientific notation 1.23e+7 in regular decimal = 12300000.0
str_num = number + zeros + '.0'
# Eg: for this condition = 1.23e+1
if e_power < len(number_mantissa):
# In this case, we only need to shift the decimal e_power digits to the right
# So we just copy the digits from mantissa to characteristic and then remove
# them from mantissa.
for i in range(e_power):
number_characteristic = number_characteristic + number_mantissa[i]
number_mantissa = number_mantissa[i:]
# Scientific notation 1.23e+1 in regular decimal = 12.3
str_num = number_characteristic + '.' + number_mantissa
# characteristic is number left of decimal point.
characteristic_part = str_num.split('.')[0]
# mantissa is number right of decimal point.
mantissa_part = str_num.split('.')[1]
# If number is supposed to be rounded to whole number,
# check first decimal digit. If more than 5, return
# characteristic + 1 else return characteristic
if precision == 0:
if mantissa_part and int(mantissa_part[0]) >= 5:
return type_num(int(characteristic_part) + 1)
return type_num(characteristic_part)
# Get the precision of the given number.
num_precision = len(mantissa_part)
# Rounding off is done only if number precision is
# greater than requested precision
if num_precision <= precision:
return num
# Replace the last '5' with 6 so that rounding off returns desired results
if str_num[-1] == '5':
str_num = re.sub('5$', '6', str_num)
result = round(type_num(str_num), precision)
# If the number was negative, add negative context back
if negative_number:
result = result * -1
return result
回答 17
另一个可能的选择是:
def hard_round(number, decimal_places=0):"""
Function:
- Rounds a float value to a specified number of decimal places
- Fixes issues with floating point binary approximation rounding in python
Requires:
- `number`:
- Type: int|float
- What: The number to round
Optional:
- `decimal_places`:
- Type: int
- What: The number of decimal places to round to
- Default: 0
Example:
```
hard_round(5.6,1)
```
"""return int(number*(10**decimal_places)+0.5)/(10**decimal_places)
def hard_round(number, decimal_places=0):
"""
Function:
- Rounds a float value to a specified number of decimal places
- Fixes issues with floating point binary approximation rounding in python
Requires:
- `number`:
- Type: int|float
- What: The number to round
Optional:
- `decimal_places`:
- Type: int
- What: The number of decimal places to round to
- Default: 0
Example:
```
hard_round(5.6,1)
```
"""
return int(number*(10**decimal_places)+0.5)/(10**decimal_places)
pip install -r requirements.txt fails with the exception below OSError: [Errno 13] Permission denied: '/usr/local/lib/.... What’s wrong and how do I fix this? (I am trying to setup Django)
Installing collected packages: amqp, anyjson, arrow, beautifulsoup4, billiard, boto, braintree, celery, cffi, cryptography, Django, django-bower, django-braces, django-celery, django-crispy-forms, django-debug-toolbar, django-disqus, django-embed-video, django-filter, django-merchant, django-pagination, django-payments, django-storages, django-vote, django-wysiwyg-redactor, easy-thumbnails, enum34, gnureadline, idna, ipaddress, ipython, kombu, mock, names, ndg-httpsclient, Pillow, pyasn1, pycparser, pycrypto, PyJWT, pyOpenSSL, python-dateutil, pytz, requests, six, sqlparse, stripe, suds-jurko
Cleaning up...
Exception:
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/pip/basecommand.py", line 122, in main
status = self.run(options, args)
File "/usr/lib/python2.7/dist-packages/pip/commands/install.py", line 283, in run
requirement_set.install(install_options, global_options, root=options.root_path)
File "/usr/lib/python2.7/dist-packages/pip/req.py", line 1436, in install
requirement.install(install_options, global_options, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/pip/req.py", line 672, in install
self.move_wheel_files(self.source_dir, root=root)
File "/usr/lib/python2.7/dist-packages/pip/req.py", line 902, in move_wheel_files
pycompile=self.pycompile,
File "/usr/lib/python2.7/dist-packages/pip/wheel.py", line 206, in move_wheel_files
clobber(source, lib_dir, True)
File "/usr/lib/python2.7/dist-packages/pip/wheel.py", line 193, in clobber
os.makedirs(destsubdir)
File "/usr/lib/python2.7/os.py", line 157, in makedirs
mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/usr/local/lib/python2.7/dist-packages/amqp-1.4.6.dist-info'
We should really stop advising the use of sudo with pip install. It’s better to first try pip install --user. If this fails then take a look at the top post here.
The reason you shouldn’t use sudo is as follows:
When you run pip with sudo, you are running arbitrary Python code from the Internet as a root user, which is quite a big security risk. If someone puts up a malicious project on PyPI and you install it, you give an attacker root access to your machine.
You are trying to install a package on the system-wide path without having the permission to do so.
In general, you can use sudo to temporarily obtain superuser
permissions at your responsibility in order to install the package on the system-wide path:
Actually, this is a bad idea and there’s no good use case for it, see @wim’s comment.
If you don’t want to make system-wide changes, you can install the package on your per-user path using the --user flag.
All it takes is:
pip install --user runloop requirements.txt
Finally, for even finer grained control, you can also use a virtualenv, which might be the superior solution for a development environment, especially if you are working on multiple projects and want to keep track of each one’s dependencies.
After activating your virtualenv with
$ my-virtualenv/bin/activate
the following command will install the package inside the virtualenv (and not on the system-wide path):
Just clarifying what worked for me after much pain in linux (ubuntu based) on permission denied errors, and leveraging from Bert’s answer above, I now use …
$ pip install --user <package-name>
or if running pip on a requirements file …
$ pip install --user -r requirements.txt
and these work reliably for every pip install including creating virtual environments.
However, the cleanest solution in my further experience has been to install python-virtualenv and virtualenvwrapper with sudo apt-get install at the system level.
Then, inside virtual environments, use pip install without the --user flag AND without sudo. Much cleaner, safer, and easier overall.
If you need permissions, you cannot use ‘pip’ with ‘sudo’.
You can do a trick, so that you can use ‘sudo’ and install package. Just place ‘sudo python -m …’ in front of your pip command.
So, I got this same exact error for a completely different reason. Due to a totally separate, but known Homebrew + pip bug, I had followed this workaround listed on Google Cloud’s help docs, where you create a .pydistutils.cfg file in your home directory. This file has special config that you’re only supposed to use for your install of certain libraries. I should have removed that disutils.cfg file after installing the packages, but I forgot to do so. So the fix for me was actually just…
rm ~/.pydistutils.cfg.
And then everything worked as normal. Of course, if you have some config in that file for a real reason, then you won’t want to just straight rm that file. But in case anyone else did that workaround, and forgot to remove that file, this did the trick for me!
回答 7
是适当的许可问题,
sudo chown -R $USER /path to your python installed directory