标签归档:ipython

禁用IPython退出确认

问题:禁用IPython退出确认

每次输入时exit(),系统都会提示我退出提示,这真令人烦恼。我当然要退出!否则,我不会写exit()

有没有一种方法可以覆盖IPython的默认行为,使其在没有提示的情况下退出?

It’s really irritating that every time I type exit(), I get prompted with a confirmation to exit; of course I want to exit! Otherwise, I would not have written exit()!!!

Is there a way to override IPython’s default behaviour to make it exit without a prompt?


回答 0

如果您还想Ctrl-D不经确认就退出,请在IPython 0.11中将其添加c.TerminalInteractiveShell.confirm_exit = False到配置文件*中。

如果还没有配置文件,请运行ipython profile create以创建一个。

如果您在Django Shell中工作,请注意此票证


*配置文件位于: $HOME/.ipython/profile_default/ipython_config.py

If you also want Ctrl-D to exit without confirmation, in IPython 0.11, add c.TerminalInteractiveShell.confirm_exit = False to your config file *.

If you don’t have a config file yet, run ipython profile create to create one.

Note this ticket if you’re working within the Django shell.


* The config file is located at: $HOME/.ipython/profile_default/ipython_config.py


回答 1

在ipython版本0.11或更高版本中,

  1. --no-confirm-exitOR 运行
  2. 通过“退出”退出,而不是按Control-D或
  3. 确保目录存在(或运行该目录ipython profile create)并将这些行添加到$ HOME / .ipython / profile_default / ipython_config.py:

    c = get_config()
    
    c.TerminalInteractiveShell.confirm_exit = False

In ipython version 0.11 or higher,

  1. Run with --no-confirm-exit OR
  2. Exit via ‘exit’ instead of control-D OR
  3. Make sure the directory exists (or run ipython profile create to create it) and add these lines to $HOME/.ipython/profile_default/ipython_config.py:

    c = get_config()
    
    c.TerminalInteractiveShell.confirm_exit = False
    

回答 2

只是打字Exit,有资本E

或者,通过以下方式启动IPython:

$ ipython -noconfirm_exit

或对于较新版本的IPython:

$ ipython --no-confirm-exit 

just type Exit, with capital E.

Alternatively, start IPython with:

$ ipython -noconfirm_exit

Or for newer versions of IPython:

$ ipython --no-confirm-exit 

回答 3

我喜欢配置建议,但是直到了解它们,我才开始使用“ Quit”组合键。

Ctrl+\

要么

Ctrl+4

这只会杀死正在运行的东西。没有时间提出确认问题。

I like the config suggestions, but until I learned them I’ve started using “Quit” key combination.

Ctrl+\

or

Ctrl+4

This just kills what is running. No time to ask questions on confirmation.


在Ipython notebook / Jupyter中,Pandas未显示我尝试绘制的图形

问题:在Ipython notebook / Jupyter中,Pandas未显示我尝试绘制的图形

我正在尝试使用Ipython Notebook中的熊猫绘制一些数据,尽管它给了我对象,但实际上并没有绘制图形本身。所以看起来像这样:

In [7]:

pledge.Amount.plot()

Out[7]:

<matplotlib.axes.AxesSubplot at 0x9397c6c>

该图应在此之后,但根本不会出现。我已经导入了matplotlib,所以这不是问题。我还需要导入其他模块吗?

I am trying to plot some data using pandas in Ipython Notebook, and while it gives me the object, it doesn’t actually plot the graph itself. So it looks like this:

In [7]:

pledge.Amount.plot()

Out[7]:

<matplotlib.axes.AxesSubplot at 0x9397c6c>

The graph should follow after that, but it simply doesn’t appear. I have imported matplotlib, so that’s not the problem. Is there any other module I need to import?


回答 0

请注意,–pylab已被弃用,并且已从较新的IPython版本中删除。建议在IPython Notebook中启用内联绘图的方法现已运行:

%matplotlib inline
import matplotlib.pyplot as plt

有关更多详细信息,请参阅ipython-dev邮件列表中的这篇文章

Note that –pylab is deprecated and has been removed from newer builds of IPython, The recommended way to enable inline plotting in the IPython Notebook is now to run:

%matplotlib inline
import matplotlib.pyplot as plt

See this post from the ipython-dev mailing list for more details.


回答 1

编辑:Pylab已被弃用,请参阅当前接受的答案

好的,看来答案是使用–pylab = inline启动ipython Notebook。因此,ipython notebook –pylab = inline可以完成我之前看到的以及我想要它做的事情。对不起这个原始的问题。

Edit:Pylab has been deprecated please see the current accepted answer

Ok, It seems the answer is to start ipython notebook with –pylab=inline. so ipython notebook –pylab=inline This has it do what I saw earlier and what I wanted it to do. Sorry about the vague original question.


回答 2

与您import matplotlib.pyplot as plt只需添加

plt.show()

它将显示所有存储的图。

With your import matplotlib.pyplot as plt just add

plt.show()

and it will show all stored plots.


回答 3

导入matplotlib之后很简单,如果像这样启动ipython,就可以执行一个魔术

ipython notebook 

%matplotlib inline 

运行此命令,一切都会完美显示

simple after importing the matplotlib you have execute one magic if you have started the ipython as like this

ipython notebook 

%matplotlib inline 

run this command everything will be shown perfectly


回答 4

使用来启动ipython ipython notebook --pylab inline,然后图形将内联显示。

start ipython with ipython notebook --pylab inline ,then graph will show inline.


回答 5

import matplotlib as plt
%matplotlib as inline
import matplotlib as plt
%matplotlib as inline

回答 6

您需要做的就是导入 matplotlib。

import matplotlib.pyplot as plt 

All you need to do is to import matplotlib.

import matplotlib.pyplot as plt 

ipython读取错误的python版本

问题:ipython读取错误的python版本

我在使用Python,iPython和库时遇到了麻烦。以下几点显示了问题的链。我在Mac Lion上运行Python 2.7。

  1. iPython不会读取scipy和matplotlib的库,但会读取numpy。
  2. 为了解决这个问题,我尝试安装Python的源代码版本,这给我带来了更多问题,因为现在我有两个不同的版本:2.7.1和2.7.2
  3. 我注意到运行Python,使用版本2.7.2,并且确实导入scipy,matplotlib和numpy,但是在iPython上,该版本是2.7.1,无法打开scipy或matplotlib。

我尝试了其他博客文章中遇到的几件事。但是他们都没有帮助,而且不幸的是,我还不太了解我对其中的一些人正在做什么。例如:我尝试使用easy_install和pip卸载并重新安装ipython。我还尝试通过自制程序重新安装所有内容,并修改路径.bash_profile。

I’ve been having trouble with Python, iPython and the libraries. The following points show the chain of the problematics. I’m running Python 2.7 on Mac Lion.

  1. iPython doesn’t read the libraries of scipy, matplotlib, but it does read numpy.
  2. To fix this, I tried installing Python’s source code version, and it only gave me more problems since now I have two different versions: 2.7.1 and 2.7.2
  3. I noticed that running Python, uses version 2.7.2 and does import scipy, matplotlib, and numpy, but on iPython the version is 2.7.1 which doesn’t open scipy or matplotlib.

I’ve tried several things that I’ve encountered from other blogposts. But none of them have helped, and also unfortunately I don’t quite know what I’m doing with some of them. For example: I tried uninstalling and reinstalling ipython with easy_install and pip. I also tried reinstalling everything through homebrew, and modifying the path .bash_profile.


回答 0

好的快速修复:

which python

给你/usr/bin/python吧?做

which ipython

我打赌会的/usr/local/bin/ipython。让我们看看里面:

编辑9/7/16-文件现在如下所示:

cat /usr/local/bin/ipython

#!/usr/bin/python

# -*- coding: utf-8 -*-
import re
import sys

from IPython import start_ipython

if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
    sys.exit(start_ipython())

我的工作正常,但是我的情况与OP并不完全一样。


原始答案-9/30/13:

cat /usr/local/bin/ipython

#!/usr/bin/python
# EASY-INSTALL-ENTRY-SCRIPT: 'ipython==0.12.1','console_scripts','ipython'
__requires__ = 'ipython==0.12.1'
import sys
from pkg_resources import load_entry_point

if __name__ == '__main__':
    sys.exit(
        load_entry_point('ipython==0.12.1', 'console_scripts', 'ipython')()
    )

啊哈- /usr/local/bin/ipython在您的编辑器中打开(具有特权),并将第一行更改为

#!/usr/local/bin/python

保存,启动iPython,应该说它正在使用您现在想要的版本。

Okay quick fix:

which python

gives you /usr/bin/python, right? Do

which ipython

and I bet that’ll be /usr/local/bin/ipython. Let’s look inside:

Edit 9/7/16 — The file now looks like this:

cat /usr/local/bin/ipython

#!/usr/bin/python

# -*- coding: utf-8 -*-
import re
import sys

from IPython import start_ipython

if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
    sys.exit(start_ipython())

And mine works properly like this, but my situation isn’t exactly like the OP’s.


Original answer — 9/30/13:

cat /usr/local/bin/ipython

#!/usr/bin/python
# EASY-INSTALL-ENTRY-SCRIPT: 'ipython==0.12.1','console_scripts','ipython'
__requires__ = 'ipython==0.12.1'
import sys
from pkg_resources import load_entry_point

if __name__ == '__main__':
    sys.exit(
        load_entry_point('ipython==0.12.1', 'console_scripts', 'ipython')()
    )

Aha – open /usr/local/bin/ipython in your editor (with privileges), and change the first line to

#!/usr/local/bin/python

save, start iPython, should say it’s using the version you want now.


回答 1

发表@Matt的评论作为答案,这样它才更明显

python -m IPython

首先使用路径上可访问的任何python将ipython加载为模块。就我而言,我预装了一个,然后从brew中添加了一个。这只是完美的工作。

Posting @Matt’s comment as an answer just so its more visible

python -m IPython

Loads ipython as a module with whatever python is accessible on the path first. In my case I had one pre-installed and one I added from brew. This just works perfectly.


回答 2

怎么样使用virtualenv?我很喜欢。也许这不是更快的方法,但我认为这很清楚。

创建virtualenv时,可以使用-p标志指定python路径。

对于python 2.7

$ virtualenv -p /usr/bin/python2.7 venv2.7
$ source venv2.7/bin/activate
(venv2.7)$ pip install ipython
(venv2.7)$ ipython

对于python 3.4

$ virtualenv -p /usr/bin/python3.4 venv3.4
$ source venv3.4/bin/activate
(venv3.4)$ pip install ipython
(venv3.4)$ ipython

What about using a virtualenv? I really like it. Maybe it’s not the faster way, but I think it’s very clear.

When you create a virtualenv, you can specify the python path with the -p flag.

for python 2.7

$ virtualenv -p /usr/bin/python2.7 venv2.7
$ source venv2.7/bin/activate
(venv2.7)$ pip install ipython
(venv2.7)$ ipython

for python 3.4

$ virtualenv -p /usr/bin/python3.4 venv3.4
$ source venv3.4/bin/activate
(venv3.4)$ pip install ipython
(venv3.4)$ ipython

回答 3

首先,我将确保您使用的是正确的python。在命令提示符下键入:

which python
python -V

第一个告诉您路径,第二个告诉您所使用的Python版本。

First, I would make sure you’re using the right python. At a command prompt type:

which python
python -V

The first will tell you the path, the second tells you the Python version you’re using.


回答 4

我的解决方案简单,愚蠢但可行。

python -V用来确定是什么版本

$ python -V
Python 2.7.10

然后在.bash_profile中创建别名

$ vi ~/.bash_profile

添加一行

alias ipython="python -m IPython"

那么您将ipython在python 2.7中获得一个。🙂

(顺便说一句,我ipython是通过安装的homebrew,默认情况下将在python 3中运行一个ipython。)

$ brew install ipython

My solution is simple, stupid but work.

I use python -V to make sure what version is

$ python -V
Python 2.7.10

and then make alias in .bash_profile

$ vi ~/.bash_profile

Add a line

alias ipython="python -m IPython"

then you will get an ipython in python 2.7. 🙂

(By the way, my ipython is install via homebrew, it default will got an ipython run in python 3.)

$ brew install ipython

回答 5

极为相关:http : //conda.pydata.org/docs/troubleshooting.html#shell-command-location

td; lr问题是由于shell的“散列”和路径变量而引起的。

extremely relevant: http://conda.pydata.org/docs/troubleshooting.html#shell-command-location.

td;lr problems are encountered because of shell ‘hashing’ and path variables.


回答 6

使用pyenv的类似方法

pyenv install 3.4.5
pyenv local 3.4.5
pip install ipython
ipython

现在它将显示正确版本的python

Python 3.4.5

A similar method using pyenv

pyenv install 3.4.5
pyenv local 3.4.5
pip install ipython
ipython

Now it will show correct version of python

Python 3.4.5

回答 7

我能想到的绝对最简单的解决方案,即无需摆弄环境,安装的文件或其他任何东西,它依赖于以下事实:

  1. 该可执行文件ipython实际上是一个Python脚本。
  2. IPython包是与您使用的每个解释器分开安装的pip intall

如果您运行的Python版本安装了IPython软件包,则可以

/path/to/desired/python $(which ipython)

这将ipython使用所需的解释器而不是shebang中列出的解释器来运行脚本。

The absolute simplest solution I could think of, which requires no fiddling with environments, installed files, or anything else, relies on the facts that

  1. The executable ipython is actually a Python script.
  2. The IPython package is installed separately for each interpreter that you ran pip intall with.

If the version of Python you are runninig with has an IPython package installed, you can just do

/path/to/desired/python $(which ipython)

This will run the ipython script with the interpreter you want instead of the one listed in the shebang.


回答 8

您的问题基本上是使ipython使用正确的python。

因此,解决此问题的方法是使ipython使用正确的python(已安装scipy之类的库)

我在这里写了一个解决方案:

如何使iPython使用Python 2而不是Python 3

Your problem is basically making ipython use the right python.

so the fix to the problem is to make ipython use the right python (which has the libraries like scipy installed)

I have written a solution here:

How to make iPython use Python 2 instead of Python 3


回答 9

我遇到了同样的问题,但以下是在OSX 12 Sierra上对我有效的唯一解决方案。

ipython始终为python 3.6启动,但我需要2.7。我找不到2.7的ipython启动脚本,也找不到与一起执行的IPython模块python -m。没有一个brew instally ipython pip install ipythonpip2 install ipython可以让我获得2.7版本的功能。所以我手动得到了。

brew install ipython@5此处安装2.7版本,但不会安装在您的2.7版本上,$PATH因为它知道名称与另一个软件包冲突。 ln -s /usr/local/Cellar/ipython@5/5.5.0_1/bin/ipython /usr/local/bin/ipython2将解决此问题,让您仅ipython2在shell提示符下运行

对我来说,因为我认真考虑将python用于2.7,所以我还运行了以下命令。

ln -s /usr/local/Cellar/ipython/6.2.1/bin/ipython /usr/local/bin/ipython3
rm -f /usr/local/bin/ipython
ln -s /usr/local/bin/ipython2 /usr/local/bin/ipython

I came across the same issue but the following was the only solution what worked for me on OSX 12, Sierra.

ipython was always launching for python 3.6 but I needed it for 2.7. I could not find an ipython startup script for 2.7, nor could I find the IPython module to execute with python -m. None of brew instally ipython pip install ipython or pip2 install ipython could get me the 2.7 version. So I got it manually.

brew install ipython@5 installs the 2.7 version from here but won’t put it on your $PATH because it knows the name conflicts with another package. ln -s /usr/local/Cellar/ipython@5/5.5.0_1/bin/ipython /usr/local/bin/ipython2 will fix this and let you just run ipython2 from your shell prompt

For me, because I was serious about using ipython for 2.7, I also ran the following commands.

ln -s /usr/local/Cellar/ipython/6.2.1/bin/ipython /usr/local/bin/ipython3
rm -f /usr/local/bin/ipython
ln -s /usr/local/bin/ipython2 /usr/local/bin/ipython

回答 10

如果您使用的是anaconda或其他虚拟环境包装器,则此处提到的所有答案都无助于解决问题。

该答案基于您正在使用anaconda的假设。

假设您处于python 3环境中,并且在jupyter笔记本上创建笔记本时,它将显示“ Python 2”而不是“ Python 3”。

这是因为“ ipython”本质上是一个正在运行的脚本,并且在此脚本中提到了用于执行命令的python版本。您需要做的就是更改此行,以便ipython使用所需的python版本。

首先停止ipython服务器,并使用命令“哪个python”获取当前环境的python可执行文件的位置。

我的输出是:

/home/sourabh/anaconda2/envs/py3/bin/python

现在使用命令“哪个ipython”获取ipython的可执行位置。

我的是 :

/home/sourabh/anaconda2/envs/py2/bin/python

请注意,它使用的是不同版本的python即。来自运行不同版本python的特定环境的python,即运行来自不同环境的python。

现在,导航到目录anaconda2 / bin(对于anaconda 3用户,应为anaconda3 / bin)并搜索“ ipython”文件。在此编辑中,将第一行指向您想要的当前python版本。即“哪个python”的输出,即:

#!/home/sourabh/anaconda2/envs/py3/bin/python

请注意,我将python环境从py2(运行python 2.7)更改为py3(运行python 3.5)。

保存文件。并运行jupyter notebook,现在在创建新笔记本时,“ Python 3”选项应该可见。

干杯!

All the answers mentioned here do not help in solving the issue if you are using anaconda or some other virtual environment wrapper.

This answer is based on the assumption that you are using anaconda.

Say you are in a python 3 environment and when creating a notebook on jupyter notebook it shows “Python 2” instead of “Python 3”.

This is because “ipython” is essentially a script which is run and in this script it mentions which python version it is using to execute the command. All you need to do is change this line for ipython to use the version of python you want.

First stop the ipython server and get the location of the python executable of the current environment using the command “which python”

My output is :

/home/sourabh/anaconda2/envs/py3/bin/python

Now get the executable location of ipython use the command “which ipython”

mine is :

/home/sourabh/anaconda2/envs/py2/bin/python

Notice that it is using a different version of python ie. python from a specific environment running a different version of python ie running python from a different environment.

Now navigate to the directory anaconda2/bin(for anaconda 3 users it should be anaconda3/bin ) and search for “ipython” file. in this edit the first line to be point it to the current python version which you want. ie the output of “which python” i.e. :

#!/home/sourabh/anaconda2/envs/py3/bin/python

Notice that I changed my python environment from py2(running python 2.7) to py3(running python 3.5).

Save the file. And run jupyter notebook, now when creating a new notebook the “Python 3” option should be visible.

Cheers!


如何使用iPython中的pandas库读取.xlsx文件?

问题:如何使用iPython中的pandas库读取.xlsx文件?

我想使用python的Pandas库读取.xlsx文件,并将数据移植到postgreSQL表中。

到目前为止,我所能做的就是:

import pandas as pd
data = pd.ExcelFile("*File Name*")

现在,我知道该步骤已成功执行,但是我想知道如何解析已读取的excel文件,以便可以了解excel中的数据如何映射到变量数据中的数据。
我了解到,如果我没有记错的话,数据就是一个Dataframe对象。因此,我如何解析此dataframe对象以逐行提取每一行。

I want to read a .xlsx file using the Pandas Library of python and port the data to a postgreSQL table.

All I could do up until now is:

import pandas as pd
data = pd.ExcelFile("*File Name*")

Now I know that the step got executed successfully, but I want to know how i can parse the excel file that has been read so that I can understand how the data in the excel maps to the data in the variable data.
I learnt that data is a Dataframe object if I’m not wrong. So How do i parse this dataframe object to extract each line row by row.


回答 0

我通常会DataFrame为每个工作表创建一个包含的字典:

xl_file = pd.ExcelFile(file_name)

dfs = {sheet_name: xl_file.parse(sheet_name) 
          for sheet_name in xl_file.sheet_names}

更新:在pandas 0.21.0+版本中,您可以通过传递sheet_name=Noneread_excel

dfs = pd.read_excel(file_name, sheet_name=None)

在0.20及sheetname更低版本中,它是而不是sheet_name(现在已弃用,而转而支持上述内容):

dfs = pd.read_excel(file_name, sheetname=None)

I usually create a dictionary containing a DataFrame for every sheet:

xl_file = pd.ExcelFile(file_name)

dfs = {sheet_name: xl_file.parse(sheet_name) 
          for sheet_name in xl_file.sheet_names}

Update: In pandas version 0.21.0+ you will get this behavior more cleanly by passing sheet_name=None to read_excel:

dfs = pd.read_excel(file_name, sheet_name=None)

In 0.20 and prior, this was sheetname rather than sheet_name (this is now deprecated in favor of the above):

dfs = pd.read_excel(file_name, sheetname=None)

回答 1

from pandas import read_excel
# find your sheet name at the bottom left of your excel file and assign 
# it to my_sheet 
my_sheet = 'Sheet1' # change it to your sheet name
file_name = 'products_and_categories.xlsx' # change it to the name of your excel file
df = read_excel(file_name, sheet_name = my_sheet)
print(df.head()) # shows headers with top 5 rows
from pandas import read_excel
# find your sheet name at the bottom left of your excel file and assign 
# it to my_sheet 
my_sheet = 'Sheet1' # change it to your sheet name
file_name = 'products_and_categories.xlsx' # change it to the name of your excel file
df = read_excel(file_name, sheet_name = my_sheet)
print(df.head()) # shows headers with top 5 rows

回答 2

DataFrame的read_excel方法类似于read_csv方法:

dfs = pd.read_excel(xlsx_file, sheetname="sheet1")


Help on function read_excel in module pandas.io.excel:

read_excel(io, sheetname=0, header=0, skiprows=None, skip_footer=0, index_col=None, names=None, parse_cols=None, parse_dates=False, date_parser=None, na_values=None, thousands=None, convert_float=True, has_index_names=None, converters=None, true_values=None, false_values=None, engine=None, squeeze=False, **kwds)
    Read an Excel table into a pandas DataFrame

    Parameters
    ----------
    io : string, path object (pathlib.Path or py._path.local.LocalPath),
        file-like object, pandas ExcelFile, or xlrd workbook.
        The string could be a URL. Valid URL schemes include http, ftp, s3,
        and file. For file URLs, a host is expected. For instance, a local
        file could be file://localhost/path/to/workbook.xlsx
    sheetname : string, int, mixed list of strings/ints, or None, default 0

        Strings are used for sheet names, Integers are used in zero-indexed
        sheet positions.

        Lists of strings/integers are used to request multiple sheets.

        Specify None to get all sheets.

        str|int -> DataFrame is returned.
        list|None -> Dict of DataFrames is returned, with keys representing
        sheets.

        Available Cases

        * Defaults to 0 -> 1st sheet as a DataFrame
        * 1 -> 2nd sheet as a DataFrame
        * "Sheet1" -> 1st sheet as a DataFrame
        * [0,1,"Sheet5"] -> 1st, 2nd & 5th sheet as a dictionary of DataFrames
        * None -> All sheets as a dictionary of DataFrames

    header : int, list of ints, default 0
        Row (0-indexed) to use for the column labels of the parsed
        DataFrame. If a list of integers is passed those row positions will
        be combined into a ``MultiIndex``
    skiprows : list-like
        Rows to skip at the beginning (0-indexed)
    skip_footer : int, default 0
        Rows at the end to skip (0-indexed)
    index_col : int, list of ints, default None
        Column (0-indexed) to use as the row labels of the DataFrame.
        Pass None if there is no such column.  If a list is passed,
        those columns will be combined into a ``MultiIndex``
    names : array-like, default None
        List of column names to use. If file contains no header row,
        then you should explicitly pass header=None
    converters : dict, default None
        Dict of functions for converting values in certain columns. Keys can
        either be integers or column labels, values are functions that take one
        input argument, the Excel cell content, and return the transformed
        content.
    true_values : list, default None
        Values to consider as True

        .. versionadded:: 0.19.0

    false_values : list, default None
        Values to consider as False

        .. versionadded:: 0.19.0

    parse_cols : int or list, default None
        * If None then parse all columns,
        * If int then indicates last column to be parsed
        * If list of ints then indicates list of column numbers to be parsed
        * If string then indicates comma separated list of column names and
          column ranges (e.g. "A:E" or "A,C,E:F")
    squeeze : boolean, default False
        If the parsed data only contains one column then return a Series
    na_values : scalar, str, list-like, or dict, default None
        Additional strings to recognize as NA/NaN. If dict passed, specific
        per-column NA values. By default the following values are interpreted
        as NaN: '', '#N/A', '#N/A N/A', '#NA', '-1.#IND', '-1.#QNAN', '-NaN', '-nan',
    '1.#IND', '1.#QNAN', 'N/A', 'NA', 'NULL', 'NaN', 'nan'.
    thousands : str, default None
        Thousands separator for parsing string columns to numeric.  Note that
        this parameter is only necessary for columns stored as TEXT in Excel,
        any numeric columns will automatically be parsed, regardless of display
        format.
    keep_default_na : bool, default True
        If na_values are specified and keep_default_na is False the default NaN
        values are overridden, otherwise they're appended to.
    verbose : boolean, default False
        Indicate number of NA values placed in non-numeric columns
    engine: string, default None
        If io is not a buffer or path, this must be set to identify io.
        Acceptable values are None or xlrd
    convert_float : boolean, default True
        convert integral floats to int (i.e., 1.0 --> 1). If False, all numeric
        data will be read in as floats: Excel stores all numbers as floats
        internally
    has_index_names : boolean, default None
        DEPRECATED: for version 0.17+ index names will be automatically
        inferred based on index_col.  To read Excel output from 0.16.2 and
        prior that had saved index names, use True.

    Returns
    -------
    parsed : DataFrame or Dict of DataFrames
        DataFrame from the passed in Excel file.  See notes in sheetname
        argument for more information on when a Dict of Dataframes is returned.

DataFrame’s read_excel method is like read_csv method:

dfs = pd.read_excel(xlsx_file, sheetname="sheet1")


Help on function read_excel in module pandas.io.excel:

read_excel(io, sheetname=0, header=0, skiprows=None, skip_footer=0, index_col=None, names=None, parse_cols=None, parse_dates=False, date_parser=None, na_values=None, thousands=None, convert_float=True, has_index_names=None, converters=None, true_values=None, false_values=None, engine=None, squeeze=False, **kwds)
    Read an Excel table into a pandas DataFrame

    Parameters
    ----------
    io : string, path object (pathlib.Path or py._path.local.LocalPath),
        file-like object, pandas ExcelFile, or xlrd workbook.
        The string could be a URL. Valid URL schemes include http, ftp, s3,
        and file. For file URLs, a host is expected. For instance, a local
        file could be file://localhost/path/to/workbook.xlsx
    sheetname : string, int, mixed list of strings/ints, or None, default 0

        Strings are used for sheet names, Integers are used in zero-indexed
        sheet positions.

        Lists of strings/integers are used to request multiple sheets.

        Specify None to get all sheets.

        str|int -> DataFrame is returned.
        list|None -> Dict of DataFrames is returned, with keys representing
        sheets.

        Available Cases

        * Defaults to 0 -> 1st sheet as a DataFrame
        * 1 -> 2nd sheet as a DataFrame
        * "Sheet1" -> 1st sheet as a DataFrame
        * [0,1,"Sheet5"] -> 1st, 2nd & 5th sheet as a dictionary of DataFrames
        * None -> All sheets as a dictionary of DataFrames

    header : int, list of ints, default 0
        Row (0-indexed) to use for the column labels of the parsed
        DataFrame. If a list of integers is passed those row positions will
        be combined into a ``MultiIndex``
    skiprows : list-like
        Rows to skip at the beginning (0-indexed)
    skip_footer : int, default 0
        Rows at the end to skip (0-indexed)
    index_col : int, list of ints, default None
        Column (0-indexed) to use as the row labels of the DataFrame.
        Pass None if there is no such column.  If a list is passed,
        those columns will be combined into a ``MultiIndex``
    names : array-like, default None
        List of column names to use. If file contains no header row,
        then you should explicitly pass header=None
    converters : dict, default None
        Dict of functions for converting values in certain columns. Keys can
        either be integers or column labels, values are functions that take one
        input argument, the Excel cell content, and return the transformed
        content.
    true_values : list, default None
        Values to consider as True

        .. versionadded:: 0.19.0

    false_values : list, default None
        Values to consider as False

        .. versionadded:: 0.19.0

    parse_cols : int or list, default None
        * If None then parse all columns,
        * If int then indicates last column to be parsed
        * If list of ints then indicates list of column numbers to be parsed
        * If string then indicates comma separated list of column names and
          column ranges (e.g. "A:E" or "A,C,E:F")
    squeeze : boolean, default False
        If the parsed data only contains one column then return a Series
    na_values : scalar, str, list-like, or dict, default None
        Additional strings to recognize as NA/NaN. If dict passed, specific
        per-column NA values. By default the following values are interpreted
        as NaN: '', '#N/A', '#N/A N/A', '#NA', '-1.#IND', '-1.#QNAN', '-NaN', '-nan',
    '1.#IND', '1.#QNAN', 'N/A', 'NA', 'NULL', 'NaN', 'nan'.
    thousands : str, default None
        Thousands separator for parsing string columns to numeric.  Note that
        this parameter is only necessary for columns stored as TEXT in Excel,
        any numeric columns will automatically be parsed, regardless of display
        format.
    keep_default_na : bool, default True
        If na_values are specified and keep_default_na is False the default NaN
        values are overridden, otherwise they're appended to.
    verbose : boolean, default False
        Indicate number of NA values placed in non-numeric columns
    engine: string, default None
        If io is not a buffer or path, this must be set to identify io.
        Acceptable values are None or xlrd
    convert_float : boolean, default True
        convert integral floats to int (i.e., 1.0 --> 1). If False, all numeric
        data will be read in as floats: Excel stores all numbers as floats
        internally
    has_index_names : boolean, default None
        DEPRECATED: for version 0.17+ index names will be automatically
        inferred based on index_col.  To read Excel output from 0.16.2 and
        prior that had saved index names, use True.

    Returns
    -------
    parsed : DataFrame or Dict of DataFrames
        DataFrame from the passed in Excel file.  See notes in sheetname
        argument for more information on when a Dict of Dataframes is returned.

回答 3

如果您不知道或无法打开excel文件以签入ubuntu(在我的情况下为Python 3.6.7,ubuntu 18.04),则可以使用参数index_col(index_col = 0)来代替工作表名称。第一张)

import pandas as pd
file_name = 'some_data_file.xlsx' 
df = pd.read_excel(file_name, index_col=0)
print(df.head()) # print the first 5 rows

Instead of using a sheet name, in case you don’t know or can’t open the excel file to check in ubuntu (in my case, Python 3.6.7, ubuntu 18.04), I use the parameter index_col (index_col=0 for the first sheet)

import pandas as pd
file_name = 'some_data_file.xlsx' 
df = pd.read_excel(file_name, index_col=0)
print(df.head()) # print the first 5 rows

回答 4

将电子表格文件名分配给 file

加载电子表格

打印工作表名称

通过名称将表加载到DataFrame中:df1

file = 'example.xlsx'
xl = pd.ExcelFile(file)
print(xl.sheet_names)
df1 = xl.parse('Sheet1')

Assign spreadsheet filename to file

Load spreadsheet

Print the sheet names

Load a sheet into a DataFrame by name: df1

file = 'example.xlsx'
xl = pd.ExcelFile(file)
print(xl.sheet_names)
df1 = xl.parse('Sheet1')

回答 5

如果在使用read_excel()函数打开的文件上使用open(),请确保将其添加rb到打开函数中,以避免编码错误

If you use read_excel() on a file opened using the function open(), make sure to add rb to the open function to avoid encoding errors


如何使IPython Notebook运行Python 3?

问题:如何使IPython Notebook运行Python 3?

我是Python的新手。

  1. 我安装了Anaconda,效果很好。
  2. 我按照Anaconda cmd行说明设置了Python 3环境,效果很好。
  3. 我将Anaconda的Python 3环境设置为Pycharm的解释器,效果很好。
  4. 我启动了Anaconda“ launcher.app”,并启动了IPython Notebook。但是,iPython Notebook正在运行Python 2而不是3。

经过三个多小时的Google搜索,我无法弄清楚如何将IPython Notebook设置为运行Python 3而不是2。

I am new to Python to bear with me.

  1. I installed Anaconda, works great.
  2. I setup a Python 3 environment following the Anaconda cmd line instructions, works great.
  3. I setup Anaconda’s Python 3 environment as Pycharm’s interpreter, works great.
  4. I launched the Anaconda “launcher.app” and launched IPython Notebook. However, iPython Notebook is running Python 2 not 3.

Over three hours of Googling later, I cannot figure out how to set IPython Notebook to run Python 3 instead of 2.


回答 0

要将IPython Notebook设置为在MAC 10.9上运行Python 3而不是2,我执行了以下步骤

$ sudo pip3 install ipython[all]

然后

$ ipython3 notebook

To set IPython Notebook to run Python 3 instead of 2 on my MAC 10.9, I did the following steps

$ sudo pip3 install ipython[all]

Then

$ ipython3 notebook


回答 1

对于Linux 16.04 Ubuntu,您可以使用

sudo apt-get install ipython3

然后使用

ipython3 notebook

在浏览器中打开笔记本。如果您有任何笔记本用python 2保存,则打开笔记本后它将自动将其转换为Python 3。

For linux 16.04 Ubuntu you can use

sudo apt-get install ipython3

and then use

ipython3 notebook

to open the notebook in the browser. If you have any notebooks saved with python 2 then it will automatically convert them to Python 3 once you open the notebook.


回答 2

要在带有Anaconda的Windows 10上将jupyter与python 3而不是python 2一起使用,我在anaconda提示符下执行了以下步骤:

pip3 install ipython[all]

然后,

ipython3 notebook

To use jupyter with python 3 instead of python 2 on my Windows 10 with Anaconda, I did the following steps on anaconda prompt:

pip3 install ipython[all]

Then,

ipython3 notebook

回答 3

您的发行版中有包装吗?如果您使用的是ubuntu,则必须安装ipython3-notebook软件包。如果没有,也许您必须使用python3安装ipython。

如果您已经运行(因为默认情况下是python2)

python setup.py

您必须改为跑步

python3 setup.py install

使用python3而不是python2安装软件包。这将是ipython3的新安装。

Is there a package from your distro? If you’re using ubuntu you must to install the ipython3-notebook package. If not, maybe you must to install ipython with python3.

If you’ve run (because it’s python2 by default)

python setup.py

you must to run instead

python3 setup.py install

to install a package with python3 instead python2. This will be a new instalation of ipython3.


回答 4

在Anaconda的“ launcher.app”中,有“环境:”下拉菜单。默认环境称为“根”。为了使用其他环境启动应用程序,只需从列表中选择所需的环境以使其处于活动状态即可。

In Anaconda “launcher.app” there is “Environment:” pull down menu. The default environment is called “root”. In order to launch application using another environment, just select the desired environment from the list, to make it active.


回答 5

如果您正在运行anaconda,则安装笔记本/ jupyter的首选方法是使用conda:

conda install jupyter

If you are running anaconda, then the preferred way to install notebook/jupyter is using conda:

conda install jupyter

回答 6

如果在jupyter笔记本电脑上都有这两个版本,则可以从菜单中更改内核。

If you have both version available on jupyter notebook, you can change the kernel from menu.


回答 7

适当切换此答案中2和3的角色。

假设您已经安装了带有python 2内核的jupyter设置和带有python 3的anaconda环境。激活python 3环境,然后运行

conda install ipykernel

之后,您可以在创建新笔记本或从内核菜单中选择正在运行的笔记本中同时选择2和3内核。

Switch the role of 2 and 3 in this answer as appropriate.

Say you already have jupyter setup with a python 2 kernel and an anaconda environment with python 3. Activate the python 3 enviroment and then run

conda install ipykernel

After that you can select both a 2 and 3 kernel when creating a new notebook, or in a running notebook from the kernels menu.


回答 8

另一个解决方案是使用python3 创建virtualenv

在此环境中,在此处安装tensorflow(您喜欢的版本):

pip install tensorflow

从那里运行您的jupyter!

Another solution would be to create a virtualenv with python3:

From this environment, install tensorflow (the version you prefer) there:

pip install tensorflow

Run your jupyter from there !


在Firefox的IPython Notebook中是否有等效于CTRL + C的功能来中断正在运行的单元格?

问题:在Firefox的IPython Notebook中是否有等效于CTRL + C的功能来中断正在运行的单元格?

我已经开始使用IPython Notebook并很喜欢它。有时,我编写需要占用大量内存或存在无限循环的错误代码。我发现“中断内核”选项缓慢或不可靠,有时我不得不重新启动内核,从而丢失了内存中的所有内容。

有时我还会编写一些脚本,导致O​​S X内存不足,并且必须进行硬重启。我肯定不是100%,但如果我写这样的错误报告之前,并在终端运行的Python,我平时可以CTRL+ C我的脚本。

我在Mac OS X上使用IPython Notebook的Anaconda发行版和Firefox。

I’ve started to use the IPython Notebook and am enjoying it. Sometimes, I write buggy code that takes massive memory requirements or has an infinite loop. I find the “interrupt kernel” option sluggish or unreliable, and sometimes I have to restart the kernel, losing everything in memory.

I also sometimes write scripts that cause OS X to run out of memory, and I have to do a hard reboot. I’m not 100% sure, but when I’ve written bugs like this before and ran Python in the terminal, I can usually CTRL+C my scripts.

I am using the Anaconda distribution of IPython notebook with Firefox on Mac OS X.


回答 0

我可能是错的,但我敢肯定的是,“中断内核”按钮,只需发送一个SIGINT信号到代码,您当前运行(这种想法是费尔南多的评论支持在这里,这是相同的东西,击打) CTRL + C可以。python中的某些进程比其他进程更突然地处理SIGINT。

如果您迫切需要停止iPython Notebook中正在运行的内容,并从终端启动iPython Notebook,则可以在该终端中按两次CTRL + C来中断整个iPython Notebook服务器。这将完全停止iPython Notebook,这意味着将无法重新启动或保存您的工作,因此,这显然不是一个很好的解决方案(您需要按CTRL + C两次,因为这是一项安全功能,因此人们无需意外地做)。但是,在紧急情况下,它通常比“中断内核”按钮更快地终止进程。

I could be wrong, but I’m pretty sure that the “interrupt kernel” button just sends a SIGINT signal to the code that you’re currently running (this idea is supported by Fernando’s comment here), which is the same thing that hitting CTRL+C would do. Some processes within python handle SIGINTs more abruptly than others.

If you desperately need to stop something that is running in iPython Notebook and you started iPython Notebook from a terminal, you can hit CTRL+C twice in that terminal to interrupt the entire iPython Notebook server. This will stop iPython Notebook alltogether, which means it won’t be possible to restart or save your work, so this is obviously not a great solution (you need to hit CTRL+C twice because it’s a safety feature so that people don’t do it by accident). In case of emergency, however, it generally kills the process more quickly than the “interrupt kernel” button.


回答 1

您可以按I两次以中断内核。

仅当您处于命令模式时,这才有效。如果尚未启用,请按Esc启用它。

You can press I twice to interrupt the kernel.

This only works if you’re in Command mode. If not already enabled, press Esc to enable it.


回答 2

是IPython Notebook的快捷方式。

Ctrl-m i中断内核。(即后面的唯一字母i Ctrl-m

根据这个答案,I两次也可以。

Here are shortcuts for the IPython Notebook.

Ctrl-m i interrupts the kernel. (that is, the sole letter i after Ctrl-m)

According to this answer, I twice works as well.


回答 3

添加到上面的内容:如果中断不起作用,则可以重新启动内核。

转到内核下拉菜单>>重新启动>>重新启动并清除输出。这通常可以解决问题。如果仍然无法解决问题,请在终端(或任务管理器)中终止内核,然后重新启动。

中断不适用于所有进程。使用R内核时,我尤其遇到这个问题。

To add to the above: If interrupt is not working, you can restart the kernel.

Go to the kernel dropdown >> restart >> restart and clear output. This usually does the trick. If this still doesn’t work, kill the kernel in the terminal (or task manager) and then restart.

Interrupt doesn’t work well for all processes. I especially have this problem using the R kernel.


回答 4

更新:将我的解决方案变成了独立的python脚本。

此解决方案为我节省了不止一次。希望其他人觉得它有用。该python脚本将查找使用不止cpu_thresholdCPU的jupyter内核,并提示用户将a发送SIGINT给内核(KeyboardInterrupt)。它将一直发送,SIGINT直到内核的cpu使用率低于为止cpu_threshold。如果存在多个行为异常的内核,它将提示用户中断每个内核(按CPU使用率从高到低的顺序排列)。非常感谢gcbeltramini编写了使用jupyter api查找jupyter内核名称的代码。该脚本已经在python3的MACOS上进行了测试,并且需要jupyter笔记本,请求,json和psutil。

将脚本放在您的主目录中,然后用法如下所示:

python ~/interrupt_bad_kernels.py
Interrupt kernel chews cpu.ipynb; PID: 57588; CPU: 2.3%? (y/n) y

下面的脚本代码:

from os import getpid, kill
from time import sleep
import re
import signal

from notebook.notebookapp import list_running_servers
from requests import get
from requests.compat import urljoin
import ipykernel
import json
import psutil


def get_active_kernels(cpu_threshold):
    """Get a list of active jupyter kernels."""
    active_kernels = []
    pids = psutil.pids()
    my_pid = getpid()

    for pid in pids:
        if pid == my_pid:
            continue
        try:
            p = psutil.Process(pid)
            cmd = p.cmdline()
            for arg in cmd:
                if arg.count('ipykernel'):
                    cpu = p.cpu_percent(interval=0.1)
                    if cpu > cpu_threshold:
                        active_kernels.append((cpu, pid, cmd))
        except psutil.AccessDenied:
            continue
    return active_kernels


def interrupt_bad_notebooks(cpu_threshold=0.2):
    """Interrupt active jupyter kernels. Prompts the user for each kernel."""

    active_kernels = sorted(get_active_kernels(cpu_threshold), reverse=True)

    servers = list_running_servers()
    for ss in servers:
        response = get(urljoin(ss['url'].replace('localhost', '127.0.0.1'), 'api/sessions'),
                       params={'token': ss.get('token', '')})
        for nn in json.loads(response.text):
            for kernel in active_kernels:
                for arg in kernel[-1]:
                    if arg.count(nn['kernel']['id']):
                        pid = kernel[1]
                        cpu = kernel[0]
                        interrupt = input(
                            'Interrupt kernel {}; PID: {}; CPU: {}%? (y/n) '.format(nn['notebook']['path'], pid, cpu))
                        if interrupt.lower() == 'y':
                            p = psutil.Process(pid)
                            while p.cpu_percent(interval=0.1) > cpu_threshold:
                                kill(pid, signal.SIGINT)
                                sleep(0.5)

if __name__ == '__main__':
    interrupt_bad_notebooks()

UPDATE: Turned my solution into a stand-alone python script.

This solution has saved me more than once. Hopefully others find it useful. This python script will find any jupyter kernel using more than cpu_threshold CPU and prompts the user to send a SIGINT to the kernel (KeyboardInterrupt). It will keep sending SIGINT until the kernel’s cpu usage goes below cpu_threshold. If there are multiple misbehaving kernels it will prompt the user to interrupt each of them (ordered by highest CPU usage to lowest). A big thanks goes to gcbeltramini for writing code to find the name of a jupyter kernel using the jupyter api. This script was tested on MACOS with python3 and requires jupyter notebook, requests, json and psutil.

Put the script in your home directory and then usage looks like:

python ~/interrupt_bad_kernels.py
Interrupt kernel chews cpu.ipynb; PID: 57588; CPU: 2.3%? (y/n) y

Script code below:

from os import getpid, kill
from time import sleep
import re
import signal

from notebook.notebookapp import list_running_servers
from requests import get
from requests.compat import urljoin
import ipykernel
import json
import psutil


def get_active_kernels(cpu_threshold):
    """Get a list of active jupyter kernels."""
    active_kernels = []
    pids = psutil.pids()
    my_pid = getpid()

    for pid in pids:
        if pid == my_pid:
            continue
        try:
            p = psutil.Process(pid)
            cmd = p.cmdline()
            for arg in cmd:
                if arg.count('ipykernel'):
                    cpu = p.cpu_percent(interval=0.1)
                    if cpu > cpu_threshold:
                        active_kernels.append((cpu, pid, cmd))
        except psutil.AccessDenied:
            continue
    return active_kernels


def interrupt_bad_notebooks(cpu_threshold=0.2):
    """Interrupt active jupyter kernels. Prompts the user for each kernel."""

    active_kernels = sorted(get_active_kernels(cpu_threshold), reverse=True)

    servers = list_running_servers()
    for ss in servers:
        response = get(urljoin(ss['url'].replace('localhost', '127.0.0.1'), 'api/sessions'),
                       params={'token': ss.get('token', '')})
        for nn in json.loads(response.text):
            for kernel in active_kernels:
                for arg in kernel[-1]:
                    if arg.count(nn['kernel']['id']):
                        pid = kernel[1]
                        cpu = kernel[0]
                        interrupt = input(
                            'Interrupt kernel {}; PID: {}; CPU: {}%? (y/n) '.format(nn['notebook']['path'], pid, cpu))
                        if interrupt.lower() == 'y':
                            p = psutil.Process(pid)
                            while p.cpu_percent(interval=0.1) > cpu_threshold:
                                kill(pid, signal.SIGINT)
                                sleep(0.5)

if __name__ == '__main__':
    interrupt_bad_notebooks()

如何腌制或存储Jupyter(IPython)笔记本会话以供以后使用

问题:如何腌制或存储Jupyter(IPython)笔记本会话以供以后使用

假设我正在Jupyter / Ipython笔记本中进行较大的数据分析,并且完成了大量耗时的计算。然后,由于某种原因,我必须关闭jupyter本地服务器I,但是我想稍后再进行分析,而不必再次进行所有耗时的计算。


我想什么爱做的是pickle或存储整个Jupyter会话(所有大熊猫dataframes,np.arrays,变量,…),所以我可以放心地关闭服务器知道我可以在完全相同的状态返回到我的会话之前。

从技术上讲甚至可行吗?有没有我忽略的内置功能?


编辑:根据这个答案,有一种%store 魔法应该是“轻型泡菜”。但是,您必须像这样手动存储变量:

#inside a ipython/nb session
foo = "A dummy string"
%store foo
关闭种子,重新启动内核#r
%store -r foo进行刷新
print(foo) # "A dummy string"

这与我想要的功能相当接近,但是由于必须手动执行并且无法区分不同的会话,因此它的用处不大。

Let’s say I am doing a larger data analysis in Jupyter/Ipython notebook with lots of time consuming computations done. Then, for some reason, I have to shut down the jupyter local server I, but I would like to return to doing the analysis later, without having to go through all the time-consuming computations again.


What I would like love to do is pickle or store the whole Jupyter session (all pandas dataframes, np.arrays, variables, …) so I can safely shut down the server knowing I can return to my session in exactly the same state as before.

Is it even technically possible? Is there a built-in functionality I overlooked?


EDIT: based on this answer there is a %store magic which should be “lightweight pickle”. However you have to store the variables manually like so:

#inside a ipython/nb session
foo = "A dummy string"
%store foo
closing seesion, restarting kernel
%store -r foo # r for refresh
print(foo) # "A dummy string"

which is fairly close to what I would want, but having to do it manually and being unable to distinguish between different sessions makes it less useful.


回答 0

我认为迪尔很好地回答了您的问题。

pip install dill

保存笔记本会话:

import dill
dill.dump_session('notebook_env.db')

恢复笔记本会话:

import dill
dill.load_session('notebook_env.db')

资源

I think Dill answers your question well.

pip install dill

Save a Notebook session:

import dill
dill.dump_session('notebook_env.db')

Restore a Notebook session:

import dill
dill.load_session('notebook_env.db')

Source


回答 1

(我宁愿评论而不愿将其作为实际答案,但我需要更多的声誉才能发表评论。)

您可以系统地存储大多数类似数据的变量。我通常要做的是将所有数据帧,数组等存储在pandas.HDFStore中。在笔记本的开头,声明

backup = pd.HDFStore('backup.h5')

然后在产生它们时存储任何新变量

backup['var1'] = var1

最后,可能是一个好主意

backup.close()

在关闭服务器之前。下次您想继续使用笔记本时:

backup = pd.HDFStore('backup.h5')
var1 = backup['var1']

说实话,我也更喜欢ipython Notebook中的内置功能。您不能以这种方式保存所有内容(例如,对象,连接),并且很难用大量样板代码来组织笔记本。

(I’d rather comment than offer this as an actual answer, but I need more reputation to comment.)

You can store most data-like variables in a systematic way. What I usually do is store all dataframes, arrays, etc. in pandas.HDFStore. At the beginning of the notebook, declare

backup = pd.HDFStore('backup.h5')

and then store any new variables as you produce them

backup['var1'] = var1

At the end, probably a good idea to do

backup.close()

before turning off the server. The next time you want to continue with the notebook:

backup = pd.HDFStore('backup.h5')
var1 = backup['var1']

Truth be told, I’d prefer built-in functionality in ipython notebook, too. You can’t save everything this way (e.g. objects, connections), and it’s hard to keep the notebook organized with so much boilerplate codes.


回答 2

这个问题涉及到:如何在IPython Notebook中进行缓存?

为了保存单个单元的结果,缓存魔术派上了用场。

%%cache longcalc.pkl var1 var2 var3
var1 = longcalculation()
....

重新运行笔记本时,将从高速缓存中加载此单元格的内容。

这不能完全回答您的问题,但是对于所有冗长的计算结果都可以快速恢复的情况,这可能就足够了。这对我来说是一个可行的解决方案,同时点击了笔记本顶部的“ run-all”按钮。

缓存魔法救不了整个笔记本的状态还没有。据我所知,还没有其他系统可以恢复“笔记本”。这将需要保存python内核的所有历史记录。加载笔记本电脑并连接到内核后,应加载此信息。

This question is related to: How to cache in IPython Notebook?

To save the results of individual cells, the caching magic comes in handy.

%%cache longcalc.pkl var1 var2 var3
var1 = longcalculation()
....

When rerunning the notebook, the contents of this cell is loaded from the cache.

This is not exactly answering your question, but it might be enough to when the results of all the lengthy calculations are recovered fast. This in combination of hitting the run-all button on top of the notebook is for me a workable solution.

The cache magic cannot save the state of a whole notebook yet. To my knowledge there is no other system yet to resume a “notebook”. This would require to save all the history of the python kernel. After loading the notebook, and connecting to a kernel, this information should be loaded.


如何使用列的格式字符串显示浮点数的pandas DataFrame?

问题:如何使用列的格式字符串显示浮点数的pandas DataFrame?

我想使用print()和IPython 显示给定格式的熊猫数据框display()。例如:

df = pd.DataFrame([123.4567, 234.5678, 345.6789, 456.7890],
                  index=['foo','bar','baz','quux'],
                  columns=['cost'])
print df

         cost
foo   123.4567
bar   234.5678
baz   345.6789
quux  456.7890

我想以某种方式强迫这样做

         cost
foo   $123.46
bar   $234.57
baz   $345.68
quux  $456.79

无需修改数据本身或创建副本,只需更改其显示方式即可。

我怎样才能做到这一点?

I would like to display a pandas dataframe with a given format using print() and the IPython display(). For example:

df = pd.DataFrame([123.4567, 234.5678, 345.6789, 456.7890],
                  index=['foo','bar','baz','quux'],
                  columns=['cost'])
print df

         cost
foo   123.4567
bar   234.5678
baz   345.6789
quux  456.7890

I would like to somehow coerce this into printing

         cost
foo   $123.46
bar   $234.57
baz   $345.68
quux  $456.79

without having to modify the data itself or create a copy, just change the way it is displayed.

How can I do this?


回答 0

import pandas as pd
pd.options.display.float_format = '${:,.2f}'.format
df = pd.DataFrame([123.4567, 234.5678, 345.6789, 456.7890],
                  index=['foo','bar','baz','quux'],
                  columns=['cost'])
print(df)

Yield

        cost
foo  $123.46
bar  $234.57
baz  $345.68
quux $456.79

但这仅在您希望每个浮点数都用美元符号格式化时才有效。

否则,如果您只想为某些浮点数设置美元格式,那么我认为您必须预先修改数据框(将这些浮点数转换为字符串):

import pandas as pd
df = pd.DataFrame([123.4567, 234.5678, 345.6789, 456.7890],
                  index=['foo','bar','baz','quux'],
                  columns=['cost'])
df['foo'] = df['cost']
df['cost'] = df['cost'].map('${:,.2f}'.format)
print(df)

Yield

         cost       foo
foo   $123.46  123.4567
bar   $234.57  234.5678
baz   $345.68  345.6789
quux  $456.79  456.7890
import pandas as pd
pd.options.display.float_format = '${:,.2f}'.format
df = pd.DataFrame([123.4567, 234.5678, 345.6789, 456.7890],
                  index=['foo','bar','baz','quux'],
                  columns=['cost'])
print(df)

yields

        cost
foo  $123.46
bar  $234.57
baz  $345.68
quux $456.79

but this only works if you want every float to be formatted with a dollar sign.

Otherwise, if you want dollar formatting for some floats only, then I think you’ll have to pre-modify the dataframe (converting those floats to strings):

import pandas as pd
df = pd.DataFrame([123.4567, 234.5678, 345.6789, 456.7890],
                  index=['foo','bar','baz','quux'],
                  columns=['cost'])
df['foo'] = df['cost']
df['cost'] = df['cost'].map('${:,.2f}'.format)
print(df)

yields

         cost       foo
foo   $123.46  123.4567
bar   $234.57  234.5678
baz   $345.68  345.6789
quux  $456.79  456.7890

回答 1

如果您不想修改数据框,则可以对该列使用自定义格式程序。

import pandas as pd
pd.options.display.float_format = '${:,.2f}'.format
df = pd.DataFrame([123.4567, 234.5678, 345.6789, 456.7890],
                  index=['foo','bar','baz','quux'],
                  columns=['cost'])


print df.to_string(formatters={'cost':'${:,.2f}'.format})

Yield

        cost
foo  $123.46
bar  $234.57
baz  $345.68
quux $456.79

If you don’t want to modify the dataframe, you could use a custom formatter for that column.

import pandas as pd
pd.options.display.float_format = '${:,.2f}'.format
df = pd.DataFrame([123.4567, 234.5678, 345.6789, 456.7890],
                  index=['foo','bar','baz','quux'],
                  columns=['cost'])


print df.to_string(formatters={'cost':'${:,.2f}'.format})

yields

        cost
foo  $123.46
bar  $234.57
baz  $345.68
quux $456.79

回答 2

从Pandas 0.17开始,现在有一个样式系统,该系统实质上使用Python格式字符串提供DataFrame的格式化视图:

import pandas as pd
import numpy as np

constants = pd.DataFrame([('pi',np.pi),('e',np.e)],
                   columns=['name','value'])
C = constants.style.format({'name': '~~ {} ~~', 'value':'--> {:15.10f} <--'})
C

显示

这是一个视图对象;DataFrame本身不会更改格式,但是DataFrame中的更新会反映在视图中:

constants.name = ['pie','eek']
C

但是,它似乎有一些局限性:

  • 在原位添加新的行和/或列似乎会导致样式视图不一致(不添加行/列标签):

    constants.loc[2] = dict(name='bogus', value=123.456)
    constants['comment'] = ['fee','fie','fo']
    constants
    

看起来不错,但是:

C

  • 格式化仅适用于值,不适用于索引条目:

    constants = pd.DataFrame([('pi',np.pi),('e',np.e)],
                   columns=['name','value'])
    constants.set_index('name',inplace=True)
    C = constants.style.format({'name': '~~ {} ~~', 'value':'--> {:15.10f} <--'})
    C
    

As of Pandas 0.17 there is now a styling system which essentially provides formatted views of a DataFrame using Python format strings:

import pandas as pd
import numpy as np

constants = pd.DataFrame([('pi',np.pi),('e',np.e)],
                   columns=['name','value'])
C = constants.style.format({'name': '~~ {} ~~', 'value':'--> {:15.10f} <--'})
C

which displays

This is a view object; the DataFrame itself does not change formatting, but updates in the DataFrame are reflected in the view:

constants.name = ['pie','eek']
C

However it appears to have some limitations:

  • Adding new rows and/or columns in-place seems to cause inconsistency in the styled view (doesn’t add row/column labels):

    constants.loc[2] = dict(name='bogus', value=123.456)
    constants['comment'] = ['fee','fie','fo']
    constants
    

which looks ok but:

C

  • Formatting works only for values, not index entries:

    constants = pd.DataFrame([('pi',np.pi),('e',np.e)],
                   columns=['name','value'])
    constants.set_index('name',inplace=True)
    C = constants.style.format({'name': '~~ {} ~~', 'value':'--> {:15.10f} <--'})
    C
    


回答 3

与上面的unutbu相似,您也可以applymap如下使用:

import pandas as pd
df = pd.DataFrame([123.4567, 234.5678, 345.6789, 456.7890],
                  index=['foo','bar','baz','quux'],
                  columns=['cost'])

df = df.applymap("${0:.2f}".format)

Similar to unutbu above, you could also use applymap as follows:

import pandas as pd
df = pd.DataFrame([123.4567, 234.5678, 345.6789, 456.7890],
                  index=['foo','bar','baz','quux'],
                  columns=['cost'])

df = df.applymap("${0:.2f}".format)

回答 4

我喜欢将pandas.apply()与python format()结合使用。

import pandas as pd
s = pd.Series([1.357, 1.489, 2.333333])

make_float = lambda x: "${:,.2f}".format(x)
s.apply(make_float)

而且,它可以轻松地用于多列…

df = pd.concat([s, s * 2], axis=1)

make_floats = lambda row: "${:,.2f}, ${:,.3f}".format(row[0], row[1])
df.apply(make_floats, axis=1)

I like using pandas.apply() with python format().

import pandas as pd
s = pd.Series([1.357, 1.489, 2.333333])

make_float = lambda x: "${:,.2f}".format(x)
s.apply(make_float)

Also, it can be easily used with multiple columns…

df = pd.concat([s, s * 2], axis=1)

make_floats = lambda row: "${:,.2f}, ${:,.3f}".format(row[0], row[1])
df.apply(make_floats, axis=1)

回答 5

您还可以将语言环境设置为您所在的区域,并将float_format设置为使用货币格式。这将自动为美国的货币设置$符号。

import locale

locale.setlocale(locale.LC_ALL, "en_US.UTF-8")

pd.set_option("float_format", locale.currency)

df = pd.DataFrame(
    [123.4567, 234.5678, 345.6789, 456.7890],
    index=["foo", "bar", "baz", "quux"],
    columns=["cost"],
)
print(df)

        cost
foo  $123.46
bar  $234.57
baz  $345.68
quux $456.79

You can also set locale to your region and set float_format to use a currency format. This will automatically set $ sign for currency in USA.

import locale

locale.setlocale(locale.LC_ALL, "en_US.UTF-8")

pd.set_option("float_format", locale.currency)

df = pd.DataFrame(
    [123.4567, 234.5678, 345.6789, 456.7890],
    index=["foo", "bar", "baz", "quux"],
    columns=["cost"],
)
print(df)

        cost
foo  $123.46
bar  $234.57
baz  $345.68
quux $456.79

回答 6

摘要:


    df = pd.DataFrame({'money': [100.456, 200.789], 'share': ['100,000', '200,000']})
    print(df)
    print(df.to_string(formatters={'money': '${:,.2f}'.format}))
    for col_name in ('share',):
        df[col_name] = df[col_name].map(lambda p: int(p.replace(',', '')))
    print(df)
    """
        money    share
    0  100.456  100,000
    1  200.789  200,000

        money    share
    0 $100.46  100,000
    1 $200.79  200,000

         money   share
    0  100.456  100000
    1  200.789  200000
    """

summary:


    df = pd.DataFrame({'money': [100.456, 200.789], 'share': ['100,000', '200,000']})
    print(df)
    print(df.to_string(formatters={'money': '${:,.2f}'.format}))
    for col_name in ('share',):
        df[col_name] = df[col_name].map(lambda p: int(p.replace(',', '')))
    print(df)
    """
        money    share
    0  100.456  100,000
    1  200.789  200,000

        money    share
    0 $100.46  100,000
    1 $200.79  200,000

         money   share
    0  100.456  100000
    1  200.789  200000
    """

如何从终端运行.ipynb Jupyter Notebook?

问题:如何从终端运行.ipynb Jupyter Notebook?

我在.ipynb文件中有一些代码,并且到了我真的不需要IPython Notebook的“交互”功能的地步。我想直接从Mac Terminal命令行运行它。

基本上,如果这只是一个.py文件,我相信我可以从命令行执行python filename.py。.ipynb文件是否存在类似内容?

I have some code in a .ipynb file and got it to the point where I don’t really need the “interactive” feature of IPython Notebook. I would like to just run it straight from a Mac Terminal Command Line.

Basically, if this were just a .py file, I believe I could just do python filename.py from the command line. Is there something similar for a .ipynb file?


回答 0

在命令行中,您可以使用以下命令将笔记本转换为python:

jupyter nbconvert --to python nb.ipynb

https://github.com/jupyter/nbconvert

您可能需要安装python mistune软件包:

sudo pip install -U mistune

From the command line you can convert a notebook to python with this command:

jupyter nbconvert --to python nb.ipynb

https://github.com/jupyter/nbconvert

You may have to install the python mistune package:

sudo pip install -U mistune

回答 1

nbconvert允许您运行带有--execute标志的笔记本:

jupyter nbconvert --execute <notebook>

如果您想运行一个笔记本并生产一个新的笔记本,则可以添加--to notebook

jupyter nbconvert --execute --to notebook <notebook>

或者,如果您想用新的输出替换现有的笔记本:

jupyter nbconvert --execute --to notebook --inplace <notebook>

由于这是一个很长的命令,因此可以使用别名:

alias nbx="jupyter nbconvert --execute --to notebook"
nbx [--inplace] <notebook>

nbconvert allows you to run notebooks with the --execute flag:

jupyter nbconvert --execute <notebook>

If you want to run a notebook and produce a new notebook, you can add --to notebook:

jupyter nbconvert --execute --to notebook <notebook>

Or if you want to replace the existing notebook with the new output:

jupyter nbconvert --execute --to notebook --inplace <notebook>

Since that’s a really long command, you can use an alias:

alias nbx="jupyter nbconvert --execute --to notebook"
nbx [--inplace] <notebook>

回答 2

您可以从中导出所有代码.ipynb并将其另存为.py脚本。然后,您可以在终端中运行脚本。

希望能帮助到你。

You can export all your code from .ipynb and save it as a .py script. Then you can run the script in your terminal.

Hope it helps.


回答 3

在终端中运行ipython:

ipython

然后找到您的脚本并放在此处:

%run your_script.ipynb

In your Terminal run ipython:

ipython

then locate your script and put there:

%run your_script.ipynb

回答 4

对于新版本,而不是:

ipython nbconvert --to python <YourNotebook>.ipynb

您可以使用ipython的jupyter实例:

jupyter nbconvert --to python <YourNotebook>.ipynb

For new version instead of:

ipython nbconvert --to python <YourNotebook>.ipynb

You can use jupyter instend of ipython:

jupyter nbconvert --to python <YourNotebook>.ipynb

回答 5

更新作者引用的评论,以提高可见度:

作者注释:“该项目在Jupyter的执行API之前启动,现在是从命令行运行笔记本的推荐方式。请考虑弃用不维护runipy的软件。” –塞巴斯蒂安·帕尔马(Sebastian Palma)

安装允许在终端上运行代码的runipy库

pip install runipy

只是编译您的代码后:

runipy <YourNotebookName>.ipynb

您也可以尝试cronjob。所有信息都在这里

Update with quoted comment by author for better visibility:

Author’s note “This project started before Jupyter’s execute API, which is now the recommended way to run notebooks from the command-line. Consider runipy deprecated and unmaintained.” – Sebastian Palma

Install runipy library that allows running your code on terminal

pip install runipy

After just compiler your code:

runipy <YourNotebookName>.ipynb

You can try cronjob as well. All information is here


回答 6

就我而言,最适合我的命令是:

jupyter nbconvert --execute --clear-output <notebook>.ipynb

为什么?此命令不会创建额外的文件(就像.py文件一样),并且每次执行Notebook时,单元的输出都会被覆盖。

如果您运行:

jupyter nbconvert --help

–clear-output清除当前文件的输出并保存到位,覆盖现有笔记本。

In my case, the command that best suited me was:

jupyter nbconvert --execute --clear-output <notebook>.ipynb

Why? This command does not create extra files (just like a .py file) and the output of the cells is overwritten everytime the notebook is executed.

If you run:

jupyter nbconvert --help

–clear-output Clear output of current file and save in place, overwriting the existing notebook.


回答 7

您还可以使用该boar软件包在python代码中运行笔记本。

from boar.running import run_notebook

outputs = run_notebook("nb.ipynb")

如果更新笔记本,则无需再次将其转换为python文件。


有关更多信息,请访问:

https://github.com/alexandreCameron/boar/blob/master/USAGE.md

You can also use the boar package to run your notebook within a python code.

from boar.running import run_notebook

outputs = run_notebook("nb.ipynb")

If you update your notebook, you won’t have to convert it again to a python file.


More information at:

https://github.com/alexandreCameron/boar/blob/master/USAGE.md


回答 8

从终端运行

jupyter nbconvert --execute --to notebook --inplace --allow-errors --ExecutePreprocessor.timeout=-1 my_nb.ipynb

默认超时为30秒。-1消除了限制。

如果您希望将输出笔记本保存到新笔记本中,则可以使用该标志 --output my_new_nb.ipynb

From the terminal run

jupyter nbconvert --execute --to notebook --inplace --allow-errors --ExecutePreprocessor.timeout=-1 my_nb.ipynb

The default timeout is 30 seconds. -1 removes the restriction.

If you wish to save the output notebook to a new notebook you can use the flag --output my_new_nb.ipynb


回答 9

我有同样的问题,我找到了造纸厂。与其他解决方案相比的优势在于,您可以在笔记本计算机运行时看到结果。当笔记本电脑花费很长时间时,我发现此功能很有趣。这是非常容易使用:

pip install papermill
papermill notebook.ipynb output.ipynb

它还具有其他方便的选项,例如将输出文件保存到Amazon S3,Google Cloud等。有关更多信息,请参阅页面。

I had the same problem and I found papermill. The advantages against the others solutions is that you can see the results while the notebook is running. I find this feature interesting when the notebook takes very long. It is very easy to use:

pip install papermill
papermill notebook.ipynb output.ipynb

It has also, other handy options as saving the output file to Amazon S3, Google Cloud, etc. See the page for more information.


如何将HTML嵌入IPython输出中?

问题:如何将HTML嵌入IPython输出中?

是否可以将呈现的HTML输出嵌入到IPython输出中?

一种方法是使用

from IPython.core.display import HTML
HTML('<a href="http://example.com">link</a>')

或(IPython多行单元格别名)

%%html
<a href="http://example.com">link</a>

哪个返回格式化的链接,但是

  1. 此链接不会从控制台打开带有网页本身的浏览器。不过,IPython笔记本支持诚实渲染。
  2. 我不知道如何HTML()在列表或pandas打印表中呈现对象。您可以这样做df.to_html(),但无需在单元格内建立链接。
  3. 此输出在PyCharm Python控制台中不是交互式的(因为它不是QT)。

如何克服这些缺点并使IPython输出更具交互性?

Is it possible to embed rendered HTML output into IPython output?

One way is to use

from IPython.core.display import HTML
HTML('<a href="http://example.com">link</a>')

or (IPython multiline cell alias)

%%html
<a href="http://example.com">link</a>

Which return a formatted link, but

  1. This link doesn’t open a browser with the webpage itself from the console. IPython notebooks support honest rendering, though.
  2. I’m unaware of how to render HTML() object within, say, a list or pandas printed table. You can do df.to_html(), but without making links inside cells.
  3. This output isn’t interactive in the PyCharm Python console (because it’s not QT).

How can I overcome these shortcomings and make IPython output a bit more interactive?


回答 0

这似乎为我工作:

from IPython.core.display import display, HTML
display(HTML('<h1>Hello, world!</h1>'))

诀窍是也将其包装在“显示”中。

来源:http : //python.6.x6.nabble.com/Printing-HTML-within-IPython-Notebook-IPython-specific-prettyprint-tp5016624p5016631.html

This seems to work for me:

from IPython.core.display import display, HTML
display(HTML('<h1>Hello, world!</h1>'))

The trick is to wrap it in “display” as well.

Source: http://python.6.x6.nabble.com/Printing-HTML-within-IPython-Notebook-IPython-specific-prettyprint-tp5016624p5016631.html


回答 1

一段时间以前,Jupyter Notebooks开始从HTML内容中剥离JavaScript [ #3118 ]。这是两个解决方案:

提供本地HTML

如果要立即在页面上嵌入带有JavaScript的HTML页面,最简单的方法是使用笔记本将HTML文件保存到目录中,然后按以下方式加载HTML:

from IPython.display import IFrame

IFrame(src='./nice.html', width=700, height=600)

提供远程HTML

如果您喜欢托管解决方案,则可以在S3中将HTML页面上传到Amazon Web Services“存储桶”,更改该存储桶上的设置,以使该存储桶托管静态网站,然后在笔记本中使用Iframe组件:

from IPython.display import IFrame

IFrame(src='https://s3.amazonaws.com/duhaime/blog/visualizations/isolation-forests.html', width=700, height=600)

就像在其他任何网页上一样,这将在iframe中呈现HTML内容和JavaScript:

<iframe src='https://s3.amazonaws.com/duhaime/blog/visualizations/isolation-forests.html', width=700, height=600></iframe>

Some time ago Jupyter Notebooks started stripping JavaScript from HTML content [#3118]. Here are two solutions:

Serving Local HTML

If you want to embed an HTML page with JavaScript on your page now, the easiest thing to do is to save your HTML file to the directory with your notebook and then load the HTML as follows:

from IPython.display import IFrame

IFrame(src='./nice.html', width=700, height=600)

Serving Remote HTML

If you prefer a hosted solution, you can upload your HTML page to an Amazon Web Services “bucket” in S3, change the settings on that bucket so as to make the bucket host a static website, then use an Iframe component in your notebook:

from IPython.display import IFrame

IFrame(src='https://s3.amazonaws.com/duhaime/blog/visualizations/isolation-forests.html', width=700, height=600)

This will render your HTML content and JavaScript in an iframe, just like you can on any other web page:

<iframe src='https://s3.amazonaws.com/duhaime/blog/visualizations/isolation-forests.html', width=700, height=600></iframe>

回答 2

相关:在构造类时,def _repr_html_(self): ...可用于创建其实例的自定义HTML表示形式:

class Foo:
    def _repr_html_(self):
        return "Hello <b>World</b>!"

o = Foo()
o

将呈现为:

世界你好!

有关更多信息,请参阅IPython的文档

一个高级示例:

from html import escape # Python 3 only :-)

class Todo:
    def __init__(self):
        self.items = []

    def add(self, text, completed):
        self.items.append({'text': text, 'completed': completed})

    def _repr_html_(self):
        return "<ol>{}</ol>".format("".join("<li>{} {}</li>".format(
            "☑" if item['completed'] else "☐",
            escape(item['text'])
        ) for item in self.items))

my_todo = Todo()
my_todo.add("Buy milk", False)
my_todo.add("Do homework", False)
my_todo.add("Play video games", True)

my_todo

将呈现:

  1. ☐购买牛奶
  2. ☐做作业
  3. ☑玩电子游戏

Related: While constructing a class, def _repr_html_(self): ... can be used to create a custom HTML representation of its instances:

class Foo:
    def _repr_html_(self):
        return "Hello <b>World</b>!"

o = Foo()
o

will render as:

Hello World!

For more info refer to IPython’s docs.

An advanced example:

from html import escape # Python 3 only :-)

class Todo:
    def __init__(self):
        self.items = []

    def add(self, text, completed):
        self.items.append({'text': text, 'completed': completed})

    def _repr_html_(self):
        return "<ol>{}</ol>".format("".join("<li>{} {}</li>".format(
            "☑" if item['completed'] else "☐",
            escape(item['text'])
        ) for item in self.items))

my_todo = Todo()
my_todo.add("Buy milk", False)
my_todo.add("Do homework", False)
my_todo.add("Play video games", True)

my_todo

Will render:

  1. ☐ Buy milk
  2. ☐ Do homework
  3. ☑ Play video games

回答 3

在上面的@Harmon上展开,看来您可以将displayand print语句组合在一起……如果需要的话。或者,将整个HTML格式化为一个字符串然后使用显示可能会更容易。无论哪种方式,不错的功能。

display(HTML('<h1>Hello, world!</h1>'))
print("Here's a link:")
display(HTML("<a href='http://www.google.com' target='_blank'>www.google.com</a>"))
print("some more printed text ...")
display(HTML('<p>Paragraph text here ...</p>'))

输出如下所示:


你好,世界!

这里是一个链接:

www.google.com

一些更多的印刷文字…

此处的段落文字…


Expanding on @Harmon above, looks like you can combine the display and print statements together … if you need. Or, maybe it’s easier to just format your entire HTML as one string and then use display. Either way, nice feature.

display(HTML('<h1>Hello, world!</h1>'))
print("Here's a link:")
display(HTML("<a href='http://www.google.com' target='_blank'>www.google.com</a>"))
print("some more printed text ...")
display(HTML('<p>Paragraph text here ...</p>'))

Outputs something like this:


Hello, world!

Here’s a link:

www.google.com

some more printed text …

Paragraph text here …