标签归档:ipython

熊猫行动中的进度指示器

问题:熊猫行动中的进度指示器

我定期对超过1500万行的数据帧执行熊猫操作,我很乐意能够访问特定操作的进度指示器。

是否存在基于文本的熊猫拆分应用合并操作进度指示器?

例如,类似:

df_users.groupby(['userID', 'requestDate']).apply(feature_rollup)

其中feature_rollup包含一些DF列并通过各种方法创建新用户列的函数。对于大型数据帧,这些操作可能需要一段时间,因此我想知道是否有可能在iPython笔记本中提供基于文本的输出,从而使我了解进度。

到目前为止,我已经尝试了Python的规范循环进度指示器,但是它们并未以任何有意义的方式与熊猫互动。

我希望pandas库/文档中有一些被我忽略的东西,它使人们知道了split-apply-combine的进度。一个简单的实现方法可能是查看apply功能在其上起作用的数据帧子集的总数,并将进度报告为这些子集的完成部分。

这是否可能需要添加到库中?

I regularly perform pandas operations on data frames in excess of 15 million or so rows and I’d love to have access to a progress indicator for particular operations.

Does a text based progress indicator for pandas split-apply-combine operations exist?

For example, in something like:

df_users.groupby(['userID', 'requestDate']).apply(feature_rollup)

where feature_rollup is a somewhat involved function that take many DF columns and creates new user columns through various methods. These operations can take a while for large data frames so I’d like to know if it is possible to have text based output in an iPython notebook that updates me on the progress.

So far, I’ve tried canonical loop progress indicators for Python but they don’t interact with pandas in any meaningful way.

I’m hoping there’s something I’ve overlooked in the pandas library/documentation that allows one to know the progress of a split-apply-combine. A simple implementation would maybe look at the total number of data frame subsets upon which the apply function is working and report progress as the completed fraction of those subsets.

Is this perhaps something that needs to be added to the library?


回答 0

由于需求旺盛,tqdm已增加了对的支持pandas。与其他答案不同,这不会明显降低熊猫的速度 -这是以下示例DataFrameGroupBy.progress_apply

import pandas as pd
import numpy as np
from tqdm import tqdm
# from tqdm.auto import tqdm  # for notebooks

df = pd.DataFrame(np.random.randint(0, int(1e8), (10000, 1000)))

# Create and register a new `tqdm` instance with `pandas`
# (can use tqdm_gui, optional kwargs, etc.)
tqdm.pandas()

# Now you can use `progress_apply` instead of `apply`
df.groupby(0).progress_apply(lambda x: x**2)

如果您对它的工作方式(以及如何为自己的回调进行修改)感兴趣,请参阅github上示例pypi 完整文档或导入模块并运行help(tqdm)

编辑


要直接回答原始问题,请替换为:

df_users.groupby(['userID', 'requestDate']).apply(feature_rollup)

与:

from tqdm import tqdm
tqdm.pandas()
df_users.groupby(['userID', 'requestDate']).progress_apply(feature_rollup)

注意:tqdm <= v4.8:对于低于4.8的tqdm版本,tqdm.pandas()您不必执行以下操作:

from tqdm import tqdm, tqdm_pandas
tqdm_pandas(tqdm())

Due to popular demand, tqdm has added support for pandas. Unlike the other answers, this will not noticeably slow pandas down — here’s an example for DataFrameGroupBy.progress_apply:

import pandas as pd
import numpy as np
from tqdm import tqdm
# from tqdm.auto import tqdm  # for notebooks

df = pd.DataFrame(np.random.randint(0, int(1e8), (10000, 1000)))

# Create and register a new `tqdm` instance with `pandas`
# (can use tqdm_gui, optional kwargs, etc.)
tqdm.pandas()

# Now you can use `progress_apply` instead of `apply`
df.groupby(0).progress_apply(lambda x: x**2)

In case you’re interested in how this works (and how to modify it for your own callbacks), see the examples on github, the full documentation on pypi, or import the module and run help(tqdm). Other supported functions include map, applymap, aggregate, and transform.

EDIT


To directly answer the original question, replace:

df_users.groupby(['userID', 'requestDate']).apply(feature_rollup)

with:

from tqdm import tqdm
tqdm.pandas()
df_users.groupby(['userID', 'requestDate']).progress_apply(feature_rollup)

Note: tqdm <= v4.8: For versions of tqdm below 4.8, instead of tqdm.pandas() you had to do:

from tqdm import tqdm, tqdm_pandas
tqdm_pandas(tqdm())

回答 1

调整Jeff的答案(并将其作为可重用函数)。

def logged_apply(g, func, *args, **kwargs):
    step_percentage = 100. / len(g)
    import sys
    sys.stdout.write('apply progress:   0%')
    sys.stdout.flush()

    def logging_decorator(func):
        def wrapper(*args, **kwargs):
            progress = wrapper.count * step_percentage
            sys.stdout.write('\033[D \033[D' * 4 + format(progress, '3.0f') + '%')
            sys.stdout.flush()
            wrapper.count += 1
            return func(*args, **kwargs)
        wrapper.count = 0
        return wrapper

    logged_func = logging_decorator(func)
    res = g.apply(logged_func, *args, **kwargs)
    sys.stdout.write('\033[D \033[D' * 4 + format(100., '3.0f') + '%' + '\n')
    sys.stdout.flush()
    return res

注意:应用进度百分比会内联更新。如果您的函数标准输出,则将无法正常工作。

In [11]: g = df_users.groupby(['userID', 'requestDate'])

In [12]: f = feature_rollup

In [13]: logged_apply(g, f)
apply progress: 100%
Out[13]: 
...

像往常一样,您可以将其作为方法添加到groupby对象中:

from pandas.core.groupby import DataFrameGroupBy
DataFrameGroupBy.logged_apply = logged_apply

In [21]: g.logged_apply(f)
apply progress: 100%
Out[21]: 
...

正如评论中提到的那样,这不是熊猫要实现的功能。但是python允许您为许多熊猫对象/方法创建这些(这样做将需要很多工作…尽管您应该能够概括这种方法)。

To tweak Jeff’s answer (and have this as a reuseable function).

def logged_apply(g, func, *args, **kwargs):
    step_percentage = 100. / len(g)
    import sys
    sys.stdout.write('apply progress:   0%')
    sys.stdout.flush()

    def logging_decorator(func):
        def wrapper(*args, **kwargs):
            progress = wrapper.count * step_percentage
            sys.stdout.write('\033[D \033[D' * 4 + format(progress, '3.0f') + '%')
            sys.stdout.flush()
            wrapper.count += 1
            return func(*args, **kwargs)
        wrapper.count = 0
        return wrapper

    logged_func = logging_decorator(func)
    res = g.apply(logged_func, *args, **kwargs)
    sys.stdout.write('\033[D \033[D' * 4 + format(100., '3.0f') + '%' + '\n')
    sys.stdout.flush()
    return res

Note: the apply progress percentage updates inline. If your function stdouts then this won’t work.

In [11]: g = df_users.groupby(['userID', 'requestDate'])

In [12]: f = feature_rollup

In [13]: logged_apply(g, f)
apply progress: 100%
Out[13]: 
...

As usual you can add this to your groupby objects as a method:

from pandas.core.groupby import DataFrameGroupBy
DataFrameGroupBy.logged_apply = logged_apply

In [21]: g.logged_apply(f)
apply progress: 100%
Out[21]: 
...

As mentioned in the comments, this isn’t a feature that core pandas would be interested in implementing. But python allows you to create these for many pandas objects/methods (doing so would be quite a bit of work… although you should be able to generalise this approach).


回答 2

如果您需要了解如何在Jupyter / IPython的笔记本使用此支持,像我一样,这里是一个有益的指导和源相关文章

from tqdm._tqdm_notebook import tqdm_notebook
import pandas as pd
tqdm_notebook.pandas()
df = pd.DataFrame(np.random.randint(0, int(1e8), (10000, 1000)))
df.groupby(0).progress_apply(lambda x: x**2)

请注意的import语句中的下划线_tqdm_notebook。正如所引用的文章所提到的,开发处于beta后期。

In case you need support for how to use this in a Jupyter/ipython notebook, as I did, here’s a helpful guide and source to relevant article:

from tqdm._tqdm_notebook import tqdm_notebook
import pandas as pd
tqdm_notebook.pandas()
df = pd.DataFrame(np.random.randint(0, int(1e8), (10000, 1000)))
df.groupby(0).progress_apply(lambda x: x**2)

Note the underscore in the import statement for _tqdm_notebook. As referenced article mentions, development is in late beta stage.


回答 3

对于希望在自己的自定义并行熊猫应用代码上应用tqdm的任何人。

(多年来,我尝试了一些用于并行化的库,但是我从来没有找到一个100%并行化解决方案,主要是针对apply函数,而且我总是不得不返回自己的“手动”代码。)

df_multi_core-这是您要呼叫的那个。它接受:

  1. 您的df对象
  2. 您要调用的函数名称
  3. 可以执行该功能的列的子集(有助于减少时间/内存)
  4. 并行运行的作业数(所有内核为-1或忽略)
  5. df函数接受的其他任何变形(例如“轴”)

_df_split-这是一个内部帮助器函数,必须全局定位到正在运行的模块(Pool.map是“与位置相关的”),否则我将在内部对其进行定位。

这是我的要旨中的代码(我将在其中添加更多的pandas功能测试):

import pandas as pd
import numpy as np
import multiprocessing
from functools import partial

def _df_split(tup_arg, **kwargs):
    split_ind, df_split, df_f_name = tup_arg
    return (split_ind, getattr(df_split, df_f_name)(**kwargs))

def df_multi_core(df, df_f_name, subset=None, njobs=-1, **kwargs):
    if njobs == -1:
        njobs = multiprocessing.cpu_count()
    pool = multiprocessing.Pool(processes=njobs)

    try:
        splits = np.array_split(df[subset], njobs)
    except ValueError:
        splits = np.array_split(df, njobs)

    pool_data = [(split_ind, df_split, df_f_name) for split_ind, df_split in enumerate(splits)]
    results = pool.map(partial(_df_split, **kwargs), pool_data)
    pool.close()
    pool.join()
    results = sorted(results, key=lambda x:x[0])
    results = pd.concat([split[1] for split in results])
    return results

波纹管是与tqdm“ progress_apply” 并行应用的测试代码。

from time import time
from tqdm import tqdm
tqdm.pandas()

if __name__ == '__main__': 
    sep = '-' * 50

    # tqdm progress_apply test      
    def apply_f(row):
        return row['c1'] + 0.1
    N = 1000000
    np.random.seed(0)
    df = pd.DataFrame({'c1': np.arange(N), 'c2': np.arange(N)})

    print('testing pandas apply on {}\n{}'.format(df.shape, sep))
    t1 = time()
    res = df.progress_apply(apply_f, axis=1)
    t2 = time()
    print('result random sample\n{}'.format(res.sample(n=3, random_state=0)))
    print('time for native implementation {}\n{}'.format(round(t2 - t1, 2), sep))

    t3 = time()
    # res = df_multi_core(df=df, df_f_name='apply', subset=['c1'], njobs=-1, func=apply_f, axis=1)
    res = df_multi_core(df=df, df_f_name='progress_apply', subset=['c1'], njobs=-1, func=apply_f, axis=1)
    t4 = time()
    print('result random sample\n{}'.format(res.sample(n=3, random_state=0)))
    print('time for multi core implementation {}\n{}'.format(round(t4 - t3, 2), sep))

在输出中,您可以看到1个进度条,用于在没有并行化的情况下运行,以及每核进度条,在具有并行化的情况下运行。会有一些小小的变化,有时其他核心会同时出现,但是即使如此,我仍然认为这很有用,因为您可以获得每个核心的进度统计信息(例如,每秒/秒和总记录)

谢谢@abcdaa提供的这个出色的库!

For anyone who’s looking to apply tqdm on their custom parallel pandas-apply code.

(I tried some of the libraries for parallelization over the years, but I never found a 100% parallelization solution, mainly for the apply function, and I always had to come back for my “manual” code.)

df_multi_core – this is the one you call. It accepts:

  1. Your df object
  2. The function name you’d like to call
  3. The subset of columns the function can be performed upon (helps reducing time / memory)
  4. The number of jobs to run in parallel (-1 or omit for all cores)
  5. Any other kwargs the df’s function accepts (like “axis”)

_df_split – this is an internal helper function that has to be positioned globally to the running module (Pool.map is “placement dependent”), otherwise I’d locate it internally..

here’s the code from my gist (I’ll add more pandas function tests there):

import pandas as pd
import numpy as np
import multiprocessing
from functools import partial

def _df_split(tup_arg, **kwargs):
    split_ind, df_split, df_f_name = tup_arg
    return (split_ind, getattr(df_split, df_f_name)(**kwargs))

def df_multi_core(df, df_f_name, subset=None, njobs=-1, **kwargs):
    if njobs == -1:
        njobs = multiprocessing.cpu_count()
    pool = multiprocessing.Pool(processes=njobs)

    try:
        splits = np.array_split(df[subset], njobs)
    except ValueError:
        splits = np.array_split(df, njobs)

    pool_data = [(split_ind, df_split, df_f_name) for split_ind, df_split in enumerate(splits)]
    results = pool.map(partial(_df_split, **kwargs), pool_data)
    pool.close()
    pool.join()
    results = sorted(results, key=lambda x:x[0])
    results = pd.concat([split[1] for split in results])
    return results

Bellow is a test code for a parallelized apply with tqdm “progress_apply”.

from time import time
from tqdm import tqdm
tqdm.pandas()

if __name__ == '__main__': 
    sep = '-' * 50

    # tqdm progress_apply test      
    def apply_f(row):
        return row['c1'] + 0.1
    N = 1000000
    np.random.seed(0)
    df = pd.DataFrame({'c1': np.arange(N), 'c2': np.arange(N)})

    print('testing pandas apply on {}\n{}'.format(df.shape, sep))
    t1 = time()
    res = df.progress_apply(apply_f, axis=1)
    t2 = time()
    print('result random sample\n{}'.format(res.sample(n=3, random_state=0)))
    print('time for native implementation {}\n{}'.format(round(t2 - t1, 2), sep))

    t3 = time()
    # res = df_multi_core(df=df, df_f_name='apply', subset=['c1'], njobs=-1, func=apply_f, axis=1)
    res = df_multi_core(df=df, df_f_name='progress_apply', subset=['c1'], njobs=-1, func=apply_f, axis=1)
    t4 = time()
    print('result random sample\n{}'.format(res.sample(n=3, random_state=0)))
    print('time for multi core implementation {}\n{}'.format(round(t4 - t3, 2), sep))

In the output you can see 1 progress bar for running without parallelization, and per-core progress bars when running with parallelization. There is a slight hickup and sometimes the rest of the cores appear at once, but even then I think its usefull since you get the progress stats per core (it/sec and total records, for ex)

Thank you @abcdaa for this great library!


回答 4

您可以使用装饰器轻松完成此操作

from functools import wraps 

def logging_decorator(func):

    @wraps
    def wrapper(*args, **kwargs):
        wrapper.count += 1
        print "The function I modify has been called {0} times(s).".format(
              wrapper.count)
        func(*args, **kwargs)
    wrapper.count = 0
    return wrapper

modified_function = logging_decorator(feature_rollup)

然后只需使用modified_function(并在您希望打印时更改)

You can easily do this with a decorator

from functools import wraps 

def logging_decorator(func):

    @wraps
    def wrapper(*args, **kwargs):
        wrapper.count += 1
        print "The function I modify has been called {0} times(s).".format(
              wrapper.count)
        func(*args, **kwargs)
    wrapper.count = 0
    return wrapper

modified_function = logging_decorator(feature_rollup)

then just use the modified_function (and change when you want it to print)


回答 5

我更改了Jeff的answer,使其包含总数,以便您可以跟踪进度和一个变量以仅打印每X次迭代(如果“ print_at”相当高,则实际上可以大大提高性能)

def count_wrapper(func,total, print_at):

    def wrapper(*args):
        wrapper.count += 1
        if wrapper.count % wrapper.print_at == 0:
            clear_output()
            sys.stdout.write( "%d / %d"%(calc_time.count,calc_time.total) )
            sys.stdout.flush()
        return func(*args)
    wrapper.count = 0
    wrapper.total = total
    wrapper.print_at = print_at

    return wrapper

clear_output()函数来自

from IPython.core.display import clear_output

如果不在IPython上,那么Andy Hayden的答案就是没有它

I’ve changed Jeff’s answer, to include a total, so that you can track progress and a variable to just print every X iterations (this actually improves the performance by a lot, if the “print_at” is reasonably high)

def count_wrapper(func,total, print_at):

    def wrapper(*args):
        wrapper.count += 1
        if wrapper.count % wrapper.print_at == 0:
            clear_output()
            sys.stdout.write( "%d / %d"%(calc_time.count,calc_time.total) )
            sys.stdout.flush()
        return func(*args)
    wrapper.count = 0
    wrapper.total = total
    wrapper.print_at = print_at

    return wrapper

the clear_output() function is from

from IPython.core.display import clear_output

if not on IPython Andy Hayden’s answer does that without it


如何清除ipython中的变量?

问题:如何清除ipython中的变量?

有时,我会在同一ipython会话中重新运行脚本,但在未清除变量的情况下,我会感到很惊讶。如何清除所有变量?每次我调用魔术命令%run时是否有可能强制执行此操作?

谢谢

Sometimes I rerun a script within the same ipython session and I get bad surprises when variables haven’t been cleared. How do I clear all variables? And is it possible to force this somehow every time I invoke the magic command %run?

Thanks


回答 0

%reset 似乎清除定义的变量。

%reset seems to clear defined variables.


回答 1

@ErdemKAYA评论后编辑。

要删除变量,请使用magic命令:

%reset_selective <regular_expression>

从命名空间删除的变量是与给定匹配的变量<regular_expression>

因此

%reset_selective -f a 

将删除所有包含的变量a

相反,仅擦除a而不是aa

In: a, aa = 1, 2
In: %reset_selective -f "^a$"
In: a  # raise NameError
In: aa  # returns 2

另请参阅以%reset_selective?获取更多示例和https://regexone.com/获得regex教程。

要擦除命名空间中的所有变量,请参见:

%reset?

EDITED after @ErdemKAYA comment.

To erase a variable, use the magic command:

%reset_selective <regular_expression>

The variables that are erased from the namespace are the one matching the given <regular_expression>.

Therefore

%reset_selective -f a 

will erase all the variables containing an a.

Instead, to erase only a and not aa:

In: a, aa = 1, 2
In: %reset_selective -f "^a$"
In: a  # raise NameError
In: aa  # returns 2

see as well %reset_selective? for more examples and https://regexone.com/ for a regex tutorial.

To erase all the variables in the namespace see:

%reset?

回答 2

在iPython中,您可以删除单个变量,如下所示:

del x

In iPython you can remove a single variable like this:

del x

回答 3

我试过了

%reset -f

并清除所有变量和内容而无提示。-f在不提示yes / no的情况下对给定命令执行强制操作。

希望这会有所帮助.. :)

I tried

%reset -f

and cleared all the variables and contents without prompt. -f does the force action on the given command without prompting for yes/no.

Wish this helps.. :)


回答 4

每次重新运行脚本时,将以下行添加到新脚本将清除所有变量:

from IPython import get_ipython
get_ipython().magic('reset -sf') 

为了使生活更轻松,您可以将它们添加到默认模板中。

在Spyder中: Tools>Preferences>Editor>Edit template

Adding the following lines to a new script will clear all variables each time you rerun the script:

from IPython import get_ipython
get_ipython().magic('reset -sf') 

To make life easy, you can add them to your default template.

In Spyder: Tools>Preferences>Editor>Edit template


回答 5

除了前面提到的方法。您还可以使用命令del删除多个变量

del variable1,variable2

Apart from the methods mentioned earlier. You can also use the command del to remove multiple variables

del variable1,variable2

回答 6

控制台面板中的退出选项还将清除变量资源管理器中的所有变量

***请注意,您将失去在控制台面板中运行的所有代码

An quit option in the Console Panel will also clear all variables in variable explorer

*** Note that you will be loosing all the code which you have run in Console Panel


尝试运行Python脚本时出现“ ImportError:未命名模块”

问题:尝试运行Python脚本时出现“ ImportError:未命名模块”

我正在尝试运行一个脚本,该脚本除其他外将启动python脚本。我收到一个ImportError:没有名为…的模块,但是,如果我启动ipython并通过解释器以相同的方式导入相同的模块,则该模块将被接受。

怎么回事,我该如何解决?我试图了解python如何使用PYTHONPATH,但是我感到非常困惑。任何帮助将不胜感激。

I’m trying to run a script that launches, amongst other things, a python script. I get a ImportError: No module named …, however, if I launch ipython and import the same module in the same way through the interpreter, the module is accepted.

What’s going on, and how can I fix it? I’ve tried to understand how python uses PYTHONPATH but I’m thoroughly confused. Any help would greatly appreciated.


回答 0

由于命令行IPython解释器使用当前路径的方式与单独进程(例如IPython笔记本,外部进程等)的方式不同,因此会出现此问题。IPython将寻找要导入的模块,这些模块不仅可以在sys.path中找到,而且可以在当前工作目录中找到。从命令行启动解释器时,您正在操作的当前目录与在ipython中启动的目录相同。如果运行

import os
os.getcwd() 

您会看到这是真的。

但是,假设您使用的是ipython笔记本,请运行,os.getcwd()而当前的工作目录是您告诉笔记本在ipython_notebook_config.py文件中操作笔记本的文件夹(通常使用该c.NotebookManager.notebook_dir设置)。

解决方案是为python解释器提供您的模块路径。最简单的解决方案是将该路径附加到sys.path列表中。在您的笔记本中,首先尝试:

import sys
sys.path.append('my/path/to/module/folder')

import module-of-interest

如果这不起作用,则您手上的问题与导入路径无关,您应提供有关问题的更多信息。

解决此问题的更好(且更永久)的方法是设置PYTHONPATH,它为解释器提供了用于python包/模块的其他目录。将PYTHONPATH编辑或设置为全局变量是os依赖的,这里将在UnixWindows上进行详细讨论。

This issue arises due to the ways in which the command line IPython interpreter uses your current path vs. the way a separate process does (be it an IPython notebook, external process, etc). IPython will look for modules to import that are not only found in your sys.path, but also on your current working directory. When starting an interpreter from the command line, the current directory you’re operating in is the same one you started ipython in. If you run

import os
os.getcwd() 

you’ll see this is true.

However, let’s say you’re using an ipython notebook, run os.getcwd() and your current working directory is instead the folder in which you told the notebook to operate from in your ipython_notebook_config.py file (typically using the c.NotebookManager.notebook_dir setting).

The solution is to provide the python interpreter with the path-to-your-module. The simplest solution is to append that path to your sys.path list. In your notebook, first try:

import sys
sys.path.append('my/path/to/module/folder')

import module-of-interest

If that doesn’t work, you’ve got a different problem on your hands unrelated to path-to-import and you should provide more info about your problem.

The better (and more permanent) way to solve this is to set your PYTHONPATH, which provides the interpreter with additional directories look in for python packages/modules. Editing or setting the PYTHONPATH as a global var is os dependent, and is discussed in detail here for Unix or Windows.


回答 1

__init__.py运行python项目时,只需在显示错误的文件夹下创建一个名称为空的python文件即可。

Just create an empty python file with the name __init__.py under the folder which showing error, while you running the python project.


回答 2

确保它们都使用相同的解释器。这在Ubuntu上发生在我身上:

$ ipython3 -c 'import sys; print(sys.version)'
3.4.2 (default, Jun 19 2015, 11:34:49) \n[GCC 4.9.1]

$ python3 -c 'import sys; print(sys.version)'
3.3.0 (default, Nov 27 2012, 12:11:06) \n[GCC 4.6.3]

而且sys.path是两个解释不同。为了解决这个问题,我删除了Python 3.3。

Make sure they are both using the same interpreter. This happened to me on Ubuntu:

$ ipython3 -c 'import sys; print(sys.version)'
3.4.2 (default, Jun 19 2015, 11:34:49) \n[GCC 4.9.1]

$ python3 -c 'import sys; print(sys.version)'
3.3.0 (default, Nov 27 2012, 12:11:06) \n[GCC 4.6.3]

And sys.path was different between the two interpreters. To fix it, I removed Python 3.3.


回答 3

主要原因是Python和IPython的sys.paths不同。

请参考lucypark链接,该解决方案适用于我的情况。它通过安装opencv时发生

conda install opencv

在iPython中出现导入错误,有三个步骤可以解决此问题:

import cv2
ImportError: ...

1.使用以下命令检查Python和iPython中的路径

import sys
sys.path

您会发现与Python和Jupyter不同的结果。第二步,只需使用sys.path.append 尝试错误即可修复错过的路径。

2.临时解决方案

在iPython中:

import sys
sys.path.append('/home/osboxes/miniconda2/lib/python2.7/site-packages')
import cv2

ImportError:..问题的解决

3.永久解决方案

创建一个iPython配置文件并设置初始追加:

在bash shell中:

ipython profile create
... CHECK the path prompted , and edit the prompted config file like my case
vi /home/osboxes/.ipython/profile_default/ipython_kernel_config.py

在vi中,附加到文件:

c.InteractiveShellApp.exec_lines = [
 'import sys; sys.path.append("/home/osboxes/miniconda2/lib/python2.7/site-packages")'
]

完成

The main reason is the sys.paths of Python and IPython are different.

Please refer to lucypark link, the solution works in my case. It happen when install opencv by

conda install opencv

And got import error in iPython, There are three steps to solve this issue:

import cv2
ImportError: ...

1. Check path in Python and iPython with following command

import sys
sys.path

You will find different result from Python and Jupyter. Second step, just use sys.path.append to fix the missed path by try-and-error.

2. Temporary solution

In iPython:

import sys
sys.path.append('/home/osboxes/miniconda2/lib/python2.7/site-packages')
import cv2

the ImportError:.. issue solved

3. Permanent solution

Create an iPython profile and set initial append:

In bash shell:

ipython profile create
... CHECK the path prompted , and edit the prompted config file like my case
vi /home/osboxes/.ipython/profile_default/ipython_kernel_config.py

In vi, append to the file:

c.InteractiveShellApp.exec_lines = [
 'import sys; sys.path.append("/home/osboxes/miniconda2/lib/python2.7/site-packages")'
]

DONE


回答 4

这样做是sys.path.append('my-path-to-module-folder')可行的,但是为了避免每次要使用IMod时都必须在IPython中执行此操作,可以将其添加export PYTHONPATH="my-path-to-module-folder:$PYTHONPATH"~/.bash_profile文件中。

Doing sys.path.append('my-path-to-module-folder') will work, but to avoid having to do this in IPython every time you want to use the module, you can add export PYTHONPATH="my-path-to-module-folder:$PYTHONPATH" to your ~/.bash_profile file.


回答 5

在安装ipython之前,我通过easy_install安装了模块;说sudo easy_install mechanize

安装ipython之后,我必须重新运行easy_install才能使ipython识别模块。

Before installing ipython, I installed modules through easy_install; say sudo easy_install mechanize.

After installing ipython, I had to re-run easy_install for ipython to recognize the modules.


回答 6

遇到了类似的问题,通过调用python3而不是python进行了修复,我的模块在Python3.5中。

Had a similar problem, fixed it by calling python3 instead of python, my modules were in Python3.5.


回答 7

如果从命令行运行它,有时python解释器将不知道在何处查找模块。

下面是我的项目的目录结构:

/project/apps/..
/project/tests/..

我在以下命令下运行:

>> cd project

>> python tests/my_test.py

运行以上命令后,出现以下错误

no module named lib

lib已导入my_test.py

我打印了sys.path并发现我正在处理的项目的路径在sys.path列表中不可用

我在脚本的开头添加了以下代码my_test.py

import sys
import os

module_path = os.path.abspath(os.getcwd())    

if module_path not in sys.path:       

    sys.path.append(module_path)

我不确定这是否是解决问题的好方法,但是是的,它确实对我有用。

If you are running it from command line, sometimes python interpreter is not aware of the path where to look for modules.

Below is the directory structure of my project:

/project/apps/..
/project/tests/..

I was running below command:

>> cd project

>> python tests/my_test.py

After running above command i got below error

no module named lib

lib was imported in my_test.py

i printed sys.path and figured out that path of project i am working on is not available in sys.path list

i added below code at the start of my script my_test.py .

import sys
import os

module_path = os.path.abspath(os.getcwd())    

if module_path not in sys.path:       

    sys.path.append(module_path)

I am not sure if it is a good way of solving it but yeah it did work for me.


回答 8

这是我解决的方法:

import os
import sys
module_path = os.path.abspath(os.getcwd() + '\\..')
if module_path not in sys.path:
    sys.path.append(module_path)

This is how I fixed it:

import os
import sys
module_path = os.path.abspath(os.getcwd() + '\\..')
if module_path not in sys.path:
    sys.path.append(module_path)

回答 9

我发现此问题的解决方案已在此处广泛记录:

https://jakevdp.github.io/blog/2017/12/05/installing-python-packages-from-jupyter/

基本上,您必须在Jupyter环境中安装软件包,并发出如下shell命令:

!{sys.executable} -m pip install numpy

请检查上面的链接以获得权威的完整答案。

I have found that the solution to this problem was extensively documented here:

https://jakevdp.github.io/blog/2017/12/05/installing-python-packages-from-jupyter/

Basically, you must install the packages within the Jupyter environment, issuing shell commands like:

!{sys.executable} -m pip install numpy

Please check the above link for an authoritative full answer.


回答 10

我发现了这种差异的另一个来源:

我在本地和在virtualenvs中都安装了ipython。我的问题是,在使用ipython新建的virtualenv内部,安装了ipython系统,它与virtualenv中的python和ipython版本不同(2.7.x与3.5.x),并且随之而来。

我认为,每当安装将要安装二进制文件的东西时,明智的做法yourvirtualenv/binrehash针对正在使用的任何外壳程序立即运行或类似运行,以便获取正确的python / ipython。(必须检查是否有合适pip的安装后挂钩…)

I found yet another source of this discrepancy:

I have ipython installed both locally and in commonly in virtualenvs. My problem was that, inside a newly made virtualenv with ipython, the system ipython was picked up, which was a different version than the python and ipython in the virtualenv (a 2.7.x vs. a 3.5.x), and hilarity ensued.

I think the smart thing to do whenever installing something that will have a binary in yourvirtualenv/bin is to immediately run rehash or similar for whatever shell you are using so that the correct python/ipython gets picked up. (Gotta check if there are suitable pip post-install hooks…)


回答 11

不带脚本的解决方案:

  1. 打开Spyder->工具-> PYTHONPATH管理器
  2. 通过单击“添加路径”来添加Python路径。例如:“ C:\ Users \ User \ AppData \ Local \ Programs \ Python \ Python37 \ Lib \ site-packages”
  3. 单击“同步…”以允许其他程序(例如Jupyter Notebook)使用在步骤2中设置的pythonpaths。
  4. 如果Jupyter已打开,请重新启动

Solution without scripting:

  1. Open Spyder -> Tools -> PYTHONPATH manager
  2. Add Python paths by clicking “Add Path”. E.g: ‘C:\Users\User\AppData\Local\Programs\Python\Python37\Lib\site-packages’
  3. Click “Synchronize…” to allow other programs (e.g. Jupyter Notebook) use the pythonpaths set in step 2.
  4. Restart Jupyter if it is open

回答 12

这可能是由系统上安装的不同python版本(即python2python3)引起的

运行命令$ pip --version$ pip3 --version检查哪个来自Python 3x。例如,您应该看到如下版本信息:

pip 19.0.3 from /usr/local/lib/python3.7/site-packages/pip (python 3.7)

然后example.py使用以下命令运行脚本

$ python3 example.py

This is probably caused by different python versions installed on your system, i.e. python2 or python3.

Run command $ pip --version and $ pip3 --version to check which pip is from at Python 3x. E.g. you should see version information like below:

pip 19.0.3 from /usr/local/lib/python3.7/site-packages/pip (python 3.7)

Then run the example.py script with below command

$ python3 example.py

回答 13

出现在我的目录下utils。我试图将此目录导入为:

from utils import somefile

utils已经是python中的软件包。只需将您的目录名称更改为其他名称,它就可以正常工作。

Happened to me with the directory utils. I was trying to import this directory as:

from utils import somefile

utils is already a package in python. Just change your directory name to something different and it should work just fine.


回答 14

此类错误最有可能是由于python版本冲突引起的。例如,如果您的应用程序仅在python 3上运行,并且您还获得了python 2,那么最好指定要使用的版本。例如使用

python3 .....

代替

python

This kind of errors occurs most probably due to python version conflicts. For example, if your application runs only on python 3 and you got python 2 as well, then it’s better to specify which version to use. For example use

python3 .....

instead of

python

回答 15

该答案适用于此问题,如果

  1. 您不想更改代码
  2. 您不想永久更改PYTHONPATH

临时修改PYTHONPATH

下面的路径可以是相对的

PYTHONPATH=/path/to/dir python script.py

This answer applies to this question if

  1. You don’t want to change your code
  2. You don’t want to change PYTHONPATH permanently

Temporarily modify PYTHONPATH

path below can be relative

PYTHONPATH=/path/to/dir python script.py

回答 16

导入sys sys.path.append(’/ Users / {user} /Library/Python/3.7/lib/python/site-packages’)import ta

import sys sys.path.append(‘/Users/{user}/Library/Python/3.7/lib/python/site-packages’) import ta


回答 17

删除pathlib并重新安装它。删除sitepackages文件夹中的pathlib并使用pip命令重新安装pathlib软件包:

pip install pathlib

Remove pathlib and reinstall it. Delete the pathlib in sitepackages folder and reinstall the pathlib package by using pip command:

pip install pathlib

在jupyter笔记本中折叠单元格

问题:在jupyter笔记本中折叠单元格

我正在使用ipython Jupyter笔记本。假设我定义了一个在屏幕上占用很多空间的函数。有没有办法让牢房塌陷?

我希望函数保持执行和可调用状态,但是我想隐藏/折叠单元格以便更好地可视化笔记本。我怎样才能做到这一点?

I am using ipython Jupyter notebook. Let’s say I defined a function that occupies a lot of space on my screen. Is there a way to collapse the cell?

I want the function to remain executed and callable, yet I want to hide / collapse the cell in order to better visualize the notebook. How can I do this?


回答 0

jupyter contrib nbextensionsPython包包含一个代码折叠扩展,可以在笔记本内启用。请点击链接(Github)获得文档。

要使用命令行安装:

pip install jupyter_contrib_nbextensions
jupyter contrib nbextension install --user

为了使生活变得更轻松,我还建议您jupyter nbextensions configurator打包。这在笔记本电脑界面中提供了一个额外的选项卡,您可以在其中轻松地(停用)所有已安装的扩展程序。

安装:

pip install jupyter_nbextensions_configurator
jupyter nbextensions_configurator enable --user

The jupyter contrib nbextensions Python package contains a code-folding extension that can be enabled within the notebook. Follow the link (Github) for documentation.

To install using command line:

pip install jupyter_contrib_nbextensions
jupyter contrib nbextension install --user

To make life easier in managing them, I’d also recommend the jupyter nbextensions configurator package. This provides an extra tab in your Notebook interface from where you can easily (de)activate all installed extensions.

Installation:

pip install jupyter_nbextensions_configurator
jupyter nbextensions_configurator enable --user

回答 1

您可以创建一个单元格并将以下代码放入其中:

%%html
<style>
div.input {
    display:none;
}
</style>

运行此单元格将隐藏所有输入单元格。要显示它们,可以使用菜单清除所有输出。

否则,您可以尝试以下笔记本扩展:

https://github.com/ipython-contrib/IPython-notebook-extensions/wiki/Home_3x

You can create a cell and put the following code in it:

%%html
<style>
div.input {
    display:none;
}
</style>

Running this cell will hide all input cells. To show them back, you can use the menu to clear all outputs.

Otherwise you can try notebook extensions like below:

https://github.com/ipython-contrib/IPython-notebook-extensions/wiki/Home_3x


回答 2

JupyterLab支持细胞折叠。单击左侧的蓝色单元格栏将折叠该单元格。

JupyterLab supports cell collapsing. Clicking on the blue cell bar on the left will fold the cell.


回答 3

我遇到了类似的问题,@ Energya指出的“ nbextensions”工作得非常好,很轻松。对于笔记本扩展及其配置程序,安装说明很简单(我在Windows上使用anaconda进行了尝试)。

就是说,我想补充一点,以下扩展应该引起关注。

  • 隐藏输入| 此扩展允许在笔记本中隐藏单个码元。这可以通过单击工具栏按钮来实现:

  • 可折叠的标题| 允许笔记本具有可折叠的部分,并以标题分隔

  • 代码折叠| 已经提到过,但是为了完整性我添加了它

I had a similar issue and the “nbextensions” pointed out by @Energya worked very well and effortlessly. The install instructions are straight forward (I tried with anaconda on Windows) for the notebook extensions and for their configurator.

That said, I would like to add that the following extensions should be of interest.

  • Hide Input | This extension allows hiding of an individual codecell in a notebook. This can be achieved by clicking on the toolbar button:

  • Collapsible Headings | Allows notebook to have collapsible sections, separated by headings

  • Codefolding | This has been mentioned but I add it for completeness


回答 4

在〜/ .jupyter / custom /内创建具有以下内容的custom.js文件:

$("<style type='text/css'> .cell.code_cell.collapse { max-height:30px; overflow:hidden;} </style>").appendTo("head");
$('.prompt.input_prompt').on('click', function(event) {
    console.log("CLICKED", arguments)   
    var c = $(event.target.closest('.cell.code_cell'))
    if(c.hasClass('collapse')) {
        c.removeClass('collapse');
    } else {
        c.addClass('collapse');
    }
});

保存后,重新启动服务器并刷新笔记本。您可以通过单击输入标签(In [])折叠任何单元格。

Create custom.js file inside ~/.jupyter/custom/ with following contents:

$("<style type='text/css'> .cell.code_cell.collapse { max-height:30px; overflow:hidden;} </style>").appendTo("head");
$('.prompt.input_prompt').on('click', function(event) {
    console.log("CLICKED", arguments)   
    var c = $(event.target.closest('.cell.code_cell'))
    if(c.hasClass('collapse')) {
        c.removeClass('collapse');
    } else {
        c.addClass('collapse');
    }
});

After saving, restart the server and refresh the notebook. You can collapse any cell by clicking on the input label (In[]).


回答 5

hide_code扩展名允许您隐藏单个单元格和/或它们旁边的提示。安装为

pip3 install hide_code

访问https://github.com/kirbs-/hide_code/了解有关此扩展程序的更多信息。

The hide_code extension allows you to hide individual cells, and/or the prompts next to them. Install as

pip3 install hide_code

Visit https://github.com/kirbs-/hide_code/ for more info about this extension.


回答 6

首先,按照Energya的指示进行:

pip install jupyter_contrib_nbextensions
jupyter contrib nbextension install --user
pip install jupyter_nbextensions_configurator
jupyter nbextensions_configurator enable --user

第二个关键是:打开木星笔记本后,单击Nbextension选项卡。现在,从Nbextension提供的搜索工具(不是Web浏览器)中搜索“ colla”,然后您将找到一个名为“ Collapsible Headings”的内容

这就是你想要的!

Firstly, follow Energya’s instruction:

pip install jupyter_contrib_nbextensions
jupyter contrib nbextension install --user
pip install jupyter_nbextensions_configurator
jupyter nbextensions_configurator enable --user

Second is the key: After opening jupiter notebook, click the Nbextension tab. Now Search “colla” from the searching tool provided by Nbextension(not by the web browser), then you will find something called “Collapsible Headings”

This is what you want!


回答 7

正如其他人所提到的,您可以通过nbextensions进行此操作。我想简短地解释一下我所做的事情:

要启用可协作的标题:在终端中,首先输入以下内容来启用/安装Jupyter Notebook Extensions:

pip install jupyter_contrib_nbextensions

然后输入:

jupyter contrib nbextension install

重新打开Jupyter Notebook。转到“编辑”选项卡,然后选择“ nbextensions配置”。取消直接在标题“ Configurable nbextensions”下的复选框,然后选择“可折叠标题”。

As others have mentioned, you can do this via nbextensions. I wanted to give the brief explanation of what I did, which was quick and easy:

To enable collabsible headings: In your terminal, enable/install Jupyter Notebook Extensions by first entering:

pip install jupyter_contrib_nbextensions

Then, enter:

jupyter contrib nbextension install

Re-open Jupyter Notebook. Go to “Edit” tab, and select “nbextensions config”. Un-check box directly under title “Configurable nbextensions”, then select “collapsible headings”.


回答 8

这个问题有很多答案,在许多扩展中,我觉得所有这些都不令人满意(有些比其他更好),例如代码折叠,标题折叠等。没有一个人以简单有效的方式满足我的要求。令我惊讶的是,尚未实施解决方案(对于Jupyter Lab而言)。

实际上,我非常不满意,以至于我开发了一个非常简单的笔记本扩展,可以扩展/折叠笔记本单元中的代码,同时保持其可执行性。

GitHub存储库:https : //github.com/BenedictWilkinsAI/cellfolding

以下是该扩展程序的小演示:

只需双击代码单元的左侧,即可将其折叠为一行:

再次双击将展开该单元格。

该扩展可以通过pip轻松安装:

pip install nbextension-cellfolding
jupyter nbextension install --py cellfolding --user
jupyter nbextension enable --py cellfolding --user 

并且还与nbextension configurator兼容。我希望人们会发现这很有用!

There are many answers to this question, all of which I feel are not satisfactory (some more than others), of the many extensions – code folding, folding by headings etc etc. None do what I want in simple and effective way. I am literally amazed that a solution has not been implemented (as it has for Jupyter Lab).

In fact, I was so dissatisfied that I have developed a very simple notebook extension that can expand/collapse the code in a notebook cell, while keeping it executable.

The GitHub repository: https://github.com/BenedictWilkinsAI/cellfolding

Below is a small demo of what the extension does:

Simply double clicking left of the code cell will collapse it to a single line:

Double clicking again will expand the cell.

The extension can be installed easily with pip:

pip install nbextension-cellfolding
jupyter nbextension install --py cellfolding --user
jupyter nbextension enable --py cellfolding --user 

and is also compatible with nbextension configurator. I hope that people will find this useful!


回答 9

潘岩建议的改进版本。它添加了显示代码单元的按钮:

%%html
<style id=hide>div.input{display:none;}</style>
<button type="button" 
onclick="var myStyle = document.getElementById('hide').sheet;myStyle.insertRule('div.input{display:inherit !important;}', 0);">
Show inputs</button>

或python:

# Run me to hide code cells

from IPython.core.display import display, HTML
display(HTML(r"""<style id=hide>div.input{display:none;}</style><button type="button"onclick="var myStyle = document.getElementById('hide').sheet;myStyle.insertRule('div.input{display:inherit !important;}', 0);">Show inputs</button>"""))

There’s also an improved version of Pan Yan suggestion. It adds the button that shows code cells back:

%%html
<style id=hide>div.input{display:none;}</style>
<button type="button" 
onclick="var myStyle = document.getElementById('hide').sheet;myStyle.insertRule('div.input{display:inherit !important;}', 0);">
Show inputs</button>

Or python:

# Run me to hide code cells

from IPython.core.display import display, HTML
display(HTML(r"""<style id=hide>div.input{display:none;}</style><button type="button"onclick="var myStyle = document.getElementById('hide').sheet;myStyle.insertRule('div.input{display:inherit !important;}', 0);">Show inputs</button>"""))

回答 10

除了启用扩展,您不需要做太多事情:

http://localhost:8888/nbextensions?nbextension=collapsible_headings
http://localhost:8888/nbextensions?nbextension=codefolding/main

最有可能在以下位置找到所有扩展:

http://localhost:8888/nbextensions

You don’t need to do much except to enable the extensions:

http://localhost:8888/nbextensions?nbextension=collapsible_headings
http://localhost:8888/nbextensions?nbextension=codefolding/main

Most probable you will find all your extensions in here:

http://localhost:8888/nbextensions


回答 11

我用来获得预期结果的是:

  1. 将以下代码块保存toggle_cell.py在与笔记本相同目录中的文件中
from IPython.core.display import display, HTML
toggle_code_str = '''
<form action="javascript:code_toggle()"><input type="submit" id="toggleButton" value="Show Sloution"></form>
'''

toggle_code_prepare_str = '''
    <script>
    function code_toggle() {
        if ($('div.cell.code_cell.rendered.selected div.input').css('display')!='none'){
            $('div.cell.code_cell.rendered.selected div.input').hide();
        } else {
            $('div.cell.code_cell.rendered.selected div.input').show();
        }
    }
    </script>

'''

display(HTML(toggle_code_prepare_str + toggle_code_str))

def hide_sloution():
    display(HTML(toggle_code_str))
  1. 在笔记本的第一个单元格中添加以下内容
from toggle_cell import toggle_code as hide_sloution
  1. 您需要添加切换按钮即可调用的任何单元格 hide_sloution()

What I use to get the desired outcome is:

  1. Save the below code block in a file named toggle_cell.py in the same directory as of your notebook
from IPython.core.display import display, HTML
toggle_code_str = '''
<form action="javascript:code_toggle()"><input type="submit" id="toggleButton" value="Show Sloution"></form>
'''

toggle_code_prepare_str = '''
    <script>
    function code_toggle() {
        if ($('div.cell.code_cell.rendered.selected div.input').css('display')!='none'){
            $('div.cell.code_cell.rendered.selected div.input').hide();
        } else {
            $('div.cell.code_cell.rendered.selected div.input').show();
        }
    }
    </script>

'''

display(HTML(toggle_code_prepare_str + toggle_code_str))

def hide_sloution():
    display(HTML(toggle_code_str))
  1. Add the following in the first cell of your notebook
from toggle_cell import toggle_code as hide_sloution
  1. Any cell you need to add the toggle button to simply call hide_sloution()

在IPython / Jupyter笔记本中显示行号

问题:在IPython / Jupyter笔记本中显示行号

在IPython / Jupyter Notebook中运行的大多数语言内核的错误报告都指出发生错误的行;但是(至少默认情况下)在笔记本电脑中没有显示行号。

是否可以将行号添加到IPython / Jupyter Notebook?

Error reports from most language kernels running in IPython/Jupyter Notebooks indicate the line on which the error occurred; but (at least by default) no line numbers are indicated in Notebooks.

Is it possibile to add the line numbers to IPython/Jupyter Notebooks?


回答 0

CTRLML在CodeMirror区域中切换行号。有关其他键盘快捷键,请参见快速帮助。

更详细地讲CTRLM(或ESC)将您带入命令模式,然后L按键将切换当前单元格行号的可见性。在较新的笔记本电脑版本中,Shift-L应切换所有单元格。

如果您忘记了快捷方式,请调出命令面板Ctrl-Shift+PCmd+Shift+P在Mac上为Mac,然后搜索“行号”),它应该允许切换并显示快捷方式。

CTRLML toggles line numbers in the CodeMirror area. See the QuickHelp for other keyboard shortcuts.

In more details CTRLM (or ESC) bring you to command mode, then pressing the L keys should toggle the visibility of current cell line numbers. In more recent notebook versions Shift-L should toggle for all cells.

If you can’t remember the shortcut, bring up the command palette Ctrl-Shift+P (Cmd+Shift+P on Mac), and search for “line numbers”), it should allow to toggle and show you the shortcut.


回答 1

在IPython 2.2.0上,只需在命令模式(通过键入Esc激活)中键入l(小写L)即可。有关其他快捷方式,请参见[帮助]-[键盘快捷方式]。

另外,您可以设置默认行为以通过编辑显示行号custom.js

On IPython 2.2.0, just typing l (lowercase L) on command mode (activated by typing Esc) works. See [Help] – [Keyboard Shortcuts] for other shortcuts.

Also, you can set default behavior to display line numbers by editing custom.js.


回答 2

View->中选择切换行号选项 Toggle Line Number

Select the Toggle Line Number Option from the View -> Toggle Line Number.


回答 3

要在启动时默认打开所有单元中的行号,我建议使用此链接。我引用:

  1. 导航到jupyter配置目录,您可以通过在命令行中键入以下内容来找到该目录:

    jupyter --config-dir
  2. 从那里打开或创建custom文件夹。

  3. 在该文件夹中,您应该找到一个custom.js文件。如果没有,则应该可以创建一个。在文本编辑器中将其打开并添加以下代码:

    define([
        'base/js/namespace',
        'base/js/events'
        ],
        function(IPython, events) {
            events.on("app_initialized.NotebookApp",
                function () {
                    IPython.Cell.options_default.cm_config.lineNumbers = true;
                }
            );
        }
    );

To turn line numbers on by default in all cells at startup I recommend this link. I quote:

  1. Navigate to your jupyter config directory, which you can find by typing the following at the command line:

    jupyter --config-dir
    
  2. From there, open or create the custom folder.

  3. In that folder, you should find a custom.js file. If there isn’t one, you should be able to create one. Open it in a text editor and add this code:

    define([
        'base/js/namespace',
        'base/js/events'
        ],
        function(IPython, events) {
            events.on("app_initialized.NotebookApp",
                function () {
                    IPython.Cell.options_default.cm_config.lineNumbers = true;
                }
            );
        }
    );
    

回答 4

为了我, ctrl + m用于将网页另存为png,因此无法正常工作。但是我找到了另一种方式。

在工具栏上,有一个名为打开命令paletee的底部,您可以单击它并键入该行,并在此处看到切换单元格的行号。

For me, ctrl + m is used to save the webpage as png, so it does not work properly. But I find another way.

On the toolbar, there is a bottom named open the command paletee, you can click it and type in the line, and you can see the toggle cell line number here.


回答 5

这是了解活动快捷方式的方法(取决于您的操作系统和笔记本电脑的版本,它可能会更改)

Help > Keyboard Shortcuts > toggle line numbers

在运行ipython3的OSX上, ESC L

Here is how to know active shortcut (depending on your OS and notebook version, it might change)

Help > Keyboard Shortcuts > toggle line numbers

On OSX running ipython3 it was ESC L


回答 6

您还可以Toggle Line NumbersView浏览器的Jupyter笔记本顶部工具栏上的下找到。这将添加/删除所有行号笔记本单元。

对我而言,Esc+ l仅添加/删除了活动单元格的行号。

You can also find Toggle Line Numbers under View on the top toolbar of the Jupyter notebook in your browser. This adds/removes the lines numbers in all notebook cells.

For me, Esc+l only added/removed the line numbers of the active cell.


回答 7

正在寻找这个:Shift-L在JupyterLab 1.0.0中

Was looking for this: Shift-L in JupyterLab 1.0.0


回答 8

1.按esc进入命令模式2.按l(小写L)显示行号

1.press esc to enter the command mode 2.perss l(it L in lowcase) to show the line number


检查pandas数据框索引中是否存在值

问题:检查pandas数据框索引中是否存在值

我敢肯定有一个明显的方法可以做到这一点,但是现在还不能想到任何光滑的东西。

基本上不是引发异常,而是要获取TrueFalse查看pandas df索引中是否存在值。

import pandas as pd
df = pd.DataFrame({'test':[1,2,3,4]}, index=['a','b','c','d'])
df.loc['g']  # (should give False)

我现在工作的是以下内容

sum(df.index == 'g')

I am sure there is an obvious way to do this but cant think of anything slick right now.

Basically instead of raising exception I would like to get True or False to see if a value exists in pandas df index.

import pandas as pd
df = pd.DataFrame({'test':[1,2,3,4]}, index=['a','b','c','d'])
df.loc['g']  # (should give False)

What I have working now is the following

sum(df.index == 'g')

回答 0

这应该可以解决问题

'g' in df.index

This should do the trick

'g' in df.index

回答 1

仅供参考,这是我一直在寻找的东西,您可以通过附加“ .values”方法来测试值或索引中是否存在,例如

g in df.<your selected field>.values
g in df.index.values

我发现添加“ .values”以获取简单的列表或ndarray会使存在或“输入”检查与其他python工具一起运行更为流畅。只是以为我会把那个扔给别人。

Just for reference as it was something I was looking for, you can test for presence within the values or the index by appending the “.values” method, e.g.

g in df.<your selected field>.values
g in df.index.values

I find that adding the “.values” to get a simple list or ndarray out makes exist or “in” checks run more smoothly with the other python tools. Just thought I’d toss that out there for people.


回答 2

多索引的工作方式与单索引略有不同。这是多索引数据框的一些方法。

df = pd.DataFrame({'col1': ['a', 'b','c', 'd'], 'col2': ['X','X','Y', 'Y'], 'col3': [1, 2, 3, 4]}, columns=['col1', 'col2', 'col3'])
df = df.set_index(['col1', 'col2'])

in df.index 仅在检查单个索引值时才适用于第一级。

'a' in df.index     # True
'X' in df.index     # False

检查df.index.levels其他级别。

'a' in df.index.levels[0] # True
'X' in df.index.levels[1] # True

签入df.index索引组合元组。

('a', 'X') in df.index  # True
('a', 'Y') in df.index  # False

Multi index works a little different from single index. Here are some methods for multi-indexed dataframe.

df = pd.DataFrame({'col1': ['a', 'b','c', 'd'], 'col2': ['X','X','Y', 'Y'], 'col3': [1, 2, 3, 4]}, columns=['col1', 'col2', 'col3'])
df = df.set_index(['col1', 'col2'])

in df.index works for the first level only when checking single index value.

'a' in df.index     # True
'X' in df.index     # False

Check df.index.levels for other levels.

'a' in df.index.levels[0] # True
'X' in df.index.levels[1] # True

Check in df.index for an index combination tuple.

('a', 'X') in df.index  # True
('a', 'Y') in df.index  # False

回答 3

与DataFrame:df_data

>>> df_data
  id   name  value
0  a  ampha      1
1  b   beta      2
2  c     ce      3

我试过了:

>>> getattr(df_data, 'value').isin([1]).any()
True
>>> getattr(df_data, 'value').isin(['1']).any()
True

但:

>>> 1 in getattr(df_data, 'value')
True
>>> '1' in getattr(df_data, 'value')
False

很有趣:D

with DataFrame: df_data

>>> df_data
  id   name  value
0  a  ampha      1
1  b   beta      2
2  c     ce      3

I tried:

>>> getattr(df_data, 'value').isin([1]).any()
True
>>> getattr(df_data, 'value').isin(['1']).any()
True

but:

>>> 1 in getattr(df_data, 'value')
True
>>> '1' in getattr(df_data, 'value')
False

So fun :D


回答 4

df = pandas.DataFrame({'g':[1]}, index=['isStop'])

#df.loc['g']

if 'g' in df.index:
    print("find g")

if 'isStop' in df.index:
    print("find a") 
df = pandas.DataFrame({'g':[1]}, index=['isStop'])

#df.loc['g']

if 'g' in df.index:
    print("find g")

if 'isStop' in df.index:
    print("find a") 

回答 5

下面的代码不打印布尔值,但允许按索引对数据框进行子集设置…我知道这可能不是解决问题的最有效方法,但是我(1)喜欢这种读取方式,并且(2)您可以轻松地进行子集化df2中存在df1索引的位置:

df3 = df1[df1.index.isin(df2.index)]

或df2中不存在df1索引的地方…

df3 = df1[~df1.index.isin(df2.index)]

Code below does not print boolean, but allows for dataframe subsetting by index… I understand this is likely not the most efficient way to solve the problem, but I (1) like the way this reads and (2) you can easily subset where df1 index exists in df2:

df3 = df1[df1.index.isin(df2.index)]

or where df1 index does not exist in df2…

df3 = df1[~df1.index.isin(df2.index)]

Python和IPython有什么区别?

问题:Python和IPython有什么区别?

Python和IPython之间到底有什么区别?

如果我用Python编写代码,它是否可以按原样在IPython中运行还是需要进行修改?

我知道IPython应该是Python的交互式外壳,但仅此而已?还是有一种叫做IPython的语言?如果我在IPython下编写某些内容,它将在Python中运行,反之亦然吗?如果存在差异,我怎么知道它们是什么?Python使用的所有软件包都可以像在IPython中那样工作吗?

What exactly is the difference between Python and IPython?

If I write code in Python, will it run in IPython as is or does it need to be modified?

I know IPython is supposed to be an interactive shell for Python, but is that all? Or is there a language called IPython? If I write something under IPython, will it run in Python, and vice-versa? If there are differences, how do I know what they are? Will all packages used by Python work as is in IPython?


回答 0

ipython 是使用python构建的交互式shell。

从项目网站:

IPython提供了丰富的工具包,可帮助您充分利用Python,并具有以下功能:

  • 强大的Python Shell(基于终端和Qt)。
  • 基于Web的笔记本,具有相同的核心功能,但支持代码,文本,数学表达式,内联图和其他富媒体。
  • 支持交互式数据可视化和GUI工具箱的使用。
  • 灵活,可嵌入的解释器,可加载到您自己的项目中。
  • 易于使用的高性能并行计算工具。

请注意,前两行告诉您它可以帮助您充分利用Python。因此,您不需要更改代码,IPython shell就像常规python shell一样运行python代码,只具有更多功能。

我建议阅读IPython教程,以了解使用IPython时获得的功能。

ipython is an interactive shell built with python.

From the project website:

IPython provides a rich toolkit to help you make the most out of using Python, with:

  • Powerful Python shells (terminal and Qt-based).
  • A web-based notebook with the same core features but support for code, text, mathematical expressions, inline plots and other rich media.
  • Support for interactive data visualization and use of GUI toolkits.
  • Flexible, embeddable interpreters to load into your own projects.
  • Easy to use, high performance tools for parallel computing.

Note that the first 2 lines tell you it helps you make the most of using Python. Thus, you don’t need to alter your code, the IPython shell runs your python code just like the normal python shell does, only with more features.

I recommend reading the IPython tutorial to get a sense of what features you gain when using IPython.


回答 1

IPython基本上是“推荐”的Python shell,它提供了额外的功能。没有称为IPython的语言。

IPython is basically the “recommended” Python shell, which provides extra features. There is no language called IPython.


回答 2

即使在查看了该线程之后,我仍然认为ipython是python shell的同义词,换句话说,在命令行中键入python会将其置于ipython模式。

实际上,如上所述,它实际上是一个非常酷的交互式外壳程序(命令行程序),可以从iPython.org安装或通过运行

pip install ipython

或更广泛的内容:

pip install ipython[notebook]

从命令行。

Even after viewing this thread, I had thought that ipython was a synonym for the python shell, in other words that typing python at the command line put one into ipython mode.

It is in fact, as referenced above, a very cool interactive shell (command line program) that can be installed from iPython.org or simply by running

pip install ipython

or the more extensive:

pip install ipython[notebook]

from the command line.


回答 3

IPython是功能强大的交互式Python解释器,与标准解释器相比,它更具交互性。

要获取标准的Python解释器,请键入,python然后会>>>从您可以工作的地方获得提示。

要获取IPython解释器,您需要先安装它。pip install ipython。您键入ipython并得到In [1]:提示,然后In [2]:输入下一条命令。您可以调用history以检查以前的命令列表,并写回%recall 1以调用该命令。

即使您在Python中,也可以直接运行shell命令,例如!ping www.google.com。如果您之前使用过的话,它看起来就像是命令行的Jupiter笔记本。

您可以使用[Tab]自动完成,如图所示。

IPython is a powerful interactive Python interpreter that is more interactive comparing to the standard interpreter.

To get the standard Python interpreter you type python and you will get the >>> prompt from where you can work.

To get IPython interpreter, you need to install it first. pip install ipython. You type ipython and you get In [1]: as a prompt and you get In [2]: for the next command. You can call history to check the list of previous commands, and write %recall 1 to recall the command.

Even you are in Python you can run shell commands directly like !ping www.google.com. Looks like a command line Jupiter notebook if you used that before.

You can use [Tab] to autocomplete as shown in the image.


回答 4

Python和IPython之间几乎没有区别,但是它们仅是对少量语法的解释,就像@Ryan Chase提到的语法一样,但是即使在Ipython中,Python的内在本质也得以保留。

IPython的最好部分是IPython笔记本。您可以将所有工作放入脚本,图像文件等笔记本中。但是使用基本Python,您只能将脚本制作在文件中并执行。

开始时,您需要了解IPython是为了在单个集成容器中支持富媒体和Python脚本而开发的。

There are few differences between Python and IPython but they are only the interpretation of few syntax like the few mentioned by @Ryan Chase but deep inside the true flavor of Python is maintained even in the Ipython.

The best part of the IPython is the IPython notebook. You can put all your work into a notebook like script, image files, etc. But with base Python, you can only make the script in a file and execute it.

At start, you need to understand that the IPython is developed with the intention of supporting rich media and Python script in a single integrated container.


回答 5

与Python相比,IPython(由Fernando Perez在2001年创建)可以完成python可以做的所有事情。Ipython甚至提供了额外的功能,例如制表符完成,测试,调试,系统调用和许多其他功能。您可以将IPython视为Python语言的强大接口。

您可以使用pip 安装Ipython-pip install ipython

您可以通过在终端窗口中键入来运行Ipythonipython

Compared to Python, IPython (created by Fernando Perez in 2001) can do every thing what python can do. Ipython provides even extra features like tab-completion, testing, debugging, system calls and many other features. You can think IPython as a powerful interface to the Python language.

You can install Ipython using pip – pip install ipython

You can run Ipython by typing ipython in your terminal window.


回答 6

根据我的经验,我发现某些在IPython中运行的命令不在基本Python中运行。例如,pwdls不要单独在基地的Python工作。但是,如果以%诸如:%pwd和开头,则它们将起作用%ls

另外,在IPython中,您可以运行cd命令,如下所示: cd C:\Users\ …即使在前缀为a的%情况下,这在基本python中似乎也不起作用。

From my experience I’ve found that some commands which run in IPython do not run in base Python. For example, pwd and ls don’t work alone in base Python. However they will work if prefaced with a % such as: %pwd and %ls.

Also, in IPython, you can run the cd command like: cd C:\Users\… This doesn’t seem to work in base python, even when prefaced with a % however.


将图像插入IPython Notebook Markdown

问题:将图像插入IPython Notebook Markdown

我开始严重依赖IPython Notebook应用程序来开发和记录算法。太棒了; 但是似乎有些应该可行的方法,但是我不知道该怎么做:

我想在我的(本地)IPython笔记本降价中插入本地图像,以帮助记录算法。我知道足够<img src="image.png">在降价中添加类似内容,但据我所知。我假设我可以将图像放在127.0.0.1:8888表示的目录(或某些子目录)中,以便能够访问它,但是我不知道该目录在哪里。(我正在Mac上工作。)那么,是否可以在没有太多麻烦的情况下做我想做的事情?

I am starting to depend heavily on the IPython notebook app to develop and document algorithms. It is awesome; but there is something that seems like it should be possible, but I can’t figure out how to do it:

I would like to insert a local image into my (local) IPython notebook markdown to aid in documenting an algorithm. I know enough to add something like <img src="image.png"> to the markdown, but that is about as far as my knowledge goes. I assume I could put the image in the directory represented by 127.0.0.1:8888 (or some subdirectory) to be able to access it, but I can’t figure out where that directory is. (I’m working on a mac.) So, is it possible to do what I’m trying to do without too much trouble?


回答 0

笔记本目录中的文件位于“ files /” URL下。因此,如果它位于基本路径中,则为<img src="files/image.png">,子目录等也可用:<img src="files/subdir/image.png">,等等。

更新:从IPython 2.0开始,files/不再需要该前缀(请参阅发行说明)。因此,现在该解决方案<img src="image.png">可以按预期工作。

Files inside the notebook dir are available under a “files/” url. So if it’s in the base path, it would be <img src="files/image.png">, and subdirs etc. are also available: <img src="files/subdir/image.png">, etc.

Update: starting with IPython 2.0, the files/ prefix is no longer needed (cf. release notes). So now the solution <img src="image.png"> simply works as expected.


回答 1

到目前为止,大多数给出的答案都是错误的,建议加载其他库并使用代码而不是标记。在Ipython / Jupyter Notebooks中,这非常简单。确保单元格确实在标记中并显示图像以供使用:

![alt text](imagename.png "Title")

与建议的其他方法相比,进一步的优点是您可以显示所有常见的文件格式,包括jpg,png和gif(动画)。

Most of the answers given so far go in the wrong direction, suggesting to load additional libraries and use the code instead of markup. In Ipython/Jupyter Notebooks it is very simple. Make sure the cell is indeed in markup and to display a image use:

![alt text](imagename.png "Title")

Further advantage compared to the other methods proposed is that you can display all common file formats including jpg, png, and gif (animations).


回答 2

我正在使用ipython 2.0,所以只有两行。

from IPython.display import Image
Image(filename='output1.png')

I am using ipython 2.0, so just two line.

from IPython.display import Image
Image(filename='output1.png')

回答 3

[过时]

IPython / Jupyter现在支持扩展模块,该模块可以通过复制和粘贴或拖放来插入图像。

https://github.com/ipython-contrib/IPython-notebook-extensions

拖放扩展似乎可以在大多数浏览器中使用

https://github.com/ipython-contrib/IPython-notebook-extensions/tree/master/nbextensions/usability/dragdrop

但是复制和粘贴仅适用于Chrome。

[Obsolete]

IPython/Jupyter now has support for an extension modules that can insert images via copy and paste or drag & drop.

https://github.com/ipython-contrib/IPython-notebook-extensions

The drag & drop extension seems to work in most browsers

https://github.com/ipython-contrib/IPython-notebook-extensions/tree/master/nbextensions/usability/dragdrop

But copy and paste only works in Chrome.


回答 4

将图像导入Jupyter NB比大多数人在这里提到的要简单得多。

1)只需创建一个空的Markdown单元。2)然后将图像文件拖放到空白的Markdown单元格中。

然后出现将插入图像的降价代码。

例如,下面以灰色突出显示的字符串将出现在Jupyter单元格中:

![Venus_flytrap_taxonomy.jpg](attachment:Venus_flytrap_taxonomy.jpg)

3)然后按Shift-Enter执行Markdown单元。然后,Jupyter服务器将插入图像,然后图像将出现。

我正在运行Jupyter笔记本服务器是:Windows 7上具有Python 3.7.0的5.7.4。

这是如此简单!

Getting an image into Jupyter NB is a much simpler operation than most people have alluded to here.

1) Simply create an empty Markdown cell. 2) Then drag-and-drop the image file into the empty Markdown cell.

The Markdown code that will insert the image then appears.

For example, a string shown highlighted in gray below will appear in the Jupyter cell:

![Venus_flytrap_taxonomy.jpg](attachment:Venus_flytrap_taxonomy.jpg)

3) Then execute the Markdown cell by hitting Shift-Enter. The Jupyter server will then insert the image, and the image will then appear.

I am running Jupyter notebook server is: 5.7.4 with Python 3.7.0 on Windows 7.

This is so simple !!


回答 5

我将IPython笔记本与图像放在同一文件夹中。我使用Windows。图像名称是“ phuong huong xac dinh.PNG”。

在Markdown中:

<img src="phuong huong xac dinh.PNG">

码:

from IPython.display import Image
Image(filename='phuong huong xac dinh.PNG')

I put the IPython notebook in the same folder with the image. I use Windows. The image name is “phuong huong xac dinh.PNG”.

In Markdown:

<img src="phuong huong xac dinh.PNG">

Code:

from IPython.display import Image
Image(filename='phuong huong xac dinh.PNG')

回答 6

首先确保您在ipython笔记本单元格中处于markdown编辑模型中

这是其他人提出的方法的替代方法<img src="myimage.png">

![title](img/picture.png)

如果标题丢失,它似乎也可以工作:

![](img/picture.png)

请注意,路径中不应包含任何引号。不知道这是否适用于带有空格的路径!

First make sure you are in markdown edit model in the ipython notebook cell

This is an alternative way to the method proposed by others <img src="myimage.png">:

![title](img/picture.png)

It also seems to work if the title is missing:

![](img/picture.png)

Note no quotations should be in the path. Not sure if this works for paths with white spaces though!


回答 7

Jupyter Notebook的最新版本本机接受图像的复制/粘贴

Last version of jupyter notebook accepts copy/paste of image natively


回答 8

如果要在Markdown单元中显示图像,请使用:

<img src="files/image.png" width="800" height="400">

如果要在“代码”单元格中显示图像,请使用:

from IPython.display import Image
Image(filename='output1.png',width=800, height=400)

If you want to display the image in a Markdown cell then use:

<img src="files/image.png" width="800" height="400">

If you want to display the image in a Code cell then use:

from IPython.display import Image
Image(filename='output1.png',width=800, height=400)

回答 9

在运行此代码之前,将默认块从“代码”更改为“ Markdown”:

![<caption>](image_filename.png)

如果图像文件在另一个文件夹中,则可以执行以下操作:

![<caption>](folder/image_filename.png)

Change the default block from “Code” to “Markdown” before running this code:

![<caption>](image_filename.png)

If image file is in another folder, you can do the following:

![<caption>](folder/image_filename.png)

回答 10

对于那些希望将图像文件放在Jupyter机器上的位置的人,以便可以从本地文件系统显示它。

我把我的mypic.png

/root/Images/mypic.png

(即Jupyter在线文件浏览器中显示的Images文件夹)

在这种情况下,我需要将以下行放入Markdown单元中,以使我的图片显示在记事本中:

![My Title](Images/mypic.png)

For those looking where to place the image file on the Jupyter machine so that it could be shown from the local file system.

I put my mypic.png into

/root/Images/mypic.png

(that is the Images folder that shows up in the Jupyter online file browser)

In that case I need to put the following line into the Markdown cell to make my pic showing in the notepad:

![My Title](Images/mypic.png)

回答 11

明克的答案是正确的。

但是,我发现这些图像在“打印视图”中显得不完整(在运行Chrome浏览器中IPython 0.13.2版的Anaconda发行版的Windows计算机上)

解决方法是使用 <img src="../files/image.png">

这使得图像可以在“打印视图”和普通的iPython编辑视图中正确显示。

更新:自从我升级到iPython v1.1.0以来,由于打印视图不再存在,因此不再需要这种解决方法。实际上,您必须避免这种解决方法,因为它会阻止nbconvert工具查找文件。

minrk’s answer is right.

However, I found that the images appeared broken in Print View (on my Windows machine running the Anaconda distribution of IPython version 0.13.2 in a Chrome browser)

The workaround for this was to use <img src="../files/image.png"> instead.

This made the image appear correctly in both Print View and the normal iPython editing view.

UPDATE: as of my upgrade to iPython v1.1.0 there is no more need for this workaround since the print view no longer exists. In fact, you must avoid this workaround since it prevents the nbconvert tool from finding the files.


回答 12

您可以在jupyter笔记本中使用“ pwd”命令查找当前工作目录,不带引号。

You can find your current working directory by ‘pwd’ command in jupyter notebook without quotes.


如何在Jupyter Notebook中显示文件中的图像?

问题:如何在Jupyter Notebook中显示文件中的图像?

我想使用IPython笔记本作为交互式分析我使用Biopython GenomeDiagram模块制作的一些基因组图的方法。尽管有大量有关如何matplotlib在IPython笔记本中用于内联获取图形的文档,但GenomeDiagram使用ReportLab工具箱,我认为IPython不支持内联图形。

但是,我当时想,一种解决方法是将绘图/基因组图写到文件中,然后打开图像内联图像,结果将是相同的,如下所示:

gd_diagram.write("test.png", "PNG")
display(file="test.png")

但是,我不知道该怎么做-或知道是否可能。有人知道是否可以在IPython中打开/显示图像吗?

I would like to use an IPython notebook as a way to interactively analyze some genome charts I am making with Biopython’s GenomeDiagram module. While there is extensive documentation on how to use matplotlib to get graphs inline in IPython notebook, GenomeDiagram uses the ReportLab toolkit which I don’t think is supported for inline graphing in IPython.

I was thinking, however, that a way around this would be to write out the plot/genome diagram to a file and then open the image inline which would have the same result with something like this:

gd_diagram.write("test.png", "PNG")
display(file="test.png")

However, I can’t figure out how to do this – or know if it’s possible. So does anyone know if images can be opened/displayed in IPython?


回答 0

通过此帖子,您可以执行以下操作:

from IPython.display import Image
Image(filename='test.png') 

官方文档

Courtesy of this post, you can do the following:

from IPython.display import Image
Image(filename='test.png') 

(official docs)


回答 1

如果试图在循环中以这种方式显示Image,则需要将Image构造函数包装在display方法中。

from IPython.display import Image, display

listOfImageNames = ['/path/to/images/1.png',
                    '/path/to/images/2.png']

for imageName in listOfImageNames:
    display(Image(filename=imageName))

If you are trying to display an Image in this way inside a loop, then you need to wrap the Image constructor in a display method.

from IPython.display import Image, display

listOfImageNames = ['/path/to/images/1.png',
                    '/path/to/images/2.png']

for imageName in listOfImageNames:
    display(Image(filename=imageName))

回答 2

请注意,到目前为止发布的解决方案仅适用于png和jpg!

如果您希望它更容易而不导入其他库,或者要在Ipython Notebook中显示动画或不动画的GIF文件。将您要显示的行转换为降价标记,并使用此简短的技巧!

![alt text](test.gif "Title")

Note, until now posted solutions only work for png and jpg!

If you want it even easier without importing further libraries or you want to display an animated or not animated GIF File in your Ipython Notebook. Transform the line where you want to display it to markdown and use this nice short hack!

![alt text](test.gif "Title")

回答 3

这将.jpg在Jupyter中导入并显示图像(在Anaconda环境中使用Python 2.7测试)

from IPython.display import display
from PIL import Image


path="/path/to/image.jpg"
display(Image.open(path))

您可能需要安装PIL

在Anaconda中,这是通过键入

conda install pillow

This will import and display a .jpg image in Jupyter (tested with Python 2.7 in Anaconda environment)

from IPython.display import display
from PIL import Image


path="/path/to/image.jpg"
display(Image.open(path))

You may need to install PIL

in Anaconda this is done by typing

conda install pillow

回答 4

感谢页面,当以上建议不起作用时,我发现此方法有效:

import PIL.Image
from cStringIO import StringIO
import IPython.display
import numpy as np
def showarray(a, fmt='png'):
    a = np.uint8(a)
    f = StringIO()
    PIL.Image.fromarray(a).save(f, fmt)
    IPython.display.display(IPython.display.Image(data=f.getvalue()))

Courtesy of this page, I found this worked when the suggestions above didn’t:

import PIL.Image
from cStringIO import StringIO
import IPython.display
import numpy as np
def showarray(a, fmt='png'):
    a = np.uint8(a)
    f = StringIO()
    PIL.Image.fromarray(a).save(f, fmt)
    IPython.display.display(IPython.display.Image(data=f.getvalue()))

回答 5

您可以在markdown部分的html代码中使用:示例:

 <img src="https://www.tensorflow.org/images/colab_logo_32px.png" />

You could use in html code in markdown section: example:

 <img src="https://www.tensorflow.org/images/colab_logo_32px.png" />

回答 6

使用标准numpy,matplotlib和PIL的更干净的Python3版本。合并从URL打开的答案。

import matplotlib.pyplot as plt
from PIL import Image
import numpy as np

pil_im = Image.open('image.png') #Take jpg + png
## Uncomment to open from URL
#import requests
#r = requests.get('https://www.vegvesen.no/public/webkamera/kamera?id=131206')
#pil_im = Image.open(BytesIO(r.content))
im_array = np.asarray(pil_im)
plt.imshow(im_array)
plt.show()

A cleaner Python3 version that use standard numpy, matplotlib and PIL. Merging the answer for opening from URL.

import matplotlib.pyplot as plt
from PIL import Image
import numpy as np

pil_im = Image.open('image.png') #Take jpg + png
## Uncomment to open from URL
#import requests
#r = requests.get('https://www.vegvesen.no/public/webkamera/kamera?id=131206')
#pil_im = Image.open(BytesIO(r.content))
im_array = np.asarray(pil_im)
plt.imshow(im_array)
plt.show()

回答 7

如果要有效显示大量图像,建议使用IPyPlot软件包

import ipyplot

ipyplot.plot_images(images_array, max_images=20, img_width=150)

该软件包中还有一些其他有用的功能,您可以在其中以交互式选项卡(每个标签/类别的单独选项卡)显示图像,这对于所有ML分类任务都非常有用。

If you want to efficiently display big number of images I recommend using IPyPlot package

import ipyplot

ipyplot.plot_images(images_array, max_images=20, img_width=150)

There are some other useful functions in that package where you can display images in interactive tabs (separate tab for each label/class) which is very helpful for all the ML classification tasks.


回答 8

GenomeDiagram与Jupyter(iPython)一起使用时,显示图像的最简单方法是将GenomeDiagram转换为PNG图像。可以使用IPython.display.Image对象包装它,使其在笔记本中显示。

from Bio.Graphics import GenomeDiagram
from Bio.SeqFeature import SeqFeature, FeatureLocation
from IPython.display import display, Image
gd_diagram = GenomeDiagram.Diagram("Test diagram")
gd_track_for_features = gd_diagram.new_track(1, name="Annotated Features")
gd_feature_set = gd_track_for_features.new_set()
gd_feature_set.add_feature(SeqFeature(FeatureLocation(25, 75), strand=+1))
gd_diagram.draw(format="linear", orientation="landscape", pagesize='A4',
                fragments=1, start=0, end=100)
Image(gd_diagram.write_to_string("PNG"))

[请参阅笔记本]

When using GenomeDiagram with Jupyter (iPython), the easiest way to display images is by converting the GenomeDiagram to a PNG image. This can be wrapped using an IPython.display.Image object to make it display in the notebook.

from Bio.Graphics import GenomeDiagram
from Bio.SeqFeature import SeqFeature, FeatureLocation
from IPython.display import display, Image
gd_diagram = GenomeDiagram.Diagram("Test diagram")
gd_track_for_features = gd_diagram.new_track(1, name="Annotated Features")
gd_feature_set = gd_track_for_features.new_set()
gd_feature_set.add_feature(SeqFeature(FeatureLocation(25, 75), strand=+1))
gd_diagram.draw(format="linear", orientation="landscape", pagesize='A4',
                fragments=1, start=0, end=100)
Image(gd_diagram.write_to_string("PNG"))

[See Notebook]


在ipython Notebook中测量单元执行时间的简单方法

问题:在ipython Notebook中测量单元执行时间的简单方法

除了单元的原始输出,我想花时间在单元执行上。

为此,我尝试了%%timeit -r1 -n1但它没有公开定义在单元格内的变量。

%%time 适用于仅包含1条语句的单元格。

In[1]: %%time
       1
CPU times: user 4 µs, sys: 0 ns, total: 4 µs
Wall time: 5.96 µs
Out[1]: 1

In[2]: %%time
       # Notice there is no out result in this case.
       x = 1
       x
CPU times: user 3 µs, sys: 0 ns, total: 3 µs
Wall time: 5.96 µs

最好的方法是什么?

更新资料

我已经在Nbextension中使用Execute Time了一段时间了。这太棒了。

I would like to get the time spent on the cell execution in addition to the original output from cell.

To this end, I tried %%timeit -r1 -n1 but it doesn’t expose the variable defined within cell.

%%time works for cell which only contains 1 statement.

In[1]: %%time
       1
CPU times: user 4 µs, sys: 0 ns, total: 4 µs
Wall time: 5.96 µs
Out[1]: 1

In[2]: %%time
       # Notice there is no out result in this case.
       x = 1
       x
CPU times: user 3 µs, sys: 0 ns, total: 3 µs
Wall time: 5.96 µs

What’s the best way to do it?

Update

I have been using Execute Time in Nbextension for quite some time now. It is great.


回答 0

使用单元魔术和Phillip Cloud在github上的此项目:

通过将其放在笔记本顶部或如果您始终希望默认情况下将其放在配置文件中来进行加载:

%install_ext https://raw.github.com/cpcloud/ipython-autotime/master/autotime.py
%load_ext autotime

如果加载,则后续单元执行的每个输出将包括执行时间(以分钟和秒为单位)。

Use cell magic and this project on github by Phillip Cloud:

Load it by putting this at the top of your notebook or put it in your config file if you always want to load it by default:

%install_ext https://raw.github.com/cpcloud/ipython-autotime/master/autotime.py
%load_ext autotime

If loaded, every output of subsequent cell execution will include the time in min and sec it took to execute it.


回答 1

我发现克服此问题的唯一方法是执行带有print的最后一条语句。

不要忘了单元魔术始于,%%行魔术始于%

%%time
clf = tree.DecisionTreeRegressor().fit(X_train, y_train)
res = clf.predict(X_test)
print(res)

请注意,在下一个单元格中将不考虑在单元格内执行的任何更改,这在存在管道时是很直观的:

The only way I found to overcome this problem is by executing the last statement with print.

Do not forget that cell magic starts with %% and line magic starts with %.

%%time
clf = tree.DecisionTreeRegressor().fit(X_train, y_train)
res = clf.predict(X_test)
print(res)

Notice that any changes performed inside the cell are not taken into consideration in the next cells, something that is counter intuitive when there is a pipeline:


回答 2

%time%timeit现在来IPython中的一部分内置的魔法命令

%time and %timeit now come part of ipython’s built-in magic commands


回答 3

一种更简单的方法是在jupyter_contrib_nbextensions软件包中使用ExecuteTime插件。

pip install jupyter_contrib_nbextensions
jupyter contrib nbextension install --user
jupyter nbextension enable execute_time/ExecuteTime

An easier way is to use ExecuteTime plugin in jupyter_contrib_nbextensions package.

pip install jupyter_contrib_nbextensions
jupyter contrib nbextension install --user
jupyter nbextension enable execute_time/ExecuteTime

回答 4

我只是%%time在单元格的开头添加了时间。您可以在Jupyter Spark群集/虚拟环境上使用相同的名称。只需%%time在单元格的顶部添加,您将获得输出。在使用Jupyter的Spark集群上,我将其添加到单元格的顶部,并得到如下输出:-

[1]  %%time
     import pandas as pd
     from pyspark.ml import Pipeline
     from pyspark.ml.classification import LogisticRegression
     import numpy as np
     .... code ....

Output :-

CPU times: user 59.8 s, sys: 4.97 s, total: 1min 4s
Wall time: 1min 18s

I simply added %%time at the beginning of the cell and got the time. You may use the same on Jupyter Spark cluster/ Virtual environment using the same. Just add %%time at the top of the cell and you will get the output. On spark cluster using Jupyter, I added to the top of the cell and I got output like below:-

[1]  %%time
     import pandas as pd
     from pyspark.ml import Pipeline
     from pyspark.ml.classification import LogisticRegression
     import numpy as np
     .... code ....

Output :-

CPU times: user 59.8 s, sys: 4.97 s, total: 1min 4s
Wall time: 1min 18s

回答 5

import time
start = time.time()
"the code you want to test stays here"
end = time.time()
print(end - start)
import time
start = time.time()
"the code you want to test stays here"
end = time.time()
print(end - start)

回答 6

您可以使用timeit魔术功能。

%timeit CODE_LINE

或在细胞上

%%timeit 

SOME_CELL_CODE

https://nbviewer.jupyter.org/github/ipython/ipython/blob/1.x/examples/notebooks/Cell%20Magics.ipynb检查更多IPython魔术函数

You can use timeit magic function for that.

%timeit CODE_LINE

Or on the cell

%%timeit 

SOME_CELL_CODE

Check more IPython magic functions at https://nbviewer.jupyter.org/github/ipython/ipython/blob/1.x/examples/notebooks/Cell%20Magics.ipynb


回答 7

这不是很漂亮,但没有额外的软件

class timeit():
    from datetime import datetime
    def __enter__(self):
        self.tic = self.datetime.now()
    def __exit__(self, *args, **kwargs):
        print('runtime: {}'.format(self.datetime.now() - self.tic))

然后,您可以像这样运行它:

with timeit():
    # your code, e.g., 
    print(sum(range(int(1e7))))

% 49999995000000
% runtime: 0:00:00.338492

This is not exactly beautiful but without extra software

class timeit():
    from datetime import datetime
    def __enter__(self):
        self.tic = self.datetime.now()
    def __exit__(self, *args, **kwargs):
        print('runtime: {}'.format(self.datetime.now() - self.tic))

Then you can run it like:

with timeit():
    # your code, e.g., 
    print(sum(range(int(1e7))))

% 49999995000000
% runtime: 0:00:00.338492

回答 8

有时,使用时单元格中的格式会有所不同print(res),但是jupyter / ipython带有display。请参阅下面有关使用熊猫的格式差异的示例。

%%time
import pandas as pd 
from IPython.display import display

df = pd.DataFrame({"col0":{"a":0,"b":0}
              ,"col1":{"a":1,"b":1}
              ,"col2":{"a":2,"b":2}
             })

#compare the following
print(df)
display(df)

display语句可以保留格式。

Sometimes the formatting is different in a cell when using print(res), but jupyter/ipython comes with a display. See an example of the formatting difference using pandas below.

%%time
import pandas as pd 
from IPython.display import display

df = pd.DataFrame({"col0":{"a":0,"b":0}
              ,"col1":{"a":1,"b":1}
              ,"col2":{"a":2,"b":2}
             })

#compare the following
print(df)
display(df)

The display statement can preserve the formatting.


回答 9

您可能还需要查看python的分析魔术命令%prun,该命令给出类似以下内容的信息-

def sum_of_lists(N):
    total = 0
    for i in range(5):
        L = [j ^ (j >> i) for j in range(N)]
        total += sum(L)
    return total

然后

%prun sum_of_lists(1000000)

将返回

14 function calls in 0.714 seconds  

Ordered by: internal time      

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    5    0.599    0.120    0.599    0.120 <ipython-input-19>:4(<listcomp>)
    5    0.064    0.013    0.064    0.013 {built-in method sum}
    1    0.036    0.036    0.699    0.699 <ipython-input-19>:1(sum_of_lists)
    1    0.014    0.014    0.714    0.714 <string>:1(<module>)
    1    0.000    0.000    0.714    0.714 {built-in method exec}

当处理大量代码时,我发现它很有用。

you may also want to look in to python’s profiling magic command %prunwhich gives something like –

def sum_of_lists(N):
    total = 0
    for i in range(5):
        L = [j ^ (j >> i) for j in range(N)]
        total += sum(L)
    return total

then

%prun sum_of_lists(1000000)

will return

14 function calls in 0.714 seconds  

Ordered by: internal time      

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    5    0.599    0.120    0.599    0.120 <ipython-input-19>:4(<listcomp>)
    5    0.064    0.013    0.064    0.013 {built-in method sum}
    1    0.036    0.036    0.699    0.699 <ipython-input-19>:1(sum_of_lists)
    1    0.014    0.014    0.714    0.714 <string>:1(<module>)
    1    0.000    0.000    0.714    0.714 {built-in method exec}

I find it useful when working with large chunks of code.


回答 10

遇到麻烦时,意味着什么:

?%timeit 要么 ??timeit

要获取详细信息:

Usage, in line mode:
  %timeit [-n<N> -r<R> [-t|-c] -q -p<P> -o] statement
or in cell mode:
  %%timeit [-n<N> -r<R> [-t|-c] -q -p<P> -o] setup_code
  code
  code...

Time execution of a Python statement or expression using the timeit
module.  This function can be used both as a line and cell magic:

- In line mode you can time a single-line statement (though multiple
  ones can be chained with using semicolons).

- In cell mode, the statement in the first line is used as setup code
  (executed but not timed) and the body of the cell is timed.  The cell
  body has access to any variables created in the setup code.

When in trouble what means what:

?%timeit or ??timeit

To get the details:

Usage, in line mode:
  %timeit [-n<N> -r<R> [-t|-c] -q -p<P> -o] statement
or in cell mode:
  %%timeit [-n<N> -r<R> [-t|-c] -q -p<P> -o] setup_code
  code
  code...

Time execution of a Python statement or expression using the timeit
module.  This function can be used both as a line and cell magic:

- In line mode you can time a single-line statement (though multiple
  ones can be chained with using semicolons).

- In cell mode, the statement in the first line is used as setup code
  (executed but not timed) and the body of the cell is timed.  The cell
  body has access to any variables created in the setup code.

回答 11

如果要打印壁单元执行时间,这是一个技巧,请使用

%%time
<--code goes here-->

但是请确保%% time是一个魔术函数,因此请将其放在代码的第一行

如果您将其放在代码的某些行之后,则会出现用法错误,并且无法正常工作。

If you want to print wall cell execution time here is a trick, use

%%time
<--code goes here-->

but here make sure that, the %%time is a magic function, so put it at first line in your code.

if you put it after some line of your code it’s going to give you usage error and not gonna work.