标签归档:jupyter-notebook

Jupyter Notebook中的tqdm反复打印新的进度条

问题:Jupyter Notebook中的tqdm反复打印新的进度条

我正在使用tqdm在Jupyter笔记本中运行的脚本打印进度。我正在通过将所有消息打印到控制台tqdm.write()。但是,这仍然给我这样的偏斜输出:

也就是说,每次必须打印新行时,新进度条都会打印在下一行上。通过终端运行脚本时不会发生这种情况。我该如何解决?

I am using tqdm to print progress in a script I’m running in a Jupyter notebook. I am printing all messages to the console via tqdm.write(). However, this still gives me a skewed output like so:

That is, each time a new line has to be printed, a new progress bar is printed on the next line. This does not happen when I run the script via terminal. How can I solve this?


回答 0

尝试使用tqdm.notebook.tqdm,而不是tqdm作为概述这里

这就像将导入更改为:

from tqdm.notebook import tqdm

祝好运!

编辑:经过测试,似乎tqdm在Jupyter笔记本中的“文本模式”下确实可以正常工作。很难说,因为您没有提供最小的示例,但是看来您的问题是由每次迭代中的打印语句引起的。在每个状态栏更新之间,print语句输出一个数字(〜0.89),这使输出混乱。尝试删除打印语句。

Try using tqdm.notebook.tqdm instead of tqdm, as outlined here.

This could be as simple as changing your import to:

from tqdm.notebook import tqdm

Good luck!

EDIT: After testing, it seems that tqdm actually works fine in ‘text mode’ in Jupyter notebook. It’s hard to tell because you haven’t provided a minimal example, but it looks like your problem is caused by a print statement in each iteration. The print statement is ouputting a number (~0.89) in between each status bar update, which is messing up the output. Try removing the print statement.


回答 1

对于tqdm_notebook对您不起作用的情况,这是一个替代答案。

给出以下示例:

from time import sleep
from tqdm import tqdm

values = range(3)
with tqdm(total=len(values)) as pbar:
    for i in values:
        pbar.write('processed: %d' %i)
        pbar.update(1)
        sleep(1)

输出看起来像这样(进度将显示为红色):

  0%|          | 0/3 [00:00<?, ?it/s]
processed: 1
 67%|██████▋   | 2/3 [00:01<00:00,  1.99it/s]
processed: 2
100%|██████████| 3/3 [00:02<00:00,  1.53it/s]
processed: 3

问题是stdoutstderr的输出是异步处理的,并根据新行分别进行处理。

如果说Jupyter在stderr上接收第一行,然后在stdout上接收“已处理”输出。然后,一旦它在stderr上收到输出以更新进度,就不会返回并更新第一行,因为它只会更新最后一行。相反,它将不得不写一个新行。

解决方法1,写入stdout

一种解决方法是将两者都输出到stdout:

import sys
from time import sleep
from tqdm import tqdm

values = range(3)
with tqdm(total=len(values), file=sys.stdout) as pbar:
    for i in values:
        pbar.write('processed: %d' % (1 + i))
        pbar.update(1)
        sleep(1)

输出将更改为(不再显示红色):

processed: 1   | 0/3 [00:00<?, ?it/s]
processed: 2   | 0/3 [00:00<?, ?it/s]
processed: 3   | 2/3 [00:01<00:00,  1.99it/s]
100%|██████████| 3/3 [00:02<00:00,  1.53it/s]

在这里我们可以看到Jupyter似乎直到行尾才清除。我们可以通过添加空格来添加另一种解决方法。如:

import sys
from time import sleep
from tqdm import tqdm

values = range(3)
with tqdm(total=len(values), file=sys.stdout) as pbar:
    for i in values:
        pbar.write('processed: %d%s' % (1 + i, ' ' * 50))
        pbar.update(1)
        sleep(1)

这给了我们:

processed: 1                                                  
processed: 2                                                  
processed: 3                                                  
100%|██████████| 3/3 [00:02<00:00,  1.53it/s]

解决方法2,改为设置描述

通常,没有两个输出而是更新描述可能更直接,例如:

import sys
from time import sleep
from tqdm import tqdm

values = range(3)
with tqdm(total=len(values), file=sys.stdout) as pbar:
    for i in values:
        pbar.set_description('processed: %d' % (1 + i))
        pbar.update(1)
        sleep(1)

输出(处理过程中更新说明):

processed: 3: 100%|██████████| 3/3 [00:02<00:00,  1.53it/s]

结论

您通常可以使它与纯tqdm一起正常工作。但是,如果tqdm_notebook为您工作,请使用它(但是您可能不会读那么远)。

This is an alternative answer for the case where tqdm_notebook doesn’t work for you.

Given the following example:

from time import sleep
from tqdm import tqdm

values = range(3)
with tqdm(total=len(values)) as pbar:
    for i in values:
        pbar.write('processed: %d' %i)
        pbar.update(1)
        sleep(1)

The output would look something like this (progress would show up red):

  0%|          | 0/3 [00:00<?, ?it/s]
processed: 1
 67%|██████▋   | 2/3 [00:01<00:00,  1.99it/s]
processed: 2
100%|██████████| 3/3 [00:02<00:00,  1.53it/s]
processed: 3

The problem is that the output to stdout and stderr are processed asynchronously and separately in terms of new lines.

If say Jupyter receives on stderr the first line and then the “processed” output on stdout. Then once it receives an output on stderr to update the progress, it wouldn’t go back and update the first line as it would only update the last line. Instead it will have to write a new line.

Workaround 1, writing to stdout

One workaround would be to output both to stdout instead:

import sys
from time import sleep
from tqdm import tqdm

values = range(3)
with tqdm(total=len(values), file=sys.stdout) as pbar:
    for i in values:
        pbar.write('processed: %d' % (1 + i))
        pbar.update(1)
        sleep(1)

The output will change to (no more red):

processed: 1   | 0/3 [00:00<?, ?it/s]
processed: 2   | 0/3 [00:00<?, ?it/s]
processed: 3   | 2/3 [00:01<00:00,  1.99it/s]
100%|██████████| 3/3 [00:02<00:00,  1.53it/s]

Here we can see that Jupyter doesn’t seem to clear until the end of the line. We could add another workaround for that by adding spaces. Such as:

import sys
from time import sleep
from tqdm import tqdm

values = range(3)
with tqdm(total=len(values), file=sys.stdout) as pbar:
    for i in values:
        pbar.write('processed: %d%s' % (1 + i, ' ' * 50))
        pbar.update(1)
        sleep(1)

Which gives us:

processed: 1                                                  
processed: 2                                                  
processed: 3                                                  
100%|██████████| 3/3 [00:02<00:00,  1.53it/s]

Workaround 2, set description instead

It might in general be more straight forward not to have two outputs but update the description instead, e.g.:

import sys
from time import sleep
from tqdm import tqdm

values = range(3)
with tqdm(total=len(values), file=sys.stdout) as pbar:
    for i in values:
        pbar.set_description('processed: %d' % (1 + i))
        pbar.update(1)
        sleep(1)

With the output (description updated while it’s processing):

processed: 3: 100%|██████████| 3/3 [00:02<00:00,  1.53it/s]

Conclusion

You can mostly get it to work fine with plain tqdm. But if tqdm_notebook works for you, just use that (but then you’d probably not read that far).


回答 2

现在大多数答案已经过时了。如果正确导入tqdm,则更好。

from tqdm import tqdm_notebook as tqdm

Most of the answers are outdated now. Better if you import tqdm correctly.

from tqdm import tqdm_notebook as tqdm


回答 3

如果此处的其他技巧不起作用,并且-和我一样-您正在通过中使用pandas集成progress_apply,则可以进行tqdm处理:

from tqdm.autonotebook import tqdm
tqdm.pandas()

df.progress_apply(row_function, axis=1)

这里的重点在于tqdm.autonotebook模块。正如他们在IPython Notebook中使用的说明中所述,这使得tqdm可以在Jupyter笔记本和Jupyter控制台中使用的进度条格式之间进行选择-由于我这一方面仍缺乏进一步的研究,该特定格式选择的tqdm.autonotebook效果很好pandas,而所有其他格式都没有不是,progress_apply特别是。

If the other tips here don’t work and – just like me – you’re using the pandas integration through progress_apply, you can let tqdm handle it:

from tqdm.autonotebook import tqdm
tqdm.pandas()

df.progress_apply(row_function, axis=1)

The main point here lies in the tqdm.autonotebook module. As stated in their instructions for use in IPython Notebooks, this makes tqdm choose between progress bar formats used in Jupyter notebooks and Jupyter consoles – for a reason still lacking further investigations on my side, the specific format chosen by tqdm.autonotebook works smoothly in pandas, while all others didn’t, for progress_apply specifically.


回答 4

要完成oscarbranson的答案:可以根据从何处运行进度条来自动选择控制台或笔记本版本的进度条:

from tqdm.autonotebook import tqdm

更多信息可以在这里找到

To complete oscarbranson’s answer: it’s possible to automatically pick console or notebook versions of progress bar depending on where it’s being run from:

from tqdm.autonotebook import tqdm

More info can be found here


回答 5

以上都不适合我。我发现运行以下命令可以在出现错误后解决此问题(它只会清除后台进度条的所有实例):

from tqdm import tqdm

# blah blah your code errored

tqdm._instances.clear()

None of the above works for me. I find that running to following sorts this issue after error (It just clears all the instances of progress bars in the background):

from tqdm import tqdm

# blah blah your code errored

tqdm._instances.clear()

回答 6

使用tqdm_notebook

从tqdm导入tqdm_notebook作为tqdm

x = [1,2,3,4,5]

对于我在tqdm(len(x))中:

print(x[i])

Use tqdm_notebook

from tqdm import tqdm_notebook as tqdm

x=[1,2,3,4,5]

for i in tqdm(range(0,len(x))):

    print(x[i])

回答 7

对于在Windows上无法解决此处提到的任何解决方案重复栏问题的每个人。我必须按照修复该问题的tqdm已知问题中的colorama说明安装该软件包。

pip install colorama

通过以下示例进行尝试:

from tqdm import tqdm
from time import sleep

for _ in tqdm(range(5), "All", ncols = 80, position = 0):
    for _ in tqdm(range(100), "Sub", ncols = 80, position = 1, leave = False):
        sleep(0.01)

会产生类似:

All:  60%|████████████████████████                | 3/5 [00:03<00:02,  1.02s/it]
Sub:  50%|██████████████████▌                  | 50/100 [00:00<00:00, 97.88it/s]

For everyone who is on windows and couldn’t solve the duplicating bars issue with any of the solutions mentioned here. I had to install the colorama package as stated in tqdm’s known issues which fixed it.

pip install colorama

Try it with this example:

from tqdm import tqdm
from time import sleep

for _ in tqdm(range(5), "All", ncols = 80, position = 0):
    for _ in tqdm(range(100), "Sub", ncols = 80, position = 1, leave = False):
        sleep(0.01)

Which will produce something like:

All:  60%|████████████████████████                | 3/5 [00:03<00:02,  1.02s/it]
Sub:  50%|██████████████████▌                  | 50/100 [00:00<00:00, 97.88it/s]

如何防止Google Colab断开连接?

问题:如何防止Google Colab断开连接?

问:是否可以通过编程方式防止Google Colab在超时时断开连接?

下面介绍导致笔记本计算机自动断开连接的情况:

Google Colab笔记本的空闲超时为90分钟,绝对超时为12小时。这意味着,如果用户在超过90分钟的时间内未与其Google Colab笔记本互动,则其实例将自动终止。另外,Colab实例的最大生存期为12小时。

自然,我们希望自动将最大值从实例中挤出,而不必不断地手动与之交互。在这里,我将假定常见的系统要求:

  • Ubuntu 18 LTS / Windows 10 / Mac操作系统
  • 对于基于Linux的系统,请使用流行的DE,例如Gnome 3或Unity
  • Firefox或Chromium浏览器

我要在这里指出,这种行为并未违反 Google Colab的使用条款,尽管根据其常见问题解答不鼓励这样做(简而言之:从道德上讲,如果您真的不需要它,则用尽所有GPU是不可行的))。


我当前的解决方案非常愚蠢:

  • 首先,我关闭屏幕保护程序,因此我的屏幕始终保持打开状态。
  • 我有一个Arduino开发板,所以我只是将它变成了一个橡胶鸭子USB,并使其在我睡觉时模拟原始用户交互(只是因为我手边有其他用例)。

有更好的方法吗?

Q: Is there any way to programmatically prevent Google Colab from disconnecting on a timeout?

The following describes the conditions causing a notebook to automatically disconnect:

Google Colab notebooks have an idle timeout of 90 minutes and absolute timeout of 12 hours. This means, if user does not interact with his Google Colab notebook for more than 90 minutes, its instance is automatically terminated. Also, maximum lifetime of a Colab instance is 12 hours.

Naturally, we want to automatically squeeze the maximum out of the instance, without having to manually interact with it constantly. Here I will assume commonly seen system requirements:

  • Ubuntu 18 LTS / Windows 10 / Mac Operating systems
  • In case of Linux-based systems, using popular DEs like Gnome 3 or Unity
  • Firefox or Chromium browsers

I should point out here that such behavior does not violate Google Colab’s Terms of Use, although it is not encouraged according to their FAQ (in short: morally it is not okay to use up all of the GPUs if you don’t really need it).


My current solution is very dumb:

  • First, I turn the screensaver off, so my sreen is always on.
  • I have an Arduino board, so I just turned it into a rubber ducky usb and make it emulate primitive user interaction while I sleep (just because I have it at hand for other use-cases).

Are there better ways?


回答 0

编辑: 显然,该解决方案非常简单,并且不需要任何JavaScript。只需在底部创建具有以下行的新单元格:

while True:pass

现在将单元格保持在运行顺序中,以便无限循环不会停止,从而使会话保持活动状态。

旧方法: 设置一个JavaScript间隔,每60秒点击一次connect按钮。使用Ctrl + Shift + I打开开发人员设置(在您的Web浏览器中),然后单击控制台选项卡,然后在控制台提示符下键入此设置。(对于Mac,请按Option + Command + I)

function ConnectButton(){
    console.log("Connect pushed"); 
    document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click() 
}
setInterval(ConnectButton,60000);

Edit: Apparently the solution is very easy, and doesn’t need any JavaScript. Just create a new cell at the bottom having the following line:

while True:pass

now keep the cell in the run sequence so that the infinite loop won’t stop and thus keep your session alive.

Old method: Set a javascript interval to click on the connect button every 60 seconds. Open developer-settings (in your web-browser) with Ctrl+Shift+I then click on console tab and type this on the console prompt. (for mac press Option+Command+I)

function ConnectButton(){
    console.log("Connect pushed"); 
    document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click() 
}
setInterval(ConnectButton,60000);

回答 1

由于现在将连接按钮的ID更改为“ colab-connect-button”,因此可以使用以下代码来继续单击该按钮。

function ClickConnect(){
    console.log("Clicked on connect button"); 
    document.querySelector("colab-connect-button").click()
}
setInterval(ClickConnect,60000)

如果仍然无法解决问题,请按照以下步骤操作:

  1. 右键单击连接按钮(位于colab的右上方)
  2. 点击检查
  3. 获取按钮的HTML ID并替换为以下代码
function ClickConnect(){
    console.log("Clicked on connect button"); 
    document.querySelector("Put ID here").click() // Change id here
}
setInterval(ClickConnect,60000)

Since the id of the connect button is now changed to “colab-connect-button”, the following code can be used to keep clicking on the button.

function ClickConnect(){
    console.log("Clicked on connect button"); 
    document.querySelector("colab-connect-button").click()
}
setInterval(ClickConnect,60000)

If still, this doesn’t work, then follow the steps given below:

  1. Right-click on the connect button (on the top-right side of the colab)
  2. Click on inspect
  3. Get the HTML id of the button and substitute in the following code
function ClickConnect(){
    console.log("Clicked on connect button"); 
    document.querySelector("Put ID here").click() // Change id here
}
setInterval(ClickConnect,60000)

回答 2

嗯,这对我有用-

在控制台中运行以下代码,它将阻止您断开连接。Ctrl + Shift + i打开检查器视图。然后进入控制台。

function ClickConnect(){
    console.log("Working"); 
    document.querySelector("colab-toolbar-button#connect").click() 
}
setInterval(ClickConnect,60000)

如何防止Google Colab断开连接

Well this is working for me –

run the following code in the console and it will prevent you from disconnecting. Ctrl+ Shift + i to open inspector view . Then go to console.

function ClickConnect(){
    console.log("Working"); 
    document.querySelector("colab-toolbar-button#connect").click() 
}
setInterval(ClickConnect,60000)

How to prevent google colab from disconnecting


回答 3

对我而言,以下示例:

  • document.querySelector("#connect").click() 要么
  • document.querySelector("colab-toolbar-button#connect").click() 要么
  • document.querySelector("colab-connect-button").click()

抛出错误。

我必须使它们适应以下条件:

版本1:

function ClickConnect(){
  console.log("Connnect Clicked - Start"); 
  document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click();
  console.log("Connnect Clicked - End"); 
};
setInterval(ClickConnect, 60000)

版本2: 如果您希望能够停止该功能,请使用以下新代码:

var startClickConnect = function startClickConnect(){
    var clickConnect = function clickConnect(){
        console.log("Connnect Clicked - Start");
        document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click();
        console.log("Connnect Clicked - End"); 
    };

    var intervalId = setInterval(clickConnect, 60000);

    var stopClickConnectHandler = function stopClickConnect() {
        console.log("Connnect Clicked Stopped - Start");
        clearInterval(intervalId);
        console.log("Connnect Clicked Stopped - End");
    };

    return stopClickConnectHandler;
};

var stopClickConnect = startClickConnect();

为了停止,请调用:

stopClickConnect();

For me the following examples:

  • document.querySelector("#connect").click() or
  • document.querySelector("colab-toolbar-button#connect").click() or
  • document.querySelector("colab-connect-button").click()

were throwing errors.

I had to adapt them to the following:

Version 1:

function ClickConnect(){
  console.log("Connnect Clicked - Start"); 
  document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click();
  console.log("Connnect Clicked - End"); 
};
setInterval(ClickConnect, 60000)

Version 2: If you would like to be able to stop the function, here is the new code:

var startClickConnect = function startClickConnect(){
    var clickConnect = function clickConnect(){
        console.log("Connnect Clicked - Start");
        document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click();
        console.log("Connnect Clicked - End"); 
    };

    var intervalId = setInterval(clickConnect, 60000);

    var stopClickConnectHandler = function stopClickConnect() {
        console.log("Connnect Clicked Stopped - Start");
        clearInterval(intervalId);
        console.log("Connnect Clicked Stopped - End");
    };

    return stopClickConnectHandler;
};

var stopClickConnect = startClickConnect();

In order to stop, call:

stopClickConnect();

回答 4

使用Pynput在您的PC中创建python代码

from pynput.mouse import Button, Controller
import time

mouse = Controller()

while True:
    mouse.click(Button.left, 1)
    time.sleep(30)

在您的桌面上运行此代码,然后将鼠标指针悬停在任何目录的目录结构上(左侧的左侧栏-文件部分),此代码将每30秒不断单击一次目录,因此每30秒将展开和缩小一次,因此您的会话不会过期重要-您必须在PC中运行此代码

create a python code in your pc with pynput

from pynput.mouse import Button, Controller
import time

mouse = Controller()

while True:
    mouse.click(Button.left, 1)
    time.sleep(30)

Run this code in your Desktop, Then point mouse arrow over (colabs left panel – file section) directory structure on any directory this code will keep clicking on directory on every 30 seconds so it will expand and shrink every 30 seconds so your session will not get expired Important – you have to run this code in your pc


回答 5

我没有单击“连接”按钮,而是单击“评论”按钮以使会话保持活动状态。(2020年8月)

function ClickConnect(){

console.log("Working"); 
document.querySelector("#comments > span").click() 
}
setInterval(ClickConnect,5000)

Instead of clicking the connect button, i just clicking on comment button to keep my session alive. (August-2020)

function ClickConnect(){

console.log("Working"); 
document.querySelector("#comments > span").click() 
}
setInterval(ClickConnect,5000)

回答 6

我使用宏程序定期单击RAM / Disk按钮以整夜训练模型。诀窍是配置一个宏程序,以两次单击Ram / Disk Colab工具栏按钮,两次单击之间的间隔很短,这样即使运行时断开连接,它也将重新连接。(第一次单击用于关闭对话框,第二次单击用于重新连接)。但是,您仍然必须整夜打开笔记本电脑,甚至可以固定Colab标签。

I use a Macro Program to periodically click on the RAM/Disk button to train the model all night. The trick is to configure a macro program to click on the Ram/Disk Colab Toolbar Button twice with a short interval between the two clicks so that even if the Runtime gets disconnected it will reconnect back. (the first click used to close the dialog box and the second click used to RECONNECT). However, you still have to leave your laptop open all night and maybe pin the Colab tab.


回答 7

在某些脚本的帮助下,以上答案可能效果很好。对于没有脚本的烦人的断开连接,我有一个解决方案(或一种技巧),尤其是当您的程序必须从google驱动器读取数据时,例如训练深度学习网络模型时,使用脚本进行reconnect操作就没有用了,因为一旦您断开与colab的连接,该程序就死了,应该再次手动连接到Google驱动器,以使您的模型能够再次读取数据集,但是脚本不会执行此操作。
我已经测试了很多次,并且效果很好。
当您使用浏览器(我使用Chrome)在colab页面上运行程序时,请记住,一旦程序开始运行,就不要对浏览器进行任何操作,例如:切换到其他网页,打开或关闭另一个网页,以及依此类推,只需将其放置在那里,等待程序完成运行,就可以切换到pycharm等其他软件来继续编写代码,而不必切换到另一个网页。我不知道为什么打开或关闭或切换到其他页面会导致google colab页面的连接问题,但是每次我尝试打扰我的浏览器(如执行某些搜索工作)时,我与colab的连接都会很快断开。

The above answers with the help of some scripts maybe work well. I have a solution(or a kind of trick) for that annoying disconnection without scripts, especially when your program must read data from your google drive, like training a deep learning network model, where using scripts to do reconnect operation is of no use because once you disconnect with your colab, the program is just dead, you should manually connect to your google drive again to make your model able to read dataset again, but the scripts will not do that thing.
I’ve already test it many times and it works well.
When you run a program on the colab page with a browser(I use Chrome), just remember that don’t do any operation to your browser once your program starts running, like: switch to other webpages, open or close another webpage, and so on, just just leave it alone there and waiting for your program finish running, you can switch to another software, like pycharm to keep writing your codes but not switch to another webpage. I don’t know why open or close or switch to other pages will cause the connection problem of the google colab page, but each time I try to bothered my browser, like do some search job, my connection to colab will soon break down.


回答 8

尝试这个:

function ClickConnect(){
  console.log("Working"); 
  document
    .querySelector("#top-toolbar > colab-connect-button")
    .shadowRoot
    .querySelector("#connect")
    .click()
}

setInterval(ClickConnect,60000)

Try this:

function ClickConnect(){
  console.log("Working"); 
  document
    .querySelector("#top-toolbar > colab-connect-button")
    .shadowRoot
    .querySelector("#connect")
    .click()
}

setInterval(ClickConnect,60000)

回答 9

使用python硒

from selenium.webdriver.common.keys import Keys
from selenium import webdriver
import time   

driver = webdriver.Chrome('/usr/lib/chromium-browser/chromedriver')

notebook_url = ''
driver.get(notebook_url)

# run all cells
driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + Keys.F9)
time.sleep(5)

# click to stay connected
start_time = time.time()
current_time = time.time()
max_time = 11*59*60 #12hours

while (current_time - start_time) < max_time:
    webdriver.ActionChains(driver).send_keys(Keys.ESCAPE).perform()
    driver.find_element_by_xpath('//*[@id="top-toolbar"]/colab-connect-button').click()
    time.sleep(30)
    current_time = time.time()

Using python selenium

from selenium.webdriver.common.keys import Keys
from selenium import webdriver
import time   

driver = webdriver.Chrome('/usr/lib/chromium-browser/chromedriver')

notebook_url = ''
driver.get(notebook_url)

# run all cells
driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + Keys.F9)
time.sleep(5)

# click to stay connected
start_time = time.time()
current_time = time.time()
max_time = 11*59*60 #12hours

while (current_time - start_time) < max_time:
    webdriver.ActionChains(driver).send_keys(Keys.ESCAPE).perform()
    driver.find_element_by_xpath('//*[@id="top-toolbar"]/colab-connect-button').click()
    time.sleep(30)
    current_time = time.time()

回答 10

我认为JavaScript解决方案不再有效。我在笔记本中使用以下命令进行操作:

    from IPython.display import display, HTML
    js = ('<script>function ConnectButton(){ '
           'console.log("Connect pushed"); '
           'document.querySelector("#connect").click()} '
           'setInterval(ConnectButton,3000);</script>')
    display(HTML(js))

首次执行全部运行时(在启动JavaScript或Python代码之前),控制台将显示:

Connected to 
wss://colab.research.google.com/api/kernels/0e1ce105-0127-4758-90e48cf801ce01a3/channels?session_id=5d8...

但是,每次运行JavaScript时,您都会看到console.log部分,但是click部分仅给出:

Connect pushed

Uncaught TypeError: Cannot read property 'click' of null
 at ConnectButton (<anonymous>:1:92)

其他人建议将按钮名称更改为#colab-connect-button,但这会产生相同的错误。

启动运行系统后,该按钮将更改为显示RAM / DISK,并显示一个下拉列表。单击下拉列表创建一个<DIV class=goog menu...>以前未在DOM中显示的新内容,并带有2个选项“连接到托管运行时”和“连接到本地运行时”。如果控制台窗口已打开并显示元素,则在单击下拉元素时可以看到此DIV出现。只需在出现的新窗口中的两个选项之间移动鼠标焦点,即可向DOM添加其他元素,一旦鼠标释放焦点,它们便会从DOM中完全删除,甚至无需单击即可。

I don’t believe the JavaScript solutions work anymore. I was doing it from within my notebook with:

    from IPython.display import display, HTML
    js = ('<script>function ConnectButton(){ '
           'console.log("Connect pushed"); '
           'document.querySelector("#connect").click()} '
           'setInterval(ConnectButton,3000);</script>')
    display(HTML(js))

When you first do a Run all (before the JavaScript or Python code has started), the console displays:

Connected to 
wss://colab.research.google.com/api/kernels/0e1ce105-0127-4758-90e48cf801ce01a3/channels?session_id=5d8...

However, ever time the JavaScript runs, you see the console.log portion, but the click portion simply gives:

Connect pushed

Uncaught TypeError: Cannot read property 'click' of null
 at ConnectButton (<anonymous>:1:92)

Others suggested the button name has changed to #colab-connect-button, but that gives same error.

After the runtime is started, the button is changed to show RAM/DISK, and a drop down is presented. Clicking on the drop down creates a new <DIV class=goog menu...> that was not shown in the DOM previously, with 2 options “Connect to hosted runtime” and “Connect to local runtime”. If the console window is open and showing elements, you can see this DIV appear when you click the dropdown element. Simply moving the mouse focus between the two options in the new window that appears adds additional elements to the DOM, as soon as the mouse looses focus, they are removed from the DOM completely, even without clicking.


回答 11

我尝试了上面的代码,但它们对我不起作用。这是我重新连接的JS代码。

let interval = setInterval(function(){
let ok = document.getElementById('ok');
if(ok != null){
   console.log("Connect pushed");
ok.click();
}},60000)

您可以使用相同的方式(在浏览器的控制台上运行)来运行它。如果要停止脚本,可以输入clearInterval(interval)并再次运行setInterval(interval)

我希望这可以帮助你。

I tried the codes above but they did not work for me. So here is my JS code for reconnecting.

let interval = setInterval(function(){
let ok = document.getElementById('ok');
if(ok != null){
   console.log("Connect pushed");
ok.click();
}},60000)

You can use it with the same way (run it on the console of your browser) to run it. If you want to stop the script, you can enter clearInterval(interval) and want to run again setInterval(interval).

I hope this helps you.


回答 12

更新了一个。这个对我有用。

function ClickConnect(){
console.log("Working"); 
document.querySelector("paper-icon-button").click()
}
Const myjob = setInterval(ClickConnect, 60000)

如果对您不起作用,请尝试运行以下命令清除它:

clearInterval(myjob)

Updated one. it works for me.

function ClickConnect(){
console.log("Working"); 
document.querySelector("paper-icon-button").click()
}
Const myjob = setInterval(ClickConnect, 60000)

If isn’t working you for you guys try clear it by running:

clearInterval(myjob)

回答 13

这对我有用(似乎他们更改了按钮的类名或ID):

function ClickConnect(){
    console.log("Working"); 
    document.querySelector("colab-connect-button").click() 
}
setInterval(ClickConnect,60000)

This one worked for me (it seems like they changed the button classname or id) :

function ClickConnect(){
    console.log("Working"); 
    document.querySelector("colab-connect-button").click() 
}
setInterval(ClickConnect,60000)

回答 14

投票得最多的答案当然对我有用,但这会使“管理会话”窗口一次又一次地弹出。
我已经解决了这一问题,方法是使用浏览器控制台如下所示自动单击刷新按钮

function ClickRefresh(){
    console.log("Clicked on refresh button"); 
    document.querySelector("paper-icon-button").click()
}
setInterval(ClickRefresh, 60000)

随时在此要点上为此贡献更多代码片段https://gist.github.com/Subangkar/fd1ef276fd40dc374a7c80acc247613e

The most voted answer certainly works for me but it makes the Manage session window popping up again and again.
I’ve solved that by auto clicking the refresh button using browser console like below

function ClickRefresh(){
    console.log("Clicked on refresh button"); 
    document.querySelector("paper-icon-button").click()
}
setInterval(ClickRefresh, 60000)

Feel free to contribute more snippets for this at this gist https://gist.github.com/Subangkar/fd1ef276fd40dc374a7c80acc247613e


回答 15

也许以前的许多解决方案都不再起作用。例如,下面的代码继续在Colab中创建新的代码单元,但仍在工作。无疑,创建一堆代码单元是一个不便之处。如果在运行几个小时后创建了太多的代码单元,而没有足够的RAM,则浏览器可能会冻结。

反复创建代码单元-

function ClickConnect(){
console.log("Working"); 
document.querySelector("colab-toolbar-button").click() 
}setInterval(ClickConnect,60000)

但是我发现下面的代码正在运行,它不会引起任何问题。在Colab笔记本选项卡中,Ctrl + Shift + i同时单击该键,然后将以下代码粘贴到控制台中。120000个间隔就足够了。

function ClickConnect(){
console.log("Working"); 
document.querySelector("colab-toolbar-button#connect").click() 
}setInterval(ClickConnect,120000)

我已在2020年11月在firefox中测试了此代码。它也将在chrome上工作。

Perhaps many of the previous solutions are no longer working. For example, this bellow code continues to create new code cells in Colab, working though. Undoubtedly, creating a bunch of code cells is an inconvenience. If too many code cells are created in some hours of running and there is no enough RAM, the browser may freeze.

This repetedly creates code cells—

function ClickConnect(){
console.log("Working"); 
document.querySelector("colab-toolbar-button").click() 
}setInterval(ClickConnect,60000)

But I found the code below is working, it doesn’t cause any problems. In the Colab notebook tab, click on the Ctrl + Shift + i key simultaneously and paste the below code in the console. 120000 intervals are enough.

function ClickConnect(){
console.log("Working"); 
document.querySelector("colab-toolbar-button#connect").click() 
}setInterval(ClickConnect,120000)

I have tested this code in firefox, in November 2020. It will work on chrome too.


回答 16

我建议使用JQuery(似乎Co-lab默认包含JQuery)。

function ClickConnect(){
  console.log("Working");
  $("colab-toolbar-button").click();
}
setInterval(ClickConnect,60000);

I would recommend using JQuery (It seems that Co-lab includes JQuery by default).

function ClickConnect(){
  console.log("Working");
  $("colab-toolbar-button").click();
}
setInterval(ClickConnect,60000);

回答 17

这些JavaScript函数存在问题:

function ClickConnect(){
    console.log("Clicked on connect button"); 
    document.querySelector("colab-connect-button").click()
}
setInterval(ClickConnect,60000)

他们实际在单击按钮之前在控制台上打印“ Clicked on connect button”。从该线程的不同答案中可以看出,自Google Colab启动以来,connect按钮的ID已经更改了两次。而且将来也可能会更改。因此,如果您打算从该线程中复制旧答案,则可能会说“单击了连接按钮”,但实际上可能不会这样做。当然,如果单击不起作用,它将在控制台上显示一个错误,但是如果您可能不会意外看到该怎么办?因此,您最好这样做:

function ClickConnect(){
    document.querySelector("colab-connect-button").click()
    console.log("Clicked on connect button"); 
}
setInterval(ClickConnect,60000)

您肯定会看到它是否真正起作用。

I have a problem with these javascript functions:

function ClickConnect(){
    console.log("Clicked on connect button"); 
    document.querySelector("colab-connect-button").click()
}
setInterval(ClickConnect,60000)

They print the “Clicked on connect button” on the console before the button is actually clicked. As you can see from different answers in this thread, the id of the connect button has changed a couple of times since Google Colab was launched. And it could be changed in the future as well. So if you’re going to copy an old answer from this thread it may say “Clicked on connect button” but it may actually not do that. Of course if the clicking won’t work it will print an error on the console but what if you may not accidentally see it? So you better do this:

function ClickConnect(){
    document.querySelector("colab-connect-button").click()
    console.log("Clicked on connect button"); 
}
setInterval(ClickConnect,60000)

And you’ll definitely see if it truly works or not.


回答 18

function ClickConnect()
{
    console.log("Working...."); 
    document.querySelector("paper-button#comments").click()
}
setInterval(ClickConnect,600)

这对我有用,但明智地使用

快乐学习:)

function ClickConnect()
{
    console.log("Working...."); 
    document.querySelector("paper-button#comments").click()
}
setInterval(ClickConnect,600)

this worked for me but use wisely

happy learning :)


回答 19

以下最新解决方案适用于我:

function ClickConnect(){
  colab.config
  console.log("Connnect Clicked - Start"); 
  document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click();
  console.log("Connnect Clicked - End");
};
setInterval(ClickConnect, 60000)

the following LATEST solution works for me:

function ClickConnect(){
  colab.config
  console.log("Connnect Clicked - Start"); 
  document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click();
  console.log("Connnect Clicked - End");
};
setInterval(ClickConnect, 60000)

回答 20

下面的javascript对我有用。学分@ artur.k.space

function ColabReconnect() {
    var dialog = document.querySelector("colab-dialog.yes-no-dialog");
    var dialogTitle = dialog && dialog.querySelector("div.content-area>h2");
    if (dialogTitle && dialogTitle.innerText == "Runtime disconnected") {
        dialog.querySelector("paper-button#ok").click();
        console.log("Reconnecting...");
    } else {
        console.log("ColabReconnect is in service.");
    }
}
timerId = setInterval(ColabReconnect, 60000);

在Colab笔记本中,同时单击Ctrl + Shift +i键。将脚本复制并粘贴到提示行中。然后Enter在关闭编辑器之前点击。

这样,该功能将每60秒检查一次,以查看是否显示了屏幕连接对话框,如果显示,则该功能将ok自动为您单击该按钮。

The javascript below works for me. Credits to @artur.k.space.

function ColabReconnect() {
    var dialog = document.querySelector("colab-dialog.yes-no-dialog");
    var dialogTitle = dialog && dialog.querySelector("div.content-area>h2");
    if (dialogTitle && dialogTitle.innerText == "Runtime disconnected") {
        dialog.querySelector("paper-button#ok").click();
        console.log("Reconnecting...");
    } else {
        console.log("ColabReconnect is in service.");
    }
}
timerId = setInterval(ColabReconnect, 60000);

In the Colab notebook, click on Ctrl + Shift + the i key simultaneously. Copy and paste the script into the prompt line. Then hit Enter before closing the editor.

By doing so, the function will check every 60 seconds to see if the onscreen connection dialog is shown, and if it is, the function would then click the ok button automatically for you.


回答 21

好吧,我不是python家伙,也不知道这个’Colab’的实际用途是什么,我将其用作构建系统。我以前在其中设置了ssh转发,然后将这段代码放到运行中,是的。

import getpass
authtoken = getpass.getpass()

Well I am not a python guy nor I know what is the actual use of this ‘Colab’, I use it as a build system lol. And I used to setup ssh forwarding in it then put this code and just leave it running and yeah it works.

import getpass
authtoken = getpass.getpass()

回答 22

此代码在文件资源管理器窗格中单击“刷新文件夹”。

function ClickRefresh(){
  console.log("Working"); 
  document.querySelector("[icon='colab:folder-refresh']").click()
}
const myjob = setInterval(ClickRefresh, 60000)

This code keep clicking “Refresh folder” in the file explorer pane.

function ClickRefresh(){
  console.log("Working"); 
  document.querySelector("[icon='colab:folder-refresh']").click()
}
const myjob = setInterval(ClickRefresh, 60000)

回答 23

GNU Colab使您可以在Colaboratory实例之上运行标准的持久桌面环境。

实际上,它包含一种不让机器死掉的机制。

这是一个视频演示

GNU Colab lets you run a standard persistent desktop environment on top of a Colaboratory instance.

Indeed it contains a mechanism to not let machines die of idling.

Here’s a video demonstration.


回答 24

您也可以使用Python按下箭头键。我也在以下代码中添加了一些随机性。

from pyautogui import press, typewrite, hotkey
import time
from random import shuffle

array = ["left", "right", "up", "down"]

while True:
    shuffle(array)
    time.sleep(10)
    press(array[0])
    press(array[1])
    press(array[2])
    press(array[3])

You can also use Python to press the arrow keys. I added a little bit of randomness in the following code as well.

from pyautogui import press, typewrite, hotkey
import time
from random import shuffle

array = ["left", "right", "up", "down"]

while True:
    shuffle(array)
    time.sleep(10)
    press(array[0])
    press(array[1])
    press(array[2])
    press(array[3])

回答 25

只需在要运行的单元格之后运行以下代码,以防止数据丢失。

!python

同样要退出此模式,请写

exit()

Just run the code below after the cell you want to run to save from data loss.

!python

Also to exit from this mode, write

exit()

回答 26

我一直在寻找解决方案,直到找到一个Python3,该Python3总是在同一位置来回移动鼠标并单击,但这足以使Colab误以为我在笔记本电脑上很活跃并且没有断开连接。

import numpy as np
import time
import mouse
import threading

def move_mouse():
    while True:
        random_row = np.random.random_sample()*100
        random_col = np.random.random_sample()*10
        random_time = np.random.random_sample()*np.random.random_sample() * 100
        mouse.wheel(1000)
        mouse.wheel(-1000)
        mouse.move(random_row, random_col, absolute=False, duration=0.2)
        mouse.move(-random_row, -random_col, absolute=False, duration = 0.2)
        mouse.LEFT
        time.sleep(random_time)


x = threading.Thread(target=move_mouse)
x.start()

您需要安装所需的软件包:sudo -H pip3 install <package_name> 您只需要使用(在本地计算机中)运行它即可(sudo因为它可以控制鼠标)并且它应该可以工作,从而使您能够充分利用Colab的12h会话。

积分: 对于使用Colab(Pro)的用户:防止会话由于不活动而断开连接

I was looking for a solution until I found a Python3 that randomly moves the mouse back and forth and clicks, always on the same place, but that’s enough to fool Colab into thinking I’m active on the notebook and not disconnect.

import numpy as np
import time
import mouse
import threading

def move_mouse():
    while True:
        random_row = np.random.random_sample()*100
        random_col = np.random.random_sample()*10
        random_time = np.random.random_sample()*np.random.random_sample() * 100
        mouse.wheel(1000)
        mouse.wheel(-1000)
        mouse.move(random_row, random_col, absolute=False, duration=0.2)
        mouse.move(-random_row, -random_col, absolute=False, duration = 0.2)
        mouse.LEFT
        time.sleep(random_time)


x = threading.Thread(target=move_mouse)
x.start()

You need to install the needed packages: sudo -H pip3 install <package_name> You just need to run it (in your local machine) with sudo (as it takes control of the mouse) and it should work, allowing you to take full advantage of Colab’s 12h sessions.

Credits: For those using Colab (Pro): Preventing Session from disconnecting due to inactivity


回答 27

function ClickConnect(){
    console.log("Clicked on connect button"); 
    document.querySelector("connect").click() // Change id here
}
setInterval(ClickConnect,60000)

试试上面对我有用的代码:)

function ClickConnect(){
    console.log("Clicked on connect button"); 
    document.querySelector("connect").click() // Change id here
}
setInterval(ClickConnect,60000)

Try above code it worked for me:)


从IPython Notebook中的日志记录模块获取输出

问题:从IPython Notebook中的日志记录模块获取输出

当我在IPython Notebook中运行以下命令时,看不到任何输出:

import logging
logging.basicConfig(level=logging.DEBUG)
logging.debug("test")

有人知道怎么做,这样我才能在笔记本中看到“测试”消息吗?

When I running the following inside IPython Notebook I don’t see any output:

import logging
logging.basicConfig(level=logging.DEBUG)
logging.debug("test")

Anyone know how to make it so I can see the “test” message inside the notebook?


回答 0

请尝试以下操作:

import logging
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
logging.debug("test")

根据logging.basicConfig

通过创建带有默认Formatter的StreamHandler并将其添加到根记录器,对记录系统进行基本配置。如果没有为根记录器定义处理程序,则debug(),info(),warning(),error()和critical()函数将自动调用basicConfig()。

如果根记录器已经为其配置了处理程序,则此功能不执行任何操作。

似乎ipython笔记本在某处调用basicConfig(或设置处理程序)。

Try following:

import logging
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
logging.debug("test")

According to logging.basicConfig:

Does basic configuration for the logging system by creating a StreamHandler with a default Formatter and adding it to the root logger. The functions debug(), info(), warning(), error() and critical() will call basicConfig() automatically if no handlers are defined for the root logger.

This function does nothing if the root logger already has handlers configured for it.

It seems like ipython notebook call basicConfig (or set handler) somewhere.


回答 1

如果仍要使用basicConfig,请像这样重新加载日志记录模块

from importlib import reload  # Not needed in Python 2
import logging
reload(logging)
logging.basicConfig(format='%(asctime)s %(levelname)s:%(message)s', level=logging.DEBUG, datefmt='%I:%M:%S')

If you still want to use basicConfig, reload the logging module like this

from importlib import reload  # Not needed in Python 2
import logging
reload(logging)
logging.basicConfig(format='%(asctime)s %(levelname)s:%(message)s', level=logging.DEBUG, datefmt='%I:%M:%S')

回答 2

我的理解是IPython会话开始记录日志,因此basicConfig不起作用。这是对我有用的设置(我希望这看起来不太好,因为我想将其用于几乎所有笔记本电脑):

import logging
logger = logging.getLogger()
fhandler = logging.FileHandler(filename='mylog.log', mode='a')
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
fhandler.setFormatter(formatter)
logger.addHandler(fhandler)
logger.setLevel(logging.DEBUG)

现在,当我运行时:

logging.error('hello!')
logging.debug('This is a debug message')
logging.info('this is an info message')
logging.warning('tbllalfhldfhd, warning.')

我在与笔记本相同的目录中得到一个“ mylog.log”文件,其中包含:

2015-01-28 09:49:25,026 - root - ERROR - hello!
2015-01-28 09:49:25,028 - root - DEBUG - This is a debug message
2015-01-28 09:49:25,029 - root - INFO - this is an info message
2015-01-28 09:49:25,032 - root - WARNING - tbllalfhldfhd, warning.

请注意,如果您在不重新启动IPython会话的情况下重新运行它,则会将重复的条目写入文件,因为现在将定义两个文件处理程序

My understanding is that the IPython session starts up logging so basicConfig doesn’t work. Here is the setup that works for me (I wish this was not so gross looking since I want to use it for almost all my notebooks):

import logging
logger = logging.getLogger()
fhandler = logging.FileHandler(filename='mylog.log', mode='a')
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
fhandler.setFormatter(formatter)
logger.addHandler(fhandler)
logger.setLevel(logging.DEBUG)

Now when I run:

logging.error('hello!')
logging.debug('This is a debug message')
logging.info('this is an info message')
logging.warning('tbllalfhldfhd, warning.')

I get a “mylog.log” file in the same directory as my notebook that contains:

2015-01-28 09:49:25,026 - root - ERROR - hello!
2015-01-28 09:49:25,028 - root - DEBUG - This is a debug message
2015-01-28 09:49:25,029 - root - INFO - this is an info message
2015-01-28 09:49:25,032 - root - WARNING - tbllalfhldfhd, warning.

Note that if you rerun this without restarting the IPython session it will write duplicate entries to the file since there would now be two file handlers defined


回答 3

请记住,stderr是logging模块的默认流,因此在IPython和Jupyter笔记本中,除非将流配置为stdout,否则可能看不到任何内容:

import logging
import sys

logging.basicConfig(format='%(asctime)s | %(levelname)s : %(message)s',
                     level=logging.INFO, stream=sys.stdout)

logging.info('Hello world!')

Bear in mind that stderr is the default stream for the logging module, so in IPython and Jupyter notebooks you might not see anything unless you configure the stream to stdout:

import logging
import sys

logging.basicConfig(format='%(asctime)s | %(levelname)s : %(message)s',
                     level=logging.INFO, stream=sys.stdout)

logging.info('Hello world!')

回答 4

现在对我有用的(Jupyter,笔记本服务器是:5.4.1,IPython 7.0.1)

import logging
logging.basicConfig()
logger = logging.getLogger('Something')
logger.setLevel(logging.DEBUG)

现在,我可以使用记录器来打印信息,否则,我只会看到默认级别(logging.WARNING)或更高级别的消息。

What worked for me now (Jupyter, notebook server is: 5.4.1, IPython 7.0.1)

import logging
logging.basicConfig()
logger = logging.getLogger('Something')
logger.setLevel(logging.DEBUG)

Now I can use logger to print info, otherwise I would see only message from the default level (logging.WARNING) or above.


回答 5

您可以通过运行配置日志记录 %config Application.log_level="INFO"

有关更多信息,请参见IPython内核选项。

You can configure logging by running %config Application.log_level="INFO"

For more information, see IPython kernel options


回答 6

我为这两个文件都设置了一个记录器,我希望它能显示在笔记本上。事实证明,添加文件处理程序会清除默认的流处理程序。

logger = logging.getLogger()

formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')

# Setup file handler
fhandler  = logging.FileHandler('my.log')
fhandler.setLevel(logging.DEBUG)
fhandler.setFormatter(formatter)

# Configure stream handler for the cells
chandler = logging.StreamHandler()
chandler.setLevel(logging.DEBUG)
chandler.setFormatter(formatter)

# Add both handlers
logger.addHandler(fhandler)
logger.addHandler(chandler)
logger.setLevel(logging.DEBUG)

# Show the handlers
logger.handlers

# Log Something
logger.info("Test info")
logger.debug("Test debug")
logger.error("Test error")

I setup a logger for both file and I wanted it to show up on the notebook. Turns out adding a filehandler clears out the default stream handlder.

logger = logging.getLogger()

formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')

# Setup file handler
fhandler  = logging.FileHandler('my.log')
fhandler.setLevel(logging.DEBUG)
fhandler.setFormatter(formatter)

# Configure stream handler for the cells
chandler = logging.StreamHandler()
chandler.setLevel(logging.DEBUG)
chandler.setFormatter(formatter)

# Add both handlers
logger.addHandler(fhandler)
logger.addHandler(chandler)
logger.setLevel(logging.DEBUG)

# Show the handlers
logger.handlers

# Log Something
logger.info("Test info")
logger.debug("Test debug")
logger.error("Test error")

回答 7

似乎适用于ipython / jupyter早期版本的解决方案不再起作用。

这是适用于ipython 7.9.0的有效解决方案(也已通过jupyter服务器6.0.2测试):

import logging
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
logging.debug("test message")

DEBUG:root:test message

It seems that solutions that worked for older versions of ipython/jupyter no longer work.

Here is a working solution for ipython 7.9.0 (also tested with jupyter server 6.0.2):

import logging
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
logging.debug("test message")

DEBUG:root:test message

使用Python 3从Jupyter Notebook中相对导入的另一个目录中的模块导入本地函数

问题:使用Python 3从Jupyter Notebook中相对导入的另一个目录中的模块导入本地函数

我有一个类似于以下内容的目录结构

meta_project
    project1
        __init__.py
        lib
            module.py
            __init__.py
    notebook_folder
        notebook.jpynb

当在工作notebook.jpynb,如果我尝试使用相对导入来访问函数function()module.py有:

from ..project1.lib.module import function

我收到以下错误:

SystemError                               Traceback (most recent call last)
<ipython-input-7-6393744d93ab> in <module>()
----> 1 from ..project1.lib.module import function

SystemError: Parent module '' not loaded, cannot perform relative import

有什么办法可以使用相对导入来使它起作用?

注意,笔记本服务器是在meta_project目录级别实例化的,因此它应该有权访问这些文件中的信息。

同样要注意的是,至少没有按照最初的意图project1被认为是模块,因此没有__init__.py文件,它只是作为文件系统目录。如果解决问题的方法需要将其视为模块,并包括一个__init__.py很好的文件(甚至是空白文件),但这样做还不足以解决问题。

我在机器之间共享此目录,相对的导入使我可以在任何地方使用相同的代码,而且我经常使用笔记本进行快速原型制作,因此涉及将绝对路径捆绑在一起的建议不太可能有帮助。


编辑:这与Python 3中的相对导入不同,后者相对于Python 3中的相对导入一般来说,尤其是从包目录中运行脚本。这与在jupyter笔记本中工作有关,该笔记本试图调用另一个目录中具有不同常规和特定方面的本地模块中的函数。

I have a directory structure similar to the following

meta_project
    project1
        __init__.py
        lib
            module.py
            __init__.py
    notebook_folder
        notebook.jpynb

When working in notebook.jpynb if I try to use a relative import to access a function function() in module.py with:

from ..project1.lib.module import function

I get the following error:

SystemError                               Traceback (most recent call last)
<ipython-input-7-6393744d93ab> in <module>()
----> 1 from ..project1.lib.module import function

SystemError: Parent module '' not loaded, cannot perform relative import

Is there any way to get this to work using relative imports?

Note, the notebook server is instantiated at the level of the meta_project directory, so it should have access to the information in those files.

Note, also, that at least as originally intended project1 wasn’t thought of as a module and therefore does not have an __init__.py file, it was just meant as a file-system directory. If the solution to the problem requires treating it as a module and including an __init__.py file (even a blank one) that is fine, but doing so is not enough to solve the problem.

I share this directory between machines and relative imports allow me to use the same code everywhere, & I often use notebooks for quick prototyping, so suggestions that involve hacking together absolute paths are unlikely to be helpful.


Edit: This is unlike Relative imports in Python 3, which talks about relative imports in Python 3 in general and – in particular – running a script from within a package directory. This has to do with working within a jupyter notebook trying to call a function in a local module in another directory which has both different general and particular aspects.


回答 0

此笔记本中,我有一个与您几乎相同的示例,在我想以DRY方式说明相邻模块功能的用法。

我的解决方案是通过向笔记本中添加如下代码段来告知Python该额外的模块导入路径:

import os
import sys
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)

这使您可以从模块层次结构中导入所需的功能:

from project1.lib.module import function
# use the function normally
function(...)

请注意,如果还没有空__init__.py文件,则必须将它们添加到project1 /lib /文件夹中。

I had almost the same example as you in this notebook where I wanted to illustrate the usage of an adjacent module’s function in a DRY manner.

My solution was to tell Python of that additional module import path by adding a snippet like this one to the notebook:

import os
import sys
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)

This allows you to import the desired function from the module hierarchy:

from project1.lib.module import function
# use the function normally
function(...)

Note that it is necessary to add empty __init__.py files to project1/ and lib/ folders if you don’t have them already.


回答 1

在这里使用笔记本时,正在寻求将代码抽象到子模块的最佳实践。我不确定是否有最佳做法。我一直在提出这个建议。

这样的项目层次结构:

├── ipynb
   ├── 20170609-Examine_Database_Requirements.ipynb
   └── 20170609-Initial_Database_Connection.ipynb
└── lib
    ├── __init__.py
    └── postgres.py

来自20170609-Initial_Database_Connection.ipynb

    In [1]: cd ..

    In [2]: from lib.postgres import database_connection

之所以可行,是因为默认情况下Jupyter Notebook可以解析该cd命令。请注意,这没有利用Python Notebook魔术。它只是工作而无需前置%bash

考虑到我使用Project Jupyter Docker映像之一在Docker中工作的100次中有99次,以下修改幂等的

    In [1]: cd /home/jovyan

    In [2]: from lib.postgres import database_connection

Came here searching for best practices in abstracting code to submodules when working in Notebooks. I’m not sure that there is a best practice. I have been proposing this.

A project hierarchy as such:

├── ipynb
│   ├── 20170609-Examine_Database_Requirements.ipynb
│   └── 20170609-Initial_Database_Connection.ipynb
└── lib
    ├── __init__.py
    └── postgres.py

And from 20170609-Initial_Database_Connection.ipynb:

    In [1]: cd ..

    In [2]: from lib.postgres import database_connection

This works because by default the Jupyter Notebook can parse the cd command. Note that this does not make use of Python Notebook magic. It simply works without prepending %bash.

Considering that 99 times out of a 100 I am working in Docker using one of the Project Jupyter Docker images, the following modification is idempotent

    In [1]: cd /home/jovyan

    In [2]: from lib.postgres import database_connection

回答 2

到目前为止,已接受的答案对我来说效果最好。但是,我一直担心的是,在某些情况下,我可能会将notebooks目录重构为子目录,从而需要module_path在每个笔记本中进行更改。我决定在每个笔记本目录中添加一个python文件,以导入所需的模块。

因此,具有以下项目结构:

project
|__notebooks
   |__explore
      |__ notebook1.ipynb
      |__ notebook2.ipynb
      |__ project_path.py
   |__ explain
       |__notebook1.ipynb
       |__project_path.py
|__lib
   |__ __init__.py
   |__ module.py

project_path.py在每个笔记本子目录(notebooks/explorenotebooks/explain)中添加了文件。此文件包含相对导入的代码(来自@metakermit):

import sys
import os

module_path = os.path.abspath(os.path.join(os.pardir, os.pardir))
if module_path not in sys.path:
    sys.path.append(module_path)

这样,我只需要在project_path.py文件中而不是在笔记本中进行相对导入即可。然后,笔记本文件仅需要在导入project_path之前导入lib。例如在0.0-notebook.ipynb

import project_path
import lib

需要注意的是,逆转进口将行不通。这不起作用:

import lib
import project_path

因此在进口期间必须小心。

So far, the accepted answer has worked best for me. However, my concern has always been that there is a likely scenario where I might refactor the notebooks directory into subdirectories, requiring to change the module_path in every notebook. I decided to add a python file within each notebook directory to import the required modules.

Thus, having the following project structure:

project
|__notebooks
   |__explore
      |__ notebook1.ipynb
      |__ notebook2.ipynb
      |__ project_path.py
   |__ explain
       |__notebook1.ipynb
       |__project_path.py
|__lib
   |__ __init__.py
   |__ module.py

I added the file project_path.py in each notebook subdirectory (notebooks/explore and notebooks/explain). This file contains the code for relative imports (from @metakermit):

import sys
import os

module_path = os.path.abspath(os.path.join(os.pardir, os.pardir))
if module_path not in sys.path:
    sys.path.append(module_path)

This way, I just need to do relative imports within the project_path.py file, and not in the notebooks. The notebooks files would then just need to import project_path before importing lib. For example in 0.0-notebook.ipynb:

import project_path
import lib

The caveat here is that reversing the imports would not work. THIS DOES NOT WORK:

import lib
import project_path

Thus care must be taken during imports.


回答 3

我刚刚找到了这个漂亮的解决方案:

import sys; sys.path.insert(0, '..') # add parent folder path where lib folder is
import lib.store_load # store_load is a file on my library folder

您只需要该文件的某些功能

from lib.store_load import your_function_name

如果python版本> = 3.3,则不需要文件夹中的init.py文件

I have just found this pretty solution:

import sys; sys.path.insert(0, '..') # add parent folder path where lib folder is
import lib.store_load # store_load is a file on my library folder

You just want some functions of that file

from lib.store_load import your_function_name

If python version >= 3.3 you do not need init.py file in the folder


回答 4

我自己研究此主题并阅读答案,因此我建议使用path.py库,因为该提供了用于更改当前工作目录的上下文管理器。

然后你有类似的东西

import path
if path.Path('../lib').isdir():
    with path.Path('..'):
        import lib

虽然,您可能只是省略了isdir声明。

在这里,我将添加打印语句,以便于跟踪正在发生的事情

import path
import pandas

print(path.Path.getcwd())
print(path.Path('../lib').isdir())
if path.Path('../lib').isdir():
    with path.Path('..'):
        print(path.Path.getcwd())
        import lib
        print('Success!')
print(path.Path.getcwd())

在此示例中输出(其中lib在/home/jovyan/shared/notebooks/by-team/data-vis/demos/lib):

/home/jovyan/shared/notebooks/by-team/data-vis/demos/custom-chart
/home/jovyan/shared/notebooks/by-team/data-vis/demos
/home/jovyan/shared/notebooks/by-team/data-vis/demos/custom-chart

由于该解决方案使用上下文管理器,因此无论内核在单元之前处于什么状态,以及导入库代码引发了什么异常,都可以保证返回到先前的工作目录。

Researching this topic myself and having read the answers I recommend using the path.py library since it provides a context manager for changing the current working directory.

You then have something like

import path
if path.Path('../lib').isdir():
    with path.Path('..'):
        import lib

Although, you might just omit the isdir statement.

Here I’ll add print statements to make it easy to follow what’s happening

import path
import pandas

print(path.Path.getcwd())
print(path.Path('../lib').isdir())
if path.Path('../lib').isdir():
    with path.Path('..'):
        print(path.Path.getcwd())
        import lib
        print('Success!')
print(path.Path.getcwd())

which outputs in this example (where lib is at /home/jovyan/shared/notebooks/by-team/data-vis/demos/lib):

/home/jovyan/shared/notebooks/by-team/data-vis/demos/custom-chart
/home/jovyan/shared/notebooks/by-team/data-vis/demos
/home/jovyan/shared/notebooks/by-team/data-vis/demos/custom-chart

Since the solution uses a context manager, you are guaranteed to go back to your previous working directory, no matter what state your kernel was in before the cell and no matter what exceptions are thrown by importing your library code.


回答 5

这是我的2美分:

导入系统

映射模块文件所在的路径。就我而言,它是台式机

sys.path.append(’/ Users / John / Desktop’)

要么导入整个映射模块,要么然后使用.notation来映射诸如mapping.Shipping()的类。

导入映射#mapping.py是我的模块文件的名称

shipit = mapping.Shipment()#Shipment是我需要在映射模块中使用的类的名称

或从映射模块导入特定的类

从映射导入映射

shipit = Shipment()#现在,您不必使用.notation

Here’s my 2 cents:

import sys

map the path where the module file is located. In my case it was the desktop

sys.path.append(‘/Users/John/Desktop’)

Either import the whole mapping module BUT then you have to use the .notation to map the classes like mapping.Shipping()

import mapping #mapping.py is the name of my module file

shipit = mapping.Shipment() #Shipment is the name of the class I need to use in the mapping module

Or import the specific class from the mapping module

from mapping import Mapping

shipit = Shipment() #Now you don’t have to use the .notation


回答 6

我发现python-dotenv可以非常有效地解决此问题。您的项目结构最终会稍有变化,但是笔记本中的代码在笔记本之间更简单,更一致。

对于您的项目,请进行一些安装。

pipenv install python-dotenv

然后,项目更改为:

├── .env (this can be empty)
├── ipynb
   ├── 20170609-Examine_Database_Requirements.ipynb
   └── 20170609-Initial_Database_Connection.ipynb
└── lib
    ├── __init__.py
    └── postgres.py

最后,您的导入更改为:

import os
import sys

from dotenv import find_dotenv


sys.path.append(os.path.dirname(find_dotenv()))

此软件包的+1是您的笔记本可以位于多个目录中。python-dotenv将在父目录中找到最接近的目录并使用它。此方法的+2是jupyter将在启动时从.env文件加载环境变量。双重打击。

I have found that python-dotenv helps solve this issue pretty effectively. Your project structure ends up changing slightly, but the code in your notebook is a bit simpler and consistent across notebooks.

For your project, do a little install.

pipenv install python-dotenv

Then, project changes to:

├── .env (this can be empty)
├── ipynb
│   ├── 20170609-Examine_Database_Requirements.ipynb
│   └── 20170609-Initial_Database_Connection.ipynb
└── lib
    ├── __init__.py
    └── postgres.py

And finally, your import changes to:

import os
import sys

from dotenv import find_dotenv


sys.path.append(os.path.dirname(find_dotenv()))

A +1 for this package is that your notebooks can be several directories deep. python-dotenv will find the closest one in a parent directory and use it. A +2 for this approach is that jupyter will load environment variables from the .env file on startup. Double whammy.


如何在IPython Notebook中打开交互式matplotlib窗口?

问题:如何在IPython Notebook中打开交互式matplotlib窗口?

我正在使用IPython,--pylab=inline有时想快速切换到交互式可缩放的matplotlib GUI来查看图(在终端Python控制台中绘制图时会弹出的图)。我该怎么办?最好不要离开或重新启动笔记本。

IPy笔记本中的内联绘图的问题在于它们的分辨率有限,我无法放大以查看一些较小的部分。使用从终端启动的maptlotlib GUI,我可以选择要放大的图形矩形,并相应地调整轴。我尝试过

from matplotlib import interactive
interactive(True)

interactive(False)

但这什么也没做。我在网上也找不到任何提示。

I am using IPython with --pylab=inline and would sometimes like to quickly switch to the interactive, zoomable matplotlib GUI for viewing plots (the one that pops up when you plot something in a terminal Python console). How could I do that? Preferably without leaving or restarting my notebook.

The problem with inline plots in IPy notebook is that they are of a limited resolution and I can’t zoom into them to see some smaller parts. With the maptlotlib GUI that starts from a terminal, I can select a rectangle of the graph that I want to zoom into and the axes adjust accordingly. I tried experimenting with

from matplotlib import interactive
interactive(True)

and

interactive(False)

but that didn’t do anything. I couldn’t find any hint online either.


回答 0

根据文档,您应该能够像这样来回切换:

In [2]: %matplotlib inline 
In [3]: plot(...)

In [4]: %matplotlib qt  # wx, gtk, osx, tk, empty uses default
In [5]: plot(...) 

然后会弹出一个常规绘图窗口(可能需要在笔记本计算机上重新启动)。

我希望这有帮助。

According to the documentation, you should be able to switch back and forth like this:

In [2]: %matplotlib inline 
In [3]: plot(...)

In [4]: %matplotlib qt  # wx, gtk, osx, tk, empty uses default
In [5]: plot(...) 

and that will pop up a regular plot window (a restart on the notebook may be necessary).

I hope this helps.


回答 1

如果您要做的只是从内联图切换到交互式图,然后再切换回去(以便可以平移/缩放),则最好使用%matplotlib magic。

#interactive plotting in separate window
%matplotlib qt 

然后返回html

#normal charts inside notebooks
%matplotlib inline 

%pylab magic会导入很多其他内容,甚至可能导致冲突。它执行“从pylab导入*”。

您还可以使用新的笔记本后端(在matplotlib 1.4中添加):

#interactive charts inside notebooks, matplotlib 1.4+
%matplotlib notebook 

如果您想在图表中增加交互性,可以查看mpld3bokeh。mpld3很棒,如果您没有大量数据点(例如<5k +),并且您想要使用普通的matplotlib语法,但与%matplotlib notebook相比,则具有更多的交互性。Bokeh可以处理大量数据,但是您需要学习它的语法,因为它是一个单独的库。

你也可以签出pivottablejs(pip installivottablejs)

from pivottablejs import pivot_ui
pivot_ui(df)

不管是多么酷的交互式数据探索,它都完全会破坏可重复性。它发生在我身上,所以一旦我感觉到数据,我就尝试只在早期就使用它,并切换到纯内联matplotlib / seaborn。

If all you want to do is to switch from inline plots to interactive and back (so that you can pan/zoom), it is better to use %matplotlib magic.

#interactive plotting in separate window
%matplotlib qt 

and back to html

#normal charts inside notebooks
%matplotlib inline 

%pylab magic imports a bunch of other things and may even result in a conflict. It does “from pylab import *”.

You also can use new notebook backend (added in matplotlib 1.4):

#interactive charts inside notebooks, matplotlib 1.4+
%matplotlib notebook 

If you want to have more interactivity in your charts, you can look at mpld3 and bokeh. mpld3 is great, if you don’t have ton’s of data points (e.g. <5k+) and you want to use normal matplotlib syntax, but more interactivity, compared to %matplotlib notebook . Bokeh can handle lots of data, but you need to learn it’s syntax as it is a separate library.

Also you can check out pivottablejs (pip install pivottablejs)

from pivottablejs import pivot_ui
pivot_ui(df)

However cool interactive data exploration is, it can totally mess with reproducibility. It has happened to me, so I try to use it only at the very early stage and switch to pure inline matplotlib/seaborn, once I got the feel for the data.


回答 2

从matplotlib 1.4.0开始,现在有一个用于笔记本的交互式后端

%matplotlib notebook

有一些版本的IPython尚未注册该别名,回退是:

%matplotlib nbagg

如果那不起作用,请更新您的IPython。

要玩这个游戏,请转到tmpnb.org

并粘贴

%matplotlib notebook

import pandas as pd
import numpy as np
import matplotlib

from matplotlib import pyplot as plt
import seaborn as sns

ts = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/2000', periods=1000))
ts = ts.cumsum()

df = pd.DataFrame(np.random.randn(1000, 4), index=ts.index,
                  columns=['A', 'B', 'C', 'D'])
df = df.cumsum()
df.plot(); plt.legend(loc='best')    

进入代码单元(或仅修改现有的python演示笔记本)

Starting with matplotlib 1.4.0 there is now an an interactive backend for use in the notebook

%matplotlib notebook

There are a few version of IPython which do not have that alias registered, the fall back is:

%matplotlib nbagg

If that does not work update you IPython.

To play with this, goto tmpnb.org

and paste

%matplotlib notebook

import pandas as pd
import numpy as np
import matplotlib

from matplotlib import pyplot as plt
import seaborn as sns

ts = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/2000', periods=1000))
ts = ts.cumsum()

df = pd.DataFrame(np.random.randn(1000, 4), index=ts.index,
                  columns=['A', 'B', 'C', 'D'])
df = df.cumsum()
df.plot(); plt.legend(loc='best')    

into a code cell (or just modify the existing python demo notebook)


回答 3

更好的解决方案可能是图表库。它使您能够使用出色的Highcharts javascript库制作精美的交互式绘图。Highcharts使用HTMLsvg标记,因此您的所有图表实际上都是矢量图像。

一些功能:

  • 您可以下载.png,.jpg和.svg格式的矢量图,因此永远不会遇到分辨率问题
  • 交互式图表(缩放,滑动,将鼠标悬停在点上,…)
  • 在IPython笔记本中可用
  • 使用异步绘图功能可同时探索数百个数据结构。

免责声明:我是图书馆的开发人员

A better solution for your problem might be the Charts library. It enables you to use the excellent Highcharts javascript library to make beautiful and interactive plots. Highcharts uses the HTML svg tag so all your charts are actually vector images.

Some features:

  • Vector plots which you can download in .png, .jpg and .svg formats so you will never run into resolution problems
  • Interactive charts (zoom, slide, hover over points, …)
  • Usable in an IPython notebook
  • Explore hundreds of data structures at the same time using the asynchronous plotting capabilities.

Disclaimer: I’m the developer of the library


回答 4

我在2011年5月28日从www.continuum.io/downloads的Anaconda的“ jupyter QTConsole”中使用ipython。

这是一个使用ipython magic在一个单独的窗口和一个内联绘图模式之间来回切换的示例。

>>> import matplotlib.pyplot as plt

# data to plot
>>> x1 = [x for x in range(20)]

# Show in separate window
>>> %matplotlib
>>> plt.plot(x1)
>>> plt.close() 

# Show in console window
>>> %matplotlib inline
>>> plt.plot(x1)
>>> plt.close() 

# Show in separate window
>>> %matplotlib
>>> plt.plot(x1)
>>> plt.close() 

# Show in console window
>>> %matplotlib inline
>>> plt.plot(x1)
>>> plt.close() 

# Note: the %matplotlib magic above causes:
#      plt.plot(...) 
# to implicitly include a:
#      plt.show()
# after the command.
#
# (Not sure how to turn off this behavior
# so that it matches behavior without using %matplotlib magic...)
# but its ok for interactive work...

I’m using ipython in “jupyter QTConsole” from Anaconda at www.continuum.io/downloads on 5/28/20117.

Here’s an example to flip back and forth between a separate window and an inline plot mode using ipython magic.

>>> import matplotlib.pyplot as plt

# data to plot
>>> x1 = [x for x in range(20)]

# Show in separate window
>>> %matplotlib
>>> plt.plot(x1)
>>> plt.close() 

# Show in console window
>>> %matplotlib inline
>>> plt.plot(x1)
>>> plt.close() 

# Show in separate window
>>> %matplotlib
>>> plt.plot(x1)
>>> plt.close() 

# Show in console window
>>> %matplotlib inline
>>> plt.plot(x1)
>>> plt.close() 

# Note: the %matplotlib magic above causes:
#      plt.plot(...) 
# to implicitly include a:
#      plt.show()
# after the command.
#
# (Not sure how to turn off this behavior
# so that it matches behavior without using %matplotlib magic...)
# but its ok for interactive work...

回答 5

重新启动内核并清除输出(如果不是从新笔记本开始),然后运行

%matplotlib tk

有关更多信息,请转到使用matplotlib进行绘图

Restart kernel and clear output (if not starting with new notebook), then run

%matplotlib tk

For more info go to Plotting with matplotlib


回答 6

您可以使用

%matplotlib qt

如果出现错误,ImportError: Failed to import any qt binding则将PyQt5安装为:pip install PyQt5它对我有用。

You can use

%matplotlib qt

If you got the error ImportError: Failed to import any qt binding then install PyQt5 as: pip install PyQt5 and it works for me.


在iPython Notebook中进行调试的正确方法是什么?

问题:在iPython Notebook中进行调试的正确方法是什么?

我所知, %debug magic可以在一个单元内进行调试。

但是,我有跨多个单元格的函数调用。

例如,

In[1]: def fun1(a)
           def fun2(b)
               # I want to set a breakpoint for the following line #
               return do_some_thing_about(b)

       return fun2(a)

In[2]: import multiprocessing as mp
       pool=mp.Pool(processes=2)
       results=pool.map(fun1, 1.0)
       pool.close()
       pool.join

我试过的

  1. 我试图%debug在cell-1的第一行中设置。但是它甚至在执行单元2之前就立即进入调试模式。

  2. 我试图%debug在代码之前添加该行return do_some_thing_about(b)。但是,代码将永远运行,永远不会停止。

在ipython笔记本中设置断点的正确方法是什么?

As I know, %debug magic can do debug within one cell.

However, I have function calls across multiple cells.

For example,

In[1]: def fun1(a)
           def fun2(b)
               # I want to set a breakpoint for the following line #
               return do_some_thing_about(b)

       return fun2(a)

In[2]: import multiprocessing as mp
       pool=mp.Pool(processes=2)
       results=pool.map(fun1, 1.0)
       pool.close()
       pool.join

What I tried:

  1. I tried to set %debug in the first line of cell-1. But it enter into debug mode immediately, even before executing cell-2.

  2. I tried to add %debug in the line right before the code return do_some_thing_about(b). But then the code runs forever, never stops.

What is the right way to set a break point within the ipython notebook?


回答 0

使用ipdb

通过安装

pip install ipdb

用法:

In[1]: def fun1(a):
   def fun2(a):
       import ipdb; ipdb.set_trace() # debugging starts here
       return do_some_thing_about(b)
   return fun2(a)
In[2]: fun1(1)

用于逐行执行n和进入功能使用,s并退出调试提示使用c

有关可用命令的完整列表:https : //appletree.or.kr/quick_reference_cards/Python/Python%20Debugger%20Cheatsheet.pdf

Use ipdb

Install it via

pip install ipdb

Usage:

In[1]: def fun1(a):
   def fun2(a):
       import ipdb; ipdb.set_trace() # debugging starts here
       return do_some_thing_about(b)
   return fun2(a)
In[2]: fun1(1)

For executing line by line use n and for step into a function use s and to exit from debugging prompt use c.

For complete list of available commands: https://appletree.or.kr/quick_reference_cards/Python/Python%20Debugger%20Cheatsheet.pdf


回答 1

您可以ipdb在jupyter内部使用以下命令:

from IPython.core.debugger import Tracer; Tracer()()

编辑:自IPython 5.1起,不推荐使用上述功能。这是新方法:

from IPython.core.debugger import set_trace

set_trace()在需要断点的地方添加。键入help用于ipdb命令输入字段出现时。

You can use ipdb inside jupyter with:

from IPython.core.debugger import Tracer; Tracer()()

Edit: the functions above are deprecated since IPython 5.1. This is the new approach:

from IPython.core.debugger import set_trace

Add set_trace() where you need a breakpoint. Type help for ipdb commands when the input field appears.


回答 2

您的返回函数位于def函数(主函数)的行中,您必须给它一个制表符。和使用

%%debug 

代替

%debug 

调试整个单元而不仅仅是行。希望这可能对您有帮助。

Your return function is in line of def function(main function), you must give one tab to it. And Use

%%debug 

instead of

%debug 

to debug the whole cell not only line. Hope, maybe this will help you.


回答 3

您始终可以在任何单元格中添加它:

import pdb; pdb.set_trace()

调试器将在该行停止。例如:

In[1]: def fun1(a):
           def fun2(a):
               import pdb; pdb.set_trace() # debugging starts here
           return fun2(a)

In[2]: fun1(1)

You can always add this in any cell:

import pdb; pdb.set_trace()

and the debugger will stop on that line. For example:

In[1]: def fun1(a):
           def fun2(a):
               import pdb; pdb.set_trace() # debugging starts here
           return fun2(a)

In[2]: fun1(1)

回答 4

在Python 3.7中,您可以使用breakpoint()函数。只需输入

breakpoint()

无论您想在哪里停止运行时,都可以使用相同的pdb命令(r,c,n,…)或评估变量。

In Python 3.7 you can use breakpoint() function. Just enter

breakpoint()

wherever you would like runtime to stop and from there you can use the same pdb commands (r, c, n, …) or evaluate your variables.


回答 5

只需键入import pdb在jupyter笔记本,然后用这个的cheatsheet调试。非常方便

c->继续,s->步进,b 12->在第12行设置断点,依此类推。

一些有用的链接: pdb上的Python官方文档Python pdb调试器示例,以更好地了解如何使用调试器命令

一些有用的屏幕截图:

Just type import pdb in jupyter notebook, and then use this cheatsheet to debug. It’s very convenient.

c –> continue, s –> step, b 12 –> set break point at line 12 and so on.

Some useful links: Python Official Document on pdb, Python pdb debugger examples for better understanding how to use the debugger commands.

Some useful screenshots:


回答 6

得到错误后,在下一个单元格中运行%debug,仅此而已。

After you get an error, in the next cell just run %debug and that’s it.


回答 7

%pdb魔术的命令是很好用为好。只需说一遍%pdb on,随后pdb调试器将在所有异常上运行,无论调用堆栈中有多深。非常便利。

如果您有要调试的特定行,只需在此处引发一个异常(通常您已经!),或使用%debug其他人一直在建议的magic命令。

The %pdb magic command is good to use as well. Just say %pdb on and subsequently the pdb debugger will run on all exceptions, no matter how deep in the call stack. Very handy.

If you have a particular line that you want to debug, just raise an exception there (often you already are!) or use the %debug magic command that other folks have been suggesting.


回答 8

我刚刚发现了PixieDebugger。甚至以为我还没有时间进行测试,这似乎确实是调试我们在ipdb中使用ipython的方式的最相似方法

它还有一个“评估”标签

I just discovered PixieDebugger. Even thought I have not yet had the time to test it, it really seems the most similar way to debug the way we’re used in ipython with ipdb

It also has an “evaluate” tab


回答 9

提供了本机调试器作为JupyterLab的扩展。可以在几周前发布,可以通过获取相关扩展以及xeus-python内核(尤其是没有ipykernel用户众所周知的魔术)来安装它:

jupyter labextension install @jupyterlab/debugger
conda install xeus-python -c conda-forge

这样可以实现其他IDE众所周知的可视化调试体验。

来源:Jupyter的可视调试器

A native debugger is being made available as an extension to JupyterLab. Released a few weeks ago, this can be installed by getting the relevant extension, as well as xeus-python kernel (which notably comes without the magics well-known to ipykernel users):

jupyter labextension install @jupyterlab/debugger
conda install xeus-python -c conda-forge

This enables a visual debugging experience well-known from other IDEs.

Source: A visual debugger for Jupyter


如何知道Jupyter笔记本中正在运行哪个?

问题:如何知道Jupyter笔记本中正在运行哪个?

我在用于Python编程的浏览器中使用Jupyter笔记本,已经安装了Anaconda(Python 3.5)。但是我很确定Jupyter使用本地python解释器而不是anaconda运行我的python命令。如何更改它并将Anaconda用作解释器?

Ubuntu 16.10-Anaconda3

I use Jupyter notebook in a browser for Python programming, I have installed Anaconda (Python 3.5). But I’m quite sure that Jupyter in running my python commands with the native python interpreter and not with anaconda. How can I change it and use Anaconda as interpreter?

Ubuntu 16.10 — Anaconda3


回答 0

from platform import python_version

print(python_version())

这将为您提供运行脚本的python的确切版本。例如输出:

3.6.5
from platform import python_version

print(python_version())

This will give you the exact version of python running your script. eg output:

3.6.5

回答 1

import sys
sys.executable

会给你翻译。您可以在创建新笔记本时选择所需的解释器。确保您的anaconda解释器的路径已添加到您的路径中(最有可能在bashrc / bash_profile中的某个位置)。

例如,我以前在.bash_profile中有以下行,我是手动添加的:

export PATH="$HOME/anaconda3/bin:$PATH"

编辑:如评论中所述,这不是将anaconda添加到路径的正确方法。引用Anaconda的文档,应在安装后改为使用以下方法conda init

我应该将Anaconda添加到macOS或Linux PATH吗?

我们不建议手动将Anaconda添加到PATH。在安装过程中,系统将询问您“是否希望安装程序通过运行conda init来初始化Anaconda3?” 我们建议“是”。如果输入“ no”,则conda根本不会修改您的Shell脚本。为了在安装过程完成后进行初始化,请先运行source <path to conda>/bin/activate然后再运行conda init

import sys
sys.executable

will give you the interpreter. You can select the interpreter you want when you create a new notebook. Make sure the path to your anaconda interpreter is added to your path (somewhere in your bashrc/bash_profile most likely).

For example I used to have the following line in my .bash_profile, that I added manually :

export PATH="$HOME/anaconda3/bin:$PATH"

EDIT: As mentioned in a comment, this is not the proper way to add anaconda to the path. Quoting Anaconda’s doc, this should be done instead after install, using conda init:

Should I add Anaconda to the macOS or Linux PATH?

We do not recommend adding Anaconda to the PATH manually. During installation, you will be asked “Do you wish the installer to initialize Anaconda3 by running conda init?” We recommend “yes”. If you enter “no”, then conda will not modify your shell scripts at all. In order to initialize after the installation process is done, first run source <path to conda>/bin/activate and then run conda init


回答 2

import sys
print(sys.executable)
print(sys.version)
print(sys.version_info)

见下文:-当我在CONDA venv之外运行JupyterNotebook时的输出

/home/dhankar/anaconda2/bin/python
2.7.12 |Anaconda 4.2.0 (64-bit)| (default, Jul  2 2016, 17:42:40) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]
sys.version_info(major=2, minor=7, micro=12, releaselevel='final', serial=0)
 

当我在使用命令创建的CONDA Venv中运行相同的JupyterNoteBook时看到以下内容-

conda create -n py35 python=3.5 ## Here - py35 , is name of my VENV

在我的Jupyter笔记本中打印:

/home/dhankar/anaconda2/envs/py35/bin/python
3.5.2 |Continuum Analytics, Inc.| (default, Jul  2 2016, 17:53:06) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]
sys.version_info(major=3, minor=5, micro=2, releaselevel='final', serial=0)

另外,如果您已经使用不同版本的Python创建了各种VENV,则可以通过从JupyterNotebook菜单中选择KERNEL >> CHANGE KERNEL切换到所需的内核… JupyterNotebookScreencapture

还要在现有的CONDA虚拟环境中安装ipykernel-

http://ipython.readthedocs.io/en/stable/install/kernel_install.html#kernels-for-different-environments

来源-https ://github.com/jupyter/notebook/issues/1524

 $ /path/to/python -m  ipykernel install --help
 usage: ipython-kernel-install [-h] [--user] [--name NAME]
                          [--display-name DISPLAY_NAME]
                          [--profile PROFILE] [--prefix PREFIX]
                          [--sys-prefix]

安装IPython内核规范。

可选参数:-h,–help显示此帮助消息并退出–user为当前用户而不是系统范围内安装–name NAME指定kernelspec的名称。需要同时具有多个IPython内核。–display-name DISPLAY_NAME指定kernelspec的显示名称。当您有多个IPython内核时,这将很有帮助。–profile PROFILE指定要加载的IPython配置文件。这可以用来创建内核的自定义版本。–prefix PREFIX为kernelspec指定安装前缀。需要将其安装到非默认位置,例如conda / virtual-env。–sys-prefix安装到Python的sys.prefix。–prefix =’/ Users / bussonniermatthias / anaconda’的简写。用于conda / virtual-envs。

 import sys
 print(sys.executable)
 print(sys.version)
 print(sys.version_info)

Seen below :- output when i run JupyterNotebook outside a CONDA venv

/home/dhankar/anaconda2/bin/python
2.7.12 |Anaconda 4.2.0 (64-bit)| (default, Jul  2 2016, 17:42:40) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]
sys.version_info(major=2, minor=7, micro=12, releaselevel='final', serial=0)

Seen below when i run same JupyterNoteBook within a CONDA Venv created with command —

conda create -n py35 python=3.5 ## Here - py35 , is name of my VENV

in my Jupyter Notebook it prints :-

/home/dhankar/anaconda2/envs/py35/bin/python
3.5.2 |Continuum Analytics, Inc.| (default, Jul  2 2016, 17:53:06) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]
sys.version_info(major=3, minor=5, micro=2, releaselevel='final', serial=0)

also if you already have various VENV’s created with different versions of Python you switch to the desired Kernel by choosing KERNEL >> CHANGE KERNEL from within the JupyterNotebook menu… JupyterNotebookScreencapture

Also to install ipykernel within an existing CONDA Virtual Environment –

http://ipython.readthedocs.io/en/stable/install/kernel_install.html#kernels-for-different-environments

Source — https://github.com/jupyter/notebook/issues/1524

 $ /path/to/python -m  ipykernel install --help
 usage: ipython-kernel-install [-h] [--user] [--name NAME]
                          [--display-name DISPLAY_NAME]
                          [--profile PROFILE] [--prefix PREFIX]
                          [--sys-prefix]

Install the IPython kernel spec.

optional arguments: -h, –help show this help message and exit –user Install for the current user instead of system-wide –name NAME Specify a name for the kernelspec. This is needed to have multiple IPython kernels at the same time. –display-name DISPLAY_NAME Specify the display name for the kernelspec. This is helpful when you have multiple IPython kernels. –profile PROFILE Specify an IPython profile to load. This can be used to create custom versions of the kernel. –prefix PREFIX Specify an install prefix for the kernelspec. This is needed to install into a non-default location, such as a conda/virtual-env. –sys-prefix Install to Python’s sys.prefix. Shorthand for –prefix=’/Users/bussonniermatthias/anaconda’. For use in conda/virtual-envs.


回答 3

假设您的后端系统错误,则可以kernel通过kernel.jsonkernelsjupyter数据路径的文件夹中创建新的或编辑现有的后端来更改后端jupyter --paths。您可以有多个内核(R,Python2,Python3(+ virtualenvs),Haskell),例如,可以创建一个Anaconda特定的内核:

$ <anaconda-path>/bin/python3 -m ipykernel install --user --name anaconda --display-name "Anaconda"

应该创建一个新内核:

<jupyter-data-dir>/kernels/anaconda/kernel.json

{
    "argv": [ "<anaconda-path>/bin/python3", "-m", "ipykernel", "-f", "{connection_file}" ],
    "display_name": "Anaconda",
    "language": "python"
}

您需要确保ipykernel在anaconda发行版中安装了软件包。

这样,您就可以在内核之间切换,并使用不同的内核使用不同的笔记本。

Assuming you have the wrong backend system you can change the backend kernel by creating a new or editing the existing kernel.json in the kernels folder of your jupyter data path jupyter --paths. You can have multiple kernels (R, Python2, Python3 (+virtualenvs), Haskell), e.g. you can create an Anaconda specific kernel:

$ <anaconda-path>/bin/python3 -m ipykernel install --user --name anaconda --display-name "Anaconda"

Should create a new kernel:

<jupyter-data-dir>/kernels/anaconda/kernel.json

{
    "argv": [ "<anaconda-path>/bin/python3", "-m", "ipykernel", "-f", "{connection_file}" ],
    "display_name": "Anaconda",
    "language": "python"
}

You need to ensure ipykernel package is installed in the anaconda distribution.

This way you can just switch between kernels and have different notebooks using different kernels.


回答 4

为Jupyter Notebook创建虚拟环境

最小的Python安装是

sudo apt install python3.7 python3.7-venv python3.7-minimal python3.7-distutils python3.7-dev python3.7-gdbm python3-gdbm-dbg python3-pip

然后您可以创建和使用环境

/usr/bin/python3.7 -m venv test
cd test
source test/bin/activate
pip install jupyter matplotlib seaborn numpy pandas scipy
# install other packages you need with pip/apt
jupyter notebook
deactivate

您可以使用以下命令为Jupyter创建内核

ipython3 kernel install --user --name=test

Creating a virtual environment for Jupyter Notebooks

A minimal Python install is

sudo apt install python3.7 python3.7-venv python3.7-minimal python3.7-distutils python3.7-dev python3.7-gdbm python3-gdbm-dbg python3-pip

Then you can create and use the environment

/usr/bin/python3.7 -m venv test
cd test
source test/bin/activate
pip install jupyter matplotlib seaborn numpy pandas scipy
# install other packages you need with pip/apt
jupyter notebook
deactivate

You can make a kernel for Jupyter with

ipython3 kernel install --user --name=test

在Ipython notebook / Jupyter中,Pandas未显示我尝试绘制的图形

问题:在Ipython notebook / Jupyter中,Pandas未显示我尝试绘制的图形

我正在尝试使用Ipython Notebook中的熊猫绘制一些数据,尽管它给了我对象,但实际上并没有绘制图形本身。所以看起来像这样:

In [7]:

pledge.Amount.plot()

Out[7]:

<matplotlib.axes.AxesSubplot at 0x9397c6c>

该图应在此之后,但根本不会出现。我已经导入了matplotlib,所以这不是问题。我还需要导入其他模块吗?

I am trying to plot some data using pandas in Ipython Notebook, and while it gives me the object, it doesn’t actually plot the graph itself. So it looks like this:

In [7]:

pledge.Amount.plot()

Out[7]:

<matplotlib.axes.AxesSubplot at 0x9397c6c>

The graph should follow after that, but it simply doesn’t appear. I have imported matplotlib, so that’s not the problem. Is there any other module I need to import?


回答 0

请注意,–pylab已被弃用,并且已从较新的IPython版本中删除。建议在IPython Notebook中启用内联绘图的方法现已运行:

%matplotlib inline
import matplotlib.pyplot as plt

有关更多详细信息,请参阅ipython-dev邮件列表中的这篇文章

Note that –pylab is deprecated and has been removed from newer builds of IPython, The recommended way to enable inline plotting in the IPython Notebook is now to run:

%matplotlib inline
import matplotlib.pyplot as plt

See this post from the ipython-dev mailing list for more details.


回答 1

编辑:Pylab已被弃用,请参阅当前接受的答案

好的,看来答案是使用–pylab = inline启动ipython Notebook。因此,ipython notebook –pylab = inline可以完成我之前看到的以及我想要它做的事情。对不起这个原始的问题。

Edit:Pylab has been deprecated please see the current accepted answer

Ok, It seems the answer is to start ipython notebook with –pylab=inline. so ipython notebook –pylab=inline This has it do what I saw earlier and what I wanted it to do. Sorry about the vague original question.


回答 2

与您import matplotlib.pyplot as plt只需添加

plt.show()

它将显示所有存储的图。

With your import matplotlib.pyplot as plt just add

plt.show()

and it will show all stored plots.


回答 3

导入matplotlib之后很简单,如果像这样启动ipython,就可以执行一个魔术

ipython notebook 

%matplotlib inline 

运行此命令,一切都会完美显示

simple after importing the matplotlib you have execute one magic if you have started the ipython as like this

ipython notebook 

%matplotlib inline 

run this command everything will be shown perfectly


回答 4

使用来启动ipython ipython notebook --pylab inline,然后图形将内联显示。

start ipython with ipython notebook --pylab inline ,then graph will show inline.


回答 5

import matplotlib as plt
%matplotlib as inline
import matplotlib as plt
%matplotlib as inline

回答 6

您需要做的就是导入 matplotlib。

import matplotlib.pyplot as plt 

All you need to do is to import matplotlib.

import matplotlib.pyplot as plt 

在Jupyter Python Notebook中显示所有数据框列

问题:在Jupyter Python Notebook中显示所有数据框列

我想在Jupyter Notebook的数据框中显示所有列。Jupyter显示一些列,并在最后一列中添加点,如下图所示:

如何显示所有列?

I want to show all columns in a dataframe in a Jupyter Notebook. Jupyter shows some of the columns and adds dots to the last columns like in the following picture:

How can I display all columns?


回答 0

尝试如下显示max_columns设置:

import pandas as pd
from IPython.display import display

df = pd.read_csv("some_data.csv")
pd.options.display.max_columns = None
display(df)

要么

pd.set_option('display.max_columns', None)

编辑:熊猫0.11.0向后

不建议使用此功能,但在低于0.11.0的Pandas版本中,该max_columns设置指定如下:

pd.set_printoptions(max_columns=500)

Try the display max_columns setting as follows:

import pandas as pd
from IPython.display import display

df = pd.read_csv("some_data.csv")
pd.options.display.max_columns = None
display(df)

Or

pd.set_option('display.max_columns', None)

Edit: Pandas 0.11.0 backwards

This is deprecated but in versions of Pandas older than 0.11.0 the max_columns setting is specified as follows:

pd.set_printoptions(max_columns=500)

回答 1

我知道这个问题有点老了,但是以下内容在运行pandas 0.22.0和Python 3的Jupyter Notebook中为我工作:

import pandas as pd
pd.set_option('display.max_columns', <number of columns>)

您也可以对行执行相同的操作:

pd.set_option('display.max_rows', <number of rows>)

这样可以节省导入IPython的时间,并且pandas.set_option文档中还有更多选项:https ://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.set_option.html

I know this question is a little old but the following worked for me in a Jupyter Notebook running pandas 0.22.0 and Python 3:

import pandas as pd
pd.set_option('display.max_columns', <number of columns>)

You can do the same for the rows too:

pd.set_option('display.max_rows', <number of rows>)

This saves importing IPython, and there are more options in the pandas.set_option documentation: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.set_option.html


回答 2

适用于大型(但不是太大)DataFrame的Python 3.x

也许是因为我有较旧版本的熊猫,但在Jupyter笔记本上,这对我有用

import pandas as pd
from IPython.core.display import HTML

df=pd.read_pickle('Data1')
display(HTML(df.to_html()))

Python 3.x for large (but not too large) DataFrames

Maybe because I have an older version of pandas but on Jupyter notebook this work for me

import pandas as pd
from IPython.core.display import HTML

df=pd.read_pickle('Data1')
display(HTML(df.to_html()))

回答 3

我建议在上下文管理器中设置显示选项,以使其仅影响单个输出。如果您还想打印“漂亮”的html版本,我将定义一个函数并df使用force_show_all(df)以下命令显示数据框:

from IPython.core.display import display, HTML

def force_show_all(df):
    with pd.option_context('display.max_rows', None, 'display.max_columns', None, 'display.width', None):
        display(HTML(df.to_html()))

正如其他人提到的那样,请谨慎地仅在合理大小的数据帧上调用它。

I recommend setting the display options inside a context manager so that it only affects a single output. If you also want to print a “pretty” html-version, I would define a function and display the dataframe df using force_show_all(df):

from IPython.core.display import display, HTML

def force_show_all(df):
    with pd.option_context('display.max_rows', None, 'display.max_columns', None, 'display.width', None):
        display(HTML(df.to_html()))

As others have mentioned, be cautious to only call this on a reasonably-sized dataframe.


回答 4

您可以使用pandas.set_option()作为列,您可以指定以下任何选项

pd.set_option("display.max_rows", 200)
pd.set_option("display.max_columns", 100)
pd.set_option("display.max_colwidth", 200)

对于完整的打印列,您可以这样使用

import pandas as pd
pd.set_option('display.max_colwidth', -1)
print(words.head())

you can use pandas.set_option(), for column, you can specify any of these options

pd.set_option("display.max_rows", 200)
pd.set_option("display.max_columns", 100)
pd.set_option("display.max_colwidth", 200)

For full print column, you can use like this

import pandas as pd
pd.set_option('display.max_colwidth', -1)
print(words.head())


回答 5

如果要显示所有行设置如下

pd.options.display.max_rows = None

如果要显示所有列,例如波纹管

pd.options.display.max_columns = None

If you want to show all the rows set like bellow

pd.options.display.max_rows = None

If you want to show all columns set like bellow

pd.options.display.max_columns = None

在Firefox的IPython Notebook中是否有等效于CTRL + C的功能来中断正在运行的单元格?

问题:在Firefox的IPython Notebook中是否有等效于CTRL + C的功能来中断正在运行的单元格?

我已经开始使用IPython Notebook并很喜欢它。有时,我编写需要占用大量内存或存在无限循环的错误代码。我发现“中断内核”选项缓慢或不可靠,有时我不得不重新启动内核,从而丢失了内存中的所有内容。

有时我还会编写一些脚本,导致O​​S X内存不足,并且必须进行硬重启。我肯定不是100%,但如果我写这样的错误报告之前,并在终端运行的Python,我平时可以CTRL+ C我的脚本。

我在Mac OS X上使用IPython Notebook的Anaconda发行版和Firefox。

I’ve started to use the IPython Notebook and am enjoying it. Sometimes, I write buggy code that takes massive memory requirements or has an infinite loop. I find the “interrupt kernel” option sluggish or unreliable, and sometimes I have to restart the kernel, losing everything in memory.

I also sometimes write scripts that cause OS X to run out of memory, and I have to do a hard reboot. I’m not 100% sure, but when I’ve written bugs like this before and ran Python in the terminal, I can usually CTRL+C my scripts.

I am using the Anaconda distribution of IPython notebook with Firefox on Mac OS X.


回答 0

我可能是错的,但我敢肯定的是,“中断内核”按钮,只需发送一个SIGINT信号到代码,您当前运行(这种想法是费尔南多的评论支持在这里,这是相同的东西,击打) CTRL + C可以。python中的某些进程比其他进程更突然地处理SIGINT。

如果您迫切需要停止iPython Notebook中正在运行的内容,并从终端启动iPython Notebook,则可以在该终端中按两次CTRL + C来中断整个iPython Notebook服务器。这将完全停止iPython Notebook,这意味着将无法重新启动或保存您的工作,因此,这显然不是一个很好的解决方案(您需要按CTRL + C两次,因为这是一项安全功能,因此人们无需意外地做)。但是,在紧急情况下,它通常比“中断内核”按钮更快地终止进程。

I could be wrong, but I’m pretty sure that the “interrupt kernel” button just sends a SIGINT signal to the code that you’re currently running (this idea is supported by Fernando’s comment here), which is the same thing that hitting CTRL+C would do. Some processes within python handle SIGINTs more abruptly than others.

If you desperately need to stop something that is running in iPython Notebook and you started iPython Notebook from a terminal, you can hit CTRL+C twice in that terminal to interrupt the entire iPython Notebook server. This will stop iPython Notebook alltogether, which means it won’t be possible to restart or save your work, so this is obviously not a great solution (you need to hit CTRL+C twice because it’s a safety feature so that people don’t do it by accident). In case of emergency, however, it generally kills the process more quickly than the “interrupt kernel” button.


回答 1

您可以按I两次以中断内核。

仅当您处于命令模式时,这才有效。如果尚未启用,请按Esc启用它。

You can press I twice to interrupt the kernel.

This only works if you’re in Command mode. If not already enabled, press Esc to enable it.


回答 2

是IPython Notebook的快捷方式。

Ctrl-m i中断内核。(即后面的唯一字母i Ctrl-m

根据这个答案,I两次也可以。

Here are shortcuts for the IPython Notebook.

Ctrl-m i interrupts the kernel. (that is, the sole letter i after Ctrl-m)

According to this answer, I twice works as well.


回答 3

添加到上面的内容:如果中断不起作用,则可以重新启动内核。

转到内核下拉菜单>>重新启动>>重新启动并清除输出。这通常可以解决问题。如果仍然无法解决问题,请在终端(或任务管理器)中终止内核,然后重新启动。

中断不适用于所有进程。使用R内核时,我尤其遇到这个问题。

To add to the above: If interrupt is not working, you can restart the kernel.

Go to the kernel dropdown >> restart >> restart and clear output. This usually does the trick. If this still doesn’t work, kill the kernel in the terminal (or task manager) and then restart.

Interrupt doesn’t work well for all processes. I especially have this problem using the R kernel.


回答 4

更新:将我的解决方案变成了独立的python脚本。

此解决方案为我节省了不止一次。希望其他人觉得它有用。该python脚本将查找使用不止cpu_thresholdCPU的jupyter内核,并提示用户将a发送SIGINT给内核(KeyboardInterrupt)。它将一直发送,SIGINT直到内核的cpu使用率低于为止cpu_threshold。如果存在多个行为异常的内核,它将提示用户中断每个内核(按CPU使用率从高到低的顺序排列)。非常感谢gcbeltramini编写了使用jupyter api查找jupyter内核名称的代码。该脚本已经在python3的MACOS上进行了测试,并且需要jupyter笔记本,请求,json和psutil。

将脚本放在您的主目录中,然后用法如下所示:

python ~/interrupt_bad_kernels.py
Interrupt kernel chews cpu.ipynb; PID: 57588; CPU: 2.3%? (y/n) y

下面的脚本代码:

from os import getpid, kill
from time import sleep
import re
import signal

from notebook.notebookapp import list_running_servers
from requests import get
from requests.compat import urljoin
import ipykernel
import json
import psutil


def get_active_kernels(cpu_threshold):
    """Get a list of active jupyter kernels."""
    active_kernels = []
    pids = psutil.pids()
    my_pid = getpid()

    for pid in pids:
        if pid == my_pid:
            continue
        try:
            p = psutil.Process(pid)
            cmd = p.cmdline()
            for arg in cmd:
                if arg.count('ipykernel'):
                    cpu = p.cpu_percent(interval=0.1)
                    if cpu > cpu_threshold:
                        active_kernels.append((cpu, pid, cmd))
        except psutil.AccessDenied:
            continue
    return active_kernels


def interrupt_bad_notebooks(cpu_threshold=0.2):
    """Interrupt active jupyter kernels. Prompts the user for each kernel."""

    active_kernels = sorted(get_active_kernels(cpu_threshold), reverse=True)

    servers = list_running_servers()
    for ss in servers:
        response = get(urljoin(ss['url'].replace('localhost', '127.0.0.1'), 'api/sessions'),
                       params={'token': ss.get('token', '')})
        for nn in json.loads(response.text):
            for kernel in active_kernels:
                for arg in kernel[-1]:
                    if arg.count(nn['kernel']['id']):
                        pid = kernel[1]
                        cpu = kernel[0]
                        interrupt = input(
                            'Interrupt kernel {}; PID: {}; CPU: {}%? (y/n) '.format(nn['notebook']['path'], pid, cpu))
                        if interrupt.lower() == 'y':
                            p = psutil.Process(pid)
                            while p.cpu_percent(interval=0.1) > cpu_threshold:
                                kill(pid, signal.SIGINT)
                                sleep(0.5)

if __name__ == '__main__':
    interrupt_bad_notebooks()

UPDATE: Turned my solution into a stand-alone python script.

This solution has saved me more than once. Hopefully others find it useful. This python script will find any jupyter kernel using more than cpu_threshold CPU and prompts the user to send a SIGINT to the kernel (KeyboardInterrupt). It will keep sending SIGINT until the kernel’s cpu usage goes below cpu_threshold. If there are multiple misbehaving kernels it will prompt the user to interrupt each of them (ordered by highest CPU usage to lowest). A big thanks goes to gcbeltramini for writing code to find the name of a jupyter kernel using the jupyter api. This script was tested on MACOS with python3 and requires jupyter notebook, requests, json and psutil.

Put the script in your home directory and then usage looks like:

python ~/interrupt_bad_kernels.py
Interrupt kernel chews cpu.ipynb; PID: 57588; CPU: 2.3%? (y/n) y

Script code below:

from os import getpid, kill
from time import sleep
import re
import signal

from notebook.notebookapp import list_running_servers
from requests import get
from requests.compat import urljoin
import ipykernel
import json
import psutil


def get_active_kernels(cpu_threshold):
    """Get a list of active jupyter kernels."""
    active_kernels = []
    pids = psutil.pids()
    my_pid = getpid()

    for pid in pids:
        if pid == my_pid:
            continue
        try:
            p = psutil.Process(pid)
            cmd = p.cmdline()
            for arg in cmd:
                if arg.count('ipykernel'):
                    cpu = p.cpu_percent(interval=0.1)
                    if cpu > cpu_threshold:
                        active_kernels.append((cpu, pid, cmd))
        except psutil.AccessDenied:
            continue
    return active_kernels


def interrupt_bad_notebooks(cpu_threshold=0.2):
    """Interrupt active jupyter kernels. Prompts the user for each kernel."""

    active_kernels = sorted(get_active_kernels(cpu_threshold), reverse=True)

    servers = list_running_servers()
    for ss in servers:
        response = get(urljoin(ss['url'].replace('localhost', '127.0.0.1'), 'api/sessions'),
                       params={'token': ss.get('token', '')})
        for nn in json.loads(response.text):
            for kernel in active_kernels:
                for arg in kernel[-1]:
                    if arg.count(nn['kernel']['id']):
                        pid = kernel[1]
                        cpu = kernel[0]
                        interrupt = input(
                            'Interrupt kernel {}; PID: {}; CPU: {}%? (y/n) '.format(nn['notebook']['path'], pid, cpu))
                        if interrupt.lower() == 'y':
                            p = psutil.Process(pid)
                            while p.cpu_percent(interval=0.1) > cpu_threshold:
                                kill(pid, signal.SIGINT)
                                sleep(0.5)

if __name__ == '__main__':
    interrupt_bad_notebooks()