标签归档:google-colaboratory

如何防止Google Colab断开连接?

问题:如何防止Google Colab断开连接?

问:是否可以通过编程方式防止Google Colab在超时时断开连接?

下面介绍导致笔记本计算机自动断开连接的情况:

Google Colab笔记本的空闲超时为90分钟,绝对超时为12小时。这意味着,如果用户在超过90分钟的时间内未与其Google Colab笔记本互动,则其实例将自动终止。另外,Colab实例的最大生存期为12小时。

自然,我们希望自动将最大值从实例中挤出,而不必不断地手动与之交互。在这里,我将假定常见的系统要求:

  • Ubuntu 18 LTS / Windows 10 / Mac操作系统
  • 对于基于Linux的系统,请使用流行的DE,例如Gnome 3或Unity
  • Firefox或Chromium浏览器

我要在这里指出,这种行为并未违反 Google Colab的使用条款,尽管根据其常见问题解答不鼓励这样做(简而言之:从道德上讲,如果您真的不需要它,则用尽所有GPU是不可行的))。


我当前的解决方案非常愚蠢:

  • 首先,我关闭屏幕保护程序,因此我的屏幕始终保持打开状态。
  • 我有一个Arduino开发板,所以我只是将它变成了一个橡胶鸭子USB,并使其在我睡觉时模拟原始用户交互(只是因为我手边有其他用例)。

有更好的方法吗?

Q: Is there any way to programmatically prevent Google Colab from disconnecting on a timeout?

The following describes the conditions causing a notebook to automatically disconnect:

Google Colab notebooks have an idle timeout of 90 minutes and absolute timeout of 12 hours. This means, if user does not interact with his Google Colab notebook for more than 90 minutes, its instance is automatically terminated. Also, maximum lifetime of a Colab instance is 12 hours.

Naturally, we want to automatically squeeze the maximum out of the instance, without having to manually interact with it constantly. Here I will assume commonly seen system requirements:

  • Ubuntu 18 LTS / Windows 10 / Mac Operating systems
  • In case of Linux-based systems, using popular DEs like Gnome 3 or Unity
  • Firefox or Chromium browsers

I should point out here that such behavior does not violate Google Colab’s Terms of Use, although it is not encouraged according to their FAQ (in short: morally it is not okay to use up all of the GPUs if you don’t really need it).


My current solution is very dumb:

  • First, I turn the screensaver off, so my sreen is always on.
  • I have an Arduino board, so I just turned it into a rubber ducky usb and make it emulate primitive user interaction while I sleep (just because I have it at hand for other use-cases).

Are there better ways?


回答 0

编辑: 显然,该解决方案非常简单,并且不需要任何JavaScript。只需在底部创建具有以下行的新单元格:

while True:pass

现在将单元格保持在运行顺序中,以便无限循环不会停止,从而使会话保持活动状态。

旧方法: 设置一个JavaScript间隔,每60秒点击一次connect按钮。使用Ctrl + Shift + I打开开发人员设置(在您的Web浏览器中),然后单击控制台选项卡,然后在控制台提示符下键入此设置。(对于Mac,请按Option + Command + I)

function ConnectButton(){
    console.log("Connect pushed"); 
    document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click() 
}
setInterval(ConnectButton,60000);

Edit: Apparently the solution is very easy, and doesn’t need any JavaScript. Just create a new cell at the bottom having the following line:

while True:pass

now keep the cell in the run sequence so that the infinite loop won’t stop and thus keep your session alive.

Old method: Set a javascript interval to click on the connect button every 60 seconds. Open developer-settings (in your web-browser) with Ctrl+Shift+I then click on console tab and type this on the console prompt. (for mac press Option+Command+I)

function ConnectButton(){
    console.log("Connect pushed"); 
    document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click() 
}
setInterval(ConnectButton,60000);

回答 1

由于现在将连接按钮的ID更改为“ colab-connect-button”,因此可以使用以下代码来继续单击该按钮。

function ClickConnect(){
    console.log("Clicked on connect button"); 
    document.querySelector("colab-connect-button").click()
}
setInterval(ClickConnect,60000)

如果仍然无法解决问题,请按照以下步骤操作:

  1. 右键单击连接按钮(位于colab的右上方)
  2. 点击检查
  3. 获取按钮的HTML ID并替换为以下代码
function ClickConnect(){
    console.log("Clicked on connect button"); 
    document.querySelector("Put ID here").click() // Change id here
}
setInterval(ClickConnect,60000)

Since the id of the connect button is now changed to “colab-connect-button”, the following code can be used to keep clicking on the button.

function ClickConnect(){
    console.log("Clicked on connect button"); 
    document.querySelector("colab-connect-button").click()
}
setInterval(ClickConnect,60000)

If still, this doesn’t work, then follow the steps given below:

  1. Right-click on the connect button (on the top-right side of the colab)
  2. Click on inspect
  3. Get the HTML id of the button and substitute in the following code
function ClickConnect(){
    console.log("Clicked on connect button"); 
    document.querySelector("Put ID here").click() // Change id here
}
setInterval(ClickConnect,60000)

回答 2

嗯,这对我有用-

在控制台中运行以下代码,它将阻止您断开连接。Ctrl + Shift + i打开检查器视图。然后进入控制台。

function ClickConnect(){
    console.log("Working"); 
    document.querySelector("colab-toolbar-button#connect").click() 
}
setInterval(ClickConnect,60000)

如何防止Google Colab断开连接

Well this is working for me –

run the following code in the console and it will prevent you from disconnecting. Ctrl+ Shift + i to open inspector view . Then go to console.

function ClickConnect(){
    console.log("Working"); 
    document.querySelector("colab-toolbar-button#connect").click() 
}
setInterval(ClickConnect,60000)

How to prevent google colab from disconnecting


回答 3

对我而言,以下示例:

  • document.querySelector("#connect").click() 要么
  • document.querySelector("colab-toolbar-button#connect").click() 要么
  • document.querySelector("colab-connect-button").click()

抛出错误。

我必须使它们适应以下条件:

版本1:

function ClickConnect(){
  console.log("Connnect Clicked - Start"); 
  document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click();
  console.log("Connnect Clicked - End"); 
};
setInterval(ClickConnect, 60000)

版本2: 如果您希望能够停止该功能,请使用以下新代码:

var startClickConnect = function startClickConnect(){
    var clickConnect = function clickConnect(){
        console.log("Connnect Clicked - Start");
        document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click();
        console.log("Connnect Clicked - End"); 
    };

    var intervalId = setInterval(clickConnect, 60000);

    var stopClickConnectHandler = function stopClickConnect() {
        console.log("Connnect Clicked Stopped - Start");
        clearInterval(intervalId);
        console.log("Connnect Clicked Stopped - End");
    };

    return stopClickConnectHandler;
};

var stopClickConnect = startClickConnect();

为了停止,请调用:

stopClickConnect();

For me the following examples:

  • document.querySelector("#connect").click() or
  • document.querySelector("colab-toolbar-button#connect").click() or
  • document.querySelector("colab-connect-button").click()

were throwing errors.

I had to adapt them to the following:

Version 1:

function ClickConnect(){
  console.log("Connnect Clicked - Start"); 
  document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click();
  console.log("Connnect Clicked - End"); 
};
setInterval(ClickConnect, 60000)

Version 2: If you would like to be able to stop the function, here is the new code:

var startClickConnect = function startClickConnect(){
    var clickConnect = function clickConnect(){
        console.log("Connnect Clicked - Start");
        document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click();
        console.log("Connnect Clicked - End"); 
    };

    var intervalId = setInterval(clickConnect, 60000);

    var stopClickConnectHandler = function stopClickConnect() {
        console.log("Connnect Clicked Stopped - Start");
        clearInterval(intervalId);
        console.log("Connnect Clicked Stopped - End");
    };

    return stopClickConnectHandler;
};

var stopClickConnect = startClickConnect();

In order to stop, call:

stopClickConnect();

回答 4

使用Pynput在您的PC中创建python代码

from pynput.mouse import Button, Controller
import time

mouse = Controller()

while True:
    mouse.click(Button.left, 1)
    time.sleep(30)

在您的桌面上运行此代码,然后将鼠标指针悬停在任何目录的目录结构上(左侧的左侧栏-文件部分),此代码将每30秒不断单击一次目录,因此每30秒将展开和缩小一次,因此您的会话不会过期重要-您必须在PC中运行此代码

create a python code in your pc with pynput

from pynput.mouse import Button, Controller
import time

mouse = Controller()

while True:
    mouse.click(Button.left, 1)
    time.sleep(30)

Run this code in your Desktop, Then point mouse arrow over (colabs left panel – file section) directory structure on any directory this code will keep clicking on directory on every 30 seconds so it will expand and shrink every 30 seconds so your session will not get expired Important – you have to run this code in your pc


回答 5

我没有单击“连接”按钮,而是单击“评论”按钮以使会话保持活动状态。(2020年8月)

function ClickConnect(){

console.log("Working"); 
document.querySelector("#comments > span").click() 
}
setInterval(ClickConnect,5000)

Instead of clicking the connect button, i just clicking on comment button to keep my session alive. (August-2020)

function ClickConnect(){

console.log("Working"); 
document.querySelector("#comments > span").click() 
}
setInterval(ClickConnect,5000)

回答 6

我使用宏程序定期单击RAM / Disk按钮以整夜训练模型。诀窍是配置一个宏程序,以两次单击Ram / Disk Colab工具栏按钮,两次单击之间的间隔很短,这样即使运行时断开连接,它也将重新连接。(第一次单击用于关闭对话框,第二次单击用于重新连接)。但是,您仍然必须整夜打开笔记本电脑,甚至可以固定Colab标签。

I use a Macro Program to periodically click on the RAM/Disk button to train the model all night. The trick is to configure a macro program to click on the Ram/Disk Colab Toolbar Button twice with a short interval between the two clicks so that even if the Runtime gets disconnected it will reconnect back. (the first click used to close the dialog box and the second click used to RECONNECT). However, you still have to leave your laptop open all night and maybe pin the Colab tab.


回答 7

在某些脚本的帮助下,以上答案可能效果很好。对于没有脚本的烦人的断开连接,我有一个解决方案(或一种技巧),尤其是当您的程序必须从google驱动器读取数据时,例如训练深度学习网络模型时,使用脚本进行reconnect操作就没有用了,因为一旦您断开与colab的连接,该程序就死了,应该再次手动连接到Google驱动器,以使您的模型能够再次读取数据集,但是脚本不会执行此操作。
我已经测试了很多次,并且效果很好。
当您使用浏览器(我使用Chrome)在colab页面上运行程序时,请记住,一旦程序开始运行,就不要对浏览器进行任何操作,例如:切换到其他网页,打开或关闭另一个网页,以及依此类推,只需将其放置在那里,等待程序完成运行,就可以切换到pycharm等其他软件来继续编写代码,而不必切换到另一个网页。我不知道为什么打开或关闭或切换到其他页面会导致google colab页面的连接问题,但是每次我尝试打扰我的浏览器(如执行某些搜索工作)时,我与colab的连接都会很快断开。

The above answers with the help of some scripts maybe work well. I have a solution(or a kind of trick) for that annoying disconnection without scripts, especially when your program must read data from your google drive, like training a deep learning network model, where using scripts to do reconnect operation is of no use because once you disconnect with your colab, the program is just dead, you should manually connect to your google drive again to make your model able to read dataset again, but the scripts will not do that thing.
I’ve already test it many times and it works well.
When you run a program on the colab page with a browser(I use Chrome), just remember that don’t do any operation to your browser once your program starts running, like: switch to other webpages, open or close another webpage, and so on, just just leave it alone there and waiting for your program finish running, you can switch to another software, like pycharm to keep writing your codes but not switch to another webpage. I don’t know why open or close or switch to other pages will cause the connection problem of the google colab page, but each time I try to bothered my browser, like do some search job, my connection to colab will soon break down.


回答 8

尝试这个:

function ClickConnect(){
  console.log("Working"); 
  document
    .querySelector("#top-toolbar > colab-connect-button")
    .shadowRoot
    .querySelector("#connect")
    .click()
}

setInterval(ClickConnect,60000)

Try this:

function ClickConnect(){
  console.log("Working"); 
  document
    .querySelector("#top-toolbar > colab-connect-button")
    .shadowRoot
    .querySelector("#connect")
    .click()
}

setInterval(ClickConnect,60000)

回答 9

使用python硒

from selenium.webdriver.common.keys import Keys
from selenium import webdriver
import time   

driver = webdriver.Chrome('/usr/lib/chromium-browser/chromedriver')

notebook_url = ''
driver.get(notebook_url)

# run all cells
driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + Keys.F9)
time.sleep(5)

# click to stay connected
start_time = time.time()
current_time = time.time()
max_time = 11*59*60 #12hours

while (current_time - start_time) < max_time:
    webdriver.ActionChains(driver).send_keys(Keys.ESCAPE).perform()
    driver.find_element_by_xpath('//*[@id="top-toolbar"]/colab-connect-button').click()
    time.sleep(30)
    current_time = time.time()

Using python selenium

from selenium.webdriver.common.keys import Keys
from selenium import webdriver
import time   

driver = webdriver.Chrome('/usr/lib/chromium-browser/chromedriver')

notebook_url = ''
driver.get(notebook_url)

# run all cells
driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + Keys.F9)
time.sleep(5)

# click to stay connected
start_time = time.time()
current_time = time.time()
max_time = 11*59*60 #12hours

while (current_time - start_time) < max_time:
    webdriver.ActionChains(driver).send_keys(Keys.ESCAPE).perform()
    driver.find_element_by_xpath('//*[@id="top-toolbar"]/colab-connect-button').click()
    time.sleep(30)
    current_time = time.time()

回答 10

我认为JavaScript解决方案不再有效。我在笔记本中使用以下命令进行操作:

    from IPython.display import display, HTML
    js = ('<script>function ConnectButton(){ '
           'console.log("Connect pushed"); '
           'document.querySelector("#connect").click()} '
           'setInterval(ConnectButton,3000);</script>')
    display(HTML(js))

首次执行全部运行时(在启动JavaScript或Python代码之前),控制台将显示:

Connected to 
wss://colab.research.google.com/api/kernels/0e1ce105-0127-4758-90e48cf801ce01a3/channels?session_id=5d8...

但是,每次运行JavaScript时,您都会看到console.log部分,但是click部分仅给出:

Connect pushed

Uncaught TypeError: Cannot read property 'click' of null
 at ConnectButton (<anonymous>:1:92)

其他人建议将按钮名称更改为#colab-connect-button,但这会产生相同的错误。

启动运行系统后,该按钮将更改为显示RAM / DISK,并显示一个下拉列表。单击下拉列表创建一个<DIV class=goog menu...>以前未在DOM中显示的新内容,并带有2个选项“连接到托管运行时”和“连接到本地运行时”。如果控制台窗口已打开并显示元素,则在单击下拉元素时可以看到此DIV出现。只需在出现的新窗口中的两个选项之间移动鼠标焦点,即可向DOM添加其他元素,一旦鼠标释放焦点,它们便会从DOM中完全删除,甚至无需单击即可。

I don’t believe the JavaScript solutions work anymore. I was doing it from within my notebook with:

    from IPython.display import display, HTML
    js = ('<script>function ConnectButton(){ '
           'console.log("Connect pushed"); '
           'document.querySelector("#connect").click()} '
           'setInterval(ConnectButton,3000);</script>')
    display(HTML(js))

When you first do a Run all (before the JavaScript or Python code has started), the console displays:

Connected to 
wss://colab.research.google.com/api/kernels/0e1ce105-0127-4758-90e48cf801ce01a3/channels?session_id=5d8...

However, ever time the JavaScript runs, you see the console.log portion, but the click portion simply gives:

Connect pushed

Uncaught TypeError: Cannot read property 'click' of null
 at ConnectButton (<anonymous>:1:92)

Others suggested the button name has changed to #colab-connect-button, but that gives same error.

After the runtime is started, the button is changed to show RAM/DISK, and a drop down is presented. Clicking on the drop down creates a new <DIV class=goog menu...> that was not shown in the DOM previously, with 2 options “Connect to hosted runtime” and “Connect to local runtime”. If the console window is open and showing elements, you can see this DIV appear when you click the dropdown element. Simply moving the mouse focus between the two options in the new window that appears adds additional elements to the DOM, as soon as the mouse looses focus, they are removed from the DOM completely, even without clicking.


回答 11

我尝试了上面的代码,但它们对我不起作用。这是我重新连接的JS代码。

let interval = setInterval(function(){
let ok = document.getElementById('ok');
if(ok != null){
   console.log("Connect pushed");
ok.click();
}},60000)

您可以使用相同的方式(在浏览器的控制台上运行)来运行它。如果要停止脚本,可以输入clearInterval(interval)并再次运行setInterval(interval)

我希望这可以帮助你。

I tried the codes above but they did not work for me. So here is my JS code for reconnecting.

let interval = setInterval(function(){
let ok = document.getElementById('ok');
if(ok != null){
   console.log("Connect pushed");
ok.click();
}},60000)

You can use it with the same way (run it on the console of your browser) to run it. If you want to stop the script, you can enter clearInterval(interval) and want to run again setInterval(interval).

I hope this helps you.


回答 12

更新了一个。这个对我有用。

function ClickConnect(){
console.log("Working"); 
document.querySelector("paper-icon-button").click()
}
Const myjob = setInterval(ClickConnect, 60000)

如果对您不起作用,请尝试运行以下命令清除它:

clearInterval(myjob)

Updated one. it works for me.

function ClickConnect(){
console.log("Working"); 
document.querySelector("paper-icon-button").click()
}
Const myjob = setInterval(ClickConnect, 60000)

If isn’t working you for you guys try clear it by running:

clearInterval(myjob)

回答 13

这对我有用(似乎他们更改了按钮的类名或ID):

function ClickConnect(){
    console.log("Working"); 
    document.querySelector("colab-connect-button").click() 
}
setInterval(ClickConnect,60000)

This one worked for me (it seems like they changed the button classname or id) :

function ClickConnect(){
    console.log("Working"); 
    document.querySelector("colab-connect-button").click() 
}
setInterval(ClickConnect,60000)

回答 14

投票得最多的答案当然对我有用,但这会使“管理会话”窗口一次又一次地弹出。
我已经解决了这一问题,方法是使用浏览器控制台如下所示自动单击刷新按钮

function ClickRefresh(){
    console.log("Clicked on refresh button"); 
    document.querySelector("paper-icon-button").click()
}
setInterval(ClickRefresh, 60000)

随时在此要点上为此贡献更多代码片段https://gist.github.com/Subangkar/fd1ef276fd40dc374a7c80acc247613e

The most voted answer certainly works for me but it makes the Manage session window popping up again and again.
I’ve solved that by auto clicking the refresh button using browser console like below

function ClickRefresh(){
    console.log("Clicked on refresh button"); 
    document.querySelector("paper-icon-button").click()
}
setInterval(ClickRefresh, 60000)

Feel free to contribute more snippets for this at this gist https://gist.github.com/Subangkar/fd1ef276fd40dc374a7c80acc247613e


回答 15

也许以前的许多解决方案都不再起作用。例如,下面的代码继续在Colab中创建新的代码单元,但仍在工作。无疑,创建一堆代码单元是一个不便之处。如果在运行几个小时后创建了太多的代码单元,而没有足够的RAM,则浏览器可能会冻结。

反复创建代码单元-

function ClickConnect(){
console.log("Working"); 
document.querySelector("colab-toolbar-button").click() 
}setInterval(ClickConnect,60000)

但是我发现下面的代码正在运行,它不会引起任何问题。在Colab笔记本选项卡中,Ctrl + Shift + i同时单击该键,然后将以下代码粘贴到控制台中。120000个间隔就足够了。

function ClickConnect(){
console.log("Working"); 
document.querySelector("colab-toolbar-button#connect").click() 
}setInterval(ClickConnect,120000)

我已在2020年11月在firefox中测试了此代码。它也将在chrome上工作。

Perhaps many of the previous solutions are no longer working. For example, this bellow code continues to create new code cells in Colab, working though. Undoubtedly, creating a bunch of code cells is an inconvenience. If too many code cells are created in some hours of running and there is no enough RAM, the browser may freeze.

This repetedly creates code cells—

function ClickConnect(){
console.log("Working"); 
document.querySelector("colab-toolbar-button").click() 
}setInterval(ClickConnect,60000)

But I found the code below is working, it doesn’t cause any problems. In the Colab notebook tab, click on the Ctrl + Shift + i key simultaneously and paste the below code in the console. 120000 intervals are enough.

function ClickConnect(){
console.log("Working"); 
document.querySelector("colab-toolbar-button#connect").click() 
}setInterval(ClickConnect,120000)

I have tested this code in firefox, in November 2020. It will work on chrome too.


回答 16

我建议使用JQuery(似乎Co-lab默认包含JQuery)。

function ClickConnect(){
  console.log("Working");
  $("colab-toolbar-button").click();
}
setInterval(ClickConnect,60000);

I would recommend using JQuery (It seems that Co-lab includes JQuery by default).

function ClickConnect(){
  console.log("Working");
  $("colab-toolbar-button").click();
}
setInterval(ClickConnect,60000);

回答 17

这些JavaScript函数存在问题:

function ClickConnect(){
    console.log("Clicked on connect button"); 
    document.querySelector("colab-connect-button").click()
}
setInterval(ClickConnect,60000)

他们实际在单击按钮之前在控制台上打印“ Clicked on connect button”。从该线程的不同答案中可以看出,自Google Colab启动以来,connect按钮的ID已经更改了两次。而且将来也可能会更改。因此,如果您打算从该线程中复制旧答案,则可能会说“单击了连接按钮”,但实际上可能不会这样做。当然,如果单击不起作用,它将在控制台上显示一个错误,但是如果您可能不会意外看到该怎么办?因此,您最好这样做:

function ClickConnect(){
    document.querySelector("colab-connect-button").click()
    console.log("Clicked on connect button"); 
}
setInterval(ClickConnect,60000)

您肯定会看到它是否真正起作用。

I have a problem with these javascript functions:

function ClickConnect(){
    console.log("Clicked on connect button"); 
    document.querySelector("colab-connect-button").click()
}
setInterval(ClickConnect,60000)

They print the “Clicked on connect button” on the console before the button is actually clicked. As you can see from different answers in this thread, the id of the connect button has changed a couple of times since Google Colab was launched. And it could be changed in the future as well. So if you’re going to copy an old answer from this thread it may say “Clicked on connect button” but it may actually not do that. Of course if the clicking won’t work it will print an error on the console but what if you may not accidentally see it? So you better do this:

function ClickConnect(){
    document.querySelector("colab-connect-button").click()
    console.log("Clicked on connect button"); 
}
setInterval(ClickConnect,60000)

And you’ll definitely see if it truly works or not.


回答 18

function ClickConnect()
{
    console.log("Working...."); 
    document.querySelector("paper-button#comments").click()
}
setInterval(ClickConnect,600)

这对我有用,但明智地使用

快乐学习:)

function ClickConnect()
{
    console.log("Working...."); 
    document.querySelector("paper-button#comments").click()
}
setInterval(ClickConnect,600)

this worked for me but use wisely

happy learning :)


回答 19

以下最新解决方案适用于我:

function ClickConnect(){
  colab.config
  console.log("Connnect Clicked - Start"); 
  document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click();
  console.log("Connnect Clicked - End");
};
setInterval(ClickConnect, 60000)

the following LATEST solution works for me:

function ClickConnect(){
  colab.config
  console.log("Connnect Clicked - Start"); 
  document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click();
  console.log("Connnect Clicked - End");
};
setInterval(ClickConnect, 60000)

回答 20

下面的javascript对我有用。学分@ artur.k.space

function ColabReconnect() {
    var dialog = document.querySelector("colab-dialog.yes-no-dialog");
    var dialogTitle = dialog && dialog.querySelector("div.content-area>h2");
    if (dialogTitle && dialogTitle.innerText == "Runtime disconnected") {
        dialog.querySelector("paper-button#ok").click();
        console.log("Reconnecting...");
    } else {
        console.log("ColabReconnect is in service.");
    }
}
timerId = setInterval(ColabReconnect, 60000);

在Colab笔记本中,同时单击Ctrl + Shift +i键。将脚本复制并粘贴到提示行中。然后Enter在关闭编辑器之前点击。

这样,该功能将每60秒检查一次,以查看是否显示了屏幕连接对话框,如果显示,则该功能将ok自动为您单击该按钮。

The javascript below works for me. Credits to @artur.k.space.

function ColabReconnect() {
    var dialog = document.querySelector("colab-dialog.yes-no-dialog");
    var dialogTitle = dialog && dialog.querySelector("div.content-area>h2");
    if (dialogTitle && dialogTitle.innerText == "Runtime disconnected") {
        dialog.querySelector("paper-button#ok").click();
        console.log("Reconnecting...");
    } else {
        console.log("ColabReconnect is in service.");
    }
}
timerId = setInterval(ColabReconnect, 60000);

In the Colab notebook, click on Ctrl + Shift + the i key simultaneously. Copy and paste the script into the prompt line. Then hit Enter before closing the editor.

By doing so, the function will check every 60 seconds to see if the onscreen connection dialog is shown, and if it is, the function would then click the ok button automatically for you.


回答 21

好吧,我不是python家伙,也不知道这个’Colab’的实际用途是什么,我将其用作构建系统。我以前在其中设置了ssh转发,然后将这段代码放到运行中,是的。

import getpass
authtoken = getpass.getpass()

Well I am not a python guy nor I know what is the actual use of this ‘Colab’, I use it as a build system lol. And I used to setup ssh forwarding in it then put this code and just leave it running and yeah it works.

import getpass
authtoken = getpass.getpass()

回答 22

此代码在文件资源管理器窗格中单击“刷新文件夹”。

function ClickRefresh(){
  console.log("Working"); 
  document.querySelector("[icon='colab:folder-refresh']").click()
}
const myjob = setInterval(ClickRefresh, 60000)

This code keep clicking “Refresh folder” in the file explorer pane.

function ClickRefresh(){
  console.log("Working"); 
  document.querySelector("[icon='colab:folder-refresh']").click()
}
const myjob = setInterval(ClickRefresh, 60000)

回答 23

GNU Colab使您可以在Colaboratory实例之上运行标准的持久桌面环境。

实际上,它包含一种不让机器死掉的机制。

这是一个视频演示

GNU Colab lets you run a standard persistent desktop environment on top of a Colaboratory instance.

Indeed it contains a mechanism to not let machines die of idling.

Here’s a video demonstration.


回答 24

您也可以使用Python按下箭头键。我也在以下代码中添加了一些随机性。

from pyautogui import press, typewrite, hotkey
import time
from random import shuffle

array = ["left", "right", "up", "down"]

while True:
    shuffle(array)
    time.sleep(10)
    press(array[0])
    press(array[1])
    press(array[2])
    press(array[3])

You can also use Python to press the arrow keys. I added a little bit of randomness in the following code as well.

from pyautogui import press, typewrite, hotkey
import time
from random import shuffle

array = ["left", "right", "up", "down"]

while True:
    shuffle(array)
    time.sleep(10)
    press(array[0])
    press(array[1])
    press(array[2])
    press(array[3])

回答 25

只需在要运行的单元格之后运行以下代码,以防止数据丢失。

!python

同样要退出此模式,请写

exit()

Just run the code below after the cell you want to run to save from data loss.

!python

Also to exit from this mode, write

exit()

回答 26

我一直在寻找解决方案,直到找到一个Python3,该Python3总是在同一位置来回移动鼠标并单击,但这足以使Colab误以为我在笔记本电脑上很活跃并且没有断开连接。

import numpy as np
import time
import mouse
import threading

def move_mouse():
    while True:
        random_row = np.random.random_sample()*100
        random_col = np.random.random_sample()*10
        random_time = np.random.random_sample()*np.random.random_sample() * 100
        mouse.wheel(1000)
        mouse.wheel(-1000)
        mouse.move(random_row, random_col, absolute=False, duration=0.2)
        mouse.move(-random_row, -random_col, absolute=False, duration = 0.2)
        mouse.LEFT
        time.sleep(random_time)


x = threading.Thread(target=move_mouse)
x.start()

您需要安装所需的软件包:sudo -H pip3 install <package_name> 您只需要使用(在本地计算机中)运行它即可(sudo因为它可以控制鼠标)并且它应该可以工作,从而使您能够充分利用Colab的12h会话。

积分: 对于使用Colab(Pro)的用户:防止会话由于不活动而断开连接

I was looking for a solution until I found a Python3 that randomly moves the mouse back and forth and clicks, always on the same place, but that’s enough to fool Colab into thinking I’m active on the notebook and not disconnect.

import numpy as np
import time
import mouse
import threading

def move_mouse():
    while True:
        random_row = np.random.random_sample()*100
        random_col = np.random.random_sample()*10
        random_time = np.random.random_sample()*np.random.random_sample() * 100
        mouse.wheel(1000)
        mouse.wheel(-1000)
        mouse.move(random_row, random_col, absolute=False, duration=0.2)
        mouse.move(-random_row, -random_col, absolute=False, duration = 0.2)
        mouse.LEFT
        time.sleep(random_time)


x = threading.Thread(target=move_mouse)
x.start()

You need to install the needed packages: sudo -H pip3 install <package_name> You just need to run it (in your local machine) with sudo (as it takes control of the mouse) and it should work, allowing you to take full advantage of Colab’s 12h sessions.

Credits: For those using Colab (Pro): Preventing Session from disconnecting due to inactivity


回答 27

function ClickConnect(){
    console.log("Clicked on connect button"); 
    document.querySelector("connect").click() // Change id here
}
setInterval(ClickConnect,60000)

试试上面对我有用的代码:)

function ClickConnect(){
    console.log("Clicked on connect button"); 
    document.querySelector("connect").click() // Change id here
}
setInterval(ClickConnect,60000)

Try above code it worked for me:)


Google Colab:如何从我的Google驱动器读取数据?

问题:Google Colab:如何从我的Google驱动器读取数据?

问题很简单:例如,我在gDrive上有一些数据 /projects/my_project/my_data*

另外,我在gColab中有一个简单的笔记本。

所以,我想做些类似的事情:

for file in glob.glob("/projects/my_project/my_data*"):
    do_something(file)

不幸的是,所有示例(例如,例如https://colab.research.google.com/notebook#fileId=/v2/external/notebooks/io.ipynb)都建议仅将所有必要的数据加载到笔记本中。

但是,如果我有很多数据,可能会非常复杂。有解决这个问题的机会吗?

感谢帮助!

The problem is simple: I have some data on gDrive, for example at /projects/my_project/my_data*.

Also I have a simple notebook in gColab.

So, I would like to do something like:

for file in glob.glob("/projects/my_project/my_data*"):
    do_something(file)

Unfortunately, all examples (like this – https://colab.research.google.com/notebook#fileId=/v2/external/notebooks/io.ipynb, for example) suggests to only mainly load all necessary data to notebook.

But, if I have a lot of pieces of data, it can be quite complicated. Is there any opportunities to solve this issue?

Thanks for help!


回答 0

好消息,PyDrive在CoLab上提供了一流的支持!PyDrive是Google Drive python客户端的包装器。这是一个有关如何从文件夹下载所有文件的示例,类似于使用glob+ *

!pip install -U -q PyDrive
import os
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# 1. Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# choose a local (colab) directory to store the data.
local_download_path = os.path.expanduser('~/data')
try:
  os.makedirs(local_download_path)
except: pass

# 2. Auto-iterate using the query syntax
#    https://developers.google.com/drive/v2/web/search-parameters
file_list = drive.ListFile(
    {'q': "'1SooKSw8M4ACbznKjnNrYvJ5wxuqJ-YCk' in parents"}).GetList()

for f in file_list:
  # 3. Create & download by id.
  print('title: %s, id: %s' % (f['title'], f['id']))
  fname = os.path.join(local_download_path, f['title'])
  print('downloading to {}'.format(fname))
  f_ = drive.CreateFile({'id': f['id']})
  f_.GetContentFile(fname)


with open(fname, 'r') as f:
  print(f.read())

请注意,to的参数drive.ListFile是一个字典,与Google Drive HTTP API使用的参数一致(您可以自定义q参数以调整到用例)。

请注意,在所有情况下,文件/文件夹均由Google云端硬盘上的id编码(窥视1SooKSw8M4ACbznKjnNrYvJ5wxuqJ-YCk)。这就要求您在Google云端硬盘中搜索与您要在其中进行搜索的文件夹相对应的特定ID。

例如,导航到"/projects/my_project/my_data"Google云端硬盘中的文件夹。

看到它包含一些文件,我们要在其中下载到CoLab。要获取文件夹的ID以便由PyDrive使用,请查看url并提取id参数。在这种情况下,对应于该文件夹的URL为:

id是网址的最后一部分:1SooKSw8M4ACbznKjnNrYvJ5wxuqJ-YCk

Good news, PyDrive has first class support on CoLab! PyDrive is a wrapper for the Google Drive python client. Here is an example on how you would download ALL files from a folder, similar to using glob + *:

!pip install -U -q PyDrive
import os
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# 1. Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# choose a local (colab) directory to store the data.
local_download_path = os.path.expanduser('~/data')
try:
  os.makedirs(local_download_path)
except: pass

# 2. Auto-iterate using the query syntax
#    https://developers.google.com/drive/v2/web/search-parameters
file_list = drive.ListFile(
    {'q': "'1SooKSw8M4ACbznKjnNrYvJ5wxuqJ-YCk' in parents"}).GetList()

for f in file_list:
  # 3. Create & download by id.
  print('title: %s, id: %s' % (f['title'], f['id']))
  fname = os.path.join(local_download_path, f['title'])
  print('downloading to {}'.format(fname))
  f_ = drive.CreateFile({'id': f['id']})
  f_.GetContentFile(fname)


with open(fname, 'r') as f:
  print(f.read())

Notice that the arguments to drive.ListFile is a dictionary that coincides with the parameters used by Google Drive HTTP API (you can customize the q parameter to be tuned to your use-case).

Know that in all cases, files/folders are encoded by id’s (peep the 1SooKSw8M4ACbznKjnNrYvJ5wxuqJ-YCk) on Google Drive. This requires that you search Google Drive for the specific id corresponding to the folder you want to root your search in.

For example, navigate to the folder "/projects/my_project/my_data" that is located in your Google Drive.

See that it contains some files, in which we want to download to CoLab. To get the id of the folder in order to use it by PyDrive, look at the url and extract the id parameter. In this case, the url corresponding to the folder was:

Where the id is the last piece of the url: 1SooKSw8M4ACbznKjnNrYvJ5wxuqJ-YCk.


回答 1

编辑:从2020年2月开始,现在有了用于自动安装云端硬盘的一流UI。

首先,打开左侧的文件浏览器。它将显示“安装驱动器”按钮。单击后,您将看到安装驱动器的权限提示,然后,当您返回笔记本计算机时,您的驱动器文件将不进行任何设置。完成的流程如下所示:

原始答案如下。(这对于共享笔记本仍然有效。)

您可以通过运行以下代码段来挂载Google云端硬盘文件:

from google.colab import drive
drive.mount('/content/drive')

然后,您可以在文件浏览器侧面板或使用命令行实用程序与您的云端硬盘文件进行交互。

这是一个示例笔记本

Edit: As of February, 2020, there’s now a first-class UI for automatically mounting Drive.

First, open the file browser on the left hand side. It will show a ‘Mount Drive’ button. Once clicked, you’ll see a permissions prompt to mount Drive, and afterwards your Drive files will be present with no setup when you return to the notebook. The completed flow looks like so:

The original answer follows, below. (This will also still work for shared notebooks.)

You can mount your Google Drive files by running the following code snippet:

from google.colab import drive
drive.mount('/content/drive')

Then, you can interact with your Drive files in the file browser side panel or using command-line utilities.

Here’s an example notebook


回答 2

感谢您的精彩回答!从Google云端硬盘将一些一次性文件传输到Colab的最快方法:加载云端硬盘帮助程序并挂载

from google.colab import drive

这将提示您进行授权。

drive.mount('/content/drive')

在新标签页中打开链接->您将获得一个代码-将其复制回提示符,您现在可以访问Google驱动器检查:

!ls "/content/drive/My Drive"

然后根据需要复制文件:

!cp "/content/drive/My Drive/xy.py" "xy.py"

确认文件已复制:

!ls

Thanks for the great answers! Fastest way to get a few one-off files to Colab from Google drive: Load the Drive helper and mount

from google.colab import drive

This will prompt for authorization.

drive.mount('/content/drive')

Open the link in a new tab-> you will get a code – copy that back into the prompt you now have access to google drive check:

!ls "/content/drive/My Drive"

then copy file(s) as needed:

!cp "/content/drive/My Drive/xy.py" "xy.py"

confirm that files were copied:

!ls

回答 3

先前的大多数答案都非常复杂

from google.colab import drive
drive.mount("/content/drive", force_remount=True)

我认为这是将google驱动器安装到CO Lab的最简单,最快的方法,您mount directory location只需更改的参数即可将其更改为所需的格式drive.mount。它会为您提供一个链接,以接受您帐户的权限,然后您必须复制粘贴生成的密钥,然后将驱动器安装在所选路径中。

force_remount 仅在必须安装驱动器时才使用它,而与之前是否已加载无关。如果不想强制安装,可以忽略when参数。

编辑:查看此内容以找到IO在colab https://colab.research.google.com/notebooks/io.ipynb中进行操作的更多方法

Most of the previous answers are a bit(Very) complicated,

from google.colab import drive
drive.mount("/content/drive", force_remount=True)

I figured out this to be the easiest and fastest way to mount google drive into CO Lab, You can change the mount directory location to what ever you want by just changing the parameter for drive.mount. It will give you a link to accept the permissions with your account and then you have to copy paste the key generated and then drive will be mounted in the selected path.

force_remount is used only when you have to mount the drive irrespective of whether its loaded previously.You can neglect this when parameter if you don’t want to force mount

Edit: Check this out to find more ways of doing the IO operations in colab https://colab.research.google.com/notebooks/io.ipynb


回答 4

您不能在colab上永久存储文件。尽管您可以从驱动器中导入文件,并且每次使用完文件后都可以将其保存回来。

要将Google驱动器安装到您的Colab会话中

from google.colab import drive
drive.mount('/content/gdrive')

您可以像写入本地文件系统一样简单地写入google驱动器。现在,如果您看到google驱动器将被加载到“文件”标签中。现在,您可以从colab中访问任何文件,也可以对其进行写入和读取。更改将在驱动器上实时完成,任何具有访问文件链接的人都可以从colab中查看您所做的更改。

with open('/content/gdrive/My Drive/filename.txt', 'w') as f:
   f.write('values')

You can’t permanently store a file on colab. Though you can import files from your drive and everytime when you are done with file you can save it back.

To mount the google drive to your Colab session

from google.colab import drive
drive.mount('/content/gdrive')

you can simply write to google drive as you would to a local file system Now if you see your google drive will be loaded in the Files tab. Now you can access any file from your colab, you can write as well as read from it. The changes will be done real time on your drive and anyone having the access link to your file can view the changes made by you from your colab.

Example

with open('/content/gdrive/My Drive/filename.txt', 'w') as f:
   f.write('values')

回答 5

我很懒惰,记忆力很差,所以我决定创建一个easycolab ,它易于记忆和键入:

import easycolab as ec
ec.mount()

确保首先安装它: !pip install easycolab

mount()方法基本上实现了这一点:

from google.colab import drive
drive.mount(‘/content/drive’)
cd ‘/content/gdrive/My Drive/’

I’m lazy and my memory is bad, so I decided to create easycolab which is easier to memorize and type:

import easycolab as ec
ec.mount()

Make sure to install it first: !pip install easycolab

The mount() method basically implement this:

from google.colab import drive
drive.mount(‘/content/drive’)
cd ‘/content/gdrive/My Drive/’

回答 6

您只需使用屏幕左侧的代码段即可。 在此处输入图片说明

插入“在虚拟机中安装Google云端硬盘”

运行代码并将代码复制并粘贴到URL中

然后使用!ls检查目录

!ls /gdrive

在大多数情况下,您将在目录“ / gdrive /我的驱动器”中找到所需的内容

那么您可以像这样执行它:

from google.colab import drive
drive.mount('/gdrive')
import glob

file_path = glob.glob("/gdrive/My Drive/***.txt")
for file in file_path:
    do_something(file)

You can simply make use of the code snippets on the left of the screen. enter image description here

Insert “Mounting Google Drive in your VM”

run the code and copy&paste the code in the URL

and then use !ls to check the directories

!ls /gdrive

for most cases, you will find what you want in the directory “/gdrive/My drive”

then you may carry it out like this:

from google.colab import drive
drive.mount('/gdrive')
import glob

file_path = glob.glob("/gdrive/My Drive/***.txt")
for file in file_path:
    do_something(file)

回答 7

我首先要做的是:

from google.colab import drive
drive.mount('/content/drive/')

然后

%cd /content/drive/My Drive/Colab Notebooks/

在我可以例如以以下方式读取csv文件后

df = pd.read_csv("data_example.csv")

如果文件的位置不同,则在“我的云端硬盘”之后添加正确的路径

What I have done is first:

from google.colab import drive
drive.mount('/content/drive/')

Then

%cd /content/drive/My Drive/Colab Notebooks/

After I can for example read csv files with

df = pd.read_csv("data_example.csv")

If you have different locations for the files just add the correct path after My Drive


回答 8

我写了一个类,将所有数据下载到“。”中。在colab服务器中的位置

整个事情都可以从这里https://github.com/brianmanderson/Copy-Shared-Google-to-Colab

!pip install PyDrive


from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
import os

class download_data_from_folder(object):
    def __init__(self,path):
        path_id = path[path.find('id=')+3:]
        self.file_list = self.get_files_in_location(path_id)
        self.unwrap_data(self.file_list)
    def get_files_in_location(self,folder_id):
        file_list = drive.ListFile({'q': "'{}' in parents and trashed=false".format(folder_id)}).GetList()
        return file_list
    def unwrap_data(self,file_list,directory='.'):
        for i, file in enumerate(file_list):
            print(str((i + 1) / len(file_list) * 100) + '% done copying')
            if file['mimeType'].find('folder') != -1:
                if not os.path.exists(os.path.join(directory, file['title'])):
                    os.makedirs(os.path.join(directory, file['title']))
                print('Copying folder ' + os.path.join(directory, file['title']))
                self.unwrap_data(self.get_files_in_location(file['id']), os.path.join(directory, file['title']))
            else:
                if not os.path.exists(os.path.join(directory, file['title'])):
                    downloaded = drive.CreateFile({'id': file['id']})
                    downloaded.GetContentFile(os.path.join(directory, file['title']))
        return None
data_path = 'shared_path_location'
download_data_from_folder(data_path)

I wrote a class that downloads all of the data to the ‘.’ location in the colab server

The whole thing can be pulled from here https://github.com/brianmanderson/Copy-Shared-Google-to-Colab

!pip install PyDrive


from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
import os

class download_data_from_folder(object):
    def __init__(self,path):
        path_id = path[path.find('id=')+3:]
        self.file_list = self.get_files_in_location(path_id)
        self.unwrap_data(self.file_list)
    def get_files_in_location(self,folder_id):
        file_list = drive.ListFile({'q': "'{}' in parents and trashed=false".format(folder_id)}).GetList()
        return file_list
    def unwrap_data(self,file_list,directory='.'):
        for i, file in enumerate(file_list):
            print(str((i + 1) / len(file_list) * 100) + '% done copying')
            if file['mimeType'].find('folder') != -1:
                if not os.path.exists(os.path.join(directory, file['title'])):
                    os.makedirs(os.path.join(directory, file['title']))
                print('Copying folder ' + os.path.join(directory, file['title']))
                self.unwrap_data(self.get_files_in_location(file['id']), os.path.join(directory, file['title']))
            else:
                if not os.path.exists(os.path.join(directory, file['title'])):
                    downloaded = drive.CreateFile({'id': file['id']})
                    downloaded.GetContentFile(os.path.join(directory, file['title']))
        return None
data_path = 'shared_path_location'
download_data_from_folder(data_path)

回答 9

例如,要从Google colab笔记本中提取Google Drive zip:

import zipfile
from google.colab import drive

drive.mount('/content/drive/')

zip_ref = zipfile.ZipFile("/content/drive/My Drive/ML/DataSet.zip", 'r')
zip_ref.extractall("/tmp")
zip_ref.close()

To extract Google Drive zip from a Google colab notebook for example:

import zipfile
from google.colab import drive

drive.mount('/content/drive/')

zip_ref = zipfile.ZipFile("/content/drive/My Drive/ML/DataSet.zip", 'r')
zip_ref.extractall("/tmp")
zip_ref.close()

回答 10

@wenkesj

我说的是复制目录及其所有子目录。

对我来说,我找到了一个解决方案,如下所示:

def copy_directory(source_id, local_target):
  try:
    os.makedirs(local_target)
  except: 
    pass
  file_list = drive.ListFile(
    {'q': "'{source_id}' in parents".format(source_id=source_id)}).GetList()
  for f in file_list:
    key in ['title', 'id', 'mimeType']]))
    if f["title"].startswith("."):
      continue
    fname = os.path.join(local_target, f['title'])
    if f['mimeType'] == 'application/vnd.google-apps.folder':
      copy_directory(f['id'], fname)
    else:
      f_ = drive.CreateFile({'id': f['id']})
      f_.GetContentFile(fname)

不过,我看起来gDrive不想复制太多文件。

@wenkesj

I am speaking about copy the directory and all it subdirectories.

For me, I found a solution, that looks like this:

def copy_directory(source_id, local_target):
  try:
    os.makedirs(local_target)
  except: 
    pass
  file_list = drive.ListFile(
    {'q': "'{source_id}' in parents".format(source_id=source_id)}).GetList()
  for f in file_list:
    key in ['title', 'id', 'mimeType']]))
    if f["title"].startswith("."):
      continue
    fname = os.path.join(local_target, f['title'])
    if f['mimeType'] == 'application/vnd.google-apps.folder':
      copy_directory(f['id'], fname)
    else:
      f_ = drive.CreateFile({'id': f['id']})
      f_.GetContentFile(fname)

Nevertheless, I looks like gDrive don’t like to copy too much files.


回答 11

有很多方法可以读取colab笔记本中的文件(**。ipnb),其中一些方法是:

  1. 在运行时的虚拟机中挂载Google云端硬盘。这里这里
  2. 使用google.colab.files.upload()。最简单的解决方案
  3. 使用本地REST API ;
  4. 在API(例如PyDrive)周围使用包装器

方法1和2 对我有用,其余我无法弄清楚。如果有人可以,正如其他人在上面的帖子中所尝试的,请写下一个优雅的答案。提前致谢。!

第一种方法:

我无法挂载Google驱动器,因此我安装了这些库

# Install a Drive FUSE wrapper.
# https://github.com/astrada/google-drive-ocamlfuse

!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse

from google.colab import auth
auth.authenticate_user()
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass

!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}

安装和授权过程完成后,首先安装驱动器。

!mkdir -p drive
!google-drive-ocamlfuse drive

安装后,我能够挂载Google驱动器,您的Google驱动器中的所有内容都从/ content / drive开始

!ls /content/drive/ML/../../../../path_to_your_folder/

现在,您可以path_to_your_folder使用上述路径将文件从文件夹中读取到熊猫中。

import pandas as pd
df = pd.read_json('drive/ML/../../../../path_to_your_folder/file.json')
df.head(5)

您假设您使用收到的绝对路径,而不使用/../ ..

第二种方法

如果您要读取的文件位于当前工作目录中,则这很方便。

如果您需要从本地文件系统上载任何文件,则可以使用以下代码,否则请避免使用它。

from google.colab import files
uploaded = files.upload()
for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))

假设您在Google驱动器中的文件夹层次结构以下:

/content/drive/ML/../../../../path_to_your_folder/

然后,您只需要下面的代码即可加载到熊猫中。

import pandas as pd
import io
df = pd.read_json(io.StringIO(uploaded['file.json'].decode('utf-8')))
df

There are many ways to read the files in your colab notebook(**.ipnb), a few are:

  1. Mounting your Google Drive in the runtime’s virtual machine.here &, here
  2. Using google.colab.files.upload(). the easiest solution
  3. Using the native REST API;
  4. Using a wrapper around the API such as PyDrive

Method 1 and 2 worked for me, rest I wasn’t able to figure out. If anyone could, as others tried in above post please write an elegant answer. thanks in advance.!

First method:

I wasn’t able to mount my google drive, so I installed these libraries

# Install a Drive FUSE wrapper.
# https://github.com/astrada/google-drive-ocamlfuse

!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse

from google.colab import auth
auth.authenticate_user()
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass

!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}

Once the installation & authorization process is finished, you first mount your drive.

!mkdir -p drive
!google-drive-ocamlfuse drive

After installation I was able to mount the google drive, everything in your google drive starts from /content/drive

!ls /content/drive/ML/../../../../path_to_your_folder/

Now you can simply read the file from path_to_your_folder folder into pandas using the above path.

import pandas as pd
df = pd.read_json('drive/ML/../../../../path_to_your_folder/file.json')
df.head(5)

you are suppose you use absolute path you received & not using /../..

Second method:

Which is convenient, if your file which you want to read it is present in the current working directory.

If you need to upload any files from your local file system, you could use below code, else just avoid it.!

from google.colab import files
uploaded = files.upload()
for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))

suppose you have below the folder hierarchy in your google drive:

/content/drive/ML/../../../../path_to_your_folder/

Then, you simply need below code to load into pandas.

import pandas as pd
import io
df = pd.read_json(io.StringIO(uploaded['file.json'].decode('utf-8')))
df

回答 12

要读取文件夹中的所有文件:

import glob
from google.colab import drive
drive.mount('/gdrive', force_remount=True)

#!ls "/gdrive/My Drive/folder"

files = glob.glob(f"/gdrive/My Drive/folder/*.txt")
for file in files:  
  do_something(file)

To read all files in a folder:

import glob
from google.colab import drive
drive.mount('/gdrive', force_remount=True)

#!ls "/gdrive/My Drive/folder"

files = glob.glob(f"/gdrive/My Drive/folder/*.txt")
for file in files:  
  do_something(file)

回答 13

from google.colab import drive
drive.mount('/content/drive')

这对我来说非常完美,后来我可以使用该os库访问文件,就像在PC上访问文件一样

from google.colab import drive
drive.mount('/content/drive')

This worked perfect for me I was later able to use the os library to access my files just like how I access them on my PC


回答 14

考虑只下载与永久链路的文件,并gdown预装喜欢这里

Consider just downloading the file with permanent link and gdown preinstalled like here