标签归档:Python

如何在pytest中打印到控制台?

问题:如何在pytest中打印到控制台?

我正在尝试将TDD(测试驱动的开发)与 pytestpytest使用时不会print进入控制台print

我正在pytest my_tests.py运行它。

documentation似乎是说,它应该是默认的工作:http://pytest.org/latest/capture.html

但:

import myapplication as tum

class TestBlogger:

    @classmethod
    def setup_class(self):
        self.user = "alice"
        self.b = tum.Blogger(self.user)
        print "This should be printed, but it won't be!"

    def test_inherit(self):
        assert issubclass(tum.Blogger, tum.Site)
        links = self.b.get_links(posts)
        print len(links)   # This won't print either.

什么都没有打印到我的标准输出控制台上(只是正常的进度以及通过/失败的测试数量)。

我正在测试的脚本包含打印:

class Blogger(Site):
    get_links(self, posts):
        print len(posts)   # It won't get printed in the test.

unittest模块中,默认情况下会打印所有内容,这正是我所需要的。但是,我想用pytest出于其他原因。

有谁知道如何使打印报表显示出来?

I’m trying to use TDD (test-driven development) with pytest. pytest will not print to the console when I use print.

I am using pytest my_tests.py to run it.

The documentation seems to say that it should work by default: http://pytest.org/latest/capture.html

But:

import myapplication as tum

class TestBlogger:

    @classmethod
    def setup_class(self):
        self.user = "alice"
        self.b = tum.Blogger(self.user)
        print "This should be printed, but it won't be!"

    def test_inherit(self):
        assert issubclass(tum.Blogger, tum.Site)
        links = self.b.get_links(posts)
        print len(links)   # This won't print either.

Nothing gets printed to my standard output console (just the normal progress and how many many tests passed/failed).

And the script that I’m testing contains print:

class Blogger(Site):
    get_links(self, posts):
        print len(posts)   # It won't get printed in the test.

In unittest module, everything gets printed by default, which is exactly what I need. However, I wish to use pytest for other reasons.

Does anyone know how to make the print statements get shown?


回答 0

默认情况下,py.test捕获标准输出的结果,以便它可以控制其输出结果的方式。如果不这样做,它将喷出大量文本,而没有测试打印该文本的上下文。

但是,如果测试失败,它将在结果报告中包括一部分,以显示在该特定测试中打印出的标准内容。

例如,

def test_good():
    for i in range(1000):
        print(i)

def test_bad():
    print('this should fail!')
    assert False

结果如下:

>>> py.test tmp.py
============================= test session starts ==============================
platform darwin -- Python 2.7.6 -- py-1.4.20 -- pytest-2.5.2
plugins: cache, cov, pep8, xdist
collected 2 items

tmp.py .F

=================================== FAILURES ===================================
___________________________________ test_bad ___________________________________

    def test_bad():
        print('this should fail!')
>       assert False
E       assert False

tmp.py:7: AssertionError
------------------------------- Captured stdout --------------------------------
this should fail!
====================== 1 failed, 1 passed in 0.04 seconds ======================

注意该Captured stdout部分。

如果您希望print在执行语句时看到它们,可以将-s标志传递给py.test。但是,请注意,有时可能很难解析。

>>> py.test tmp.py -s
============================= test session starts ==============================
platform darwin -- Python 2.7.6 -- py-1.4.20 -- pytest-2.5.2
plugins: cache, cov, pep8, xdist
collected 2 items

tmp.py 0
1
2
3
... and so on ...
997
998
999
.this should fail!
F

=================================== FAILURES ===================================
___________________________________ test_bad ___________________________________

    def test_bad():
        print('this should fail!')
>       assert False
E       assert False

tmp.py:7: AssertionError
====================== 1 failed, 1 passed in 0.02 seconds ======================

By default, py.test captures the result of standard out so that it can control how it prints it out. If it didn’t do this, it would spew out a lot of text without the context of what test printed that text.

However, if a test fails, it will include a section in the resulting report that shows what was printed to standard out in that particular test.

For example,

def test_good():
    for i in range(1000):
        print(i)

def test_bad():
    print('this should fail!')
    assert False

Results in the following output:

>>> py.test tmp.py
============================= test session starts ==============================
platform darwin -- Python 2.7.6 -- py-1.4.20 -- pytest-2.5.2
plugins: cache, cov, pep8, xdist
collected 2 items

tmp.py .F

=================================== FAILURES ===================================
___________________________________ test_bad ___________________________________

    def test_bad():
        print('this should fail!')
>       assert False
E       assert False

tmp.py:7: AssertionError
------------------------------- Captured stdout --------------------------------
this should fail!
====================== 1 failed, 1 passed in 0.04 seconds ======================

Note the Captured stdout section.

If you would like to see print statements as they are executed, you can pass the -s flag to py.test. However, note that this can sometimes be difficult to parse.

>>> py.test tmp.py -s
============================= test session starts ==============================
platform darwin -- Python 2.7.6 -- py-1.4.20 -- pytest-2.5.2
plugins: cache, cov, pep8, xdist
collected 2 items

tmp.py 0
1
2
3
... and so on ...
997
998
999
.this should fail!
F

=================================== FAILURES ===================================
___________________________________ test_bad ___________________________________

    def test_bad():
        print('this should fail!')
>       assert False
E       assert False

tmp.py:7: AssertionError
====================== 1 failed, 1 passed in 0.02 seconds ======================

回答 1

using -s选项将打印所有功能的输出,可能太多了。

如果您需要特定的输出,则您提到的文档页面提供了一些建议:

  1. assert False, "dumb assert to make PyTest print my stuff"在函数的末尾插入,由于测试失败,您将看到输出。

  2. 您有PyTest传递给您的特殊对象,您可以将输出写入文件中以供日后检查,例如

    def test_good1(capsys):
        for i in range(5):
            print i
        out, err = capsys.readouterr()
        open("err.txt", "w").write(err)
        open("out.txt", "w").write(out)

    您可以在单独的标签中打开outerr文件,然后让编辑器为您自动刷新它,或者执行简单的py.test; cat out.txtshell命令来运行测试。

那是做事的一种骇人听闻的方式,但是可能正是您所需要的东西:毕竟,TDD意味着您会弄乱这些东西,并在准备就绪时保持干净整洁:-)。

Using -s option will print output of all functions, which may be too much.

If you need particular output, the doc page you mentioned offers few suggestions:

  1. Insert assert False, "dumb assert to make PyTest print my stuff" at the end of your function, and you will see your output due to failed test.

  2. You have special object passed to you by PyTest, and you can write the output into a file to inspect it later, like

    def test_good1(capsys):
        for i in range(5):
            print i
        out, err = capsys.readouterr()
        open("err.txt", "w").write(err)
        open("out.txt", "w").write(out)
    

    You can open the out and err files in a separate tab and let editor automatically refresh it for you, or do a simple py.test; cat out.txt shell command to run your test.

That is rather hackish way to do stuff, but may be it is the stuff you need: after all, TDD means you mess with stuff and leave it clean and silent when it’s ready :-).


回答 2

简短答案

使用-s选项:

pytest -s

详细答案

文档

在执行测试期间,将捕获发送到stdoutstderr的所有输出。如果测试或设置方法失败,则通常会显示其相应的捕获输出以及失败回溯。

pytest具有选项--capture=method,其中method是每个测试捕获方法,并且可以是下列之一:fdsysnopytest还具有-s是的快捷方式--capture=no的选项,该选项使您可以在控制台中查看打印语句。

pytest --capture=no     # show print statements in console
pytest -s               # equivalent to previous command

设置捕获方法或禁用捕获

有两种pytest执行捕获的方法:

  1. 文件描述符(FD)级别捕获(默认):将捕获所有对操作系统文件描述符1和2的写操作。

  2. sys级捕获:仅捕获对Python文件sys.stdout和sys.stderr的写入。不捕获对文件描述符的写入。

pytest -s            # disable all capturing
pytest --capture=sys # replace sys.stdout/stderr with in-mem files
pytest --capture=fd  # also point filedescriptors 1 and 2 to temp file

Short Answer

Use the -s option:

pytest -s

Detailed answer

From the docs:

During test execution any output sent to stdout and stderr is captured. If a test or a setup method fails its according captured output will usually be shown along with the failure traceback.

pytest has the option --capture=method in which method is per-test capturing method, and could be one of the following: fd, sys or no. pytest also has the option -s which is a shortcut for --capture=no, and this is the option that will allow you to see your print statements in the console.

pytest --capture=no     # show print statements in console
pytest -s               # equivalent to previous command

Setting capturing methods or disabling capturing

There are two ways in which pytest can perform capturing:

  1. file descriptor (FD) level capturing (default): All writes going to the operating system file descriptors 1 and 2 will be captured.

  2. sys level capturing: Only writes to Python files sys.stdout and sys.stderr will be captured. No capturing of writes to filedescriptors is performed.

pytest -s            # disable all capturing
pytest --capture=sys # replace sys.stdout/stderr with in-mem files
pytest --capture=fd  # also point filedescriptors 1 and 2 to temp file

回答 3

PyTest确实需要在忽略所有内容时打印有关跳过测试的重要警告。

我不想通过测试发送信号失败,所以我做了如下的修改:

def test_2_YellAboutBrokenAndMutedTests():
    import atexit
    def report():
        print C_patch.tidy_text("""
In silent mode PyTest breaks low level stream structure I work with, so
I cannot test if my functionality work fine. I skipped corresponding tests.
Run `py.test -s` to make sure everything is tested.""")
    if sys.stdout != sys.__stdout__:
        atexit.register(report)

atexit模块允许我 PyTest释放输出流打印内容。输出如下:

============================= test session starts ==============================
platform linux2 -- Python 2.7.3, pytest-2.9.2, py-1.4.31, pluggy-0.3.1
rootdir: /media/Storage/henaro/smyth/Alchemist2-git/sources/C_patch, inifile: 
collected 15 items 

test_C_patch.py .....ssss....s.

===================== 10 passed, 5 skipped in 0.15 seconds =====================
In silent mode PyTest breaks low level stream structure I work with, so
I cannot test if my functionality work fine. I skipped corresponding tests.
Run `py.test -s` to make sure everything is tested.
~/.../sources/C_patch$

即使PyTest在静默模式下,消息也会被打印,如果您使用来运行东西,则消息不会被打印py.test -s,因此一切都已经过了很好的测试。

I needed to print important warning about skipped tests exactly when PyTest muted literally everything.

I didn’t want to fail a test to send a signal, so I did a hack as follow:

def test_2_YellAboutBrokenAndMutedTests():
    import atexit
    def report():
        print C_patch.tidy_text("""
In silent mode PyTest breaks low level stream structure I work with, so
I cannot test if my functionality work fine. I skipped corresponding tests.
Run `py.test -s` to make sure everything is tested.""")
    if sys.stdout != sys.__stdout__:
        atexit.register(report)

The atexit module allows me to print stuff after PyTest released the output streams. The output looks as follow:

============================= test session starts ==============================
platform linux2 -- Python 2.7.3, pytest-2.9.2, py-1.4.31, pluggy-0.3.1
rootdir: /media/Storage/henaro/smyth/Alchemist2-git/sources/C_patch, inifile: 
collected 15 items 

test_C_patch.py .....ssss....s.

===================== 10 passed, 5 skipped in 0.15 seconds =====================
In silent mode PyTest breaks low level stream structure I work with, so
I cannot test if my functionality work fine. I skipped corresponding tests.
Run `py.test -s` to make sure everything is tested.
~/.../sources/C_patch$

Message is printed even when PyTest is in silent mode, and is not printed if you run stuff with py.test -s, so everything is tested nicely already.


回答 4

根据pytest docspytest --capture=sys应该可以工作。如果要在测试中捕获标准,请参考capsys装置。

According to the pytest docs, pytest --capture=sys should work. If you want to capture standard out inside a test, refer to the capsys fixture.


回答 5

我最初是来这里寻找如何PyTest在VSCode的控制台中运行/调试单元测试的同时进行打印的。这可以通过以下launch.json配置完成。给定.venv虚拟环境文件夹。

    "version": "0.2.0",
    "configurations": [
        {
            "name": "PyTest",
            "type": "python",
            "request": "launch",
            "stopOnEntry": false,
            "pythonPath": "${config:python.pythonPath}",
            "module": "pytest",
            "args": [
                "-sv"
            ],
            "cwd": "${workspaceRoot}",
            "env": {},
            "envFile": "${workspaceRoot}/.venv",
            "debugOptions": [
                "WaitOnAbnormalExit",
                "WaitOnNormalExit",
                "RedirectOutput"
            ]
        }
    ]
}

I originally came in here to find how to make PyTest print in VSCode’s console while running/debugging the unit test from there. This can be done with the following launch.json configuration. Given .venv the virtual environment folder.

    "version": "0.2.0",
    "configurations": [
        {
            "name": "PyTest",
            "type": "python",
            "request": "launch",
            "stopOnEntry": false,
            "pythonPath": "${config:python.pythonPath}",
            "module": "pytest",
            "args": [
                "-sv"
            ],
            "cwd": "${workspaceRoot}",
            "env": {},
            "envFile": "${workspaceRoot}/.venv",
            "debugOptions": [
                "WaitOnAbnormalExit",
                "WaitOnNormalExit",
                "RedirectOutput"
            ]
        }
    ]
}

如何获取熊猫DataFrame的最后N行?

问题:如何获取熊猫DataFrame的最后N行?

我有熊猫数据帧df1df2(df1是vanila数据帧,df2由’STK_ID’和’RPT_Date’索引):

>>> df1
    STK_ID  RPT_Date  TClose   sales  discount
0   000568  20060331    3.69   5.975       NaN
1   000568  20060630    9.14  10.143       NaN
2   000568  20060930    9.49  13.854       NaN
3   000568  20061231   15.84  19.262       NaN
4   000568  20070331   17.00   6.803       NaN
5   000568  20070630   26.31  12.940       NaN
6   000568  20070930   39.12  19.977       NaN
7   000568  20071231   45.94  29.269       NaN
8   000568  20080331   38.75  12.668       NaN
9   000568  20080630   30.09  21.102       NaN
10  000568  20080930   26.00  30.769       NaN

>>> df2
                 TClose   sales  discount  net_sales    cogs
STK_ID RPT_Date                                             
000568 20060331    3.69   5.975       NaN      5.975   2.591
       20060630    9.14  10.143       NaN     10.143   4.363
       20060930    9.49  13.854       NaN     13.854   5.901
       20061231   15.84  19.262       NaN     19.262   8.407
       20070331   17.00   6.803       NaN      6.803   2.815
       20070630   26.31  12.940       NaN     12.940   5.418
       20070930   39.12  19.977       NaN     19.977   8.452
       20071231   45.94  29.269       NaN     29.269  12.606
       20080331   38.75  12.668       NaN     12.668   3.958
       20080630   30.09  21.102       NaN     21.102   7.431

我可以通过以下方式获得df2的最后3行:

>>> df2.ix[-3:]
                 TClose   sales  discount  net_sales    cogs
STK_ID RPT_Date                                             
000568 20071231   45.94  29.269       NaN     29.269  12.606
       20080331   38.75  12.668       NaN     12.668   3.958
       20080630   30.09  21.102       NaN     21.102   7.431

同时df1.ix[-3:]给出所有行:

>>> df1.ix[-3:]
    STK_ID  RPT_Date  TClose   sales  discount
0   000568  20060331    3.69   5.975       NaN
1   000568  20060630    9.14  10.143       NaN
2   000568  20060930    9.49  13.854       NaN
3   000568  20061231   15.84  19.262       NaN
4   000568  20070331   17.00   6.803       NaN
5   000568  20070630   26.31  12.940       NaN
6   000568  20070930   39.12  19.977       NaN
7   000568  20071231   45.94  29.269       NaN
8   000568  20080331   38.75  12.668       NaN
9   000568  20080630   30.09  21.102       NaN
10  000568  20080930   26.00  30.769       NaN

为什么呢 如何获得df1(索引的数据帧)的最后3行?熊猫0.10.1

I have pandas dataframe df1 and df2 (df1 is vanila dataframe, df2 is indexed by ‘STK_ID’ & ‘RPT_Date’) :

>>> df1
    STK_ID  RPT_Date  TClose   sales  discount
0   000568  20060331    3.69   5.975       NaN
1   000568  20060630    9.14  10.143       NaN
2   000568  20060930    9.49  13.854       NaN
3   000568  20061231   15.84  19.262       NaN
4   000568  20070331   17.00   6.803       NaN
5   000568  20070630   26.31  12.940       NaN
6   000568  20070930   39.12  19.977       NaN
7   000568  20071231   45.94  29.269       NaN
8   000568  20080331   38.75  12.668       NaN
9   000568  20080630   30.09  21.102       NaN
10  000568  20080930   26.00  30.769       NaN

>>> df2
                 TClose   sales  discount  net_sales    cogs
STK_ID RPT_Date                                             
000568 20060331    3.69   5.975       NaN      5.975   2.591
       20060630    9.14  10.143       NaN     10.143   4.363
       20060930    9.49  13.854       NaN     13.854   5.901
       20061231   15.84  19.262       NaN     19.262   8.407
       20070331   17.00   6.803       NaN      6.803   2.815
       20070630   26.31  12.940       NaN     12.940   5.418
       20070930   39.12  19.977       NaN     19.977   8.452
       20071231   45.94  29.269       NaN     29.269  12.606
       20080331   38.75  12.668       NaN     12.668   3.958
       20080630   30.09  21.102       NaN     21.102   7.431

I can get the last 3 rows of df2 by:

>>> df2.ix[-3:]
                 TClose   sales  discount  net_sales    cogs
STK_ID RPT_Date                                             
000568 20071231   45.94  29.269       NaN     29.269  12.606
       20080331   38.75  12.668       NaN     12.668   3.958
       20080630   30.09  21.102       NaN     21.102   7.431

while df1.ix[-3:] give all the rows:

>>> df1.ix[-3:]
    STK_ID  RPT_Date  TClose   sales  discount
0   000568  20060331    3.69   5.975       NaN
1   000568  20060630    9.14  10.143       NaN
2   000568  20060930    9.49  13.854       NaN
3   000568  20061231   15.84  19.262       NaN
4   000568  20070331   17.00   6.803       NaN
5   000568  20070630   26.31  12.940       NaN
6   000568  20070930   39.12  19.977       NaN
7   000568  20071231   45.94  29.269       NaN
8   000568  20080331   38.75  12.668       NaN
9   000568  20080630   30.09  21.102       NaN
10  000568  20080930   26.00  30.769       NaN

Why ? How to get the last 3 rows of df1 (dataframe without index) ? Pandas 0.10.1


回答 0

别忘了DataFrame.tail!例如df1.tail(10)

Don’t forget DataFrame.tail! e.g. df1.tail(10)


回答 1

这是因为使用整数索引(通过-3而不是positionix通过标签选择索引,这是设计使然:请参见pandas“ gotchas” *中的整数索引)。

*在较新版本的熊猫中,建议使用loc或iloc删除ix作为位置或标签的歧义:

df.iloc[-3:]

请参阅文档

正如Wes所指出的,在这种特定情况下,您应该只使用tail!

This is because of using integer indices (ix selects those by label over -3 rather than position, and this is by design: see integer indexing in pandas “gotchas”*).

*In newer versions of pandas prefer loc or iloc to remove the ambiguity of ix as position or label:

df.iloc[-3:]

see the docs.

As Wes points out, in this specific case you should just use tail!


回答 2

如何获取熊猫DataFrame的最后N行?

如果您按位置进行切片,__getitem__(即使用进行切片[])效果很好,并且是我针对该问题找到的最简洁的解决方案。

pd.__version__
# '0.24.2'

df = pd.DataFrame({'A': list('aaabbbbc'), 'B': np.arange(1, 9)})
df

   A  B
0  a  1
1  a  2
2  a  3
3  b  4
4  b  5
5  b  6
6  b  7
7  c  8

df[-3:]

   A  B
5  b  6
6  b  7
7  c  8

例如,这与调用相同df.iloc[-3:]iloc内部委托__getitem__)。


顺便说一句,如果要查找每个组的最后N行,请使用groupbyGroupBy.tail

df.groupby('A').tail(2)

   A  B
1  a  2
2  a  3
5  b  6
6  b  7
7  c  8

How to get the last N rows of a pandas DataFrame?

If you are slicing by position, __getitem__ (i.e., slicing with[]) works well, and is the most succinct solution I’ve found for this problem.

pd.__version__
# '0.24.2'

df = pd.DataFrame({'A': list('aaabbbbc'), 'B': np.arange(1, 9)})
df

   A  B
0  a  1
1  a  2
2  a  3
3  b  4
4  b  5
5  b  6
6  b  7
7  c  8

df[-3:]

   A  B
5  b  6
6  b  7
7  c  8

This is the same as calling df.iloc[-3:], for instance (iloc internally delegates to __getitem__).


As an aside, if you want to find the last N rows for each group, use groupby and GroupBy.tail:

df.groupby('A').tail(2)

   A  B
1  a  2
2  a  3
5  b  6
6  b  7
7  c  8

如何在Django queryset中执行小于或等于过滤器?

问题:如何在Django queryset中执行小于或等于过滤器?

我试图通过每个称为“个人资料”的用户个人资料中的自定义字段来过滤用户。此字段称为级别,是0到3之间的整数。

如果我使用等于进行过滤,则会得到具有预期级别的用户列表:

user_list = User.objects.filter(userprofile__level = 0)

当我尝试使用少于以下内容进行过滤时:

user_list = User.objects.filter(userprofile__level < 3)

我得到了错误:

未定义全局名称“ userprofile__level”

有没有一种方法可以通过<或>进行过滤,或者我是否吠叫了错误的树。

I am attempting to filter users by a custom field in each users profile called profile. This field is called level and is an integer between 0-3.

If I filter using equals, I get a list of users with the chosen level as expected:

user_list = User.objects.filter(userprofile__level = 0)

When I try to filter using less than:

user_list = User.objects.filter(userprofile__level < 3)

I get the error:

global name ‘userprofile__level’ is not defined

Is there a way to filter by < or >, or am I barking up the wrong tree.


回答 0

小于或等于:

User.objects.filter(userprofile__level__lte=0)

大于或等于:

User.objects.filter(userprofile__level__gte=0)

同样,lt小于和gt大于。您可以在文档中找到它们。

Less than or equal:

User.objects.filter(userprofile__level__lte=0)

Greater than or equal:

User.objects.filter(userprofile__level__gte=0)

Likewise, lt for less than and gt for greater than. You can find them all in the documentation.


TypeError:无法散列的类型:’dict’

问题:TypeError:无法散列的类型:’dict’

这段代码给我一个错误unhashable type: dict,任何人都可以向我解释解决方案

negids = movie_reviews.fileids('neg')
def word_feats(words):
    return dict([(word, True) for word in words])

negfeats = [(word_feats(movie_reviews.words(fileids=[f])), 'neg') for f in negids]
stopset = set(stopwords.words('english'))

def stopword_filtered_word_feats(words):
    return dict([(word, True) for word in words if word not in stopset])

result=stopword_filtered_word_feats(negfeats)

This piece of code is giving me an error unhashable type: dict can anyone explain me what is the solution

negids = movie_reviews.fileids('neg')
def word_feats(words):
    return dict([(word, True) for word in words])

negfeats = [(word_feats(movie_reviews.words(fileids=[f])), 'neg') for f in negids]
stopset = set(stopwords.words('english'))

def stopword_filtered_word_feats(words):
    return dict([(word, True) for word in words if word not in stopset])

result=stopword_filtered_word_feats(negfeats)

回答 0

您正在尝试将a dict用作另一个dict或in的键set。那是行不通的,因为密钥必须是可哈希的。通常,只有不可变的对象(字符串,整数,浮点数,frozensets,不可变的元组)才是可哈希化的(尽管可能有exceptions)。因此,这不起作用:

>>> dict_key = {"a": "b"}
>>> some_dict[dict_key] = True
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'

要将字典用作键,您需要将其转换为可能首先进行哈希处理的东西。如果要用作键的字典仅由不可变值组成,则可以像这样创建可散列的表示形式:

>>> key = frozenset(dict_key.items())

现在,您可以keydict或中用作键set

>>> some_dict[key] = True
>>> some_dict
{frozenset([('a', 'b')]): True}

当然,每当要使用字典查找某些内容时,都需要重复练习:

>>> some_dict[dict_key]                     # Doesn't work
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'
>>> some_dict[frozenset(dict_key.items())]  # Works
True

如果dict您希望用作键的值本身就是字典和/或列表,则需要递归“冻结”预期键。这是一个起点:

def freeze(d):
    if isinstance(d, dict):
        return frozenset((key, freeze(value)) for key, value in d.items())
    elif isinstance(d, list):
        return tuple(freeze(value) for value in d)
    return d

You’re trying to use a dict as a key to another dict or in a set. That does not work because the keys have to be hashable. As a general rule, only immutable objects (strings, integers, floats, frozensets, tuples of immutables) are hashable (though exceptions are possible). So this does not work:

>>> dict_key = {"a": "b"}
>>> some_dict[dict_key] = True
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'

To use a dict as a key you need to turn it into something that may be hashed first. If the dict you wish to use as key consists of only immutable values, you can create a hashable representation of it like this:

>>> key = frozenset(dict_key.items())

Now you may use key as a key in a dict or set:

>>> some_dict[key] = True
>>> some_dict
{frozenset([('a', 'b')]): True}

Of course you need to repeat the exercise whenever you want to look up something using a dict:

>>> some_dict[dict_key]                     # Doesn't work
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'
>>> some_dict[frozenset(dict_key.items())]  # Works
True

If the dict you wish to use as key has values that are themselves dicts and/or lists, you need to recursively “freeze” the prospective key. Here’s a starting point:

def freeze(d):
    if isinstance(d, dict):
        return frozenset((key, freeze(value)) for key, value in d.items())
    elif isinstance(d, list):
        return tuple(freeze(value) for value in d)
    return d

回答 1

一个可能的解决方案可能是使用JSON dumps()方法,因此您可以将字典转换为字符串-

import json

a={"a":10, "b":20}
b={"b":20, "a":10}
c = [json.dumps(a), json.dumps(b)]


set(c)
json.dumps(a) in c

输出-

set(['{"a": 10, "b": 20}'])
True

A possible solution might be to use the JSON dumps() method, so you can convert the dictionary to a string —

import json

a={"a":10, "b":20}
b={"b":20, "a":10}
c = [json.dumps(a), json.dumps(b)]


set(c)
json.dumps(a) in c

Output –

set(['{"a": 10, "b": 20}'])
True

如何在Tkinter中将参数传递给Button命令?

问题:如何在Tkinter中将参数传递给Button命令?

假设我Button在Python中使用Tkinter进行了以下操作:

import Tkinter as Tk
win = Tk.Toplevel()
frame = Tk.Frame(master=win).grid(row=1, column=1)
button = Tk.Button(master=frame, text='press', command=action)

action当我按下按钮时会调用该方法,但是如果我想向该方法传递一些参数action呢?

我尝试使用以下代码:

button = Tk.Button(master=frame, text='press', command=action(someNumber))

这只是立即调用该方法,而按该按钮则没有任何作用。

Suppose I have the following Button made with Tkinter in Python:

import Tkinter as Tk
win = Tk.Toplevel()
frame = Tk.Frame(master=win).grid(row=1, column=1)
button = Tk.Button(master=frame, text='press', command=action)

The method action is called when I press the button, but what if I wanted to pass some arguments to the method action?

I have tried with the following code:

button = Tk.Button(master=frame, text='press', command=action(someNumber))

This just invokes the method immediately, and pressing the button does nothing.


回答 0

我个人更喜欢lambdas在这种情况下使用,因为imo更加简单明了,并且如果您无法控制被调用的方法,也不会强迫您编写很多包装方法,但这当然是一个问题。

这就是使用lambda的方式(请注意,在功能模块中还存在一些currying的实现,因此您也可以使用它):

button = Tk.Button(master=frame, text='press', command= lambda: action(someNumber))

I personally prefer to use lambdas in such a scenario, because imo it’s clearer and simpler and also doesn’t force you to write lots of wrapper methods if you don’t have control over the called method, but that’s certainly a matter of taste.

That’s how you’d do it with a lambda (note there’s also some implementation of currying in the functional module, so you can use that too):

button = Tk.Button(master=frame, text='press', command= lambda: action(someNumber))

回答 1

这也可以通过使用partial标准库functools来完成,如下所示:

from functools import partial
#(...)
action_with_arg = partial(action, arg)
button = Tk.Button(master=frame, text='press', command=action_with_arg)

This can also be done by using partial from the standard library functools, like this:

from functools import partial
#(...)
action_with_arg = partial(action, arg)
button = Tk.Button(master=frame, text='press', command=action_with_arg)

回答 2

GUI示例:

假设我有GUI:

import tkinter as tk

root = tk.Tk()

btn = tk.Button(root, text="Press")
btn.pack()

root.mainloop()

按下按钮时会发生什么

看到btn按下时它会调用自己的函数,函数与button_press_handle以下示例非常相似:

def button_press_handle(callback=None):
    if callback:
        callback() # Where exactly the method assigned to btn['command'] is being callled

与:

button_press_handle(btn['command'])

您可以简单地认为command应该将option设置为对我们要调用的方法的引用,类似于callbackin button_press_handle


按下按钮时调用方法(回调

没有参数

因此,如果要在print按下按钮时进行某些操作,则需要进行以下设置:

btn['command'] = print # default to print is new line

请密切注意缺少()print方法的不足,该方法的含义是:“这是我要在按下时调用的方法名称,不要立即调用。” 但是,我没有为传递任何参数,print因此在没有参数的情况下,它会打印任何内容。

论点

现在,如果我还希望将参数传递给要在按下按钮时调用的方法,则可以使用匿名函数,该函数可以通过lambda语句创建,在这种情况下,将使用print内置方法,如下所示:

btn['command'] = lambda arg1="Hello", arg2=" ", arg3="World!" : print(arg1 + arg2 + arg3)

按下按钮时调用多种方法

没有参数

您也可以使用using lambda语句实现该功能,但是这被认为是不好的做法,因此在此不再赘述。好的做法是定义一个单独的方法,multiple_methods该方法调用所需的方法,然后将其设置为按下按钮的回调:

def multiple_methods():
    print("Vicariously") # the first inner callback
    print("I") # another inner callback

论点

为了将参数传递给调用其他方法的方法,请再次使用lambda语句,但首先:

def multiple_methods(*args, **kwargs):
    print(args[0]) # the first inner callback
    print(kwargs['opt1']) # another inner callback

然后设置:

btn['command'] = lambda arg="live", kw="as the" : a_new_method(arg, opt1=kw)

从回调返回对象

还要进一步注意,这callback并不是真的,return因为它仅在button_press_handlewith 内调用,callback()而不是return callback()。确实return不在该功能之外的任何地方。因此,您应该修改当前作用域中可访问的对象。


具有全局对象修改的完整示例

下面的示例将调用一个方法,该方法btn每次按下按钮都会更改的文本:

import tkinter as tk

i = 0
def text_mod():
    global i, btn           # btn can be omitted but not sure if should be
    txt = ("Vicariously", "I", "live", "as", "the", "whole", "world", "dies")
    btn['text'] = txt[i]    # the global object that is modified
    i = (i + 1) % len(txt)  # another global object that gets modified

root = tk.Tk()

btn = tk.Button(root, text="My Button")
btn['command'] = text_mod

btn.pack(fill='both', expand=True)

root.mainloop()

镜子

Example GUI:

Let’s say I have the GUI:

import tkinter as tk

root = tk.Tk()

btn = tk.Button(root, text="Press")
btn.pack()

root.mainloop()

What Happens When a Button Is Pressed

See that when btn is pressed it calls its own function which is very similar to button_press_handle in the following example:

def button_press_handle(callback=None):
    if callback:
        callback() # Where exactly the method assigned to btn['command'] is being callled

with:

button_press_handle(btn['command'])

You can simply think that command option should be set as, the reference to the method we want to be called, similar to callback in button_press_handle.


Calling a Method(Callback) When the Button is Pressed

Without arguments

So if I wanted to print something when the button is pressed I would need to set:

btn['command'] = print # default to print is new line

Pay close attention to the lack of () with the print method which is omitted in the meaning that: “This is the method’s name which I want you to call when pressed but don’t call it just this very instant.” However, I didn’t pass any arguments for the print so it printed whatever it prints when called without arguments.

With Argument(s)

Now If I wanted to also pass arguments to the method I want to be called when the button is pressed I could make use of the anonymous functions, which can be created with lambda statement, in this case for print built-in method, like the following:

btn['command'] = lambda arg1="Hello", arg2=" ", arg3="World!" : print(arg1 + arg2 + arg3)

Calling Multiple Methods when the Button Is Pressed

Without Arguments

You can also achieve that using lambda statement but it is considered bad practice and thus I won’t include it here. The good practice is to define a separate method, multiple_methods, that calls the methods wanted and then set it as the callback to the button press:

def multiple_methods():
    print("Vicariously") # the first inner callback
    print("I") # another inner callback

With Argument(s)

In order to pass argument(s) to method that calls other methods, again make use of lambda statement, but first:

def multiple_methods(*args, **kwargs):
    print(args[0]) # the first inner callback
    print(kwargs['opt1']) # another inner callback

and then set:

btn['command'] = lambda arg="live", kw="as the" : a_new_method(arg, opt1=kw)

Returning Object(s) From the Callback

Also further note that callback can’t really return because it’s only called inside button_press_handle with callback() as opposed to return callback(). It does return but not anywhere outside that function. Thus you should rather modify object(s) that are accessible in the current scope.


Complete Example with global Object Modification(s)

Below example will call a method that changes btn‘s text each time the button is pressed:

import tkinter as tk

i = 0
def text_mod():
    global i, btn           # btn can be omitted but not sure if should be
    txt = ("Vicariously", "I", "live", "as", "the", "whole", "world", "dies")
    btn['text'] = txt[i]    # the global object that is modified
    i = (i + 1) % len(txt)  # another global object that gets modified

root = tk.Tk()

btn = tk.Button(root, text="My Button")
btn['command'] = text_mod

btn.pack(fill='both', expand=True)

root.mainloop()

Mirror


回答 3

Python提供函数参数默认值的能力为我们提供了一条出路。

def fce(x=myX, y=myY):
    myFunction(x,y)
button = Tk.Button(mainWin, text='press', command=fce)

请参阅:http : //infohost.nmt.edu/tcc/help/pubs/tkinter/web/extra-args.html

对于更多按钮,您可以创建一个返回函数的函数:

def fce(myX, myY):
    def wrapper(x=myX, y=myY):
        pass
        pass
        pass
        return x+y
    return wrapper

button1 = Tk.Button(mainWin, text='press 1', command=fce(1,2))
button2 = Tk.Button(mainWin, text='press 2', command=fce(3,4))
button3 = Tk.Button(mainWin, text='press 3', command=fce(9,8))

Python’s ability to provide default values for function arguments gives us a way out.

def fce(x=myX, y=myY):
    myFunction(x,y)
button = Tk.Button(mainWin, text='press', command=fce)

See: http://infohost.nmt.edu/tcc/help/pubs/tkinter/web/extra-args.html

For more buttons you can create a function which returns a function:

def fce(myX, myY):
    def wrapper(x=myX, y=myY):
        pass
        pass
        pass
        return x+y
    return wrapper

button1 = Tk.Button(mainWin, text='press 1', command=fce(1,2))
button2 = Tk.Button(mainWin, text='press 2', command=fce(3,4))
button3 = Tk.Button(mainWin, text='press 3', command=fce(9,8))

回答 4

建立在Matt Thompsons的答案上:可以将一个类设为可调用的,因此可以代替一个函数来使用它:

import tkinter as tk

class Callback:
    def __init__(self, func, *args, **kwargs):
        self.func = func
        self.args = args
        self.kwargs = kwargs
    def __call__(self):
        self.func(*self.args, **self.kwargs)

def default_callback(t):
    print("Button '{}' pressed.".format(t))

root = tk.Tk()

buttons = ["A", "B", "C"]

for i, b in enumerate(buttons):
    tk.Button(root, text=b, command=Callback(default_callback, b)).grid(row=i, column=0)

tk.mainloop()

Building on Matt Thompsons answer : a class can be made callable so it can be used instead of a function:

import tkinter as tk

class Callback:
    def __init__(self, func, *args, **kwargs):
        self.func = func
        self.args = args
        self.kwargs = kwargs
    def __call__(self):
        self.func(*self.args, **self.kwargs)

def default_callback(t):
    print("Button '{}' pressed.".format(t))

root = tk.Tk()

buttons = ["A", "B", "C"]

for i, b in enumerate(buttons):
    tk.Button(root, text=b, command=Callback(default_callback, b)).grid(row=i, column=0)

tk.mainloop()

回答 5

它立即调用该方法并且按下按钮没有执行任何操作的原因action(somenumber)是已评估并且其返回值归因于按钮的命令。因此,如果action打印出一些东西告诉您它已经运行并返回了None,那么您只需运行action以评估其返回值并给出None作为按钮的命令。

要使按钮具有不同的参数来调用函数,可以使用全局变量,尽管我不建议这样做:

import Tkinter as Tk

frame = Tk.Frame(width=5, height=2, bd=1, relief=Tk.SUNKEN)
frame.grid(row=2,column=2)
frame.pack(fill=Tk.X, padx=5, pady=5)
def action():
    global output
    global variable
    output.insert(Tk.END,variable.get())
button = Tk.Button(master=frame, text='press', command=action)
button.pack()
variable = Tk.Entry(master=frame)
variable.pack()
output = Tk.Text(master=frame)
output.pack()

if __name__ == '__main__':
    Tk.mainloop()

我要做的是制作一个class其对象包含所需的每个变量和根据需要更改它们的方法:

import Tkinter as Tk
class Window:
    def __init__(self):
        self.frame = Tk.Frame(width=5, height=2, bd=1, relief=Tk.SUNKEN)
        self.frame.grid(row=2,column=2)
        self.frame.pack(fill=Tk.X, padx=5, pady=5)

        self.button = Tk.Button(master=self.frame, text='press', command=self.action)
        self.button.pack()

        self.variable = Tk.Entry(master=self.frame)
        self.variable.pack()

        self.output = Tk.Text(master=self.frame)
        self.output.pack()

    def action(self):
        self.output.insert(Tk.END,self.variable.get())

if __name__ == '__main__':
    window = Window()
    Tk.mainloop()

The reason it invokes the method immediately and pressing the button does nothing is that action(somenumber) is evaluated and its return value is attributed as the command for the button. So if action prints something to tell you it has run and returns None, you just run action to evaluate its return value and given None as the command for the button.

To have buttons to call functions with different arguments you can use global variables, although I can’t recommend it:

import Tkinter as Tk

frame = Tk.Frame(width=5, height=2, bd=1, relief=Tk.SUNKEN)
frame.grid(row=2,column=2)
frame.pack(fill=Tk.X, padx=5, pady=5)
def action():
    global output
    global variable
    output.insert(Tk.END,variable.get())
button = Tk.Button(master=frame, text='press', command=action)
button.pack()
variable = Tk.Entry(master=frame)
variable.pack()
output = Tk.Text(master=frame)
output.pack()

if __name__ == '__main__':
    Tk.mainloop()

What I would do is make a class whose objects would contain every variable required and methods to change those as needed:

import Tkinter as Tk
class Window:
    def __init__(self):
        self.frame = Tk.Frame(width=5, height=2, bd=1, relief=Tk.SUNKEN)
        self.frame.grid(row=2,column=2)
        self.frame.pack(fill=Tk.X, padx=5, pady=5)

        self.button = Tk.Button(master=self.frame, text='press', command=self.action)
        self.button.pack()

        self.variable = Tk.Entry(master=self.frame)
        self.variable.pack()

        self.output = Tk.Text(master=self.frame)
        self.output.pack()

    def action(self):
        self.output.insert(Tk.END,self.variable.get())

if __name__ == '__main__':
    window = Window()
    Tk.mainloop()

回答 6

button = Tk.Button(master=frame, text='press', command=lambda: action(someNumber))

我相信应该解决这个问题

button = Tk.Button(master=frame, text='press', command=lambda: action(someNumber))

I believe should fix this


回答 7

最好的做法是使用lambda,如下所示:

button = Tk.Button(master=frame, text='press', command=lambda: action(someNumber))

The best thing to do is use lambda as follows:

button = Tk.Button(master=frame, text='press', command=lambda: action(someNumber))

回答 8

我来晚了,但是这是完成它的一种非常简单的方法。

import tkinter as tk
def function1(param1, param2):
    print(str(param1) + str(param2))

var1 = "Hello "
var2 = "World!"
def function2():
    function1(var1, var2)

root = tk.Tk()

myButton = tk.Button(root, text="Button", command=function2)
root.mainloop()

您只需将要使用的功能包装到另一个功能中,然后在按下按钮时调用第二个功能。

I am extremely late, but here is a very simple way of accomplishing it.

import tkinter as tk
def function1(param1, param2):
    print(str(param1) + str(param2))

var1 = "Hello "
var2 = "World!"
def function2():
    function1(var1, var2)

root = tk.Tk()

myButton = tk.Button(root, text="Button", command=function2)
root.mainloop()

You simply wrap the function you want to use in another function and call the second function on the button press.


回答 9

Lambda很不错,但是您也可以尝试一下(在for循环中顺便说一句):

root = Tk()

dct = {"1": [*args], "2": [*args]}
def keypress(event):
    *args = dct[event.char]
    for arg in args:
        pass
for i in range(10):
    root.bind(str(i), keypress)

之所以起作用,是因为设置了绑定后,按键将事件作为参数传递。然后,您可以取消事件的属性,例如event.char获得“ 1”或“ UP”。如果您需要一个或多个事件属性以外的参数。只需创建一个字典来存储它们。

Lambdas are all well and good, but you can also try this (which works in a for loop btw):

root = Tk()

dct = {"1": [*args], "2": [*args]}
def keypress(event):
    *args = dct[event.char]
    for arg in args:
        pass
for i in range(10):
    root.bind(str(i), keypress)

This works because when the binding is set, a key press passes the event as an argument. You can then call attributes off the event like event.char to get “1” or “UP” ect. If you need an argument or multiple arguments other than the event attributes. just create a dictionary to store them.


回答 10

我也曾经遇到过这个问题。您可以只使用lambda:

button = Tk.Button(master=frame, text='press',command=lambda: action(someNumber))

I have encountered this problem before, too. You can just use lambda:

button = Tk.Button(master=frame, text='press',command=lambda: action(someNumber))

回答 11

如果您要执行更多操作,请使用lambda将条目数据传递给命令函数,例如:

event1 = Entry(master)
button1 = Button(master, text="OK", command=lambda: test_event(event1.get()))

def test_event(event_text):
    if not event_text:
        print("Nothing entered")
    else:
        print(str(event_text))
        #  do stuff

这会将事件中的信息传递给按钮功能。可能有更多类似Python的方式编写此代码,但这对我有用。

Use a lambda to pass the entry data to the command function if you have more actions to carry out, like this (I’ve tried to make it generic, so just adapt):

event1 = Entry(master)
button1 = Button(master, text="OK", command=lambda: test_event(event1.get()))

def test_event(event_text):
    if not event_text:
        print("Nothing entered")
    else:
        print(str(event_text))
        #  do stuff

This will pass the information in the event to the button function. There may be more Pythonesque ways of writing this, but it works for me.


回答 12

JasonPy-一些事情…

如果您将一个按钮粘在一个循环中,它将一遍又一遍地创建…这可能不是您想要的。(也许是)…

它总是获得最后一个索引的原因是单击它们时运行的lambda事件-而不是程序启动时。我不确定100%在做什么,但也许尝试在完成后存储值,然后稍后使用lambda按钮调用它。

例如:(不使用此代码,仅作为示例)

for entry in stuff_that_is_happening:
    value_store[entry] = stuff_that_is_happening

那你可以说…

button... command: lambda: value_store[1]

希望这可以帮助!

JasonPy – a few things…

if you stick a button in a loop it will be created over and over and over again… which is probably not what you want. (maybe it is)…

The reason it always gets the last index is lambda events run when you click them – not when the program starts. I’m not sure 100% what you are doing but maybe try storing the value when it’s made then call it later with the lambda button.

eg: (don’t use this code, just an example)

for entry in stuff_that_is_happening:
    value_store[entry] = stuff_that_is_happening

then you can say….

button... command: lambda: value_store[1]

hope this helps!


回答 13

一种简单的方法是button使用lambda以下语法进行配置:

button['command'] = lambda arg1 = local_var1, arg2 = local_var2 : function(arg1, arg2)

One simple way would be to configure button with lambda like the following syntax:

button['command'] = lambda arg1 = local_var1, arg2 = local_var2 : function(arg1, arg2)

回答 14

为了后代:您也可以使用类来实现类似的目的。例如:

class Function_Wrapper():
    def __init__(self, x, y, z):
        self.x, self.y, self.z = x, y, z
    def func(self):
        return self.x + self.y + self.z # execute function

然后可以通过以下方式简单地创建按钮:

instance1 = Function_Wrapper(x, y, z)
button1  = Button(master, text = "press", command = instance1.func)

这种方法还允许您通过设置来更改函数参数instance1.x = 3

For posterity: you can also use classes to achieve something similar. For instance:

class Function_Wrapper():
    def __init__(self, x, y, z):
        self.x, self.y, self.z = x, y, z
    def func(self):
        return self.x + self.y + self.z # execute function

Button can then be simply created by:

instance1 = Function_Wrapper(x, y, z)
button1  = Button(master, text = "press", command = instance1.func)

This approach also allows you to change the function arguments by i.e. setting instance1.x = 3.


回答 15

您需要使用 lambda:

button = Tk.Button(master=frame, text='press', command=lambda: action(someNumber))

You need to use lambda:

button = Tk.Button(master=frame, text='press', command=lambda: action(someNumber))

回答 16

使用lambda

import tkinter as tk

root = tk.Tk()
def go(text):
    print(text)

b = tk.Button(root, text="Click", command=lambda: go("hello"))
b.pack()
root.mainloop()

输出:

hello

Use lambda

import tkinter as tk

root = tk.Tk()
def go(text):
    print(text)

b = tk.Button(root, text="Click", command=lambda: go("hello"))
b.pack()
root.mainloop()

output:

hello

如何在Python中将多个值附加到列表

问题:如何在Python中将多个值附加到列表

我试图弄清楚如何在Python中将多个值附加到列表中。我知道有一些方法来做到这一点,如手动输入值,或在PUR追加操作for循环,或appendextend功能。

但是,我想知道是否还有更整洁的方法?也许某个软件包或功能?

I am trying to figure out how to append multiple values to a list in Python. I know there are few methods to do so, such as manually input the values, or put the append operation in a for loop, or the append and extend functions.

However, I wonder if there is a more neat way to do so? Maybe a certain package or function?


回答 0

您可以使用sequence方法list.extend将列表从任意迭代类型中扩展为多个值,无论是另一个列表还是提供值序列的任何其他事物。

>>> lst = [1, 2]
>>> lst.append(3)
>>> lst.append(4)
>>> lst
[1, 2, 3, 4]

>>> lst.extend([5, 6, 7])
>>> lst.extend((8, 9, 10))
>>> lst
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

>>> lst.extend(range(11, 14))
>>> lst
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]

因此,您可以list.append()用来附加单个值,也list.extend()可以附加多个值。

You can use the sequence method list.extend to extend the list by multiple values from any kind of iterable, being it another list or any other thing that provides a sequence of values.

>>> lst = [1, 2]
>>> lst.append(3)
>>> lst.append(4)
>>> lst
[1, 2, 3, 4]

>>> lst.extend([5, 6, 7])
>>> lst.extend((8, 9, 10))
>>> lst
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

>>> lst.extend(range(11, 14))
>>> lst
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]

So you can use list.append() to append a single value, and list.extend() to append multiple values.


回答 1

除了append函数以外,如果用“多个值”表示另一个列表,则可以像这样简单地将它们连接起来。

>>> a = [1,2,3]
>>> b = [4,5,6]
>>> a + b
[1, 2, 3, 4, 5, 6]

Other than the append function, if by “multiple values” you mean another list, you can simply concatenate them like so.

>>> a = [1,2,3]
>>> b = [4,5,6]
>>> a + b
[1, 2, 3, 4, 5, 6]

回答 2

如果你看一下在官方的文档,你会看到下方appendextend。这就是您要寻找的。

itertools.chain如果您对高效的迭代感兴趣,而不是最终获得一个完全填充的数据结构,那也是很有用的。

If you take a look at the official docs, you’ll see right below append, extend. That’s what your looking for.

There’s also itertools.chain if you are more interested in efficient iteration than ending up with a fully populated data structure.


将包含NaN的Pandas列转换为dtype`int`

问题:将包含NaN的Pandas列转换为dtype`int`

我将数据从.csv文件读取到Pandas数据框,如下所示。对于其中一列,id我想将列类型指定为int。问题在于该id系列的值缺失/为空。

当我尝试id在读取.csv时将列转换为整数时,得到:

df= pd.read_csv("data.csv", dtype={'id': int}) 
error: Integer column has NA values

或者,我尝试在阅读以下内容后转换列类型,但是这次我得到:

df= pd.read_csv("data.csv") 
df[['id']] = df[['id']].astype(int)
error: Cannot convert NA to integer

我该如何解决?

I read data from a .csv file to a Pandas dataframe as below. For one of the columns, namely id, I want to specify the column type as int. The problem is the id series has missing/empty values.

When I try to cast the id column to integer while reading the .csv, I get:

df= pd.read_csv("data.csv", dtype={'id': int}) 
error: Integer column has NA values

Alternatively, I tried to convert the column type after reading as below, but this time I get:

df= pd.read_csv("data.csv") 
df[['id']] = df[['id']].astype(int)
error: Cannot convert NA to integer

How can I tackle this?


回答 0

整数列中缺少NaN rep是熊猫的“陷阱”

通常的解决方法是仅使用浮点数。

The lack of NaN rep in integer columns is a pandas “gotcha”.

The usual workaround is to simply use floats.


回答 1

在0.24。+版本中,pandas获得了保留具有缺失值的整数dtypes的功能。

可空整数数据类型

大熊猫可以使用来表示可能缺少值的整数数据arrays.IntegerArray。这是在熊猫中实现的扩展类型。它不是整数的默认dtype,因此不会被推断。您必须将dtype明确传递给array()Series

arr = pd.array([1, 2, np.nan], dtype=pd.Int64Dtype())
pd.Series(arr)

0      1
1      2
2    NaN
dtype: Int64

要将列转换为可为空的整数,请使用:

df['myCol'] = df['myCol'].astype('Int64')

In version 0.24.+ pandas has gained the ability to hold integer dtypes with missing values.

Nullable Integer Data Type.

Pandas can represent integer data with possibly missing values using arrays.IntegerArray. This is an extension types implemented within pandas. It is not the default dtype for integers, and will not be inferred; you must explicitly pass the dtype into array() or Series:

arr = pd.array([1, 2, np.nan], dtype=pd.Int64Dtype())
pd.Series(arr)

0      1
1      2
2    NaN
dtype: Int64

For convert column to nullable integers use:

df['myCol'] = df['myCol'].astype('Int64')

回答 2

我的用例是在装入数据库表之前先整理数据:

df[col] = df[col].fillna(-1)
df[col] = df[col].astype(int)
df[col] = df[col].astype(str)
df[col] = df[col].replace('-1', np.nan)

删除NaN,转换为int,转换为str,然后重新插入NAN。

它虽然不漂亮,但可以完成工作!

My use case is munging data prior to loading into a DB table:

df[col] = df[col].fillna(-1)
df[col] = df[col].astype(int)
df[col] = df[col].astype(str)
df[col] = df[col].replace('-1', np.nan)

Remove NaNs, convert to int, convert to str and then reinsert NANs.

It’s not pretty but it gets the job done!


回答 3

现在可以创建一个包含NaNs作为intdtype 的熊猫列,因为它现在已正式添加到熊猫0.24.0中。

pandas 0.24.x发行说明 Quote:“ Pandas已经拥有了持有缺失值的整数dtypes的能力

It is now possible to create a pandas column containing NaNs as dtype int, since it is now officially added on pandas 0.24.0

pandas 0.24.x release notes Quote: “Pandas has gained the ability to hold integer dtypes with missing values


回答 4

如果绝对要在列中组合整数和NaN,则可以使用“对象”数据类型:

df['col'] = (
    df['col'].fillna(0)
    .astype(int)
    .astype(object)
    .where(df['col'].notnull())
)

这将用整数替换NaN(无关紧要),将其转换为int,转换为对象,最后重新插入NaN。

If you absolutely want to combine integers and NaNs in a column, you can use the ‘object’ data type:

df['col'] = (
    df['col'].fillna(0)
    .astype(int)
    .astype(object)
    .where(df['col'].notnull())
)

This will replace NaNs with an integer (doesn’t matter which), convert to int, convert to object and finally reinsert NaNs.


回答 5

如果您可以修改存储的数据,请使用缺少的哨兵值id。由列名推断出的一个常见用例id是一个严格大于零的整数,您可以将其0用作前哨值,以便编写

if row['id']:
   regular_process(row)
else:
   special_process(row)

If you can modify your stored data, use a sentinel value for missing id. A common use case, inferred by the column name, being that id is an integer, strictly greater than zero, you could use 0 as a sentinel value so that you can write

if row['id']:
   regular_process(row)
else:
   special_process(row)

回答 6

您可以使用.dropna()是否可以删除带有NaN值的行。

df = df.dropna(subset=['id'])

或者,使用.fillna().astype()将NaN替换为值,并将其转换为int。

在处理带有大整数的CSV文件时,我遇到了这个问题,而其中一些缺失(NaN)。不能使用float作为类型,因为我可能会降低精度。

我的解决方案是使用str作为中间类型。然后,您可以在稍后的代码中将字符串转换为int。我将NaN替换为0,但是您可以选择任何值。

df = pd.read_csv(filename, dtype={'id':str})
df["id"] = df["id"].fillna("0").astype(int)

为了进行说明,这是一个示例,说明浮点数可能会降低精度:

s = "12345678901234567890"
f = float(s)
i = int(f)
i2 = int(s)
print (f, i, i2)

输出为:

1.2345678901234567e+19 12345678901234567168 12345678901234567890

You could use .dropna() if it is OK to drop the rows with the NaN values.

df = df.dropna(subset=['id'])

Alternatively, use .fillna() and .astype() to replace the NaN with values and convert them to int.

I ran into this problem when processing a CSV file with large integers, while some of them were missing (NaN). Using float as the type was not an option, because I might loose the precision.

My solution was to use str as the intermediate type. Then you can convert the string to int as you please later in the code. I replaced NaN with 0, but you could choose any value.

df = pd.read_csv(filename, dtype={'id':str})
df["id"] = df["id"].fillna("0").astype(int)

For the illustration, here is an example how floats may loose the precision:

s = "12345678901234567890"
f = float(s)
i = int(f)
i2 = int(s)
print (f, i, i2)

And the output is:

1.2345678901234567e+19 12345678901234567168 12345678901234567890

回答 7

这里的大多数解决方案都告诉您如何使用占位符整数表示空值。如果不确定整数是否会显示在源数据中,则该方法无济于事。我的方法将格式化不包含其十进制值的浮点数,并将空值转换为无值。结果是一个对象数据类型,当加载到CSV中时,它将看起来像一个带有空值的整数字段。

keep_df[col] = keep_df[col].apply(lambda x: None if pandas.isnull(x) else '{0:.0f}'.format(pandas.to_numeric(x)))

Most solutions here tell you how to use a placeholder integer to represent nulls. That approach isn’t helpful if you’re uncertain that integer won’t show up in your source data though. My method with will format floats without their decimal values and convert nulls to None’s. The result is an object datatype that will look like an integer field with null values when loaded into a CSV.

keep_df[col] = keep_df[col].apply(lambda x: None if pandas.isnull(x) else '{0:.0f}'.format(pandas.to_numeric(x)))

回答 8

我在使用pyspark时遇到了这个问题。由于这是在jvm上运行的代码的python前端,因此它需要类型安全,并且不能选择使用float而不是int。我通过将熊猫包装pd.read_csv在一个函数中来解决此问题,该函数将使用用户定义的填充值填充用户定义的列,然后再将其转换为所需的类型。这是我最终使用的内容:

def custom_read_csv(file_path, custom_dtype = None, fill_values = None, **kwargs):
    if custom_dtype is None:
        return pd.read_csv(file_path, **kwargs)
    else:
        assert 'dtype' not in kwargs.keys()
        df = pd.read_csv(file_path, dtype = {}, **kwargs)
        for col, typ in custom_dtype.items():
            if fill_values is None or col not in fill_values.keys():
                fill_val = -1
            else:
                fill_val = fill_values[col]
            df[col] = df[col].fillna(fill_val).astype(typ)
    return df

I ran into this issue working with pyspark. As this is a python frontend for code running on a jvm, it requires type safety and using float instead of int is not an option. I worked around the issue by wrapping the pandas pd.read_csv in a function that will fill user-defined columns with user-defined fill values before casting them to the required type. Here is what I ended up using:

def custom_read_csv(file_path, custom_dtype = None, fill_values = None, **kwargs):
    if custom_dtype is None:
        return pd.read_csv(file_path, **kwargs)
    else:
        assert 'dtype' not in kwargs.keys()
        df = pd.read_csv(file_path, dtype = {}, **kwargs)
        for col, typ in custom_dtype.items():
            if fill_values is None or col not in fill_values.keys():
                fill_val = -1
            else:
                fill_val = fill_values[col]
            df[col] = df[col].fillna(fill_val).astype(typ)
    return df

回答 9

首先删除包含NaN的行。然后对剩余的行进行整数转换。最后,再次插入删除的行。希望它能工作

First remove the rows which contain NaN. Then do Integer conversion on remaining rows. At Last insert the removed rows again. Hope it will work


回答 10

import pandas as pd

df= pd.read_csv("data.csv")
df['id'] = pd.to_numeric(df['id'])
import pandas as pd

df= pd.read_csv("data.csv")
df['id'] = pd.to_numeric(df['id'])

回答 11

假设您的DateColumn格式为3312018.0的字符串应转换为03/31/2018。并且,某些记录丢失或为0。

df['DateColumn'] = df['DateColumn'].astype(int)
df['DateColumn'] = df['DateColumn'].astype(str)
df['DateColumn'] = df['DateColumn'].apply(lambda x: x.zfill(8))
df.loc[df['DateColumn'] == '00000000','DateColumn'] = '01011980'
df['DateColumn'] = pd.to_datetime(df['DateColumn'], format="%m%d%Y")
df['DateColumn'] = df['DateColumn'].apply(lambda x: x.strftime('%m/%d/%Y'))

Assuming your DateColumn formatted 3312018.0 should be converted to 03/31/2018 as a string. And, some records are missing or 0.

df['DateColumn'] = df['DateColumn'].astype(int)
df['DateColumn'] = df['DateColumn'].astype(str)
df['DateColumn'] = df['DateColumn'].apply(lambda x: x.zfill(8))
df.loc[df['DateColumn'] == '00000000','DateColumn'] = '01011980'
df['DateColumn'] = pd.to_datetime(df['DateColumn'], format="%m%d%Y")
df['DateColumn'] = df['DateColumn'].apply(lambda x: x.strftime('%m/%d/%Y'))

Python中的“ collection.defaultdict”多个级别

问题:Python中的“ collection.defaultdict”多个级别

感谢SO方面的一些杰出人士,我发现了的可能性collections.defaultdict,尤其是在可读性和速度方面。我让他们成功使用。

现在,我想实现三个级别的字典,两个最大的字典是defaultdict,最低的是int。我找不到执行此操作的适当方法。这是我的尝试:

from collections import defaultdict
d = defaultdict(defaultdict)
a = [("key1", {"a1":22, "a2":33}),
     ("key2", {"a1":32, "a2":55}),
     ("key3", {"a1":43, "a2":44})]
for i in a:
    d[i[0]] = i[1]

现在这可以工作,但是以下是所需的行为,但无效:

d["key4"]["a1"] + 1

我怀疑我应该在某个地方声明第二个级别defaultdict是type int,但是我没有找到在哪里或怎么做。

defaultdict首先使用的原因是避免必须为每个新键初始化字典。

还有更优雅的建议吗?

谢谢pythoneers!

Thanks to some great folks on SO, I discovered the possibilities offered by collections.defaultdict, notably in readability and speed. I have put them to use with success.

Now I would like to implement three levels of dictionaries, the two top ones being defaultdict and the lowest one being int. I don’t find the appropriate way to do this. Here is my attempt:

from collections import defaultdict
d = defaultdict(defaultdict)
a = [("key1", {"a1":22, "a2":33}),
     ("key2", {"a1":32, "a2":55}),
     ("key3", {"a1":43, "a2":44})]
for i in a:
    d[i[0]] = i[1]

Now this works, but the following, which is the desired behavior, doesn’t:

d["key4"]["a1"] + 1

I suspect that I should have declared somewhere that the second level defaultdict is of type int, but I didn’t find where or how to do so.

The reason I am using defaultdict in the first place is to avoid having to initialize the dictionary for each new key.

Any more elegant suggestion?

Thanks pythoneers!


回答 0

用:

from collections import defaultdict
d = defaultdict(lambda: defaultdict(int))

defaultdict(int)只要在中访问新密钥,就会创建一个新密钥d

Use:

from collections import defaultdict
d = defaultdict(lambda: defaultdict(int))

This will create a new defaultdict(int) whenever a new key is accessed in d.


回答 1

使可腌制的嵌套defaultdict的另一种方法是使用部分对象而不是lambda:

from functools import partial
...
d = defaultdict(partial(defaultdict, int))

这将起作用,因为defaultdict类可在模块级别全局访问:

“除非对它包装的函数[或在这种情况下,类]可以在其__name__(在其__module__内)全局访问,否则您不能腌制部分对象” – 酸洗包装的部分函数

Another way to make a pickleable, nested defaultdict is to use a partial object instead of a lambda:

from functools import partial
...
d = defaultdict(partial(defaultdict, int))

This will work because the defaultdict class is globally accessible at the module level:

“You can’t pickle a partial object unless the function [or in this case, class] it wraps is globally accessible … under its __name__ (within its __module__)” — Pickling wrapped partial functions


回答 2

这里查看nosklo的答案以获得更通用的解决方案。

class AutoVivification(dict):
    """Implementation of perl's autovivification feature."""
    def __getitem__(self, item):
        try:
            return dict.__getitem__(self, item)
        except KeyError:
            value = self[item] = type(self)()
            return value

测试:

a = AutoVivification()

a[1][2][3] = 4
a[1][3][3] = 5
a[1][2]['test'] = 6

print a

输出:

{1: {2: {'test': 6, 3: 4}, 3: {3: 5}}}

Look at nosklo’s answer here for a more general solution.

class AutoVivification(dict):
    """Implementation of perl's autovivification feature."""
    def __getitem__(self, item):
        try:
            return dict.__getitem__(self, item)
        except KeyError:
            value = self[item] = type(self)()
            return value

Testing:

a = AutoVivification()

a[1][2][3] = 4
a[1][3][3] = 5
a[1][2]['test'] = 6

print a

Output:

{1: {2: {'test': 6, 3: 4}, 3: {3: 5}}}

回答 3

按照@rschwieb的要求D['key'] += 1,我们可以通过定义方法覆盖加法来扩展前一个__add__方法,以使其表现得更像collections.Counter()

首先__missing__将被调用以创建一个新的空值,该值将传递到中__add__。我们测试该值,以空值为False

有关覆盖的更多信息,请参见模拟数字类型

from numbers import Number


class autovivify(dict):
    def __missing__(self, key):
        value = self[key] = type(self)()
        return value

    def __add__(self, x):
        """ override addition for numeric types when self is empty """
        if not self and isinstance(x, Number):
            return x
        raise ValueError

    def __sub__(self, x):
        if not self and isinstance(x, Number):
            return -1 * x
        raise ValueError

例子:

>>> import autovivify
>>> a = autovivify.autovivify()
>>> a
{}
>>> a[2]
{}
>>> a
{2: {}}
>>> a[4] += 1
>>> a[5][3][2] -= 1
>>> a
{2: {}, 4: 1, 5: {3: {2: -1}}}

我们可以只提供默认的0值,然后尝试操作:

class av2(dict):
    def __missing__(self, key):
        value = self[key] = type(self)()
        return value

    def __add__(self, x):
        """ override addition when self is empty """
        if not self:
            return 0 + x
        raise ValueError

    def __sub__(self, x):
        """ override subtraction when self is empty """
        if not self:
            return 0 - x
        raise ValueError

As per @rschwieb’s request for D['key'] += 1, we can expand on previous by overriding addition by defining __add__ method, to make this behave more like a collections.Counter()

First __missing__ will be called to create a new empty value, which will be passed into __add__. We test the value, counting on empty values to be False.

See emulating numeric types for more information on overriding.

from numbers import Number


class autovivify(dict):
    def __missing__(self, key):
        value = self[key] = type(self)()
        return value

    def __add__(self, x):
        """ override addition for numeric types when self is empty """
        if not self and isinstance(x, Number):
            return x
        raise ValueError

    def __sub__(self, x):
        if not self and isinstance(x, Number):
            return -1 * x
        raise ValueError

Examples:

>>> import autovivify
>>> a = autovivify.autovivify()
>>> a
{}
>>> a[2]
{}
>>> a
{2: {}}
>>> a[4] += 1
>>> a[5][3][2] -= 1
>>> a
{2: {}, 4: 1, 5: {3: {2: -1}}}

Rather than checking argument is a Number (very non-python, amirite!) we could just provide a default 0 value and then attempt the operation:

class av2(dict):
    def __missing__(self, key):
        value = self[key] = type(self)()
        return value

    def __add__(self, x):
        """ override addition when self is empty """
        if not self:
            return 0 + x
        raise ValueError

    def __sub__(self, x):
        """ override subtraction when self is empty """
        if not self:
            return 0 - x
        raise ValueError

回答 4

晚会晚了,但是对于任意深度,我只是发现自己在做这样的事情:

from collections import defaultdict

class DeepDict(defaultdict):
    def __call__(self):
        return DeepDict(self.default_factory)

这里的窍门基本上是使DeepDict实例本身成为构造缺失值的有效工厂。现在我们可以做类似的事情

dd = DeepDict(DeepDict(list))
dd[1][2].extend([3,4])
sum(dd[1][2])  # 7

ddd = DeepDict(DeepDict(DeepDict(list)))
ddd[1][2][3].extend([4,5])
sum(ddd[1][2][3])  # 9

Late to the party, but for arbitrary depth I just found myself doing something like this:

from collections import defaultdict

class DeepDict(defaultdict):
    def __call__(self):
        return DeepDict(self.default_factory)

The trick here is basically to make the DeepDict instance itself a valid factory for constructing missing values. Now we can do things like

dd = DeepDict(DeepDict(list))
dd[1][2].extend([3,4])
sum(dd[1][2])  # 7

ddd = DeepDict(DeepDict(DeepDict(list)))
ddd[1][2][3].extend([4,5])
sum(ddd[1][2][3])  # 9

回答 5

def _sub_getitem(self, k):
    try:
        # sub.__class__.__bases__[0]
        real_val = self.__class__.mro()[-2].__getitem__(self, k)
        val = '' if real_val is None else real_val
    except Exception:
        val = ''
        real_val = None
    # isinstance(Avoid,dict)也是true,会一直递归死
    if type(val) in (dict, list, str, tuple):
        val = type('Avoid', (type(val),), {'__getitem__': _sub_getitem, 'pop': _sub_pop})(val)
        # 重新赋值当前字典键为返回值,当对其赋值时可回溯
        if all([real_val is not None, isinstance(self, (dict, list)), type(k) is not slice]):
            self[k] = val
    return val


def _sub_pop(self, k=-1):
    try:
        val = self.__class__.mro()[-2].pop(self, k)
        val = '' if val is None else val
    except Exception:
        val = ''
    if type(val) in (dict, list, str, tuple):
        val = type('Avoid', (type(val),), {'__getitem__': _sub_getitem, 'pop': _sub_pop})(val)
    return val


class DefaultDict(dict):
    def __getitem__(self, k):
        return _sub_getitem(self, k)

    def pop(self, k):
        return _sub_pop(self, k)

In[8]: d=DefaultDict()
In[9]: d['a']['b']['c']['d']
Out[9]: ''
In[10]: d['a']="ggggggg"
In[11]: d['a']
Out[11]: 'ggggggg'
In[12]: d['a']['pp']
Out[12]: ''

再没有错误。无论嵌套多少级。弹出也没有错误

dd = DefaultDict({“ 1”:333333})

def _sub_getitem(self, k):
    try:
        # sub.__class__.__bases__[0]
        real_val = self.__class__.mro()[-2].__getitem__(self, k)
        val = '' if real_val is None else real_val
    except Exception:
        val = ''
        real_val = None
    # isinstance(Avoid,dict)也是true,会一直递归死
    if type(val) in (dict, list, str, tuple):
        val = type('Avoid', (type(val),), {'__getitem__': _sub_getitem, 'pop': _sub_pop})(val)
        # 重新赋值当前字典键为返回值,当对其赋值时可回溯
        if all([real_val is not None, isinstance(self, (dict, list)), type(k) is not slice]):
            self[k] = val
    return val


def _sub_pop(self, k=-1):
    try:
        val = self.__class__.mro()[-2].pop(self, k)
        val = '' if val is None else val
    except Exception:
        val = ''
    if type(val) in (dict, list, str, tuple):
        val = type('Avoid', (type(val),), {'__getitem__': _sub_getitem, 'pop': _sub_pop})(val)
    return val


class DefaultDict(dict):
    def __getitem__(self, k):
        return _sub_getitem(self, k)

    def pop(self, k):
        return _sub_pop(self, k)

In[8]: d=DefaultDict()
In[9]: d['a']['b']['c']['d']
Out[9]: ''
In[10]: d['a']="ggggggg"
In[11]: d['a']
Out[11]: 'ggggggg'
In[12]: d['a']['pp']
Out[12]: ''

No errors again. No matter how many levels nested. pop no error also

dd=DefaultDict({“1”:333333})


如何修改文本文件?

问题:如何修改文本文件?

我正在使用Python,并且想在不删除或复制文件的情况下将字符串插入文本文件。我怎样才能做到这一点?

I’m using Python, and would like to insert a string into a text file without deleting or copying the file. How can I do that?


回答 0

不幸的是,没有重写的方法就无法插入文件的中间。如先前的张贴者所指出的,您可以将文件追加到文件中或使用“搜索”覆盖文件的一部分,但是如果要在文件的开头或中间添加内容,则必须重写它。

这是操作系统,而不是Python。所有语言均相同。

我通常要做的是从文件中读取,进行修改并将其写到名为myfile.txt.tmp或类似名称的新文件中。这比将整个文件读入内存更好,因为文件可能太大了。临时文件完成后,我将其重命名为原始文件。

这是一种很好的安全方法,因为如果文件写入由于任何原因而崩溃或中止,您仍然可以拥有原始文件。

Unfortunately there is no way to insert into the middle of a file without re-writing it. As previous posters have indicated, you can append to a file or overwrite part of it using seek but if you want to add stuff at the beginning or the middle, you’ll have to rewrite it.

This is an operating system thing, not a Python thing. It is the same in all languages.

What I usually do is read from the file, make the modifications and write it out to a new file called myfile.txt.tmp or something like that. This is better than reading the whole file into memory because the file may be too large for that. Once the temporary file is completed, I rename it the same as the original file.

This is a good, safe way to do it because if the file write crashes or aborts for any reason, you still have your untouched original file.


谁能解释python的相对导入?

问题:谁能解释python的相对导入?

我无法终生让python的相对导入工作。我创建了一个不起作用的简单示例:

目录结构为:

/__init__.py
/start.py
/parent.py
/sub/__init__.py
/sub/relative.py

/start.py 仅包含: import sub.relative

/sub/relative.py 仅包含 from .. import parent

所有其他文件均为空白。

在命令行上执行以下命令时:

$ cd /
$ python start.py

我得到:

Traceback (most recent call last):
  File "start.py", line 1, in <module>
    import sub.relative
  File "/home/cvondrick/sandbox/sub/relative.py", line 1, in <module>
    from .. import parent
ValueError: Attempted relative import beyond toplevel package

我正在使用Python 2.6。为什么会这样呢?如何使此沙盒示例正常工作?

I can’t for the life of me get python’s relative imports to work. I have created a simple example of where it does not function:

The directory structure is:

/__init__.py
/start.py
/parent.py
/sub/__init__.py
/sub/relative.py

/start.py contains just: import sub.relative

/sub/relative.py contains just from .. import parent

All other files are blank.

When executing the following on the command line:

$ cd /
$ python start.py

I get:

Traceback (most recent call last):
  File "start.py", line 1, in <module>
    import sub.relative
  File "/home/cvondrick/sandbox/sub/relative.py", line 1, in <module>
    from .. import parent
ValueError: Attempted relative import beyond toplevel package

I am using Python 2.6. Why is this the case? How do I make this sandbox example work?


回答 0

您正在从“ sub”包中导入。start.py即使有__init__.py礼物,它本身也不在包装中。

您需要从以下目录中的一个目录启动程序parent.py

./start.py

./pkg/__init__.py
./pkg/parent.py
./pkg/sub/__init__.py
./pkg/sub/relative.py

start.py

import pkg.sub.relative

现在pkg是顶层软件包,您的相对导入应该可以了。


如果您想坚持使用当前的布局,则可以使用import parent。因为您是start.py用来启动解释器的,所以该目录start.py位于python路径中。parent.py作为一个单独的模块住在那儿。

__init__.py如果您不将任何内容导入到目录树中更远的脚本中,也可以安全地删除顶层。

You are importing from package “sub”. start.py is not itself in a package even if there is a __init__.py present.

You would need to start your program from one directory over parent.py:

./start.py

./pkg/__init__.py
./pkg/parent.py
./pkg/sub/__init__.py
./pkg/sub/relative.py

With start.py:

import pkg.sub.relative

Now pkg is the top level package and your relative import should work.


If you want to stick with your current layout you can just use import parent. Because you use start.py to launch your interpreter, the directory where start.py is located is in your python path. parent.py lives there as a separate module.

You can also safely delete the top level __init__.py, if you don’t import anything into a script further up the directory tree.


回答 1

如果要relative.py直接调用,即如果您确实要从顶级模块导入,则必须将其显式添加到sys.path列表中。
它应如何工作:

# Add this line to the beginning of relative.py file
import sys
sys.path.append('..')

# Now you can do imports from one directory top cause it is in the sys.path
import parent

# And even like this:
from parent import Parent

如果您认为上述情况可能导致某种程度的不一致,则可以改用以下方法:

sys.path.append(sys.path[0] + "/..")

sys.path[0] 指的是进入点运行的路径。

If you are going to call relative.py directly and i.e. if you really want to import from a top level module you have to explicitly add it to the sys.path list.
Here is how it should work:

# Add this line to the beginning of relative.py file
import sys
sys.path.append('..')

# Now you can do imports from one directory top cause it is in the sys.path
import parent

# And even like this:
from parent import Parent

If you think the above can cause some kind of inconsistency you can use this instead:

sys.path.append(sys.path[0] + "/..")

sys.path[0] refers to the path that the entry point was ran from.


回答 2

在python3中签出:

python -V
Python 3.6.5

范例1:

.
├── parent.py
├── start.py
└── sub
    └── relative.py

- start.py
import sub.relative

- parent.py
print('Hello from parent.py')

- sub/relative.py
from .. import parent

如果我们这样运行(只是确保PYTHONPATH为空):

PYTHONPATH='' python3 start.py

输出:

Traceback (most recent call last):
  File "start.py", line 1, in <module>
    import sub.relative
  File "/python-import-examples/so-example-v1/sub/relative.py", line 1, in <module>
    from .. import parent
ValueError: attempted relative import beyond top-level package

如果我们改变导入 sub/relative.py

- sub/relative.py
import parent

如果我们这样运行:

PYTHONPATH='' python3 start.py

输出:

Hello from parent.py

范例2:

.
├── parent.py
└── sub
    ├── relative.py
    └── start.py

- parent.py
print('Hello from parent.py')

- sub/relative.py
print('Hello from relative.py')

- sub/start.py
import relative
from .. import parent

像这样运行:

PYTHONPATH='' python3 sub/start.py

输出:

Hello from relative.py
Traceback (most recent call last):
  File "sub/start.py", line 2, in <module>
    from .. import parent
ValueError: attempted relative import beyond top-level package

如果我们更改import in sub/start.py

- sub/start.py
import relative
import parent

像这样运行:

PYTHONPATH='' python3 sub/start.py

输出:

Hello from relative.py
Traceback (most recent call last):
  File "sub/start.py", line 3, in <module>
    import parent
ModuleNotFoundError: No module named 'parent'

像这样运行:

PYTHONPATH='.' python3 sub/start.py

输出:

Hello from relative.py
Hello from parent.py

另外最好使用从根文件夹导入,即:

- sub/start.py
import sub.relative
import parent

像这样运行:

PYTHONPATH='.' python3 sub/start.py

输出:

Hello from relative.py
Hello from parent.py

Checking it out in python3:

python -V
Python 3.6.5

Example1:

.
├── parent.py
├── start.py
└── sub
    └── relative.py

- start.py
import sub.relative

- parent.py
print('Hello from parent.py')

- sub/relative.py
from .. import parent

If we run it like this(just to make sure PYTHONPATH is empty):

PYTHONPATH='' python3 start.py

Output:

Traceback (most recent call last):
  File "start.py", line 1, in <module>
    import sub.relative
  File "/python-import-examples/so-example-v1/sub/relative.py", line 1, in <module>
    from .. import parent
ValueError: attempted relative import beyond top-level package

If we change import in sub/relative.py

- sub/relative.py
import parent

If we run it like this:

PYTHONPATH='' python3 start.py

Output:

Hello from parent.py

Example2:

.
├── parent.py
└── sub
    ├── relative.py
    └── start.py

- parent.py
print('Hello from parent.py')

- sub/relative.py
print('Hello from relative.py')

- sub/start.py
import relative
from .. import parent

Run it like:

PYTHONPATH='' python3 sub/start.py

Output:

Hello from relative.py
Traceback (most recent call last):
  File "sub/start.py", line 2, in <module>
    from .. import parent
ValueError: attempted relative import beyond top-level package

If we change import in sub/start.py:

- sub/start.py
import relative
import parent

Run it like:

PYTHONPATH='' python3 sub/start.py

Output:

Hello from relative.py
Traceback (most recent call last):
  File "sub/start.py", line 3, in <module>
    import parent
ModuleNotFoundError: No module named 'parent'

Run it like:

PYTHONPATH='.' python3 sub/start.py

Output:

Hello from relative.py
Hello from parent.py

Also it’s better to use import from root folder, i.e.:

- sub/start.py
import sub.relative
import parent

Run it like:

PYTHONPATH='.' python3 sub/start.py

Output:

Hello from relative.py
Hello from parent.py