标签归档:Python

如何正确忽略异常

问题:如何正确忽略异常

当您只想执行try-except但不处理异常时,如何在Python中进行呢?

以下是正确的方法吗?

try:
    shutil.rmtree(path)
except:
    pass

When you just want to do a try-except without handling the exception, how do you do it in Python?

Is the following the right way to do it?

try:
    shutil.rmtree(path)
except:
    pass

回答 0

try:
    doSomething()
except: 
    pass

要么

try:
    doSomething()
except Exception: 
    pass

所不同的是,第一个也将赶上KeyboardInterruptSystemExit和类似的东西,这是直接来源于exceptions.BaseException,没有exceptions.Exception

有关详细信息,请参见文档:

try:
    doSomething()
except: 
    pass

or

try:
    doSomething()
except Exception: 
    pass

The difference is that the first one will also catch KeyboardInterrupt, SystemExit and stuff like that, which are derived directly from exceptions.BaseException, not exceptions.Exception.

See documentation for details:


回答 1

通常,仅捕获您感兴趣的错误是最佳实践。在这种情况下,shutil.rmtree可能是OSError

>>> shutil.rmtree("/fake/dir")
Traceback (most recent call last):
    [...]
OSError: [Errno 2] No such file or directory: '/fake/dir'

如果要静默忽略该错误,则可以执行以下操作:

try:
    shutil.rmtree(path)
except OSError:
    pass

为什么?说您(以某种方式)不小心将整数而不是字符串传递给函数,例如:

shutil.rmtree(2)

它将给出错误“ TypeError:强制转换为Unicode:需要字符串或缓冲区,找到int” -您可能不想忽略它,这可能很难调试。

如果您确实想忽略所有错误,请抓住Exception而不是仅仅except:声明。同样,为什么呢?

不指定异常会捕获所有异常,包括SystemExit例如sys.exit()使用的异常:

>>> try:
...     sys.exit(1)
... except:
...     pass
... 
>>>

将此与以下内容进行比较,即可正确退出:

>>> try:
...     sys.exit(1)
... except Exception:
...     pass
... 
shell:~$ 

如果您想编写更好的行为代码,则OSError异常可以表示各种错误,但是在上面的示例中,我们仅想忽略Errno 2,因此我们可以更加具体:

import errno

try:
    shutil.rmtree(path)
except OSError as e:
    if e.errno != errno.ENOENT:
        # ignore "No such file or directory", but re-raise other errors
        raise

It’s generally considered best-practice to only catch the errors you are interested in. In the case of shutil.rmtree it’s probably OSError:

>>> shutil.rmtree("/fake/dir")
Traceback (most recent call last):
    [...]
OSError: [Errno 2] No such file or directory: '/fake/dir'

If you want to silently ignore that error, you would do:

try:
    shutil.rmtree(path)
except OSError:
    pass

Why? Say you (somehow) accidently pass the function an integer instead of a string, like:

shutil.rmtree(2)

It will give the error “TypeError: coercing to Unicode: need string or buffer, int found” – you probably don’t want to ignore that, which can be difficult to debug.

If you definitely want to ignore all errors, catch Exception rather than a bare except: statement. Again, why?

Not specifying an exception catches every exception, including the SystemExit exception which for example sys.exit() uses:

>>> try:
...     sys.exit(1)
... except:
...     pass
... 
>>>

Compare this to the following, which correctly exits:

>>> try:
...     sys.exit(1)
... except Exception:
...     pass
... 
shell:~$ 

If you want to write ever better behaving code, the OSError exception can represent various errors, but in the example above we only want to ignore Errno 2, so we could be even more specific:

import errno

try:
    shutil.rmtree(path)
except OSError as e:
    if e.errno != errno.ENOENT:
        # ignore "No such file or directory", but re-raise other errors
        raise

回答 2

当您只想尝试捕获而不处理异常时,如何在Python中执行呢?

这取决于您所说的“处理”。

如果您打算不采取任何措施就将其捕获,则发布的代码将起作用。

如果您是想对异常采取措施而又不阻止异常上升,那么您需要这样的东西:

try:
    do_something()
except:
    handle_exception()
    raise  #re-raise the exact same exception that was thrown

When you just want to do a try catch without handling the exception, how do you do it in Python?

It depends on what you mean by “handling.”

If you mean to catch it without taking any action, the code you posted will work.

If you mean that you want to take action on an exception without stopping the exception from going up the stack, then you want something like this:

try:
    do_something()
except:
    handle_exception()
    raise  #re-raise the exact same exception that was thrown

回答 3

首先,我从这个话题中引述杰克·奥康纳的答案。引用的线程已关闭,所以我在这里写:

“ Python 3.4中引入了一种新的方法:

from contextlib import suppress

with suppress(Exception):
    # your code

这是添加了它的提交:http : //hg.python.org/cpython/rev/406b47c64480

这是作者Raymond Hettinger,讨论了这一点以及其他各种Python热度:https ://youtu.be/OSGv2VnC0go ? t = 43m23s

我对此的补充是Python 2.7等效项:

from contextlib import contextmanager

@contextmanager
def ignored(*exceptions):
    try:
        yield
    except exceptions:
        pass

然后像在Python 3.4中一样使用它:

with ignored(Exception):
    # your code

First I quote the answer of Jack o’Connor from this thread. The referenced thread got closed so I write here:

“There’s a new way to do this coming in Python 3.4:

from contextlib import suppress

with suppress(Exception):
    # your code

Here’s the commit that added it: http://hg.python.org/cpython/rev/406b47c64480

And here’s the author, Raymond Hettinger, talking about this and all sorts of other Python hotness: https://youtu.be/OSGv2VnC0go?t=43m23s

My addition to this is the Python 2.7 equivalent:

from contextlib import contextmanager

@contextmanager
def ignored(*exceptions):
    try:
        yield
    except exceptions:
        pass

Then you use it like in Python 3.4:

with ignored(Exception):
    # your code

回答 4

为了完整性:

>>> def divide(x, y):
...     try:
...         result = x / y
...     except ZeroDivisionError:
...         print("division by zero!")
...     else:
...         print("result is", result)
...     finally:
...         print("executing finally clause")

还要注意,您可以像这样捕获异常:

>>> try:
...     this_fails()
... except ZeroDivisionError as err:
...     print("Handling run-time error:", err)

…并重新引发如下异常:

>>> try:
...     raise NameError('HiThere')
... except NameError:
...     print('An exception flew by!')
...     raise

…来自python教程的示例。

For completeness:

>>> def divide(x, y):
...     try:
...         result = x / y
...     except ZeroDivisionError:
...         print("division by zero!")
...     else:
...         print("result is", result)
...     finally:
...         print("executing finally clause")

Also note that you can capture the exception like this:

>>> try:
...     this_fails()
... except ZeroDivisionError as err:
...     print("Handling run-time error:", err)

…and re-raise the exception like this:

>>> try:
...     raise NameError('HiThere')
... except NameError:
...     print('An exception flew by!')
...     raise

…examples from the python tutorial.


回答 5

如何正确忽略异常?

有几种方法可以做到这一点。

但是,示例的选择具有一个不包含一般情况的简单解决方案。

特定于示例:

代替

try:
    shutil.rmtree(path)
except:
    pass

做这个:

shutil.rmtree(path, ignore_errors=True)

这是特定于的论点shutil.rmtree。您可以通过执行以下操作来查看有关此操作的帮助,并且您还将看到它还允许错误处理功能。

>>> import shutil
>>> help(shutil.rmtree)

由于这仅涵盖了示例的狭义情况,因此我将进一步说明如果这些关键字参数不存在,该如何处理。

一般的做法

由于上面仅涵盖了示例的狭义情况,因此我将进一步演示如果这些关键字参数不存在,该如何处理。

Python 3.4的新功能:

您可以导入suppress上下文管理器:

from contextlib import suppress

但只禁止最具体的exceptions:

with suppress(FileNotFoundError):
    shutil.rmtree(path)

您将默默地忽略FileNotFoundError

>>> with suppress(FileNotFoundError):
...     shutil.rmtree('bajkjbkdlsjfljsf')
... 
>>> 

文档

与其他任何完全抑制异常的机制一样,此上下文管理器仅应用于涵盖非常具体的错误,在这些错误中,静默地继续执行程序是正确的做法。

请注意,suppress并且FileNotFoundError仅在Python 3中可用。

如果您还希望代码也可以在Python 2中运行,请参阅下一节:

Python 2和3:

当您只想尝试/exceptions而不处理异常时,如何在Python中进行呢?

以下是正确的方法吗?

try :
    shutil.rmtree ( path )
except :
    pass

对于与Python 2兼容的代码,这pass是不执行操作的正确方法。但是,当你做一个光秃秃的except:,这是一样的做except BaseException:,其中包括GeneratorExitKeyboardInterrupt,和SystemExit,一般来说,你不想要赶上那些东西。

实际上,在命名异常时应尽可能具体。

这是Python(2)异常层次结构的一部分,如您所见,如果您捕获了更多常规异常,则可以隐藏您没有想到的问题:

BaseException
 +-- SystemExit
 +-- KeyboardInterrupt
 +-- GeneratorExit
 +-- Exception
      +-- StopIteration
      +-- StandardError
      |    +-- BufferError
      |    +-- ArithmeticError
      |    |    +-- FloatingPointError
      |    |    +-- OverflowError
      |    |    +-- ZeroDivisionError
      |    +-- AssertionError
      |    +-- AttributeError
      |    +-- EnvironmentError
      |    |    +-- IOError
      |    |    +-- OSError
      |    |         +-- WindowsError (Windows)
      |    |         +-- VMSError (VMS)
      |    +-- EOFError
... and so on

您可能想在这里捕获OSError,也许您不关心的异常是没有目录。

我们可以从库中获取特定的错误号errno,如果没有该错误号,则重新引发:

import errno

try:
    shutil.rmtree(path)
except OSError as error:
    if error.errno == errno.ENOENT: # no such file or directory
        pass
    else: # we had an OSError we didn't expect, so reraise it
        raise 

请注意,不加薪将引发原始异常,在这种情况下,这可能就是您想要的。简明扼要,因为pass在异常处理中我们实际上不需要显式地使用代码:

try:
    shutil.rmtree(path)
except OSError as error:
    if error.errno != errno.ENOENT: # no such file or directory
        raise 

How to properly ignore Exceptions?

There are several ways of doing this.

However, the choice of example has a simple solution that does not cover the general case.

Specific to the example:

Instead of

try:
    shutil.rmtree(path)
except:
    pass

Do this:

shutil.rmtree(path, ignore_errors=True)

This is an argument specific to shutil.rmtree. You can see the help on it by doing the following, and you’ll see it can also allow for functionality on errors as well.

>>> import shutil
>>> help(shutil.rmtree)

Since this only covers the narrow case of the example, I’ll further demonstrate how to handle this if those keyword arguments didn’t exist.

General approach

Since the above only covers the narrow case of the example, I’ll further demonstrate how to handle this if those keyword arguments didn’t exist.

New in Python 3.4:

You can import the suppress context manager:

from contextlib import suppress

But only suppress the most specific exception:

with suppress(FileNotFoundError):
    shutil.rmtree(path)

You will silently ignore a FileNotFoundError:

>>> with suppress(FileNotFoundError):
...     shutil.rmtree('bajkjbkdlsjfljsf')
... 
>>> 

From the docs:

As with any other mechanism that completely suppresses exceptions, this context manager should be used only to cover very specific errors where silently continuing with program execution is known to be the right thing to do.

Note that suppress and FileNotFoundError are only available in Python 3.

If you want your code to work in Python 2 as well, see the next section:

Python 2 & 3:

When you just want to do a try/except without handling the exception, how do you do it in Python?

Is the following the right way to do it?

try :
    shutil.rmtree ( path )
except :
    pass

For Python 2 compatible code, pass is the correct way to have a statement that’s a no-op. But when you do a bare except:, that’s the same as doing except BaseException: which includes GeneratorExit, KeyboardInterrupt, and SystemExit, and in general, you don’t want to catch those things.

In fact, you should be as specific in naming the exception as you can.

Here’s part of the Python (2) exception hierarchy, and as you can see, if you catch more general Exceptions, you can hide problems you did not expect:

BaseException
 +-- SystemExit
 +-- KeyboardInterrupt
 +-- GeneratorExit
 +-- Exception
      +-- StopIteration
      +-- StandardError
      |    +-- BufferError
      |    +-- ArithmeticError
      |    |    +-- FloatingPointError
      |    |    +-- OverflowError
      |    |    +-- ZeroDivisionError
      |    +-- AssertionError
      |    +-- AttributeError
      |    +-- EnvironmentError
      |    |    +-- IOError
      |    |    +-- OSError
      |    |         +-- WindowsError (Windows)
      |    |         +-- VMSError (VMS)
      |    +-- EOFError
... and so on

You probably want to catch an OSError here, and maybe the exception you don’t care about is if there is no directory.

We can get that specific error number from the errno library, and reraise if we don’t have that:

import errno

try:
    shutil.rmtree(path)
except OSError as error:
    if error.errno == errno.ENOENT: # no such file or directory
        pass
    else: # we had an OSError we didn't expect, so reraise it
        raise 

Note, a bare raise raises the original exception, which is probably what you want in this case. Written more concisely, as we don’t really need to explicitly pass with code in the exception handling:

try:
    shutil.rmtree(path)
except OSError as error:
    if error.errno != errno.ENOENT: # no such file or directory
        raise 

回答 6

当您只想尝试捕获而不处理异常时,如何在Python中执行呢?

这将帮助您打印出异常是什么(例如,在不处理异常的情况下尝试捕获并打印异常。)

import sys
try:
    doSomething()
except:
    print "Unexpected error:", sys.exc_info()[0]

When you just want to do a try catch without handling the exception, how do you do it in Python?

This will help you to print what the exception is:( i.e. try catch without handling the exception and print the exception.)

import sys
try:
    doSomething()
except:
    print "Unexpected error:", sys.exc_info()[0]

回答 7

try:
      doSomething()
except Exception: 
    pass
else:
      stuffDoneIf()
      TryClauseSucceeds()

仅供参考,else子句可以在所有异常之后执行,并且仅在try中的代码不会引起异常的情况下才会运行。

try:
      doSomething()
except Exception: 
    pass
else:
      stuffDoneIf()
      TryClauseSucceeds()

FYI the else clause can go after all exceptions and will only be run if the code in the try doesn’t cause an exception.


回答 8

我需要忽略多个命令中的错误,fuckit做到了

import fuckit

@fuckit
def helper():
    print('before')
    1/0
    print('after1')
    1/0
    print('after2')

helper()

I needed to ignore errors in multiple commands and fuckit did the trick

import fuckit

@fuckit
def helper():
    print('before')
    1/0
    print('after1')
    1/0
    print('after2')

helper()

回答 9

在Python中,我们处理与其他语言相似的异常,但是区别在于语法上有些差异,例如,

try:
    #Your code in which exception can occur
except <here we can put in a particular exception name>:
    # We can call that exception here also, like ZeroDivisionError()
    # now your code
# We can put in a finally block also
finally:
    # Your code...

In Python, we handle exceptions similar to other language, but the difference is some syntax difference, for example,

try:
    #Your code in which exception can occur
except <here we can put in a particular exception name>:
    # We can call that exception here also, like ZeroDivisionError()
    # now your code
# We can put in a finally block also
finally:
    # Your code...

回答 10

我通常只是这样做:

try:
    doSomething()
except:
    _ = ""

I usually just do:

try:
    doSomething()
except:
    _ = ""

Python中变量和函数名称的命名约定是什么?

问题:Python中变量和函数名称的命名约定是什么?

来自C#背景的变量和方法名称的命名约定通常为camelCase或PascalCase:

// C# example
string thisIsMyVariable = "a"
public void ThisIsMyMethod()

在Python中,我已经看到了上述内容,但也看到了使用下划线的情况:

# python example
this_is_my_variable = 'a'
def this_is_my_function():

有没有更优选的,确定的Python编码风格?

Coming from a C# background the naming convention for variables and method names are usually either camelCase or PascalCase:

// C# example
string thisIsMyVariable = "a"
public void ThisIsMyMethod()

In Python, I have seen the above but I have also seen underscores being used:

# python example
this_is_my_variable = 'a'
def this_is_my_function():

Is there a more preferable, definitive coding style for Python?


回答 0

请参阅Python PEP 8:函数和变量名称

函数名称应小写,必要时用下划线分隔单词,以提高可读性。

变量名遵循与函数名相同的约定。

仅在已经是主流样式(例如threading.py)的上下文中才允许使用blendCase,以保持向后兼容性。

See Python PEP 8: Function and Variable Names:

Function names should be lowercase, with words separated by underscores as necessary to improve readability.

Variable names follow the same convention as function names.

mixedCase is allowed only in contexts where that’s already the prevailing style (e.g. threading.py), to retain backwards compatibility.


回答 1

Google Python样式指南》具有以下约定:

module_namepackage_nameClassNamemethod_nameExceptionNamefunction_nameGLOBAL_CONSTANT_NAMEglobal_var_nameinstance_var_namefunction_parameter_namelocal_var_name

类似的命名方案应适用于 CLASS_CONSTANT_NAME

The Google Python Style Guide has the following convention:

module_name, package_name, ClassName, method_name, ExceptionName, function_name, GLOBAL_CONSTANT_NAME, global_var_name, instance_var_name, function_parameter_name, local_var_name.

A similar naming scheme should be applied to a CLASS_CONSTANT_NAME


回答 2

大卫·Goodger(在“代码就像Pythonista” 在这里)描述了PEP 8项建议如下:

  • joined_lower 用于函数,方法,属性,变量

  • joined_lowerALL_CAPS常量

  • StudlyCaps 上课

  • camelCase 仅符合先前的约定

David Goodger (in “Code Like a Pythonista” here) describes the PEP 8 recommendations as follows:

  • joined_lower for functions, methods, attributes, variables

  • joined_lower or ALL_CAPS for constants

  • StudlyCaps for classes

  • camelCase only to conform to pre-existing conventions


回答 3

正如Python代码样式指南所承认的那样,

Python库的命名约定有些混乱,因此我们永远都无法做到这一点

请注意,这仅指Python的标准库。如果他们不能得到那个一致,那么就几乎是具有很大的希望通常附着到约定所有的 Python代码,不是吗?

因此,在这里的讨论中,我可以推断出,如果在过渡到Python时继续使用变量或函数的Java或C#命名惯例(例如清晰明确的命名规则),这并不是一个可怕的罪过。当然,请记住,最好遵守代码库/项目/团队的流行风格。正如《 Python风格指南》指出的那样,内部一致性最重要。

随意将我视为异端。:-)像OP一样,我也不是“ Pythonista”,无论如何也没有。

As the Style Guide for Python Code admits,

The naming conventions of Python’s library are a bit of a mess, so we’ll never get this completely consistent

Note that this refers just to Python’s standard library. If they can’t get that consistent, then there hardly is much hope of having a generally-adhered-to convention for all Python code, is there?

From that, and the discussion here, I would deduce that it’s not a horrible sin if one keeps using e.g. Java’s or C#’s (clear and well-established) naming conventions for variables and functions when crossing over to Python. Keeping in mind, of course, that it is best to abide with whatever the prevailing style for a codebase / project / team happens to be. As the Python Style Guide points out, internal consistency matters most.

Feel free to dismiss me as a heretic. :-) Like the OP, I’m not a “Pythonista”, not yet anyway.


回答 4

如其他答案所示,有PEP 8,但是PEP 8只是标准库的样式指南,在其中仅作为福音。PEP 8对于其他代码段最常见的偏差之一是变量命名,尤其是方法。尽管考虑到使用mixedCase的代码量很大,但没有单一的主导风格,如果要进行严格的普查,则可能最终会得到带有mixedCase的PEP 8版本。与PEP 8几乎没有其他偏差是很常见的。

There is PEP 8, as other answers show, but PEP 8 is only the styleguide for the standard library, and it’s only taken as gospel therein. One of the most frequent deviations of PEP 8 for other pieces of code is the variable naming, specifically for methods. There is no single predominate style, although considering the volume of code that uses mixedCase, if one were to make a strict census one would probably end up with a version of PEP 8 with mixedCase. There is little other deviation from PEP 8 that is quite as common.


回答 5

如前所述,PEP 8表示可lower_case_with_underscores用于变量,方法和函数。

我更喜欢使用lower_case_with_underscores变量以及mixedCase方法和函数使代码更明确和可读。因此,遵循Python Zen的 “显式优于隐式”和“可读性”

As mentioned, PEP 8 says to use lower_case_with_underscores for variables, methods and functions.

I prefer using lower_case_with_underscores for variables and mixedCase for methods and functions makes the code more explicit and readable. Thus following the Zen of Python’s “explicit is better than implicit” and “Readability counts”


回答 6

@JohnTESlade回答的内容更进一步。Google的python样式指南提供了一些非常简洁的建议,

避免使用的名称

  • 单个字符名称(计数器或迭代器除外)
  • 任何程序包/模块名称中的破折号(-)
  • \__double_leading_and_trailing_underscore__ names (由Python保留)

命名约定

  • “内部”是指模块内部或类中受保护或私有的内部。
  • 在单个下划线(_)前面有一些支持来保护模块变量和函数(import * from中不包括)。在实例变量或方法前加双下划线(__)可以有效地使变量或方法对其类具有私有性(使用名称修饰)。
  • 将相关的类和顶级功能放到一个模块中。与Java不同,不需要将自己限制为每个模块一个类。
  • 使用CapWords类的名字,但lower_with_under.py对模块名称。尽管有许多命名的现有模块CapWords.py,但现在不建议这样做,因为当碰巧以一个类命名该模块时会造成混淆。(“等待-我写import StringIO还是写from StringIO import StringIO?”)

源自Guido建议的指南 在此处输入图片说明

further to what @JohnTESlade has answered. Google’s python style guide has some pretty neat recommendations,

Names to Avoid

  • single character names except for counters or iterators
  • dashes (-) in any package/module name
  • \__double_leading_and_trailing_underscore__ names (reserved by Python)

Naming Convention

  • “Internal” means internal to a module or protected or private within a class.
  • Prepending a single underscore (_) has some support for protecting module variables and functions (not included with import * from). Prepending a double underscore (__) to an instance variable or method effectively serves to make the variable or method private to its class (using name mangling).
  • Place related classes and top-level functions together in a module. Unlike Java, there is no need to limit yourself to one class per module.
  • Use CapWords for class names, but lower_with_under.py for module names. Although there are many existing modules named CapWords.py, this is now discouraged because it’s confusing when the module happens to be named after a class. (“wait — did I write import StringIO or from StringIO import StringIO?”)

Guidelines derived from Guido’s Recommendations enter image description here


回答 7

大多数python的人都喜欢使用下划线,但是自从5年前以来,即使我使用python,我仍然不喜欢它们。它们对我来说看起来很难看,但也许这就是我脑海中的所有Java。

我只是喜欢驼峰更好,因为它适合与类的命名方式更好,感觉更符合逻辑具有SomeClass.doSomething()SomeClass.do_something()。如果您在python中查看全局模块索引,则会发现这两者,这是因为它是随着时间的推移而增长的各种来源的库的集合,而不是由像Sun这样的公司开发的具有严格编码规则的库。我要说的底线是:使用任何您喜欢的更好的东西,这只是个人品味的问题。

Most python people prefer underscores, but even I am using python since more than 5 years right now, I still do not like them. They just look ugly to me, but maybe that’s all the Java in my head.

I simply like CamelCase better since it fits better with the way classes are named, It feels more logical to have SomeClass.doSomething() than SomeClass.do_something(). If you look around in the global module index in python, you will find both, which is due to the fact that it’s a collection of libraries from various sources that grew overtime and not something that was developed by one company like Sun with strict coding rules. I would say the bottom line is: Use whatever you like better, it’s just a question of personal taste.


回答 8

我个人尝试将CamelCase用于类,mixedCase方法和函数。变量通常用下划线分隔(当我记得时)。这样一来,我就可以一目了然地告诉我我到底在叫什么,而不是所有看起来都一样的东西。

Personally I try to use CamelCase for classes, mixedCase methods and functions. Variables are usually underscore separated (when I can remember). This way I can tell at a glance what exactly I’m calling, rather than everything looking the same.


回答 9

有一篇关于此的论文:http : //www.cs.kent.edu/~jmaletic/papers/ICPC2010-CamelCaseUnderScoreClouds.pdf

TL; DR它说snake_case比camelCase更具可读性。这就是为什么现代语言在任何可能的地方使用(或应该使用)蛇的原因。

There is a paper about this: http://www.cs.kent.edu/~jmaletic/papers/ICPC2010-CamelCaseUnderScoreClouds.pdf

TL;DR It says that snake_case is more readable than camelCase. That’s why modern languages use (or should use) snake wherever they can.


回答 10

编码风格通常是组织内部政策/惯例标准的一部分,但我认为一般来说,all_lower_case_underscore_separator风格(也称为snake_case)在python中最为常见。

The coding style is usually part of an organization’s internal policy/convention standards, but I think in general, the all_lower_case_underscore_separator style (also called snake_case) is most common in python.


回答 11

在以其他编程语言进行开发时,我个人使用Java的命名约定,因为它一致且易于遵循。这样,我就不会一直在努力使用哪些约定不应该成为我项目中最难的部分!

I personally use Java’s naming conventions when developing in other programming languages as it is consistent and easy to follow. That way I am not continuously struggling over what conventions to use which shouldn’t be the hardest part of my project!


回答 12

通常,遵循语言标准库中使用的约定。

Typically, one follow the conventions used in the language’s standard library.


如何从列表中删除第一个项目?

问题:如何从列表中删除第一个项目?

我有表[0, 1, 2, 3, 4],我想将它做成[1, 2, 3, 4]。我该怎么办?

I have the list [0, 1, 2, 3, 4] I’d like to make it into [1, 2, 3, 4]. How do I go about this?


回答 0

Python清单

list.pop(索引)

>>> l = ['a', 'b', 'c', 'd']
>>> l.pop(0)
'a'
>>> l
['b', 'c', 'd']
>>> 

删除列表[索引]

>>> l = ['a', 'b', 'c', 'd']
>>> del l[0]
>>> l
['b', 'c', 'd']
>>> 

这些都将修改您的原始列表。

其他人建议使用切片:

  • 复制清单
  • 可以返回一个子集

另外,如果要执行许多pop(0),则应查看collections.deque

from collections import deque
>>> l = deque(['a', 'b', 'c', 'd'])
>>> l.popleft()
'a'
>>> l
deque(['b', 'c', 'd'])
  • 从列表的左端提供更高的性能

Python List

list.pop(index)

>>> l = ['a', 'b', 'c', 'd']
>>> l.pop(0)
'a'
>>> l
['b', 'c', 'd']
>>> 

del list[index]

>>> l = ['a', 'b', 'c', 'd']
>>> del l[0]
>>> l
['b', 'c', 'd']
>>> 

These both modify your original list.

Others have suggested using slicing:

  • Copies the list
  • Can return a subset

Also, if you are performing many pop(0), you should look at collections.deque

from collections import deque
>>> l = deque(['a', 'b', 'c', 'd'])
>>> l.popleft()
'a'
>>> l
deque(['b', 'c', 'd'])
  • Provides higher performance popping from left end of the list

回答 1

切片:

x = [0,1,2,3,4]
x = x[1:]

这实际上将返回原始的子集,但不会对其进行修改。

Slicing:

x = [0,1,2,3,4]
x = x[1:]

Which would actually return a subset of the original but not modify it.


回答 2

>>> x = [0, 1, 2, 3, 4]
>>> x.pop(0)
0

更多关于此这里

>>> x = [0, 1, 2, 3, 4]
>>> x.pop(0)
0

More on this here.


回答 3

使用列表切片,请参阅有关列表的Python教程以获取更多详细信息:

>>> l = [0, 1, 2, 3, 4]
>>> l[1:]
[1, 2, 3, 4]

With list slicing, see the Python tutorial about lists for more details:

>>> l = [0, 1, 2, 3, 4]
>>> l[1:]
[1, 2, 3, 4]

回答 4

你会这样做

l = [0, 1, 2, 3, 4]
l.pop(0)

要么 l = l[1:]

利弊

使用pop可以获取值

x = l.pop(0) x0

you would just do this

l = [0, 1, 2, 3, 4]
l.pop(0)

or l = l[1:]

Pros and Cons

Using pop you can retrieve the value

say x = l.pop(0) x would be 0


回答 5

然后将其删除:

x = [0, 1, 2, 3, 4]
del x[0]
print x
# [1, 2, 3, 4]

Then just delete it:

x = [0, 1, 2, 3, 4]
del x[0]
print x
# [1, 2, 3, 4]

回答 6

您可以list.reverse()用来反转列表,然后list.pop()删除最后一个元素,例如:

l = [0, 1, 2, 3, 4]
l.reverse()
print l
[4, 3, 2, 1, 0]


l.pop()
0
l.pop()
1
l.pop()
2
l.pop()
3
l.pop()
4

You can use list.reverse() to reverse the list, then list.pop() to remove the last element, for example:

l = [0, 1, 2, 3, 4]
l.reverse()
print l
[4, 3, 2, 1, 0]


l.pop()
0
l.pop()
1
l.pop()
2
l.pop()
3
l.pop()
4

回答 7

您也可以使用list.remove(a[0])pop删除列表中的第一个元素。

>>>> a=[1,2,3,4,5]
>>>> a.remove(a[0])
>>>> print a
>>>> [2,3,4,5]

You can also use list.remove(a[0]) to pop out the first element in the list.

>>>> a=[1,2,3,4,5]
>>>> a.remove(a[0])
>>>> print a
>>>> [2,3,4,5]

回答 8

如果使用numpy,则需要使用delete方法:

import numpy as np

a = np.array([1, 2, 3, 4, 5])

a = np.delete(a, 0)

print(a) # [2 3 4 5]

If you are working with numpy you need to use the delete method:

import numpy as np

a = np.array([1, 2, 3, 4, 5])

a = np.delete(a, 0)

print(a) # [2 3 4 5]

回答 9

有一个称为“双端队列”或双头队列的数据结构,它比列表更快,更高效。您可以使用列表并将其转换为双端队列,并在其中进行所需的转换。您也可以将双端队列转换回列表。

import collections
mylist = [0, 1, 2, 3, 4]

#make a deque from your list
de = collections.deque(mylist)

#you can remove from a deque from either left side or right side
de.popleft()
print(de)

#you can covert the deque back to list
mylist = list(de)
print(mylist)

Deque还提供了非常有用的功能,例如将元素插入列表的任一侧或任何特定的索引。您也可以旋转或反转双端队列。试试看!!

There is a datastructure called “deque” or double ended queue which is faster and efficient than a list. You can use your list and convert it to deque and do the required transformations in it. You can also convert the deque back to list.

import collections
mylist = [0, 1, 2, 3, 4]

#make a deque from your list
de = collections.deque(mylist)

#you can remove from a deque from either left side or right side
de.popleft()
print(de)

#you can covert the deque back to list
mylist = list(de)
print(mylist)

Deque also provides very useful functions like inserting elements to either side of the list or to any specific index. You can also rotate or reverse a deque. Give it a try!!


改组对象列表

问题:改组对象列表

我有一个对象列表,我想对其进行洗牌。我以为可以使用该random.shuffle方法,但是当列表中包含对象时,这似乎失败了。是否有一种用于改组对象的方法或解决此问题的另一种方法?

import random

class A:
    foo = "bar"

a1 = a()
a2 = a()
b = [a1, a2]

print(random.shuffle(b))

这将失败。

I have a list of objects and I want to shuffle them. I thought I could use the random.shuffle method, but this seems to fail when the list is of objects. Is there a method for shuffling objects or another way around this?

import random

class A:
    foo = "bar"

a1 = a()
a2 = a()
b = [a1, a2]

print(random.shuffle(b))

This will fail.


回答 0

random.shuffle应该管用。这是一个示例,其中对象是列表:

from random import shuffle
x = [[i] for i in range(10)]
shuffle(x)

# print(x)  gives  [[9], [2], [7], [0], [4], [5], [3], [1], [8], [6]]
# of course your results will vary

请注意,随机播放在适当的地方起作用,并返回None。

random.shuffle should work. Here’s an example, where the objects are lists:

from random import shuffle
x = [[i] for i in range(10)]
shuffle(x)

# print(x)  gives  [[9], [2], [7], [0], [4], [5], [3], [1], [8], [6]]
# of course your results will vary

Note that shuffle works in place, and returns None.


回答 1

当您了解到就地改组就是问题所在。我也经常遇到问题,而且似乎也常常忘记如何复制列表。使用sample(a, len(a))是解决方案,使用len(a)作为样本量。有关Python文档,请参见https://docs.python.org/3.6/library/random.html#random.sample

这是使用的简单版本random.sample(),它将经过改组的结果作为新列表返回。

import random

a = range(5)
b = random.sample(a, len(a))
print a, b, "two list same:", a == b
# print: [0, 1, 2, 3, 4] [2, 1, 3, 4, 0] two list same: False

# The function sample allows no duplicates.
# Result can be smaller but not larger than the input.
a = range(555)
b = random.sample(a, len(a))
print "no duplicates:", a == list(set(b))

try:
    random.sample(a, len(a) + 1)
except ValueError as e:
    print "Nope!", e

# print: no duplicates: True
# print: Nope! sample larger than population

As you learned the in-place shuffling was the problem. I also have problem frequently, and often seem to forget how to copy a list, too. Using sample(a, len(a)) is the solution, using len(a) as the sample size. See https://docs.python.org/3.6/library/random.html#random.sample for the Python documentation.

Here’s a simple version using random.sample() that returns the shuffled result as a new list.

import random

a = range(5)
b = random.sample(a, len(a))
print a, b, "two list same:", a == b
# print: [0, 1, 2, 3, 4] [2, 1, 3, 4, 0] two list same: False

# The function sample allows no duplicates.
# Result can be smaller but not larger than the input.
a = range(555)
b = random.sample(a, len(a))
print "no duplicates:", a == list(set(b))

try:
    random.sample(a, len(a) + 1)
except ValueError as e:
    print "Nope!", e

# print: no duplicates: True
# print: Nope! sample larger than population

回答 2

我也花了一些时间来做到这一点。但是洗牌的文档非常清楚:

列表随机排列x ; 不返回。

所以你不应该print(random.shuffle(b))。相反random.shuffle(b),然后print(b)

It took me some time to get that too. But the documentation for shuffle is very clear:

shuffle list x in place; return None.

So you shouldn’t print(random.shuffle(b)). Instead do random.shuffle(b) and then print(b).


回答 3

#!/usr/bin/python3

import random

s=list(range(5))
random.shuffle(s) # << shuffle before print or assignment
print(s)

# print: [2, 4, 1, 3, 0]
#!/usr/bin/python3

import random

s=list(range(5))
random.shuffle(s) # << shuffle before print or assignment
print(s)

# print: [2, 4, 1, 3, 0]

回答 4

如果您碰巧已经使用numpy(在科学和金融应用中非常流行),则可以节省导入时间。

import numpy as np    
np.random.shuffle(b)
print(b)

http://docs.scipy.org/doc/numpy/reference/generation/numpy.random.shuffle.html

If you happen to be using numpy already (very popular for scientific and financial applications) you can save yourself an import.

import numpy as np    
np.random.shuffle(b)
print(b)

http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.shuffle.html


回答 5

>>> import random
>>> a = ['hi','world','cat','dog']
>>> random.shuffle(a,random.random)
>>> a
['hi', 'cat', 'dog', 'world']

这对我来说可以。确保设置随机方法。

>>> import random
>>> a = ['hi','world','cat','dog']
>>> random.shuffle(a,random.random)
>>> a
['hi', 'cat', 'dog', 'world']

It works fine for me. Make sure to set the random method.


回答 6

如果您有多个列表,则可能要先定义排列(随机排列列表/重新排列列表中项目的方式),然后将其应用于所有列表:

import random

perm = list(range(len(list_one)))
random.shuffle(perm)
list_one = [list_one[index] for index in perm]
list_two = [list_two[index] for index in perm]

脾气暴躁

如果您的列表是numpy数组,则更为简单:

import numpy as np

perm = np.random.permutation(len(list_one))
list_one = list_one[perm]
list_two = list_two[perm]

处理器

我创建了mpu具有以下consistent_shuffle功能的小型实用程序包:

import mpu

# Necessary if you want consistent results
import random
random.seed(8)

# Define example lists
list_one = [1,2,3]
list_two = ['a', 'b', 'c']

# Call the function
list_one, list_two = mpu.consistent_shuffle(list_one, list_two)

请注意,它mpu.consistent_shuffle接受任意数量的参数。因此,您也可以使用它洗牌三个或更多列表。

If you have multiple lists, you might want to define the permutation (the way you shuffle the list / rearrange the items in the list) first and then apply it to all lists:

import random

perm = list(range(len(list_one)))
random.shuffle(perm)
list_one = [list_one[index] for index in perm]
list_two = [list_two[index] for index in perm]

Numpy / Scipy

If your lists are numpy arrays, it is simpler:

import numpy as np

perm = np.random.permutation(len(list_one))
list_one = list_one[perm]
list_two = list_two[perm]

mpu

I’ve created the small utility package mpu which has the consistent_shuffle function:

import mpu

# Necessary if you want consistent results
import random
random.seed(8)

# Define example lists
list_one = [1,2,3]
list_two = ['a', 'b', 'c']

# Call the function
list_one, list_two = mpu.consistent_shuffle(list_one, list_two)

Note that mpu.consistent_shuffle takes an arbitrary number of arguments. So you can also shuffle three or more lists with it.


回答 7

from random import random
my_list = range(10)
shuffled_list = sorted(my_list, key=lambda x: random())

对于要交换订购功能的某些应用程序,此替代方法可能很有用。

from random import random
my_list = range(10)
shuffled_list = sorted(my_list, key=lambda x: random())

This alternative may be useful for some applications where you want to swap the ordering function.


回答 8

在某些情况下,使用numpy数组时,请random.shuffle在数组中使用创建的重复数据。

另一种方法是使用numpy.random.shuffle。如果您已经在使用numpy,那么这是优于generic的首选方法random.shuffle

numpy.random.shuffle

>>> import numpy as np
>>> import random

使用random.shuffle

>>> foo = np.array([[1,2,3],[4,5,6],[7,8,9]])
>>> foo

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])


>>> random.shuffle(foo)
>>> foo

array([[1, 2, 3],
       [1, 2, 3],
       [4, 5, 6]])

使用numpy.random.shuffle

>>> foo = np.array([[1,2,3],[4,5,6],[7,8,9]])
>>> foo

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])


>>> np.random.shuffle(foo)
>>> foo

array([[1, 2, 3],
       [7, 8, 9],
       [4, 5, 6]])

In some cases when using numpy arrays, using random.shuffle created duplicate data in the array.

An alternative is to use numpy.random.shuffle. If you’re working with numpy already, this is the preferred method over the generic random.shuffle.

numpy.random.shuffle

Example

>>> import numpy as np
>>> import random

Using random.shuffle:

>>> foo = np.array([[1,2,3],[4,5,6],[7,8,9]])
>>> foo

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])


>>> random.shuffle(foo)
>>> foo

array([[1, 2, 3],
       [1, 2, 3],
       [4, 5, 6]])

Using numpy.random.shuffle:

>>> foo = np.array([[1,2,3],[4,5,6],[7,8,9]])
>>> foo

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])


>>> np.random.shuffle(foo)
>>> foo

array([[1, 2, 3],
       [7, 8, 9],
       [4, 5, 6]])

回答 9

对于单行代码,请使用random.sample(list_to_be_shuffled, length_of_the_list)示例:

import random
random.sample(list(range(10)), 10)

输出:[2、9、7、8、3、0、4、1、6、5]

For one-liners, userandom.sample(list_to_be_shuffled, length_of_the_list) with an example:

import random
random.sample(list(range(10)), 10)

outputs: [2, 9, 7, 8, 3, 0, 4, 1, 6, 5]


回答 10

当使用’foo’调用时,’print func(foo)’将输出’func’的返回值。但是,’shuffle’的返回类型为None,因为该列表将被修改,因此不打印任何内容。解决方法:

# shuffle the list in place 
random.shuffle(b)

# print it
print(b)

如果您更喜欢函数式编程风格,则可能需要创建以下包装函数:

def myshuffle(ls):
    random.shuffle(ls)
    return ls

‘print func(foo)’ will print the return value of ‘func’ when called with ‘foo’. ‘shuffle’ however has None as its return type, as the list will be modified in place, hence it prints nothing. Workaround:

# shuffle the list in place 
random.shuffle(b)

# print it
print(b)

If you’re more into functional programming style you might want to make the following wrapper function:

def myshuffle(ls):
    random.shuffle(ls)
    return ls

回答 11

可以定义一个函数shuffledsort与vs 相同sorted

def shuffled(x):
    import random
    y = x[:]
    random.shuffle(y)
    return y

x = shuffled([1, 2, 3, 4])
print x

One can define a function called shuffled (in the same sense of sort vs sorted)

def shuffled(x):
    import random
    y = x[:]
    random.shuffle(y)
    return y

x = shuffled([1, 2, 3, 4])
print x

回答 12

import random

class a:
    foo = "bar"

a1 = a()
a2 = a()
a3 = a()
a4 = a()
b = [a1,a2,a3,a4]

random.shuffle(b)
print(b)

shuffle 到位,因此不要打印结果None,而是列表。

import random

class a:
    foo = "bar"

a1 = a()
a2 = a()
a3 = a()
a4 = a()
b = [a1,a2,a3,a4]

random.shuffle(b)
print(b)

shuffle is in place, so do not print result, which is None, but the list.


回答 13

您可以这样做:

>>> A = ['r','a','n','d','o','m']
>>> B = [1,2,3,4,5,6]
>>> import random
>>> random.sample(A+B, len(A+B))
[3, 'r', 4, 'n', 6, 5, 'm', 2, 1, 'a', 'o', 'd']

如果要返回到两个列表,则可以将此长列表分成两部分。

You can go for this:

>>> A = ['r','a','n','d','o','m']
>>> B = [1,2,3,4,5,6]
>>> import random
>>> random.sample(A+B, len(A+B))
[3, 'r', 4, 'n', 6, 5, 'm', 2, 1, 'a', 'o', 'd']

if you want to go back to two lists, you then split this long list into two.


回答 14

您可以构建一个将列表作为参数并返回列表的随机版本的函数:

from random import *

def listshuffler(inputlist):
    for i in range(len(inputlist)):
        swap = randint(0,len(inputlist)-1)
        temp = inputlist[swap]
        inputlist[swap] = inputlist[i]
        inputlist[i] = temp
    return inputlist

you could build a function that takes a list as a parameter and returns a shuffled version of the list:

from random import *

def listshuffler(inputlist):
    for i in range(len(inputlist)):
        swap = randint(0,len(inputlist)-1)
        temp = inputlist[swap]
        inputlist[swap] = inputlist[i]
        inputlist[i] = temp
    return inputlist

回答 15

""" to shuffle random, set random= True """

def shuffle(x,random=False):
     shuffled = []
     ma = x
     if random == True:
         rando = [ma[i] for i in np.random.randint(0,len(ma),len(ma))]
         return rando
     if random == False:
          for i in range(len(ma)):
          ave = len(ma)//3
          if i < ave:
             shuffled.append(ma[i+ave])
          else:
             shuffled.append(ma[i-ave])    
     return shuffled
""" to shuffle random, set random= True """

def shuffle(x,random=False):
     shuffled = []
     ma = x
     if random == True:
         rando = [ma[i] for i in np.random.randint(0,len(ma),len(ma))]
         return rando
     if random == False:
          for i in range(len(ma)):
          ave = len(ma)//3
          if i < ave:
             shuffled.append(ma[i+ave])
          else:
             shuffled.append(ma[i-ave])    
     return shuffled

回答 16

您可以使用随机播放或采样。两者均来自随机模块。

import random
def shuffle(arr1):
    n=len(arr1)
    b=random.sample(arr1,n)
    return b

要么

import random
def shuffle(arr1):
    random.shuffle(arr1)
    return arr1

you can either use shuffle or sample . both of which come from random module.

import random
def shuffle(arr1):
    n=len(arr1)
    b=random.sample(arr1,n)
    return b

OR

import random
def shuffle(arr1):
    random.shuffle(arr1)
    return arr1

回答 17

确保您没有命名源文件random.py,并且工作目录中没有名为random.pyc ..的文件,这可能会导致程序尝试导入本地random.py文件而不是pythons random模块。

Make sure you are not naming your source file random.py, and that there is not a file in your working directory called random.pyc.. either could cause your program to try and import your local random.py file instead of pythons random module.


回答 18

def shuffle(_list):
    if not _list == []:
        import random
        list2 = []
        while _list != []:
            card = random.choice(_list)
            _list.remove(card)
            list2.append(card)
        while list2 != []:
            card1 = list2[0]
            list2.remove(card1)
            _list.append(card1)
        return _list
def shuffle(_list):
    if not _list == []:
        import random
        list2 = []
        while _list != []:
            card = random.choice(_list)
            _list.remove(card)
            list2.append(card)
        while list2 != []:
            card1 = list2[0]
            list2.remove(card1)
            _list.append(card1)
        return _list

回答 19

import random
class a:
    foo = "bar"

a1 = a()
a2 = a()
b = [a1.foo,a2.foo]
random.shuffle(b)
import random
class a:
    foo = "bar"

a1 = a()
a2 = a()
b = [a1.foo,a2.foo]
random.shuffle(b)

回答 20

改组过程是“有替换的”,因此每个项目的出现可能会改变!至少当列表中的项目也同时列出时。

例如,

ml = [[0], [1]] * 10

后,

random.shuffle(ml)

[0]的数目可以是9或8,但不完全是10。

The shuffling process is “with replacement”, so the occurrence of each item may change! At least when when items in your list is also list.

E.g.,

ml = [[0], [1]] * 10

After,

random.shuffle(ml)

The number of [0] may be 9 or 8, but not exactly 10.


回答 21

计划:无需依赖库就可以完成改组工作。示例:从元素0的开头开始浏览列表;找到一个新的随机位置,例如6,将0的值放在6中,将6的值放在0中。移到元素1并重复此过程,以此类推。

import random
iteration = random.randint(2, 100)
temp_var = 0
while iteration > 0:

    for i in range(1, len(my_list)): # have to use range with len()
        for j in range(1, len(my_list) - i):
            # Using temp_var as my place holder so I don't lose values
            temp_var = my_list[i]
            my_list[i] = my_list[j]
            my_list[j] = temp_var

        iteration -= 1

Plan: Write out the shuffle without relying on a library to do the heavy lifting. Example: Go through the list from the beginning starting with element 0; find a new random position for it, say 6, put 0’s value in 6 and 6’s value in 0. Move on to element 1 and repeat this process, and so on through the rest of the list

import random
iteration = random.randint(2, 100)
temp_var = 0
while iteration > 0:

    for i in range(1, len(my_list)): # have to use range with len()
        for j in range(1, len(my_list) - i):
            # Using temp_var as my place holder so I don't lose values
            temp_var = my_list[i]
            my_list[i] = my_list[j]
            my_list[j] = temp_var

        iteration -= 1

回答 22

它工作正常。我在这里尝试使用功能作为列表对象:

    from random import shuffle

    def foo1():
        print "foo1",

    def foo2():
        print "foo2",

    def foo3():
        print "foo3",

    A=[foo1,foo2,foo3]

    for x in A:
        x()

    print "\r"

    shuffle(A)
    for y in A:
        y()

它打印出来:foo1 foo2 foo3 foo2 foo3 foo1(最后一行中的foos具有随机​​顺序)

It works fine. I am trying it here with functions as list objects:

    from random import shuffle

    def foo1():
        print "foo1",

    def foo2():
        print "foo2",

    def foo3():
        print "foo3",

    A=[foo1,foo2,foo3]

    for x in A:
        x()

    print "\r"

    shuffle(A)
    for y in A:
        y()

It prints out: foo1 foo2 foo3 foo2 foo3 foo1 (the foos in the last row have a random order)


如何在保留订单的同时从列表中删除重复项?

问题:如何在保留订单的同时从列表中删除重复项?

是否有内置的程序在保留顺序的同时从Python列表中删除重复项?我知道我可以使用集合来删除重复项,但这会破坏原始顺序。我也知道我可以这样滚动自己:

def uniq(input):
  output = []
  for x in input:
    if x not in output:
      output.append(x)
  return output

(感谢您放松代码示例。)

但是如果可能的话,我想利用一个内置的或更Pythonic的习惯用法。

相关问题:在Python中,从列表中删除重复项以使所有元素在保持顺序唯一的同时最快的算法是什么?

Is there a built-in that removes duplicates from list in Python, whilst preserving order? I know that I can use a set to remove duplicates, but that destroys the original order. I also know that I can roll my own like this:

def uniq(input):
  output = []
  for x in input:
    if x not in output:
      output.append(x)
  return output

(Thanks to unwind for that code sample.)

But I’d like to avail myself of a built-in or a more Pythonic idiom if possible.

Related question: In Python, what is the fastest algorithm for removing duplicates from a list so that all elements are unique while preserving order?


回答 0

在这里,您有一些选择:http : //www.peterbe.com/plog/uniqifiers-benchmark

最快的:

def f7(seq):
    seen = set()
    seen_add = seen.add
    return [x for x in seq if not (x in seen or seen_add(x))]

为什么要分配seen.addseen_add而不是仅打电话给seen.add?Python是一种动态语言,与解决seen.add局部变量相比,解决每次迭代的成本更高。seen.add可能在两次迭代之间发生了变化,并且运行时不够智能,无法排除这种情况。为了安全起见,它必须每次检查对象。

如果您打算在同一数据集上大量使用此功能,则最好使用有序集:http : //code.activestate.com/recipes/528878/

O(1)每个操作的插入,删除和成员检查。

(小的附加说明:seen.add()始终返回None,因此or以上内容仅是尝试进行集合更新的一种方法,而不是逻辑测试的组成部分。)

Here you have some alternatives: http://www.peterbe.com/plog/uniqifiers-benchmark

Fastest one:

def f7(seq):
    seen = set()
    seen_add = seen.add
    return [x for x in seq if not (x in seen or seen_add(x))]

Why assign seen.add to seen_add instead of just calling seen.add? Python is a dynamic language, and resolving seen.add each iteration is more costly than resolving a local variable. seen.add could have changed between iterations, and the runtime isn’t smart enough to rule that out. To play it safe, it has to check the object each time.

If you plan on using this function a lot on the same dataset, perhaps you would be better off with an ordered set: http://code.activestate.com/recipes/528878/

O(1) insertion, deletion and member-check per operation.

(Small additional note: seen.add() always returns None, so the or above is there only as a way to attempt a set update, and not as an integral part of the logical test.)


回答 1

编辑2016

正如Raymond所指出的那样,在OrderedDictC语言中实现的python 3.5+中,列表理解方法将比OrderedDict(除非您实际上需要列表的末尾-甚至在输入非常短的情况下)慢一些。因此,针对3.5+的最佳解决方案是OrderedDict

重要编辑2015

@abarnert所述,more_itertools库(pip install more_itertools)包含一个unique_everseen为解决此问题而构建的函数,列表理解中没有任何不可读的not seen.add突变。这也是最快的解决方案:

>>> from  more_itertools import unique_everseen
>>> items = [1, 2, 0, 1, 3, 2]
>>> list(unique_everseen(items))
[1, 2, 0, 3]

只需导入一个简单的库,就不会有黑客入侵。这来自itertools配方的实现,unique_everseen如下所示:

def unique_everseen(iterable, key=None):
    "List unique elements, preserving order. Remember all elements ever seen."
    # unique_everseen('AAAABBBCCDAABBB') --> A B C D
    # unique_everseen('ABBCcAD', str.lower) --> A B C D
    seen = set()
    seen_add = seen.add
    if key is None:
        for element in filterfalse(seen.__contains__, iterable):
            seen_add(element)
            yield element
    else:
        for element in iterable:
            k = key(element)
            if k not in seen:
                seen_add(k)
                yield element

在Python中2.7+这种用法公认的通用用法(它可以工作,但没有针对速度进行优化,我现在将使用unique_everseencollections.OrderedDict

运行时间:O(N)

>>> from collections import OrderedDict
>>> items = [1, 2, 0, 1, 3, 2]
>>> list(OrderedDict.fromkeys(items))
[1, 2, 0, 3]

看起来比:

seen = set()
[x for x in seq if x not in seen and not seen.add(x)]

并且没有利用丑陋的hack

not seen.add(x)

这取决于set.add始终返回的就地方法的事实,None因此not None求值为True

但是请注意,尽管破解解决方案具有相同的运行时复杂度O(N),但其原始速度更快。

Edit 2016

As Raymond pointed out, in python 3.5+ where OrderedDict is implemented in C, the list comprehension approach will be slower than OrderedDict (unless you actually need the list at the end – and even then, only if the input is very short). So the best solution for 3.5+ is OrderedDict.

Important Edit 2015

As @abarnert notes, the more_itertools library (pip install more_itertools) contains a unique_everseen function that is built to solve this problem without any unreadable (not seen.add) mutations in list comprehensions. This is also the fastest solution too:

>>> from  more_itertools import unique_everseen
>>> items = [1, 2, 0, 1, 3, 2]
>>> list(unique_everseen(items))
[1, 2, 0, 3]

Just one simple library import and no hacks. This comes from an implementation of the itertools recipe unique_everseen which looks like:

def unique_everseen(iterable, key=None):
    "List unique elements, preserving order. Remember all elements ever seen."
    # unique_everseen('AAAABBBCCDAABBB') --> A B C D
    # unique_everseen('ABBCcAD', str.lower) --> A B C D
    seen = set()
    seen_add = seen.add
    if key is None:
        for element in filterfalse(seen.__contains__, iterable):
            seen_add(element)
            yield element
    else:
        for element in iterable:
            k = key(element)
            if k not in seen:
                seen_add(k)
                yield element

In Python 2.7+ the accepted common idiom (which works but isn’t optimized for speed, I would now use unique_everseen) for this uses collections.OrderedDict:

Runtime: O(N)

>>> from collections import OrderedDict
>>> items = [1, 2, 0, 1, 3, 2]
>>> list(OrderedDict.fromkeys(items))
[1, 2, 0, 3]

This looks much nicer than:

seen = set()
[x for x in seq if x not in seen and not seen.add(x)]

and doesn’t utilize the ugly hack:

not seen.add(x)

which relies on the fact that set.add is an in-place method that always returns None so not None evaluates to True.

Note however that the hack solution is faster in raw speed though it has the same runtime complexity O(N).


回答 2

在Python 2.7中,从迭代器中删除重复项并同时保持其原始顺序的新方法是:

>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

在Python 3.5中,OrderedDict具有C实现。我的时间表明,这是Python 3.5各种方法中最快也是最短的。

在Python 3.6中,常规字典变得有序且紧凑。(此功能适用于CPython和PyPy,但在其他实现中可能不存在)。这为我们提供了一种在保留订单的同时进行重复数据删除的最快方法:

>>> list(dict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

在Python 3.7中,保证常规dict在所有实现中都排序。 因此,最短,最快的解决方案是:

>>> list(dict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

对@max的响应:一旦移至3.6或3.7并使用常规dict而不是OrderedDict,就无法真正以其他任何方式击败性能。该词典非常密集,几乎可以无开销地转换为列表。目标列表的大小预先设置为len(d),这样可以保存列表推导中发生的所有调整大小。同样,由于内部键列表很密集,因此指针的复制几乎快如列表副本。

In Python 2.7, the new way of removing duplicates from an iterable while keeping it in the original order is:

>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

In Python 3.5, the OrderedDict has a C implementation. My timings show that this is now both the fastest and shortest of the various approaches for Python 3.5.

In Python 3.6, the regular dict became both ordered and compact. (This feature is holds for CPython and PyPy but may not present in other implementations). That gives us a new fastest way of deduping while retaining order:

>>> list(dict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

In Python 3.7, the regular dict is guaranteed to both ordered across all implementations. So, the shortest and fastest solution is:

>>> list(dict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

Response to @max: Once you move to 3.6 or 3.7 and use the regular dict instead of OrderedDict, you can’t really beat the performance in any other way. The dictionary is dense and readily converts to a list with almost no overhead. The target list is pre-sized to len(d) which saves all the resizes that occur in a list comprehension. Also, since the internal key list is dense, copying the pointers is about almost fast as a list copy.


回答 3

sequence = ['1', '2', '3', '3', '6', '4', '5', '6']
unique = []
[unique.append(item) for item in sequence if item not in unique]

独特→ ['1', '2', '3', '6', '4', '5']

sequence = ['1', '2', '3', '3', '6', '4', '5', '6']
unique = []
[unique.append(item) for item in sequence if item not in unique]

unique → ['1', '2', '3', '6', '4', '5']


回答 4

不要踢死马(这个问题很老了,已经有很多好的答案了),但是这里有一种使用熊猫的解决方案,在很多情况下都非常快,而且使用起来很简单。

import pandas as pd

my_list = [0, 1, 2, 3, 4, 1, 2, 3, 5]

>>> pd.Series(my_list).drop_duplicates().tolist()
# Output:
# [0, 1, 2, 3, 4, 5]

Not to kick a dead horse (this question is very old and already has lots of good answers), but here is a solution using pandas that is quite fast in many circumstances and is dead simple to use.

import pandas as pd

my_list = [0, 1, 2, 3, 4, 1, 2, 3, 5]

>>> pd.Series(my_list).drop_duplicates().tolist()
# Output:
# [0, 1, 2, 3, 4, 5]

回答 5

from itertools import groupby
[ key for key,_ in groupby(sortedList)]

该列表甚至不必排序,充分的条件是将相等的值分组在一起。

编辑:我认为“保留顺序”意味着该列表实际上是有序的。如果不是这种情况,那么MizardX的解决方案就是正确的解决方案。

社区编辑:但是,这是“将重复的连续元素压缩为单个元素”的最优雅的方法。

from itertools import groupby
[ key for key,_ in groupby(sortedList)]

The list doesn’t even have to be sorted, the sufficient condition is that equal values are grouped together.

Edit: I assumed that “preserving order” implies that the list is actually ordered. If this is not the case, then the solution from MizardX is the right one.

Community edit: This is however the most elegant way to “compress duplicate consecutive elements into a single element”.


回答 6

我想如果您想维持订单,

您可以尝试以下方法:

list1 = ['b','c','d','b','c','a','a']    
list2 = list(set(list1))    
list2.sort(key=list1.index)    
print list2

或者类似地,您可以执行以下操作:

list1 = ['b','c','d','b','c','a','a']  
list2 = sorted(set(list1),key=list1.index)  
print list2 

您也可以这样做:

list1 = ['b','c','d','b','c','a','a']    
list2 = []    
for i in list1:    
    if not i in list2:  
        list2.append(i)`    
print list2

也可以这样写:

list1 = ['b','c','d','b','c','a','a']    
list2 = []    
[list2.append(i) for i in list1 if not i in list2]    
print list2 

I think if you wanna maintain the order,

you can try this:

list1 = ['b','c','d','b','c','a','a']    
list2 = list(set(list1))    
list2.sort(key=list1.index)    
print list2

OR similarly you can do this:

list1 = ['b','c','d','b','c','a','a']  
list2 = sorted(set(list1),key=list1.index)  
print list2 

You can also do this:

list1 = ['b','c','d','b','c','a','a']    
list2 = []    
for i in list1:    
    if not i in list2:  
        list2.append(i)`    
print list2

It can also be written as this:

list1 = ['b','c','d','b','c','a','a']    
list2 = []    
[list2.append(i) for i in list1 if not i in list2]    
print list2 

回答 7

Python 3.7及更高版本中,保证字典记住其键插入顺序。这个问题的答案总结了当前的状况。

OrderedDict因此,该解决方案变得过时了,没有任何导入语句,我们可以简单地发出:

>>> lst = [1, 2, 1, 3, 3, 2, 4]
>>> list(dict.fromkeys(lst))
[1, 2, 3, 4]

In Python 3.7 and above, dictionaries are guaranteed to remember their key insertion order. The answer to this question summarizes the current state of affairs.

The OrderedDict solution thus becomes obsolete and without any import statements we can simply issue:

>>> lst = [1, 2, 1, 3, 3, 2, 4]
>>> list(dict.fromkeys(lst))
[1, 2, 3, 4]

回答 8

对于另一个非常老的问题的另一个非常晚的答案:

itertools食谱有做到这一点,利用函数seen集技术,但是:

  • 处理标准key功能。
  • 不使用不雅观的骇客。
  • 通过预绑定seen.add而不是查找N次来优化循环。(f7也可以这样做,但某些版本则不能。)
  • 通过使用来优化循环ifilterfalse,因此您只需要遍历Python中的唯一元素,而不是所有元素。(ifilterfalse当然,您仍然可以在内部遍历所有这些对象,但这是在C中,而且速度更快。)

实际上比它快f7吗?这取决于您的数据,因此您必须对其进行测试并查看。如果您最后想要一个列表,请f7使用listcomp,在这里没有办法做到这一点。(您可以直接append代替yielding,也可以将生成器输入到list函数中,但没有一个可以比listcomp内的LIST_APPEND快。)无论如何,通常挤出几微秒的时间不会像之所以重要,是因为它具有易于理解,可重用,已编写的功能,在您要装饰时不需要DSU。

与所有食谱一样,它也可以在中找到more-iterools

如果只希望没有key条件,可以将其简化为:

def unique(iterable):
    seen = set()
    seen_add = seen.add
    for element in itertools.ifilterfalse(seen.__contains__, iterable):
        seen_add(element)
        yield element

For another very late answer to another very old question:

The itertools recipes have a function that does this, using the seen set technique, but:

  • Handles a standard key function.
  • Uses no unseemly hacks.
  • Optimizes the loop by pre-binding seen.add instead of looking it up N times. (f7 also does this, but some versions don’t.)
  • Optimizes the loop by using ifilterfalse, so you only have to loop over the unique elements in Python, instead of all of them. (You still iterate over all of them inside ifilterfalse, of course, but that’s in C, and much faster.)

Is it actually faster than f7? It depends on your data, so you’ll have to test it and see. If you want a list in the end, f7 uses a listcomp, and there’s no way to do that here. (You can directly append instead of yielding, or you can feed the generator into the list function, but neither one can be as fast as the LIST_APPEND inside a listcomp.) At any rate, usually, squeezing out a few microseconds is not going to be as important as having an easily-understandable, reusable, already-written function that doesn’t require DSU when you want to decorate.

As with all of the recipes, it’s also available in more-iterools.

If you just want the no-key case, you can simplify it as:

def unique(iterable):
    seen = set()
    seen_add = seen.add
    for element in itertools.ifilterfalse(seen.__contains__, iterable):
        seen_add(element)
        yield element

回答 9

刚添加另一个(非常高性能的)实现的这样的功能从外部模块1iteration_utilities.unique_everseen

>>> from iteration_utilities import unique_everseen
>>> lst = [1,1,1,2,3,2,2,2,1,3,4]

>>> list(unique_everseen(lst))
[1, 2, 3, 4]

时机

我做了一些计时(Python的3.6),这些表明,它比我测试的所有其他办法,包括更快OrderedDict.fromkeysf7并且more_itertools.unique_everseen

%matplotlib notebook

from iteration_utilities import unique_everseen
from collections import OrderedDict
from more_itertools import unique_everseen as mi_unique_everseen

def f7(seq):
    seen = set()
    seen_add = seen.add
    return [x for x in seq if not (x in seen or seen_add(x))]

def iteration_utilities_unique_everseen(seq):
    return list(unique_everseen(seq))

def more_itertools_unique_everseen(seq):
    return list(mi_unique_everseen(seq))

def odict(seq):
    return list(OrderedDict.fromkeys(seq))

from simple_benchmark import benchmark

b = benchmark([f7, iteration_utilities_unique_everseen, more_itertools_unique_everseen, odict],
              {2**i: list(range(2**i)) for i in range(1, 20)},
              'list size (no duplicates)')
b.plot()

在此处输入图片说明

并确保我也进行了更多重复的测试,以检查是否有所不同:

import random

b = benchmark([f7, iteration_utilities_unique_everseen, more_itertools_unique_everseen, odict],
              {2**i: [random.randint(0, 2**(i-1)) for _ in range(2**i)] for i in range(1, 20)},
              'list size (lots of duplicates)')
b.plot()

在此处输入图片说明

还有一个仅包含一个值:

b = benchmark([f7, iteration_utilities_unique_everseen, more_itertools_unique_everseen, odict],
              {2**i: [1]*(2**i) for i in range(1, 20)},
              'list size (only duplicates)')
b.plot()

在此处输入图片说明

在所有这些情况下,该iteration_utilities.unique_everseen功能都是最快的(在我的计算机上)。


iteration_utilities.unique_everseen函数还可以处理输入中不可散列的值(但是使用O(n*n)性能而不是O(n)可散列值时的性能)。

>>> lst = [{1}, {1}, {2}, {1}, {3}]

>>> list(unique_everseen(lst))
[{1}, {2}, {3}]

1免责声明:我是该软件包的作者。

Just to add another (very performant) implementation of such a functionality from an external module1: iteration_utilities.unique_everseen:

>>> from iteration_utilities import unique_everseen
>>> lst = [1,1,1,2,3,2,2,2,1,3,4]

>>> list(unique_everseen(lst))
[1, 2, 3, 4]

Timings

I did some timings (Python 3.6) and these show that it’s faster than all other alternatives I tested, including OrderedDict.fromkeys, f7 and more_itertools.unique_everseen:

%matplotlib notebook

from iteration_utilities import unique_everseen
from collections import OrderedDict
from more_itertools import unique_everseen as mi_unique_everseen

def f7(seq):
    seen = set()
    seen_add = seen.add
    return [x for x in seq if not (x in seen or seen_add(x))]

def iteration_utilities_unique_everseen(seq):
    return list(unique_everseen(seq))

def more_itertools_unique_everseen(seq):
    return list(mi_unique_everseen(seq))

def odict(seq):
    return list(OrderedDict.fromkeys(seq))

from simple_benchmark import benchmark

b = benchmark([f7, iteration_utilities_unique_everseen, more_itertools_unique_everseen, odict],
              {2**i: list(range(2**i)) for i in range(1, 20)},
              'list size (no duplicates)')
b.plot()

enter image description here

And just to make sure I also did a test with more duplicates just to check if it makes a difference:

import random

b = benchmark([f7, iteration_utilities_unique_everseen, more_itertools_unique_everseen, odict],
              {2**i: [random.randint(0, 2**(i-1)) for _ in range(2**i)] for i in range(1, 20)},
              'list size (lots of duplicates)')
b.plot()

enter image description here

And one containing only one value:

b = benchmark([f7, iteration_utilities_unique_everseen, more_itertools_unique_everseen, odict],
              {2**i: [1]*(2**i) for i in range(1, 20)},
              'list size (only duplicates)')
b.plot()

enter image description here

In all of these cases the iteration_utilities.unique_everseen function is the fastest (on my computer).


This iteration_utilities.unique_everseen function can also handle unhashable values in the input (however with an O(n*n) performance instead of the O(n) performance when the values are hashable).

>>> lst = [{1}, {1}, {2}, {1}, {3}]

>>> list(unique_everseen(lst))
[{1}, {2}, {3}]

1 Disclaimer: I’m the author of that package.


回答 10

对于没有可散列的类型(例如列表列表),基于MizardX:

def f7_noHash(seq)
    seen = set()
    return [ x for x in seq if str( x ) not in seen and not seen.add( str( x ) )]

For no hashable types (e.g. list of lists), based on MizardX’s:

def f7_noHash(seq)
    seen = set()
    return [ x for x in seq if str( x ) not in seen and not seen.add( str( x ) )]

回答 11

借用在定义Haskell nub函数的列表时使用的递归思想,这将是一种递归方法:

def unique(lst):
    return [] if lst==[] else [lst[0]] + unique(filter(lambda x: x!= lst[0], lst[1:]))

例如:

In [118]: unique([1,5,1,1,4,3,4])
Out[118]: [1, 5, 4, 3]

我尝试使用它来增加数据量,并看到了次线性时间复杂度(不确定,但建议对于正常数据应该很好)。

In [122]: %timeit unique(np.random.randint(5, size=(1)))
10000 loops, best of 3: 25.3 us per loop

In [123]: %timeit unique(np.random.randint(5, size=(10)))
10000 loops, best of 3: 42.9 us per loop

In [124]: %timeit unique(np.random.randint(5, size=(100)))
10000 loops, best of 3: 132 us per loop

In [125]: %timeit unique(np.random.randint(5, size=(1000)))
1000 loops, best of 3: 1.05 ms per loop

In [126]: %timeit unique(np.random.randint(5, size=(10000)))
100 loops, best of 3: 11 ms per loop

我也认为很有趣的是,其他操作可以很容易地将其推广到唯一性。像这样:

import operator
def unique(lst, cmp_op=operator.ne):
    return [] if lst==[] else [lst[0]] + unique(filter(lambda x: cmp_op(x, lst[0]), lst[1:]), cmp_op)

例如,您可以传入一个函数,该函数使用舍入的概念将其舍入为相同的整数,就好像它是“相等”是出于唯一性目的一样,如下所示:

def test_round(x,y):
    return round(x) != round(y)

然后unique(some_list,test_round)将提供列表的唯一元素,其中唯一性不再意味着传统的相等性(这是通过使用任何基于集合或基于dict-key的方法来解决的),而是意味着对于每个可能舍入到的整数K,只有第一个舍入到K的元素,例如:

In [6]: unique([1.2, 5, 1.9, 1.1, 4.2, 3, 4.8], test_round)
Out[6]: [1.2, 5, 1.9, 4.2, 3]

Borrowing the recursive idea used in definining Haskell’s nub function for lists, this would be a recursive approach:

def unique(lst):
    return [] if lst==[] else [lst[0]] + unique(filter(lambda x: x!= lst[0], lst[1:]))

e.g.:

In [118]: unique([1,5,1,1,4,3,4])
Out[118]: [1, 5, 4, 3]

I tried it for growing data sizes and saw sub-linear time-complexity (not definitive, but suggests this should be fine for normal data).

In [122]: %timeit unique(np.random.randint(5, size=(1)))
10000 loops, best of 3: 25.3 us per loop

In [123]: %timeit unique(np.random.randint(5, size=(10)))
10000 loops, best of 3: 42.9 us per loop

In [124]: %timeit unique(np.random.randint(5, size=(100)))
10000 loops, best of 3: 132 us per loop

In [125]: %timeit unique(np.random.randint(5, size=(1000)))
1000 loops, best of 3: 1.05 ms per loop

In [126]: %timeit unique(np.random.randint(5, size=(10000)))
100 loops, best of 3: 11 ms per loop

I also think it’s interesting that this could be readily generalized to uniqueness by other operations. Like this:

import operator
def unique(lst, cmp_op=operator.ne):
    return [] if lst==[] else [lst[0]] + unique(filter(lambda x: cmp_op(x, lst[0]), lst[1:]), cmp_op)

For example, you could pass in a function that uses the notion of rounding to the same integer as if it was “equality” for uniqueness purposes, like this:

def test_round(x,y):
    return round(x) != round(y)

then unique(some_list, test_round) would provide the unique elements of the list where uniqueness no longer meant traditional equality (which is implied by using any sort of set-based or dict-key-based approach to this problem) but instead meant to take only the first element that rounds to K for each possible integer K that the elements might round to, e.g.:

In [6]: unique([1.2, 5, 1.9, 1.1, 4.2, 3, 4.8], test_round)
Out[6]: [1.2, 5, 1.9, 4.2, 3]

回答 12

5倍速减少变体但更复杂

>>> l = [5, 6, 6, 1, 1, 2, 2, 3, 4]
>>> reduce(lambda r, v: v in r[1] and r or (r[0].append(v) or r[1].add(v)) or r, l, ([], set()))[0]
[5, 6, 1, 2, 3, 4]

说明:

default = (list(), set())
# use list to keep order
# use set to make lookup faster

def reducer(result, item):
    if item not in result[1]:
        result[0].append(item)
        result[1].add(item)
    return result

>>> reduce(reducer, l, default)[0]
[5, 6, 1, 2, 3, 4]

5 x faster reduce variant but more sophisticated

>>> l = [5, 6, 6, 1, 1, 2, 2, 3, 4]
>>> reduce(lambda r, v: v in r[1] and r or (r[0].append(v) or r[1].add(v)) or r, l, ([], set()))[0]
[5, 6, 1, 2, 3, 4]

Explanation:

default = (list(), set())
# use list to keep order
# use set to make lookup faster

def reducer(result, item):
    if item not in result[1]:
        result[0].append(item)
        result[1].add(item)
    return result

>>> reduce(reducer, l, default)[0]
[5, 6, 1, 2, 3, 4]

回答 13

您可以引用列表理解,因为它是由符号“ _ [1]”构建的。
例如,以下函数通过引用元素列表理解来唯一化元素列表,而不更改其顺序。

def unique(my_list): 
    return [x for x in my_list if x not in locals()['_[1]']]

演示:

l1 = [1, 2, 3, 4, 1, 2, 3, 4, 5]
l2 = [x for x in l1 if x not in locals()['_[1]']]
print l2

输出:

[1, 2, 3, 4, 5]

You can reference a list comprehension as it is being built by the symbol ‘_[1]’.
For example, the following function unique-ifies a list of elements without changing their order by referencing its list comprehension.

def unique(my_list): 
    return [x for x in my_list if x not in locals()['_[1]']]

Demo:

l1 = [1, 2, 3, 4, 1, 2, 3, 4, 5]
l2 = [x for x in l1 if x not in locals()['_[1]']]
print l2

Output:

[1, 2, 3, 4, 5]

回答 14

MizardX的答案很好地总结了多种方法。

这是我在大声思考时想到的:

mylist = [x for i,x in enumerate(mylist) if x not in mylist[i+1:]]

MizardX’s answer gives a good collection of multiple approaches.

This is what I came up with while thinking aloud:

mylist = [x for i,x in enumerate(mylist) if x not in mylist[i+1:]]

回答 15

这是一种简单的方法:

list1 = ["hello", " ", "w", "o", "r", "l", "d"]
sorted(set(list1 ), key=lambda x:list1.index(x))

给出输出:

["hello", " ", "w", "o", "r", "l", "d"]

here is a simple way to do it:

list1 = ["hello", " ", "w", "o", "r", "l", "d"]
sorted(set(list1 ), key=lambda x:list1.index(x))

that gives the output:

["hello", " ", "w", "o", "r", "l", "d"]

回答 16

您可以进行某种丑陋的列表理解技巧。

[l[i] for i in range(len(l)) if l.index(l[i]) == i]

You could do a sort of ugly list comprehension hack.

[l[i] for i in range(len(l)) if l.index(l[i]) == i]

回答 17

相对有效_sorted_numpy数组方法:

b = np.array([1,3,3, 8, 12, 12,12])    
numpy.hstack([b[0], [x[0] for x in zip(b[1:], b[:-1]) if x[0]!=x[1]]])

输出:

array([ 1,  3,  8, 12])

Relatively effective approach with _sorted_ a numpy arrays:

b = np.array([1,3,3, 8, 12, 12,12])    
numpy.hstack([b[0], [x[0] for x in zip(b[1:], b[:-1]) if x[0]!=x[1]]])

Outputs:

array([ 1,  3,  8, 12])

回答 18

l = [1,2,2,3,3,...]
n = []
n.extend(ele for ele in l if ele not in set(n))

生成器表达式使用集合的O(1)查找来确定是否在新列表中包括元素。

l = [1,2,2,3,3,...]
n = []
n.extend(ele for ele in l if ele not in set(n))

A generator expression that uses the O(1) look up of a set to determine whether or not to include an element in the new list.


回答 19

一个简单的递归解决方案:

def uniquefy_list(a):
    return uniquefy_list(a[1:]) if a[0] in a[1:] else [a[0]]+uniquefy_list(a[1:]) if len(a)>1 else [a[0]]

A simple recursive solution:

def uniquefy_list(a):
    return uniquefy_list(a[1:]) if a[0] in a[1:] else [a[0]]+uniquefy_list(a[1:]) if len(a)>1 else [a[0]]

回答 20

消除序列中的重复值,但保留其余项目的顺序。使用通用发生器功能。

# for hashable sequence
def remove_duplicates(items):
    seen = set()
    for item in items:
        if item not in seen:
            yield item
            seen.add(item)

a = [1, 5, 2, 1, 9, 1, 5, 10]
list(remove_duplicates(a))
# [1, 5, 2, 9, 10]



# for unhashable sequence
def remove_duplicates(items, key=None):
    seen = set()
    for item in items:
        val = item if key is None else key(item)
        if val not in seen:
            yield item
            seen.add(val)

a = [ {'x': 1, 'y': 2}, {'x': 1, 'y': 3}, {'x': 1, 'y': 2}, {'x': 2, 'y': 4}]
list(remove_duplicates(a, key=lambda d: (d['x'],d['y'])))
# [{'x': 1, 'y': 2}, {'x': 1, 'y': 3}, {'x': 2, 'y': 4}]

Eliminating the duplicate values in a sequence, but preserve the order of the remaining items. Use of general purpose generator function.

# for hashable sequence
def remove_duplicates(items):
    seen = set()
    for item in items:
        if item not in seen:
            yield item
            seen.add(item)

a = [1, 5, 2, 1, 9, 1, 5, 10]
list(remove_duplicates(a))
# [1, 5, 2, 9, 10]



# for unhashable sequence
def remove_duplicates(items, key=None):
    seen = set()
    for item in items:
        val = item if key is None else key(item)
        if val not in seen:
            yield item
            seen.add(val)

a = [ {'x': 1, 'y': 2}, {'x': 1, 'y': 3}, {'x': 1, 'y': 2}, {'x': 2, 'y': 4}]
list(remove_duplicates(a, key=lambda d: (d['x'],d['y'])))
# [{'x': 1, 'y': 2}, {'x': 1, 'y': 3}, {'x': 2, 'y': 4}]

回答 21

如果您需要一支班轮,那么这可能会有所帮助:

reduce(lambda x, y: x + y if y[0] not in x else x, map(lambda x: [x],lst))

…应该可以,但是如果我错了,请纠正我

If you need one liner then maybe this would help:

reduce(lambda x, y: x + y if y[0] not in x else x, map(lambda x: [x],lst))

… should work but correct me if i’m wrong


回答 22

如果您经常使用pandas,并且美观优先于性能,那么请考虑内置功能pandas.Series.drop_duplicates

    import pandas as pd
    import numpy as np

    uniquifier = lambda alist: pd.Series(alist).drop_duplicates().tolist()

    # from the chosen answer 
    def f7(seq):
        seen = set()
        seen_add = seen.add
        return [ x for x in seq if not (x in seen or seen_add(x))]

    alist = np.random.randint(low=0, high=1000, size=10000).tolist()

    print uniquifier(alist) == f7(alist)  # True

定时:

    In [104]: %timeit f7(alist)
    1000 loops, best of 3: 1.3 ms per loop
    In [110]: %timeit uniquifier(alist)
    100 loops, best of 3: 4.39 ms per loop

If you routinely use pandas, and aesthetics is preferred over performance, then consider the built-in function pandas.Series.drop_duplicates:

    import pandas as pd
    import numpy as np

    uniquifier = lambda alist: pd.Series(alist).drop_duplicates().tolist()

    # from the chosen answer 
    def f7(seq):
        seen = set()
        seen_add = seen.add
        return [ x for x in seq if not (x in seen or seen_add(x))]

    alist = np.random.randint(low=0, high=1000, size=10000).tolist()

    print uniquifier(alist) == f7(alist)  # True

Timing:

    In [104]: %timeit f7(alist)
    1000 loops, best of 3: 1.3 ms per loop
    In [110]: %timeit uniquifier(alist)
    100 loops, best of 3: 4.39 ms per loop

回答 23

这将保留订单并在O(n)时间运行。基本上,这个想法是在发现重复的地方创建一个孔,并将其下沉到底部。利用读写指针。每当发现重复项时,只有读指针前进,写指针停留在重复项上以覆盖它。

def deduplicate(l):
    count = {}
    (read,write) = (0,0)
    while read < len(l):
        if l[read] in count:
            read += 1
            continue
        count[l[read]] = True
        l[write] = l[read]
        read += 1
        write += 1
    return l[0:write]

this will preserve order and run in O(n) time. basically the idea is to create a hole wherever there is a duplicate found and sink it down to the bottom. makes use of a read and write pointer. whenever a duplicate is found only the read pointer advances and write pointer stays on the duplicate entry to overwrite it.

def deduplicate(l):
    count = {}
    (read,write) = (0,0)
    while read < len(l):
        if l[read] in count:
            read += 1
            continue
        count[l[read]] = True
        l[write] = l[read]
        read += 1
        write += 1
    return l[0:write]

回答 24

不使用导入的模块或集的解决方案:

text = "ask not what your country can do for you ask what you can do for your country"
sentence = text.split(" ")
noduplicates = [(sentence[i]) for i in range (0,len(sentence)) if sentence[i] not in sentence[:i]]
print(noduplicates)

给出输出:

['ask', 'not', 'what', 'your', 'country', 'can', 'do', 'for', 'you']

A solution without using imported modules or sets:

text = "ask not what your country can do for you ask what you can do for your country"
sentence = text.split(" ")
noduplicates = [(sentence[i]) for i in range (0,len(sentence)) if sentence[i] not in sentence[:i]]
print(noduplicates)

Gives output:

['ask', 'not', 'what', 'your', 'country', 'can', 'do', 'for', 'you']

回答 25

就地方法

此方法是二次方的,因为我们对列表的每个元素都有一个线性查找列表(由于存在以下原因,我们必须增加重新排列列表的成本) del)。

就是说,如果我们从列表的末尾开始,朝着原点前进,删除子列表左侧的每个术语,就有可能进行适当的操作

代码中的这个想法很简单

for i in range(len(l)-1,0,-1): 
    if l[i] in l[:i]: del l[i] 

实施的简单测试

In [91]: from random import randint, seed                                                                                            
In [92]: seed('20080808') ; l = [randint(1,6) for _ in range(12)] # Beijing Olympics                                                                 
In [93]: for i in range(len(l)-1,0,-1): 
    ...:     print(l) 
    ...:     print(i, l[i], l[:i], end='') 
    ...:     if l[i] in l[:i]: 
    ...:          print( ': remove', l[i]) 
    ...:          del l[i] 
    ...:     else: 
    ...:          print() 
    ...: print(l)
[6, 5, 1, 4, 6, 1, 6, 2, 2, 4, 5, 2]
11 2 [6, 5, 1, 4, 6, 1, 6, 2, 2, 4, 5]: remove 2
[6, 5, 1, 4, 6, 1, 6, 2, 2, 4, 5]
10 5 [6, 5, 1, 4, 6, 1, 6, 2, 2, 4]: remove 5
[6, 5, 1, 4, 6, 1, 6, 2, 2, 4]
9 4 [6, 5, 1, 4, 6, 1, 6, 2, 2]: remove 4
[6, 5, 1, 4, 6, 1, 6, 2, 2]
8 2 [6, 5, 1, 4, 6, 1, 6, 2]: remove 2
[6, 5, 1, 4, 6, 1, 6, 2]
7 2 [6, 5, 1, 4, 6, 1, 6]
[6, 5, 1, 4, 6, 1, 6, 2]
6 6 [6, 5, 1, 4, 6, 1]: remove 6
[6, 5, 1, 4, 6, 1, 2]
5 1 [6, 5, 1, 4, 6]: remove 1
[6, 5, 1, 4, 6, 2]
4 6 [6, 5, 1, 4]: remove 6
[6, 5, 1, 4, 2]
3 4 [6, 5, 1]
[6, 5, 1, 4, 2]
2 1 [6, 5]
[6, 5, 1, 4, 2]
1 5 [6]
[6, 5, 1, 4, 2]

In [94]:                                                                                                                             

An in-place method

This method is quadratic, because we have a linear lookup into the list for every element of the list (to that we have to add the cost of rearranging the list because of the del s).

That said, it is possible to operate in place if we start from the end of the list and proceed toward the origin removing each term that is present in the sub-list at its left

This idea in code is simply

for i in range(len(l)-1,0,-1): 
    if l[i] in l[:i]: del l[i] 

A simple test of the implementation

In [91]: from random import randint, seed                                                                                            
In [92]: seed('20080808') ; l = [randint(1,6) for _ in range(12)] # Beijing Olympics                                                                 
In [93]: for i in range(len(l)-1,0,-1): 
    ...:     print(l) 
    ...:     print(i, l[i], l[:i], end='') 
    ...:     if l[i] in l[:i]: 
    ...:          print( ': remove', l[i]) 
    ...:          del l[i] 
    ...:     else: 
    ...:          print() 
    ...: print(l)
[6, 5, 1, 4, 6, 1, 6, 2, 2, 4, 5, 2]
11 2 [6, 5, 1, 4, 6, 1, 6, 2, 2, 4, 5]: remove 2
[6, 5, 1, 4, 6, 1, 6, 2, 2, 4, 5]
10 5 [6, 5, 1, 4, 6, 1, 6, 2, 2, 4]: remove 5
[6, 5, 1, 4, 6, 1, 6, 2, 2, 4]
9 4 [6, 5, 1, 4, 6, 1, 6, 2, 2]: remove 4
[6, 5, 1, 4, 6, 1, 6, 2, 2]
8 2 [6, 5, 1, 4, 6, 1, 6, 2]: remove 2
[6, 5, 1, 4, 6, 1, 6, 2]
7 2 [6, 5, 1, 4, 6, 1, 6]
[6, 5, 1, 4, 6, 1, 6, 2]
6 6 [6, 5, 1, 4, 6, 1]: remove 6
[6, 5, 1, 4, 6, 1, 2]
5 1 [6, 5, 1, 4, 6]: remove 1
[6, 5, 1, 4, 6, 2]
4 6 [6, 5, 1, 4]: remove 6
[6, 5, 1, 4, 2]
3 4 [6, 5, 1]
[6, 5, 1, 4, 2]
2 1 [6, 5]
[6, 5, 1, 4, 2]
1 5 [6]
[6, 5, 1, 4, 2]

In [94]:                                                                                                                             

回答 26

zmk的方法使用列表理解,它非常快,但自然保持顺序。对于区分大小写的字符串,可以轻松对其进行修改。这也保留了原始情况。

def DelDupes(aseq) :
    seen = set()
    return [x for x in aseq if (x.lower() not in seen) and (not seen.add(x.lower()))]

紧密相关的功能是:

def HasDupes(aseq) :
    s = set()
    return any(((x.lower() in s) or s.add(x.lower())) for x in aseq)

def GetDupes(aseq) :
    s = set()
    return set(x for x in aseq if ((x.lower() in s) or s.add(x.lower())))

zmk’s approach uses list comprehension which is very fast, yet keeps the order naturally. For applying to case sensitive strings it can be easily modified. This also preserves the original case.

def DelDupes(aseq) :
    seen = set()
    return [x for x in aseq if (x.lower() not in seen) and (not seen.add(x.lower()))]

Closely associated functions are:

def HasDupes(aseq) :
    s = set()
    return any(((x.lower() in s) or s.add(x.lower())) for x in aseq)

def GetDupes(aseq) :
    s = set()
    return set(x for x in aseq if ((x.lower() in s) or s.add(x.lower())))

回答 27

熊猫用户应退房pandas.unique

>>> import pandas as pd
>>> lst = [1, 2, 1, 3, 3, 2, 4]
>>> pd.unique(lst)
array([1, 2, 3, 4])

该函数返回一个NumPy数组。如果需要,可以使用tolist方法将其转换为列表。

pandas users should check out pandas.unique.

>>> import pandas as pd
>>> lst = [1, 2, 1, 3, 3, 2, 4]
>>> pd.unique(lst)
array([1, 2, 3, 4])

The function returns a NumPy array. If needed, you can convert it to a list with the tolist method.


回答 28

一种班轮清单理解:

values_non_duplicated = [value for index, value in enumerate(values) if value not in values[ : index]]

只需添加一个条件以检查该不在先前位置

One liner list comprehension:

values_non_duplicated = [value for index, value in enumerate(values) if value not in values[ : index]]

Simply add a conditional to check that value is not on a previous position


将字典的字符串表示形式转换为字典?

问题:将字典的字符串表示形式转换为字典?

如何将a的str表示形式(dict例如以下字符串)转换为a dict

s = "{'muffin' : 'lolz', 'foo' : 'kitty'}"

我宁愿不使用eval。我还能使用什么?

这样做的主要原因是他写的我的同事类之一,将所有输入都转换为字符串。我不打算去修改他的类,以解决这个问题。

How can I convert the str representation of a dict, such as the following string, into a dict?

s = "{'muffin' : 'lolz', 'foo' : 'kitty'}"

I prefer not to use eval. What else can I use?

The main reason for this, is one of my coworkers classes he wrote, converts all input into strings. I’m not in the mood to go and modify his classes, to deal with this issue.


回答 0

从Python 2.6开始,您可以使用内置的ast.literal_eval

>>> import ast
>>> ast.literal_eval("{'muffin' : 'lolz', 'foo' : 'kitty'}")
{'muffin': 'lolz', 'foo': 'kitty'}

这比使用更为安全eval。正如其文档所说:

>>>帮助(ast.literal_eval)
帮助ast模块中的literal_eval函数:

literal_eval(node_or_string)
    安全地评估表达式节点或包含Python的字符串
    表达。提供的字符串或节点只能由以下内容组成
    Python文字结构:字符串,数字,元组,列表,字典,布尔值,
    和没有。

例如:

>>> eval("shutil.rmtree('mongo')")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1, in <module>
  File "/opt/Python-2.6.1/lib/python2.6/shutil.py", line 208, in rmtree
    onerror(os.listdir, path, sys.exc_info())
  File "/opt/Python-2.6.1/lib/python2.6/shutil.py", line 206, in rmtree
    names = os.listdir(path)
OSError: [Errno 2] No such file or directory: 'mongo'
>>> ast.literal_eval("shutil.rmtree('mongo')")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/Python-2.6.1/lib/python2.6/ast.py", line 68, in literal_eval
    return _convert(node_or_string)
  File "/opt/Python-2.6.1/lib/python2.6/ast.py", line 67, in _convert
    raise ValueError('malformed string')
ValueError: malformed string

Starting in Python 2.6 you can use the built-in ast.literal_eval:

>>> import ast
>>> ast.literal_eval("{'muffin' : 'lolz', 'foo' : 'kitty'}")
{'muffin': 'lolz', 'foo': 'kitty'}

This is safer than using eval. As its own docs say:

>>> help(ast.literal_eval)
Help on function literal_eval in module ast:

literal_eval(node_or_string)
    Safely evaluate an expression node or a string containing a Python
    expression.  The string or node provided may only consist of the following
    Python literal structures: strings, numbers, tuples, lists, dicts, booleans,
    and None.

For example:

>>> eval("shutil.rmtree('mongo')")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1, in <module>
  File "/opt/Python-2.6.1/lib/python2.6/shutil.py", line 208, in rmtree
    onerror(os.listdir, path, sys.exc_info())
  File "/opt/Python-2.6.1/lib/python2.6/shutil.py", line 206, in rmtree
    names = os.listdir(path)
OSError: [Errno 2] No such file or directory: 'mongo'
>>> ast.literal_eval("shutil.rmtree('mongo')")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/Python-2.6.1/lib/python2.6/ast.py", line 68, in literal_eval
    return _convert(node_or_string)
  File "/opt/Python-2.6.1/lib/python2.6/ast.py", line 67, in _convert
    raise ValueError('malformed string')
ValueError: malformed string

回答 1

https://docs.python.org/3.8/library/json.html

JSON可以解决此问题,尽管其解码器希望在键和值周围使用双引号。如果您不介意更换骇客…

import json
s = "{'muffin' : 'lolz', 'foo' : 'kitty'}"
json_acceptable_string = s.replace("'", "\"")
d = json.loads(json_acceptable_string)
# d = {u'muffin': u'lolz', u'foo': u'kitty'}

请注意,如果将单引号作为键或值的一部分,则由于字符替换不当而导致此操作失败。仅当您对评估解决方案强烈反对时,才建议使用此解决方案。

有关JSON单引号的更多信息:jQuery.parseJSON由于JSON中的单引号已转义而引发“无效JSON”错误

https://docs.python.org/3.8/library/json.html

JSON can solve this problem though its decoder wants double quotes around keys and values. If you don’t mind a replace hack…

import json
s = "{'muffin' : 'lolz', 'foo' : 'kitty'}"
json_acceptable_string = s.replace("'", "\"")
d = json.loads(json_acceptable_string)
# d = {u'muffin': u'lolz', u'foo': u'kitty'}

NOTE that if you have single quotes as a part of your keys or values this will fail due to improper character replacement. This solution is only recommended if you have a strong aversion to the eval solution.

More about json single quote: jQuery.parseJSON throws “Invalid JSON” error due to escaped single quote in JSON


回答 2

使用json.loads

>>> import json
>>> h = '{"foo":"bar", "foo2":"bar2"}'
>>> d = json.loads(h)
>>> d
{u'foo': u'bar', u'foo2': u'bar2'}
>>> type(d)
<type 'dict'>

using json.loads:

>>> import json
>>> h = '{"foo":"bar", "foo2":"bar2"}'
>>> d = json.loads(h)
>>> d
{u'foo': u'bar', u'foo2': u'bar2'}
>>> type(d)
<type 'dict'>

回答 3

以OP为例:

s = "{'muffin' : 'lolz', 'foo' : 'kitty'}"

我们可以使用Yaml处理字符串中的这种非标准json:

>>> import yaml
>>> s = "{'muffin' : 'lolz', 'foo' : 'kitty'}"
>>> s
"{'muffin' : 'lolz', 'foo' : 'kitty'}"
>>> yaml.load(s)
{'muffin': 'lolz', 'foo': 'kitty'}

To OP’s example:

s = "{'muffin' : 'lolz', 'foo' : 'kitty'}"

We can use Yaml to deal with this kind of non-standard json in string:

>>> import yaml
>>> s = "{'muffin' : 'lolz', 'foo' : 'kitty'}"
>>> s
"{'muffin' : 'lolz', 'foo' : 'kitty'}"
>>> yaml.load(s)
{'muffin': 'lolz', 'foo': 'kitty'}

回答 4

如果始终可以信任该字符串,则可以使用eval(或literal_eval按建议使用;无论该字符串是什么都是安全的。)否则,您需要一个解析器。如果JSON解析器(例如simplejson)仅存储符合JSON方案的内容,则该解析器将起作用。

If the string can always be trusted, you could use eval (or use literal_eval as suggested; it’s safe no matter what the string is.) Otherwise you need a parser. A JSON parser (such as simplejson) would work if he only ever stores content that fits with the JSON scheme.


回答 5

使用json。该ast库消耗大量内存,并且速度较慢。我有一个过程需要读取156Mb的文本文件。Ast转换字典需要5分钟的延迟,json而使用内存减少60%则需要1分钟!

Use json. the ast library consumes a lot of memory and and slower. I have a process that needs to read a text file of 156Mb. Ast with 5 minutes delay for the conversion dictionary json and 1 minutes using 60% less memory!


回答 6

总结一下:

import ast, yaml, json, timeit

descs=['short string','long string']
strings=['{"809001":2,"848545":2,"565828":1}','{"2979":1,"30581":1,"7296":1,"127256":1,"18803":2,"41619":1,"41312":1,"16837":1,"7253":1,"70075":1,"3453":1,"4126":1,"23599":1,"11465":3,"19172":1,"4019":1,"4775":1,"64225":1,"3235":2,"15593":1,"7528":1,"176840":1,"40022":1,"152854":1,"9878":1,"16156":1,"6512":1,"4138":1,"11090":1,"12259":1,"4934":1,"65581":1,"9747":2,"18290":1,"107981":1,"459762":1,"23177":1,"23246":1,"3591":1,"3671":1,"5767":1,"3930":1,"89507":2,"19293":1,"92797":1,"32444":2,"70089":1,"46549":1,"30988":1,"4613":1,"14042":1,"26298":1,"222972":1,"2982":1,"3932":1,"11134":1,"3084":1,"6516":1,"486617":1,"14475":2,"2127":1,"51359":1,"2662":1,"4121":1,"53848":2,"552967":1,"204081":1,"5675":2,"32433":1,"92448":1}']
funcs=[json.loads,eval,ast.literal_eval,yaml.load]

for  desc,string in zip(descs,strings):
    print('***',desc,'***')
    print('')
    for  func in funcs:
        print(func.__module__+' '+func.__name__+':')
        %timeit func(string)        
    print('')

结果:

*** short string ***

json loads:
4.47 µs ± 33.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
builtins eval:
24.1 µs ± 163 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
ast literal_eval:
30.4 µs ± 299 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
yaml load:
504 µs ± 1.29 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

*** long string ***

json loads:
29.6 µs ± 230 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
builtins eval:
219 µs ± 3.92 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
ast literal_eval:
331 µs ± 1.89 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
yaml load:
9.02 ms ± 92.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

结论:更喜欢json.loads

To summarize:

import ast, yaml, json, timeit

descs=['short string','long string']
strings=['{"809001":2,"848545":2,"565828":1}','{"2979":1,"30581":1,"7296":1,"127256":1,"18803":2,"41619":1,"41312":1,"16837":1,"7253":1,"70075":1,"3453":1,"4126":1,"23599":1,"11465":3,"19172":1,"4019":1,"4775":1,"64225":1,"3235":2,"15593":1,"7528":1,"176840":1,"40022":1,"152854":1,"9878":1,"16156":1,"6512":1,"4138":1,"11090":1,"12259":1,"4934":1,"65581":1,"9747":2,"18290":1,"107981":1,"459762":1,"23177":1,"23246":1,"3591":1,"3671":1,"5767":1,"3930":1,"89507":2,"19293":1,"92797":1,"32444":2,"70089":1,"46549":1,"30988":1,"4613":1,"14042":1,"26298":1,"222972":1,"2982":1,"3932":1,"11134":1,"3084":1,"6516":1,"486617":1,"14475":2,"2127":1,"51359":1,"2662":1,"4121":1,"53848":2,"552967":1,"204081":1,"5675":2,"32433":1,"92448":1}']
funcs=[json.loads,eval,ast.literal_eval,yaml.load]

for  desc,string in zip(descs,strings):
    print('***',desc,'***')
    print('')
    for  func in funcs:
        print(func.__module__+' '+func.__name__+':')
        %timeit func(string)        
    print('')

Results:

*** short string ***

json loads:
4.47 µs ± 33.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
builtins eval:
24.1 µs ± 163 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
ast literal_eval:
30.4 µs ± 299 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
yaml load:
504 µs ± 1.29 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

*** long string ***

json loads:
29.6 µs ± 230 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
builtins eval:
219 µs ± 3.92 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
ast literal_eval:
331 µs ± 1.89 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
yaml load:
9.02 ms ± 92.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Conclusion: prefer json.loads


回答 7

string = "{'server1':'value','server2':'value'}"

#Now removing { and }
s = string.replace("{" ,"")
finalstring = s.replace("}" , "")

#Splitting the string based on , we get key value pairs
list = finalstring.split(",")

dictionary ={}
for i in list:
    #Get Key Value pairs separately to store in dictionary
    keyvalue = i.split(":")

    #Replacing the single quotes in the leading.
    m= keyvalue[0].strip('\'')
    m = m.replace("\"", "")
    dictionary[m] = keyvalue[1].strip('"\'')

print dictionary
string = "{'server1':'value','server2':'value'}"

#Now removing { and }
s = string.replace("{" ,"")
finalstring = s.replace("}" , "")

#Splitting the string based on , we get key value pairs
list = finalstring.split(",")

dictionary ={}
for i in list:
    #Get Key Value pairs separately to store in dictionary
    keyvalue = i.split(":")

    #Replacing the single quotes in the leading.
    m= keyvalue[0].strip('\'')
    m = m.replace("\"", "")
    dictionary[m] = keyvalue[1].strip('"\'')

print dictionary

回答 8

没有使用任何库:

dict_format_string = "{'1':'one', '2' : 'two'}"
d = {}
elems  = filter(str.isalnum,dict_format_string.split("'"))
values = elems[1::2]
keys   = elems[0::2]
d.update(zip(keys,values))

注意:由于已进行硬编码,split("'")因此仅适用于“单引号”数据的字符串。

no any libs are used:

dict_format_string = "{'1':'one', '2' : 'two'}"
d = {}
elems  = filter(str.isalnum,dict_format_string.split("'"))
values = elems[1::2]
keys   = elems[0::2]
d.update(zip(keys,values))

NOTE: As it has hardcoded split("'") will work only for strings where data is “single quoted”.


找不到pg_config可执行文件

问题:找不到pg_config可执行文件

我在安装psycopg2时遇到问题。尝试执行以下操作时出现以下错误pip install psycopg2

Error: pg_config executable not found.

Please add the directory containing pg_config to the PATH

or specify the full executable path with the option:



    python setup.py build_ext --pg-config /path/to/pg_config build ...



or with the pg_config option in 'setup.cfg'.

----------------------------------------
Command python setup.py egg_info failed with error code 1 in /tmp/pip-build/psycopg2

但是问题出pg_config在我身上PATH; 它运行没有任何问题:

$ which pg_config
/usr/pgsql-9.1/bin/pg_config

我尝试将pg_config路径添加到setup.cfg文件中,并使用从其网站(http://initd.org/psycopg/)下载的源文件进行构建,然后收到以下错误消息!

Error: Unable to find 'pg_config' file in '/usr/pgsql-9.1/bin/'

但是实际上是那里!!!

这些错误使我感到困惑。有人可以帮忙吗?

顺便说一下,我sudo所有的命令。我也在使用RHEL 5.5。

I am having trouble installing psycopg2. I get the following error when I try to pip install psycopg2:

Error: pg_config executable not found.

Please add the directory containing pg_config to the PATH

or specify the full executable path with the option:



    python setup.py build_ext --pg-config /path/to/pg_config build ...



or with the pg_config option in 'setup.cfg'.

----------------------------------------
Command python setup.py egg_info failed with error code 1 in /tmp/pip-build/psycopg2

But the problem is pg_config is actually in my PATH; it runs without any problem:

$ which pg_config
/usr/pgsql-9.1/bin/pg_config

I tried adding the pg_config path to the setup.cfg file and building it using the source files I downloaded from their website (http://initd.org/psycopg/) and I get the following error message!

Error: Unable to find 'pg_config' file in '/usr/pgsql-9.1/bin/'

But it is actually THERE!!!

I am baffled by these errors. Can anyone help please?

By the way, I sudo all the commands. Also I am on RHEL 5.5.


回答 0

pg_config位于postgresql-devellibpq-devlibpq-develCentos / Cygwin / Babun的Debian / Ubuntu中。)

pg_config is in postgresql-devel (libpq-dev in Debian/Ubuntu, libpq-devel on Centos/Cygwin/Babun.)


回答 1

在Mac OS X上,我使用自制软件包管理器解决了该问题

brew install postgresql

On Mac OS X, I solved it using the homebrew package manager

brew install postgresql

回答 2

你安装好了python-dev吗?如果已经拥有,请尝试安装libpq-dev

sudo apt-get install libpq-dev python-dev

从文章:如何在virtualenv下安装psycopg2

Have you installed python-dev? If you already have, try also installing libpq-dev

sudo apt-get install libpq-dev python-dev

From the article: How to install psycopg2 under virtualenv


回答 3

同样在OSX上。从http://postgresapp.com/安装了Postgress.app,但存在相同的问题。

pg_config在该应用程序的内容中找到了目录,并将目录添加到$PATH

到了/Applications/Postgres.app/Contents/Versions/latest/bin。所以这个工作:export PATH="/Applications/Postgres.app/Contents/Versions/latest/bin:$PATH"

Also on OSX. Installed Postgress.app from http://postgresapp.com/ but had the same issue.

I found pg_config in that app’s contents and added the dir to $PATH.

It was at /Applications/Postgres.app/Contents/Versions/latest/bin. So this worked: export PATH="/Applications/Postgres.app/Contents/Versions/latest/bin:$PATH".


回答 4

在高山上,包含的库pg_configpostgresql-dev。要安装,请运行:

apk add postgresql-dev

On alpine, the library containing pg_config is postgresql-dev. To install, run:

apk add postgresql-dev

回答 5

这是在CentOS上首次安装对我有用的东西:

sudo yum install postgresql postgresql-devel python-devel

在Ubuntu上,只需使用等效的apt-get软件包。

sudo apt-get install postgresql postgresql-dev python-dev

现在在pip安装中包含postgresql二进制目录的路径,这对于基于Debain或RHEL的Linux应该都适用:

sudo PATH=$PATH:/usr/pgsql-9.3/bin/ pip install psycopg2

确保包括正确的路径。就这样 :)

This is what worked for me on CentOS, first install:

sudo yum install postgresql postgresql-devel python-devel

On Ubuntu just use the equivilent apt-get packages.

sudo apt-get install postgresql postgresql-dev python-dev

And now include the path to your postgresql binary dir with you pip install, this should work for either Debain or RHEL based Linux:

sudo PATH=$PATH:/usr/pgsql-9.3/bin/ pip install psycopg2

Make sure to include the correct path. Thats all :)


回答 6

apt-get build-dep python-psycopg2
apt-get build-dep python-psycopg2

回答 7

您应该添加在Ubuntu上的Postgres中使用的python要求。跑:

sudo apt-get install libpq-dev python-dev

You should add python requirements used in Postgres on Ubuntu. Run:

sudo apt-get install libpq-dev python-dev

回答 8

综上所述,我也面临着完全相同的问题。在阅读了很多stackoverflow帖子和在线博客之后,对我有用的最终解决方案是:

1)在安装psycopg2之前,应先安装PostgreSQL(开发版或任何稳定版本)。

2)在安装psycopg2之前,必须显式设置pg_config文件(该文件通常位于PostgreSQL安装文件夹的bin文件夹中)。就我而言,PostgreSQL的安装路径为:

/opt/local/lib/postgresql91/

因此,为了显式设置pg_config文件的PATH,我在终端中输入了以下命令:

PATH=$PATH:/opt/local/lib/postgresql91/bin/

此命令可确保当您尝试通过pip安装psycopg2时,它将自动找到pg_config的路径。

我还在我的博客上发布了有关trace及其解决方案的完整错误,您可能希望参考。它适用于Mac OS X,但是pg_config PATH问题是通用的,也适用于Linux。

Just to sum up, I also faced exactly same problem. After reading a lot of stackoverflow posts and online blogs, the final solution which worked for me is this:

1) PostgreSQL(development or any stable version) should be installed before installing psycopg2.

2) The pg_config file (this file normally resides in the bin folder of the PostgreSQL installation folder) PATH had to be explicitly setup before installing psycopg2. In my case, the installation PATH for PostgreSQL is:

/opt/local/lib/postgresql91/

so in order to explicitly set the PATH of pg_config file, I entered following command in my terminal:

PATH=$PATH:/opt/local/lib/postgresql91/bin/

This command ensures that when you try to pip install psycopg2, it would find the PATH to pg_config automatically this time.

I have also posted a full error with trace and its solution on my blog which you may want to refer. Its for Mac OS X but the pg_config PATH problem is generic and applicable to Linux also.


回答 9

sudo apt-get install libpq-dev 在Ubuntu 15.4上为我工作

sudo apt-get install libpq-dev works for me on Ubuntu 15.4


回答 10

在Linux上Mint sudo apt-get install libpq-dev为我工作。

On Linux Mint sudo apt-get install libpq-dev worked for me.


回答 11

您可以使用pip或在任何平台上安装预编译的二进制文件conda

python -m pip install psycopg2-binary

要么

conda install psycopg2

请注意,psycopg2-binary pypi页面建议在生产中从源代码构建:

二进制软件包是开发和测试的实际选择,但在生产中,建议使用从源构建的软件包

要使用从源构建的软件包,请使用python -m pip install psycopg2。该过程将需要几个依赖项(文档)(重点是我的):

  • 交流编译器
  • Python的头文件。它们通常安装在python-dev之类的软件包中。诸如错误消息:Python.h:没有这样的文件或目录表示缺少Python标头。
  • libpq的头文件。它们通常安装在libpq-dev之类的软件包中。如果出现错误:libpq-fe.h:没有此类文件或目录,您将丢失它们。
  • pg_config程序:它通常是由安装的libpq-dev的包,但有时它是不是在路径目录。将它放在PATH中可以大大简化安装,因此请尝试运行pg_config –version:如果返回错误或意外的版本号,请找到包含正确libpq版本附带的pg_config的目录(通常是/ usr / lib / postgresql / XY / bin /)并将其添加到PATH: $ export PATH=/usr/lib/postgresql/X.Y/bin/:$PATH 您仅需要pg_config来编译psycopg2,而无需常规使用。

You can install pre-compiled binaries on any platform with pip or conda:

python -m pip install psycopg2-binary

or

conda install psycopg2

Please be advised that the psycopg2-binary pypi page recommends building from source in production:

The binary package is a practical choice for development and testing but in production it is advised to use the package built from sources

To use the package built from sources, use python -m pip install psycopg2. That process will require several dependencies (documentation) (emphasis mine):

  • A C compiler.
  • The Python header files. They are usually installed in a package such as python-dev. A message such as error: Python.h: No such file or directory is an indication that the Python headers are missing.
  • The libpq header files. They are usually installed in a package such as libpq-dev. If you get an error: libpq-fe.h: No such file or directory you are missing them.
  • The pg_config program: it is usually installed by the libpq-dev package but sometimes it is not in a PATH directory. Having it in the PATH greatly streamlines the installation, so try running pg_config –version: if it returns an error or an unexpected version number then locate the directory containing the pg_config shipped with the right libpq version (usually /usr/lib/postgresql/X.Y/bin/) and add it to the PATH: $ export PATH=/usr/lib/postgresql/X.Y/bin/:$PATH You only need pg_config to compile psycopg2, not for its regular usage.

回答 12

UPDATE /etc/yum.repos.d/CentOS-Base.repo、[base]和[updates]部分
ADD exclude = postgresql *

curl -O http://yum.postgresql.org/9.1/redhat/rhel-6-i386/pgdg-centos91-9.1-4.noarch.rpmr  
rpm -ivh pgdg-centos91-9.1-4.noarch.rpm

yum install postgresql  
yum install postgresql-devel

PATH=$PATH:/usr/pgsql-9.1/bin/

pip install psycopg2

UPDATE /etc/yum.repos.d/CentOS-Base.repo, [base] and [updates] sections
ADD exclude=postgresql*

curl -O http://yum.postgresql.org/9.1/redhat/rhel-6-i386/pgdg-centos91-9.1-4.noarch.rpmr  
rpm -ivh pgdg-centos91-9.1-4.noarch.rpm

yum install postgresql  
yum install postgresql-devel

PATH=$PATH:/usr/pgsql-9.1/bin/

pip install psycopg2

回答 13

对于运行OS X的用户,此解决方案对我有用:

1)安装Postgres.app:

http://www.postgresql.org/download/macosx/

2)然后打开终端并运行以下命令,将其显示为{{version}}的位置替换为Postgres版本号:

导出PATH = $ PATH:/Applications/Postgres.app/Contents/Versions / {{version}} / bin

例如

导出PATH = $ PATH:/Applications/Postgres.app/Contents/Versions/9.4/bin

For those running OS X, this solution worked for me:

1) Install Postgres.app:

http://www.postgresql.org/download/macosx/

2) Then open the Terminal and run this command, replacing where it says {{version}} with the Postgres version number:

export PATH=$PATH:/Applications/Postgres.app/Contents/Versions/{{version}}/bin

e.g.

export PATH=$PATH:/Applications/Postgres.app/Contents/Versions/9.4/bin


回答 14

尝试将其添加到PATH:

PATH=$PATH:/usr/pgsql-9.1/bin/ ./pip install psycopg2

Try to add it to PATH:

PATH=$PATH:/usr/pgsql-9.1/bin/ ./pip install psycopg2

回答 15

Ali的解决方案对我有用,但是我在查找bin文件夹位置时遇到了麻烦。在Mac OS X上查找路径的快速方法是打开psql(顶部菜单栏中有一个快速链接)。这将打开一个单独的终端窗口,在第二行,您的Postgres安装路径将如下所示:

My-MacBook-Pro:~ Me$ /Applications/Postgres93.app/Contents/MacOS/bin/psql ; exit;

您的pg_config文件在该bin文件夹中。因此,在安装psycopg2之前,请设置pg_config文件的路径:

PATH=$PATH:/Applications/Postgres93.app/Contents/MacOS/bin/

或较新版本:

PATH=$PATH:/Applications/Postgres.app/Contents/Versions/9.3/bin

然后安装psycopg2。

Ali’s solution worked for me but I was having trouble finding the bin folder location. A quick way to find the path on Mac OS X is to open psql (there’s a quick link in the top menu bar). This will open a separate terminal window and on the second line the path of your Postgres installation will appear like so:

My-MacBook-Pro:~ Me$ /Applications/Postgres93.app/Contents/MacOS/bin/psql ; exit;

Your pg_config file is in that bin folder. Therefore, before installing psycopg2 set the path of the pg_config file:

PATH=$PATH:/Applications/Postgres93.app/Contents/MacOS/bin/

or for newer version:

PATH=$PATH:/Applications/Postgres.app/Contents/Versions/9.3/bin

Then install psycopg2.


回答 16

在安装psycopg2之前,您需要升级您的pip。使用此命令

pip install --upgrade pip

You need to upgrade your pip before installing psycopg2. Use this command

pip install --upgrade pip

回答 17

我将把这个留给下一个不幸的灵魂,尽管所有提供的解决方案都无法解决这个问题。只需使用sudo pip3 install psycopg2-binary

I’m going to leave this here for the next unfortunate soul who can’t get around this problem despite all the provided solutions. Simply use sudo pip3 install psycopg2-binary


回答 18

刚刚通过以下方法解决了Cent OS 7中的问题:

export PATH=$PATH:/usr/pgsql-9.5/bin

确保您的PostgreSql版本与上面的正确版本匹配。

Just solved the problem in Cent OS 7 by:

export PATH=$PATH:/usr/pgsql-9.5/bin

make sure your PostgreSql version matches the right version above.


回答 19

在Mac OS X上,如果您使用的是Postgres App(http://postgresapp.com/):

export PATH=$PATH:/Applications/Postgres.app/Contents/Versions/latest/bin

无需在此命令中指定Postgres的版本。它将始终指向最新。

并做

pip install psycopg2

PS:如果更改未反映出您可能需要重新启动终端/命令提示符

资源

On Mac OS X and If you are using Postgres App (http://postgresapp.com/):

export PATH=$PATH:/Applications/Postgres.app/Contents/Versions/latest/bin

No need to specify version of Postgres in this command. It will be always pointed to latest.

and do

pip install psycopg2

P.S: If Changes doesn’t reflect you may need to restart the Terminal/Command prompt

Source


回答 20

安装python-psycopg2在Arch Linux上为我解决了这个问题:

pacman -S python-psycopg2

Installing python-psycopg2 solved it for me on Arch Linux:

pacman -S python-psycopg2

回答 21

在Windows上,您可能需要安装PsycopgWindows端口,该端口psycopg的文档中建议使用。

On Windows, You may want to install the Windows port of Psycopg, which is recommended in psycopg’s documentation.


回答 22

在MacOS上,最简单的解决方案是将正确的二进制文件符号链接到Postgres软件包下。

sudo ln -s /Applications/Postgres.app/Contents/Versions/latest/bin/pg_config /usr/local/bin/pg_config

这是相当无害的,并且如果需要,所有应用程序都可以在系统范围内使用它。

On MacOS, the simplest solution will be to symlink the correct binary, that is under the Postgres package.

sudo ln -s /Applications/Postgres.app/Contents/Versions/latest/bin/pg_config /usr/local/bin/pg_config

This is fairly harmless, and all the applications will be able to use it system wide, if required.


回答 23

sudo yum安装postgresql-devel(centos6X)

点安装psycopg2 == 2.5.2

sudo yum install postgresql-devel (centos6X)

pip install psycopg2==2.5.2


回答 24

在这里,为了确保OS X的完整性:如果从MacPorts安装PostgreSQL,则pg_config将位于 /opt/local/lib/postgresql94/bin/pg_config

安装MacPorts时,它已经添加了 /opt/local/bin到PATH中。

因此,这将解决问题: $ sudo ln -s /opt/local/lib/postgresql94/bin/pg_config /opt/local/bin/pg_config

现在pip install psycopg2将可以pg_config毫无问题地运行。

Here, for OS X completeness: if you install PostgreSQL from MacPorts, pg_config will be in /opt/local/lib/postgresql94/bin/pg_config.

When you installed MacPorts, it already added /opt/local/bin to your PATH.

So, this will fix the problem: $ sudo ln -s /opt/local/lib/postgresql94/bin/pg_config /opt/local/bin/pg_config

Now pip install psycopg2 will be able to run pg_config without issues.


回答 25

对于使用zshshell的macOS Catalina 上也安装了postgres应用程序的用户

打开~/.zshrc文件,并添加以下行:

export PATH="/Applications/Postgres.app/Contents/Versions/latest/bin:$PATH"

然后关闭所有终端,重新打开它们,您将解决问题。

如果您不想关闭终端,只需输入要继续使用的终端即可source ~/.zshrc

To those on macOS Catalina using the zsh shell who have also installed the postgres app:

Open your ~/.zshrc file, and add the following line:

export PATH="/Applications/Postgres.app/Contents/Versions/latest/bin:$PATH"

Then close all your terminals, reopen them, and you’ll have resolved your problem.

If you don’t want to close your terminals, simply enter source ~/.zshrc in whatever terminal you’d like to keep working on.


回答 26

对于mac用户,扩展您的path变量以包括PostgreSQL这样的export PATH=$PATH:/Library/PostgreSQL/12/bin

For mac users, extend your path variable to include PostgreSQL like this export PATH=$PATH:/Library/PostgreSQL/12/bin.


回答 27

这是我设法安装psycopg2的方法

$ wget http://initd.org/psycopg/tarballs/PSYCOPG-2-5/psycopg2-2.5.3.tar.gz
$ tar -xzf psycopg2-2.5.3.tar.gz
$ cd psycopg2-2.5.3
$ pip install .

This is how I managed to install psycopg2

$ wget http://initd.org/psycopg/tarballs/PSYCOPG-2-5/psycopg2-2.5.3.tar.gz
$ tar -xzf psycopg2-2.5.3.tar.gz
$ cd psycopg2-2.5.3
$ pip install .

回答 28

我敢肯定,您会遇到与我相同的“问题”,因此,我将为您提供极为简单的解决方案…

在您的情况下,您需要添加到$ PATH(或作为命令参数)的实际路径是:

/usr/pgsql-9.1/bin/pg_config

/usr/pgsql-9.1/bin

例如,如果您随后运行python setup.py脚本,则可以这样运行它:

python setup.py build_ext --pg-config /usr/pgsql-9.1/bin/pg_config build

可能为时已晚,但仍然是最简单的解决方案。

之后编辑:

在进一步测试下,我发现如果您最初以以下形式将路径添加到pg_config

/usr/pgsql-9.1/bin

(在../bin之后没有/ pg_config)并运行pip install命令,它将起作用。

但是,如果您随后决定按照说明运行python setup.py,则必须在….. / bin之后使用/ pg_config指定路径,即

python setup.py build_ext --pg-config /usr/pgsql-9.1/bin/pg_config build

I am pretty sure you’ve experienced the same “problem” i did, therefore I’ll offer you the extremely easy solution…

In your case, the actual path that you need to add to $PATH (or as a command param) is:

/usr/pgsql-9.1/bin/pg_config

not

/usr/pgsql-9.1/bin

E.g. if you run the python setup.py script afterwards, you would run it like this:

python setup.py build_ext --pg-config /usr/pgsql-9.1/bin/pg_config build

Probably too late, but still the easiest solution.

LATER EDIT:

Under further test I found out that if you initially add the path to pg_config in the form

/usr/pgsql-9.1/bin

(without /pg_config after …../bin) and run the pip install command it will work.

However, if you then decide to follow the indication to run python setup.py, you will have to specify the path with /pg_config after …../bin, i.e.

python setup.py build_ext --pg-config /usr/pgsql-9.1/bin/pg_config build

回答 29

对于CentOS / RedHat,请确保这/etc/alternatives/pgsql-pg_config是一个不间断的符号链接

for CentOS/RedHat make sure that /etc/alternatives/pgsql-pg_config is a non-broken symlink


如何检查变量的类型是否为字符串?

问题:如何检查变量的类型是否为字符串?

有没有一种方法可以检查python中变量的类型是否为string,例如:

isinstance(x,int);

对于整数值?

Is there a way to check if the type of a variable in python is a string, like:

isinstance(x,int);

for integer values?


回答 0

在Python 2.x中,您可以

isinstance(s, basestring)

basestring抽象的超类strunicode。它可用于测试对象是否是str或的实例unicode


在Python 3.x中,正确的测试是

isinstance(s, str)

bytes在Python 3中,该类不被视为字符串类型。

In Python 2.x, you would do

isinstance(s, basestring)

basestring is the abstract superclass of str and unicode. It can be used to test whether an object is an instance of str or unicode.


In Python 3.x, the correct test is

isinstance(s, str)

The bytes class isn’t considered a string type in Python 3.


回答 1

我知道这是一个古老的话题,但是作为第一个显示在google上的话题,鉴于我没有找到满意的答案,因此我将其留在此处以供将来参考:

第六个是Python 2和3兼容性库,它已经解决了这个问题。然后,您可以执行以下操作:

import six

if isinstance(value, six.string_types):
    pass # It's a string !!

检查代码,您会发现:

import sys

PY3 = sys.version_info[0] == 3

if PY3:
    string_types = str,
else:
    string_types = basestring,

I know this is an old topic, but being the first one shown on google and given that I don’t find any of the answers satisfactory, I’ll leave this here for future reference:

six is a Python 2 and 3 compatibility library which already covers this issue. You can then do something like this:

import six

if isinstance(value, six.string_types):
    pass # It's a string !!

Inspecting the code, this is what you find:

import sys

PY3 = sys.version_info[0] == 3

if PY3:
    string_types = str,
else:
    string_types = basestring,

回答 2

在Python 3.x或Python 2.7.6中

if type(x) == str:

In Python 3.x or Python 2.7.6

if type(x) == str:

回答 3

你可以做:

var = 1
if type(var) == int:
   print('your variable is an integer')

要么:

var2 = 'this is variable #2'
if type(var2) == str:
    print('your variable is a string')
else:
    print('your variable IS NOT a string')

希望这可以帮助!

you can do:

var = 1
if type(var) == int:
   print('your variable is an integer')

or:

var2 = 'this is variable #2'
if type(var2) == str:
    print('your variable is a string')
else:
    print('your variable IS NOT a string')

hope this helps!


回答 4

如果要检查的内容多于整数和字符串,则类型模块也存在。 http://docs.python.org/library/types.html

The type module also exists if you are checking more than ints and strings. http://docs.python.org/library/types.html


回答 5

根据以下更好的答案进行编辑。记下3个答案,找出基弦的凉爽。

旧答案:当心unicode字符串,您可以从多个地方获得unicode字符串,包括Windows中的所有COM调用。

if isinstance(target, str) or isinstance(target, unicode):

Edit based on better answer below. Go down about 3 answers and find out about the coolness of basestring.

Old answer: Watch out for unicode strings, which you can get from several places, including all COM calls in Windows.

if isinstance(target, str) or isinstance(target, unicode):

回答 6

由于basestring未在Python3中定义,因此此小技巧可能有助于使代码兼容:

try: # check whether python knows about 'basestring'
   basestring
except NameError: # no, it doesn't (it's Python3); use 'str' instead
   basestring=str

之后,您可以在Python2和Python3上运行以下测试

isinstance(myvar, basestring)

since basestring isn’t defined in Python3, this little trick might help to make the code compatible:

try: # check whether python knows about 'basestring'
   basestring
except NameError: # no, it doesn't (it's Python3); use 'str' instead
   basestring=str

after that you can run the following test on both Python2 and Python3

isinstance(myvar, basestring)

回答 7

Python 2/3包括unicode

from __future__ import unicode_literals
from builtins import str  #  pip install future
isinstance('asdf', str)   #  True
isinstance(u'asdf', str)  #  True

http://python-future.org/overview.html

Python 2 / 3 including unicode

from __future__ import unicode_literals
from builtins import str  #  pip install future
isinstance('asdf', str)   #  True
isinstance(u'asdf', str)  #  True

http://python-future.org/overview.html


回答 8

我还要注意,如果要检查变量的类型是否为特定类型,可以将变量的类型与已知对象的类型进行比较。

对于字符串,您可以使用此

type(s) == type('')

Also I want notice that if you want to check whether the type of a variable is a specific kind, you can compare the type of the variable to the type of a known object.

For string you can use this

type(s) == type('')

回答 9

其他人在这里提供了很多好的建议,但是我看不到一个很好的跨平台摘要。对于任何Python程序来说,以下内容都是不错的选择:

def isstring(s):
    # if we use Python 3
    if (sys.version_info[0] >= 3):
        return isinstance(s, str)
    # we use Python 2
    return isinstance(s, basestring)

在此函数中,我们用于isinstance(object, classinfo)查看输入是str在Python 3中还是basestring在Python 2中。

Lots of good suggestions provided by others here, but I don’t see a good cross-platform summary. The following should be a good drop in for any Python program:

def isstring(s):
    # if we use Python 3
    if (sys.version_info[0] >= 3):
        return isinstance(s, str)
    # we use Python 2
    return isinstance(s, basestring)

In this function, we use isinstance(object, classinfo) to see if our input is a str in Python 3 or a basestring in Python 2.


回答 10

不使用basestring的Python 2替代方法:

isinstance(s, (str, unicode))

但由于unicode未定义(在Python 3中),因此在Python 3中仍然无法使用。

Alternative way for Python 2, without using basestring:

isinstance(s, (str, unicode))

But still won’t work in Python 3 since unicode isn’t defined (in Python 3).


回答 11

所以,

您可以使用很多选项来检查变量是否为字符串:

a = "my string"
type(a) == str # first 
a.__class__ == str # second
isinstance(a, str) # third
str(a) == a # forth
type(a) == type('') # fifth

此命令是有目的的。

So,

You have plenty of options to check whether your variable is string or not:

a = "my string"
type(a) == str # first 
a.__class__ == str # second
isinstance(a, str) # third
str(a) == a # forth
type(a) == type('') # fifth

This order is for purpose.


回答 12

a = '1000' # also tested for 'abc100', 'a100bc', '100abc'

isinstance(a, str) or isinstance(a, unicode)

返回True

type(a) in [str, unicode]

返回True

a = '1000' # also tested for 'abc100', 'a100bc', '100abc'

isinstance(a, str) or isinstance(a, unicode)

returns True

type(a) in [str, unicode]

returns True


回答 13

这是我对同时支持Python 2和Python 3以及这些要求的回答:

  • 用最少的Py2兼容代码以Py3代码编写。
  • 稍后删除Py2兼容代码而不会受到干扰。即仅旨在删除,不修改Py3代码。
  • 避免使用 six或类似的compat模块,因为它们倾向于隐藏试图实现的目标。
  • 面向未来的潜在Py4。

import sys
PY2 = sys.version_info.major == 2

# Check if string (lenient for byte-strings on Py2):
isinstance('abc', basestring if PY2 else str)

# Check if strictly a string (unicode-string):
isinstance('abc', unicode if PY2 else str)

# Check if either string (unicode-string) or byte-string:
isinstance('abc', basestring if PY2 else (str, bytes))

# Check for byte-string (Py3 and Py2.7):
isinstance('abc', bytes)

Here is my answer to support both Python 2 and Python 3 along with these requirements:

  • Written in Py3 code with minimal Py2 compat code.
  • Remove Py2 compat code later without disruption. I.e. aim for deletion only, no modification to Py3 code.
  • Avoid using six or similar compat module as they tend to hide away what is trying to be achieved.
  • Future-proof for a potential Py4.

import sys
PY2 = sys.version_info.major == 2

# Check if string (lenient for byte-strings on Py2):
isinstance('abc', basestring if PY2 else str)

# Check if strictly a string (unicode-string):
isinstance('abc', unicode if PY2 else str)

# Check if either string (unicode-string) or byte-string:
isinstance('abc', basestring if PY2 else (str, bytes))

# Check for byte-string (Py3 and Py2.7):
isinstance('abc', bytes)

回答 14

如果您不想依赖外部库,那么这对于Python 2.7+和Python 3(http://ideone.com/uB4Kdc)都适用:

# your code goes here
s = ["test"];
#s = "test";
isString = False;

if(isinstance(s, str)):
    isString = True;
try:
    if(isinstance(s, basestring)):
        isString = True;
except NameError:
    pass;

if(isString):
    print("String");
else:
    print("Not String");

If you do not want to depend on external libs, this works both for Python 2.7+ and Python 3 (http://ideone.com/uB4Kdc):

# your code goes here
s = ["test"];
#s = "test";
isString = False;

if(isinstance(s, str)):
    isString = True;
try:
    if(isinstance(s, basestring)):
        isString = True;
except NameError:
    pass;

if(isString):
    print("String");
else:
    print("Not String");

回答 15

您可以简单地使用isinstance函数来确保输入数据的格式为stringunicode。以下示例将帮助您轻松理解。

>>> isinstance('my string', str)
True
>>> isinstance(12, str)
False
>>> isinstance('my string', unicode)
False
>>> isinstance(u'my string',  unicode)
True

You can simply use the isinstance function to make sure that the input data is of format string or unicode. Below examples will help you to understand easily.

>>> isinstance('my string', str)
True
>>> isinstance(12, str)
False
>>> isinstance('my string', unicode)
False
>>> isinstance(u'my string',  unicode)
True

回答 16

s = '123'
issubclass(s.__class__, str)
s = '123'
issubclass(s.__class__, str)

回答 17

这是我的方法:

if type(x) == type(str()):

This is how I do it:

if type(x) == type(str()):

回答 18

我见过:

hasattr(s, 'endswith') 

I’ve seen:

hasattr(s, 'endswith') 

回答 19

>>> thing = 'foo'
>>> type(thing).__name__ == 'str' or type(thing).__name__ == 'unicode'
True
>>> thing = 'foo'
>>> type(thing).__name__ == 'str' or type(thing).__name__ == 'unicode'
True

有什么办法可以杀死线程吗?

问题:有什么办法可以杀死线程吗?

是否可以在不设置/检查任何标志/信号灯/等的情况下终止正在运行的线程?

Is it possible to terminate a running thread without setting/checking any flags/semaphores/etc.?


回答 0

用Python和任何语言突然终止线程通常是一个错误的模式。考虑以下情况:

  • 线程持有必须正确关闭的关键资源
  • 该线程创建了其他几个必须同时终止的线程。

如果您负担得起的话(如果您要管理自己的线程),处理此问题的一种好方法是有一个exit_request标志,每个线程定期检查一次,以查看是否该退出。

例如:

import threading

class StoppableThread(threading.Thread):
    """Thread class with a stop() method. The thread itself has to check
    regularly for the stopped() condition."""

    def __init__(self,  *args, **kwargs):
        super(StoppableThread, self).__init__(*args, **kwargs)
        self._stop_event = threading.Event()

    def stop(self):
        self._stop_event.set()

    def stopped(self):
        return self._stop_event.is_set()

在此代码中,您应该stop()在希望退出线程时调用该线程,然后使用来等待线程正确退出join()。线程应定期检查停止标志。

但是,在某些情况下,您确实需要杀死线程。一个示例是当您包装一个忙于长时间调用的外部库并且想要中断它时。

以下代码允许(有一些限制)在Python线程中引发Exception:

def _async_raise(tid, exctype):
    '''Raises an exception in the threads with id tid'''
    if not inspect.isclass(exctype):
        raise TypeError("Only types can be raised (not instances)")
    res = ctypes.pythonapi.PyThreadState_SetAsyncExc(ctypes.c_long(tid),
                                                     ctypes.py_object(exctype))
    if res == 0:
        raise ValueError("invalid thread id")
    elif res != 1:
        # "if it returns a number greater than one, you're in trouble,
        # and you should call it again with exc=NULL to revert the effect"
        ctypes.pythonapi.PyThreadState_SetAsyncExc(ctypes.c_long(tid), None)
        raise SystemError("PyThreadState_SetAsyncExc failed")

class ThreadWithExc(threading.Thread):
    '''A thread class that supports raising exception in the thread from
       another thread.
    '''
    def _get_my_tid(self):
        """determines this (self's) thread id

        CAREFUL : this function is executed in the context of the caller
        thread, to get the identity of the thread represented by this
        instance.
        """
        if not self.isAlive():
            raise threading.ThreadError("the thread is not active")

        # do we have it cached?
        if hasattr(self, "_thread_id"):
            return self._thread_id

        # no, look for it in the _active dict
        for tid, tobj in threading._active.items():
            if tobj is self:
                self._thread_id = tid
                return tid

        # TODO: in python 2.6, there's a simpler way to do : self.ident

        raise AssertionError("could not determine the thread's id")

    def raiseExc(self, exctype):
        """Raises the given exception type in the context of this thread.

        If the thread is busy in a system call (time.sleep(),
        socket.accept(), ...), the exception is simply ignored.

        If you are sure that your exception should terminate the thread,
        one way to ensure that it works is:

            t = ThreadWithExc( ... )
            ...
            t.raiseExc( SomeException )
            while t.isAlive():
                time.sleep( 0.1 )
                t.raiseExc( SomeException )

        If the exception is to be caught by the thread, you need a way to
        check that your thread has caught it.

        CAREFUL : this function is executed in the context of the
        caller thread, to raise an excpetion in the context of the
        thread represented by this instance.
        """
        _async_raise( self._get_my_tid(), exctype )

(基于Tomer Filiba的Killable Threads。有关return值的引用PyThreadState_SetAsyncExc似乎来自旧版本的Python。)

如文档中所述,这不是灵丹妙药,因为如果线程在Python解释器之外很忙,它将无法捕获中断。

此代码的一个很好的用法模式是让线程捕获特定的异常并执行清理。这样,您可以中断任务并仍然进行适当的清理。

It is generally a bad pattern to kill a thread abruptly, in Python and in any language. Think of the following cases:

  • the thread is holding a critical resource that must be closed properly
  • the thread has created several other threads that must be killed as well.

The nice way of handling this if you can afford it (if you are managing your own threads) is to have an exit_request flag that each threads checks on regular interval to see if it is time for it to exit.

For example:

import threading

class StoppableThread(threading.Thread):
    """Thread class with a stop() method. The thread itself has to check
    regularly for the stopped() condition."""

    def __init__(self,  *args, **kwargs):
        super(StoppableThread, self).__init__(*args, **kwargs)
        self._stop_event = threading.Event()

    def stop(self):
        self._stop_event.set()

    def stopped(self):
        return self._stop_event.is_set()

In this code, you should call stop() on the thread when you want it to exit, and wait for the thread to exit properly using join(). The thread should check the stop flag at regular intervals.

There are cases however when you really need to kill a thread. An example is when you are wrapping an external library that is busy for long calls and you want to interrupt it.

The following code allows (with some restrictions) to raise an Exception in a Python thread:

def _async_raise(tid, exctype):
    '''Raises an exception in the threads with id tid'''
    if not inspect.isclass(exctype):
        raise TypeError("Only types can be raised (not instances)")
    res = ctypes.pythonapi.PyThreadState_SetAsyncExc(ctypes.c_long(tid),
                                                     ctypes.py_object(exctype))
    if res == 0:
        raise ValueError("invalid thread id")
    elif res != 1:
        # "if it returns a number greater than one, you're in trouble,
        # and you should call it again with exc=NULL to revert the effect"
        ctypes.pythonapi.PyThreadState_SetAsyncExc(ctypes.c_long(tid), None)
        raise SystemError("PyThreadState_SetAsyncExc failed")

class ThreadWithExc(threading.Thread):
    '''A thread class that supports raising exception in the thread from
       another thread.
    '''
    def _get_my_tid(self):
        """determines this (self's) thread id

        CAREFUL : this function is executed in the context of the caller
        thread, to get the identity of the thread represented by this
        instance.
        """
        if not self.isAlive():
            raise threading.ThreadError("the thread is not active")

        # do we have it cached?
        if hasattr(self, "_thread_id"):
            return self._thread_id

        # no, look for it in the _active dict
        for tid, tobj in threading._active.items():
            if tobj is self:
                self._thread_id = tid
                return tid

        # TODO: in python 2.6, there's a simpler way to do : self.ident

        raise AssertionError("could not determine the thread's id")

    def raiseExc(self, exctype):
        """Raises the given exception type in the context of this thread.

        If the thread is busy in a system call (time.sleep(),
        socket.accept(), ...), the exception is simply ignored.

        If you are sure that your exception should terminate the thread,
        one way to ensure that it works is:

            t = ThreadWithExc( ... )
            ...
            t.raiseExc( SomeException )
            while t.isAlive():
                time.sleep( 0.1 )
                t.raiseExc( SomeException )

        If the exception is to be caught by the thread, you need a way to
        check that your thread has caught it.

        CAREFUL : this function is executed in the context of the
        caller thread, to raise an excpetion in the context of the
        thread represented by this instance.
        """
        _async_raise( self._get_my_tid(), exctype )

(Based on Killable Threads by Tomer Filiba. The quote about the return value of PyThreadState_SetAsyncExc appears to be from an old version of Python.)

As noted in the documentation, this is not a magic bullet because if the thread is busy outside the Python interpreter, it will not catch the interruption.

A good usage pattern of this code is to have the thread catch a specific exception and perform the cleanup. That way, you can interrupt a task and still have proper cleanup.


回答 1

没有官方的API可以这样做。

您需要使用平台API杀死线程,例如pthread_kill或TerminateThread。您可以通过pythonwin或ctypes访问此类API。

请注意,这本质上是不安全的。如果被杀死的线程在被杀死时具有GIL,则可能会导致无法收集的垃圾(来自成为垃圾的堆栈帧的局部变量),并可能导致死锁。

There is no official API to do that, no.

You need to use platform API to kill the thread, e.g. pthread_kill, or TerminateThread. You can access such API e.g. through pythonwin, or through ctypes.

Notice that this is inherently unsafe. It will likely lead to uncollectable garbage (from local variables of the stack frames that become garbage), and may lead to deadlocks, if the thread being killed has the GIL at the point when it is killed.


回答 2

一个multiprocessing.Process罐头p.terminate()

在我想杀死一个线程,但又不想使用标志/锁/信号/信号量/事件/任何东西的情况下,我会将线程提升为完整进程。对于仅使用几个线程的代码,开销并不是那么糟糕。

例如,这样做很方便,可以轻松终止执行阻塞I / O的助手“线程”

转换是微不足道的:在相关代码中,将全部替换threading.Threadmultiprocessing.Process和全部替换queue.Queuemultiprocessing.Queue,并将所需的调用添加p.terminate()到要杀死其子级的父进程中p

请参阅Python文档multiprocessing

A multiprocessing.Process can p.terminate()

In the cases where I want to kill a thread, but do not want to use flags/locks/signals/semaphores/events/whatever, I promote the threads to full blown processes. For code that makes use of just a few threads the overhead is not that bad.

E.g. this comes in handy to easily terminate helper “threads” which execute blocking I/O

The conversion is trivial: In related code replace all threading.Thread with multiprocessing.Process and all queue.Queue with multiprocessing.Queue and add the required calls of p.terminate() to your parent process which wants to kill its child p

See the Python documentation for multiprocessing.


回答 3

如果试图终止整个程序,则可以将线程设置为“守护程序”。参见 Thread.daemon

If you are trying to terminate the whole program you can set the thread as a “daemon”. see Thread.daemon


回答 4

正如其他人提到的那样,规范是设置停止标志。对于轻量级的东西(没有Thread的子类,没有全局变量),可以选择使用lambda回调。(请注意中的括号if stop()。)

import threading
import time

def do_work(id, stop):
    print("I am thread", id)
    while True:
        print("I am thread {} doing something".format(id))
        if stop():
            print("  Exiting loop.")
            break
    print("Thread {}, signing off".format(id))


def main():
    stop_threads = False
    workers = []
    for id in range(0,3):
        tmp = threading.Thread(target=do_work, args=(id, lambda: stop_threads))
        workers.append(tmp)
        tmp.start()
    time.sleep(3)
    print('main: done sleeping; time to stop the threads.')
    stop_threads = True
    for worker in workers:
        worker.join()
    print('Finis.')

if __name__ == '__main__':
    main()

替换print()pr()始终刷新(sys.stdout.flush())的函数可以提高Shell输出的精度。

(仅在Windows / Eclipse / Python3.3上测试过)

As others have mentioned, the norm is to set a stop flag. For something lightweight (no subclassing of Thread, no global variable), a lambda callback is an option. (Note the parentheses in if stop().)

import threading
import time

def do_work(id, stop):
    print("I am thread", id)
    while True:
        print("I am thread {} doing something".format(id))
        if stop():
            print("  Exiting loop.")
            break
    print("Thread {}, signing off".format(id))


def main():
    stop_threads = False
    workers = []
    for id in range(0,3):
        tmp = threading.Thread(target=do_work, args=(id, lambda: stop_threads))
        workers.append(tmp)
        tmp.start()
    time.sleep(3)
    print('main: done sleeping; time to stop the threads.')
    stop_threads = True
    for worker in workers:
        worker.join()
    print('Finis.')

if __name__ == '__main__':
    main()

Replacing print() with a pr() function that always flushes (sys.stdout.flush()) may improve the precision of the shell output.

(Only tested on Windows/Eclipse/Python3.3)


回答 5

这基于thread2-可杀死的线程(Python配方)

您需要调用PyThreadState_SetasyncExc(),该方法仅可通过ctypes使用。

该功能仅在Python 2.7.3上进行了测试,但可能与其他最新的2.x版本一起使用。

import ctypes

def terminate_thread(thread):
    """Terminates a python thread from another thread.

    :param thread: a threading.Thread instance
    """
    if not thread.isAlive():
        return

    exc = ctypes.py_object(SystemExit)
    res = ctypes.pythonapi.PyThreadState_SetAsyncExc(
        ctypes.c_long(thread.ident), exc)
    if res == 0:
        raise ValueError("nonexistent thread id")
    elif res > 1:
        # """if it returns a number greater than one, you're in trouble,
        # and you should call it again with exc=NULL to revert the effect"""
        ctypes.pythonapi.PyThreadState_SetAsyncExc(thread.ident, None)
        raise SystemError("PyThreadState_SetAsyncExc failed")

This is based on thread2 — killable threads (Python recipe)

You need to call PyThreadState_SetasyncExc(), which is only available through ctypes.

This has only been tested on Python 2.7.3, but it is likely to work with other recent 2.x releases.

import ctypes

def terminate_thread(thread):
    """Terminates a python thread from another thread.

    :param thread: a threading.Thread instance
    """
    if not thread.isAlive():
        return

    exc = ctypes.py_object(SystemExit)
    res = ctypes.pythonapi.PyThreadState_SetAsyncExc(
        ctypes.c_long(thread.ident), exc)
    if res == 0:
        raise ValueError("nonexistent thread id")
    elif res > 1:
        # """if it returns a number greater than one, you're in trouble,
        # and you should call it again with exc=NULL to revert the effect"""
        ctypes.pythonapi.PyThreadState_SetAsyncExc(thread.ident, None)
        raise SystemError("PyThreadState_SetAsyncExc failed")

回答 6

如果不配合使用线程,切勿强行杀死它。

杀死线程会删除所有尝试/最终阻止设置的保证,因此您可能会锁定锁,打开文件等。

唯一可以断言强行杀死线程是个好主意的方法是快速杀死程序,但绝不要杀死单个线程。

You should never forcibly kill a thread without cooperating with it.

Killing a thread removes any guarantees that try/finally blocks set up so you might leave locks locked, files open, etc.

The only time you can argue that forcibly killing threads is a good idea is to kill a program fast, but never single threads.


回答 7

在Python中,您根本无法直接杀死线程。

如果您实际上并不需要Thread(!),则可以使用 multiprocessing软件包,而不是使用threading软件包来做。在这里,要杀死一个进程,您可以简单地调用方法:

yourProcess.terminate()  # kill the process!

Python将杀死您的进程(在Unix上通过SIGTERM信号,而在Windows上通过TerminateProcess()调用)。在使用队列或管道时,请注意使用它!(它可能会损坏队列/管道中的数据)

请注意,multiprocessing.Event和的multiprocessing.Semaphore工作方式与threading.Event和的完全相同threading.Semaphore。实际上,第一个是后者的克隆。

如果您确实需要使用线程,则无法直接杀死它。但是,您可以使用“守护程序线程”。实际上,在Python中,可以将Thread标记为守护程序

yourThread.daemon = True  # set the Thread as a "daemon thread"

当没有活动的非守护线程时,主程序将退出。换句话说,当您的主线程(当然,这是一个非守护程序线程)完成其操作时,即使仍有一些守护程序线程在工作,程序也会退出。

请注意,有必要daemonstart()调用该方法之前设置一个线程!

当然,daemon甚至可以使用multiprocessing。在这里,当主进程退出时,它将尝试终止其所有守护进程。

最后,请注意,sys.exit()os.kill()不是选择。

In Python, you simply cannot kill a Thread directly.

If you do NOT really need to have a Thread (!), what you can do, instead of using the threading package , is to use the multiprocessing package . Here, to kill a process, you can simply call the method:

yourProcess.terminate()  # kill the process!

Python will kill your process (on Unix through the SIGTERM signal, while on Windows through the TerminateProcess() call). Pay attention to use it while using a Queue or a Pipe! (it may corrupt the data in the Queue/Pipe)

Note that the multiprocessing.Event and the multiprocessing.Semaphore work exactly in the same way of the threading.Event and the threading.Semaphore respectively. In fact, the first ones are clones of the latters.

If you REALLY need to use a Thread, there is no way to kill it directly. What you can do, however, is to use a “daemon thread”. In fact, in Python, a Thread can be flagged as daemon:

yourThread.daemon = True  # set the Thread as a "daemon thread"

The main program will exit when no alive non-daemon threads are left. In other words, when your main thread (which is, of course, a non-daemon thread) will finish its operations, the program will exit even if there are still some daemon threads working.

Note that it is necessary to set a Thread as daemon before the start() method is called!

Of course you can, and should, use daemon even with multiprocessing. Here, when the main process exits, it attempts to terminate all of its daemonic child processes.

Finally, please, note that sys.exit() and os.kill() are not choices.


回答 8

您可以通过在将退出线程的线程中安装跟踪来杀死线程。请参阅附件链接以获取一种可能的实现。

在Python中杀死线程

You can kill a thread by installing trace into the thread that will exit the thread. See attached link for one possible implementation.

Kill a thread in Python


回答 9

如果你是显式调用time.sleep()为你的线程(比如查询一些外部的服务)的一部分,在菲利普的方法的改进是使用了超时的eventwait()方法,无论你sleep()

例如:

import threading

class KillableThread(threading.Thread):
    def __init__(self, sleep_interval=1):
        super().__init__()
        self._kill = threading.Event()
        self._interval = sleep_interval

    def run(self):
        while True:
            print("Do Something")

            # If no kill signal is set, sleep for the interval,
            # If kill signal comes in while sleeping, immediately
            #  wake up and handle
            is_killed = self._kill.wait(self._interval)
            if is_killed:
                break

        print("Killing Thread")

    def kill(self):
        self._kill.set()

然后运行

t = KillableThread(sleep_interval=5)
t.start()
# Every 5 seconds it prints:
#: Do Something
t.kill()
#: Killing Thread

使用wait()而不是sleep()ing并定期检查事件的优点是您可以在更长的睡眠间隔内进行编程,线程几乎立即停止(原本应该是sleep()ing),并且我认为处理退出的代码明显更简单。

If you are explicitly calling time.sleep() as part of your thread (say polling some external service), an improvement upon Phillipe’s method is to use the timeout in the event‘s wait() method wherever you sleep()

For example:

import threading

class KillableThread(threading.Thread):
    def __init__(self, sleep_interval=1):
        super().__init__()
        self._kill = threading.Event()
        self._interval = sleep_interval

    def run(self):
        while True:
            print("Do Something")

            # If no kill signal is set, sleep for the interval,
            # If kill signal comes in while sleeping, immediately
            #  wake up and handle
            is_killed = self._kill.wait(self._interval)
            if is_killed:
                break

        print("Killing Thread")

    def kill(self):
        self._kill.set()

Then to run it

t = KillableThread(sleep_interval=5)
t.start()
# Every 5 seconds it prints:
#: Do Something
t.kill()
#: Killing Thread

The advantage of using wait() instead of sleep()ing and regularly checking the event is that you can program in longer intervals of sleep, the thread is stopped almost immediately (when you would otherwise be sleep()ing) and in my opinion, the code for handling exit is significantly simpler.


回答 10

如果您不杀死线程,那就更好了。一种方法可能是在线程的循环中引入“ try”块,并在您要停止线程时引发异常(例如,break / return / …会停止for / while / …)。我已经在我的应用程序上使用了它,并且可以使用…

It is better if you don’t kill a thread. A way could be to introduce a “try” block into the thread’s cycle and to throw an exception when you want to stop the thread (for example a break/return/… that stops your for/while/…). I’ve used this on my app and it works…


回答 11

绝对有可能实现Thread.stop以下示例代码中所示的方法:

import sys
import threading
import time


class StopThread(StopIteration):
    pass

threading.SystemExit = SystemExit, StopThread


class Thread2(threading.Thread):

    def stop(self):
        self.__stop = True

    def _bootstrap(self):
        if threading._trace_hook is not None:
            raise ValueError('Cannot run thread with tracing!')
        self.__stop = False
        sys.settrace(self.__trace)
        super()._bootstrap()

    def __trace(self, frame, event, arg):
        if self.__stop:
            raise StopThread()
        return self.__trace


class Thread3(threading.Thread):

    def _bootstrap(self, stop_thread=False):
        def stop():
            nonlocal stop_thread
            stop_thread = True
        self.stop = stop

        def tracer(*_):
            if stop_thread:
                raise StopThread()
            return tracer
        sys.settrace(tracer)
        super()._bootstrap()

###############################################################################


def main():
    test1 = Thread2(target=printer)
    test1.start()
    time.sleep(1)
    test1.stop()
    test1.join()
    test2 = Thread2(target=speed_test)
    test2.start()
    time.sleep(1)
    test2.stop()
    test2.join()
    test3 = Thread3(target=speed_test)
    test3.start()
    time.sleep(1)
    test3.stop()
    test3.join()


def printer():
    while True:
        print(time.time() % 1)
        time.sleep(0.1)


def speed_test(count=0):
    try:
        while True:
            count += 1
    except StopThread:
        print('Count =', count)

if __name__ == '__main__':
    main()

Thread3类似乎比快大约33%的运行代码Thread2类。

It is definitely possible to implement a Thread.stop method as shown in the following example code:

import sys
import threading
import time


class StopThread(StopIteration):
    pass

threading.SystemExit = SystemExit, StopThread


class Thread2(threading.Thread):

    def stop(self):
        self.__stop = True

    def _bootstrap(self):
        if threading._trace_hook is not None:
            raise ValueError('Cannot run thread with tracing!')
        self.__stop = False
        sys.settrace(self.__trace)
        super()._bootstrap()

    def __trace(self, frame, event, arg):
        if self.__stop:
            raise StopThread()
        return self.__trace


class Thread3(threading.Thread):

    def _bootstrap(self, stop_thread=False):
        def stop():
            nonlocal stop_thread
            stop_thread = True
        self.stop = stop

        def tracer(*_):
            if stop_thread:
                raise StopThread()
            return tracer
        sys.settrace(tracer)
        super()._bootstrap()

###############################################################################


def main():
    test1 = Thread2(target=printer)
    test1.start()
    time.sleep(1)
    test1.stop()
    test1.join()
    test2 = Thread2(target=speed_test)
    test2.start()
    time.sleep(1)
    test2.stop()
    test2.join()
    test3 = Thread3(target=speed_test)
    test3.start()
    time.sleep(1)
    test3.stop()
    test3.join()


def printer():
    while True:
        print(time.time() % 1)
        time.sleep(0.1)


def speed_test(count=0):
    try:
        while True:
            count += 1
    except StopThread:
        print('Count =', count)

if __name__ == '__main__':
    main()

The Thread3 class appears to run code approximately 33% faster than the Thread2 class.


回答 12

from ctypes import *
pthread = cdll.LoadLibrary("libpthread-2.15.so")
pthread.pthread_cancel(c_ulong(t.ident))

t是您的Thread对象。

阅读python源代码(Modules/threadmodule.cPython/thread_pthread.h),您可以看到的Thread.ident是一种pthread_t类型,因此您可以pthread在python use中做任何事情libpthread

from ctypes import *
pthread = cdll.LoadLibrary("libpthread-2.15.so")
pthread.pthread_cancel(c_ulong(t.ident))

t is your Thread object.

Read the python source (Modules/threadmodule.c and Python/thread_pthread.h) you can see the Thread.ident is an pthread_t type, so you can do anything pthread can do in python use libpthread.


回答 13

可以使用以下解决方法杀死线程:

kill_threads = False

def doSomething():
    global kill_threads
    while True:
        if kill_threads:
            thread.exit()
        ......
        ......

thread.start_new_thread(doSomething, ())

这甚至可以用于终止从主线程终止其代码在另一个模块中编写的线程。我们可以在该模块中声明一个全局变量,并使用它终止该模块中产生的线程。

我通常使用它在程序出口处终止所有线程。这可能不是终止线程的理想方法,但可能会有所帮助。

Following workaround can be used to kill a thread:

kill_threads = False

def doSomething():
    global kill_threads
    while True:
        if kill_threads:
            thread.exit()
        ......
        ......

thread.start_new_thread(doSomething, ())

This can be used even for terminating threads, whose code is written in another module, from main thread. We can declare a global variable in that module and use it to terminate thread/s spawned in that module.

I usually use this to terminate all the threads at the program exit. This might not be the perfect way to terminate thread/s but could help.


回答 14

我玩这个游戏很晚,但是我一直在努力解决类似的问题,以下内容似乎可以为我很好地解决问题,并且让守护子线程退出时让我做一些基本的线程状态检查和清理工作:

import threading
import time
import atexit

def do_work():

  i = 0
  @atexit.register
  def goodbye():
    print ("'CLEANLY' kill sub-thread with value: %s [THREAD: %s]" %
           (i, threading.currentThread().ident))

  while True:
    print i
    i += 1
    time.sleep(1)

t = threading.Thread(target=do_work)
t.daemon = True
t.start()

def after_timeout():
  print "KILL MAIN THREAD: %s" % threading.currentThread().ident
  raise SystemExit

threading.Timer(2, after_timeout).start()

Yield:

0
1
KILL MAIN THREAD: 140013208254208
'CLEANLY' kill sub-thread with value: 2 [THREAD: 140013674317568]

I’m way late to this game, but I’ve been wrestling with a similar question and the following appears to both resolve the issue perfectly for me AND lets me do some basic thread state checking and cleanup when the daemonized sub-thread exits:

import threading
import time
import atexit

def do_work():

  i = 0
  @atexit.register
  def goodbye():
    print ("'CLEANLY' kill sub-thread with value: %s [THREAD: %s]" %
           (i, threading.currentThread().ident))

  while True:
    print i
    i += 1
    time.sleep(1)

t = threading.Thread(target=do_work)
t.daemon = True
t.start()

def after_timeout():
  print "KILL MAIN THREAD: %s" % threading.currentThread().ident
  raise SystemExit

threading.Timer(2, after_timeout).start()

Yields:

0
1
KILL MAIN THREAD: 140013208254208
'CLEANLY' kill sub-thread with value: 2 [THREAD: 140013674317568]

回答 15

我想补充的一件事是,如果您在线程lib Python中阅读了官方文档,建议您避免使用“恶魔”线程,如果您不希望线程突然结束,请使用Paolo Rovelli 提到的标志。

根据官方文档:

守护程序线程在关闭时突然停止。它们的资源(例如打开的文件,数据库事务等)可能无法正确释放。如果您希望线程正常停止,请将它们设置为非守护进程,并使用适当的信令机制(例如事件)。

我认为创建守护线程取决于您的应用程序,但总的来说(我认为)最好避免杀死它们或使其成为守护线程。在多处理中,您可以is_alive()用来检查过程状态并“终止”以完成过程(还可以避免GIL问题)。但是,有时在Windows中执行代码时会发现更多问题。

永远记住,如果您有“活动线程”,Python解释器将运行以等待它们。(因为这个守护进程可以帮助您,如果没关系突然结束)。

One thing I want to add is that if you read official documentation in threading lib Python, it’s recommended to avoid use of “demonic” threads, when you don’t want threads end abruptly, with the flag that Paolo Rovelli mentioned.

From official documentation:

Daemon threads are abruptly stopped at shutdown. Their resources (such as open files, database transactions, etc.) may not be released properly. If you want your threads to stop gracefully, make them non-daemonic and use a suitable signaling mechanism such as an Event.

I think that creating daemonic threads depends of your application, but in general (and in my opinion) it’s better to avoid killing them or making them daemonic. In multiprocessing you can use is_alive() to check process status and “terminate” for finish them (Also you avoid GIL problems). But you can find more problems, sometimes, when you execute your code in Windows.

And always remember that if you have “live threads”, the Python interpreter will be running for wait them. (Because of this daemonic can help you if don’t matter abruptly ends).


回答 16

有一个为此目的而建立的库stopit。尽管此处列出的某些注意事项仍然适用,但至少此库提供了一种常规的,可重复的技术来实现所述目标。

There is a library built for this purpose, stopit. Although some of the same cautions listed herein still apply, at least this library presents a regular, repeatable technique for achieving the stated goal.


回答 17

虽然它已经很老了,对于某些人来说可能是一个方便的解决方案:

一个扩展了线程的模块功能的小模块-允许一个线程在另一个线程的上下文中引发异常。通过提高SystemExit,您最终可以杀死python线程。

import threading
import ctypes     

def _async_raise(tid, excobj):
    res = ctypes.pythonapi.PyThreadState_SetAsyncExc(tid, ctypes.py_object(excobj))
    if res == 0:
        raise ValueError("nonexistent thread id")
    elif res > 1:
        # """if it returns a number greater than one, you're in trouble, 
        # and you should call it again with exc=NULL to revert the effect"""
        ctypes.pythonapi.PyThreadState_SetAsyncExc(tid, 0)
        raise SystemError("PyThreadState_SetAsyncExc failed")

class Thread(threading.Thread):
    def raise_exc(self, excobj):
        assert self.isAlive(), "thread must be started"
        for tid, tobj in threading._active.items():
            if tobj is self:
                _async_raise(tid, excobj)
                return

        # the thread was alive when we entered the loop, but was not found 
        # in the dict, hence it must have been already terminated. should we raise
        # an exception here? silently ignore?

    def terminate(self):
        # must raise the SystemExit type, instead of a SystemExit() instance
        # due to a bug in PyThreadState_SetAsyncExc
        self.raise_exc(SystemExit)

因此,它允许“线程在另一个线程的上下文中引发异常”,这样,终止的线程可以处理终止,而无需定期检查中止标志。

但是,根据其原始来源,此代码存在一些问题。

  • 仅在执行python字节码时才会引发异常。如果您的线程调用了本机/内置阻止函数,则仅当执行返回到python代码时才会引发异常。
    • 如果内置函数在内部调用PyErr_Clear()也会存在一个问题,这将有效地取消您的未决异常。您可以尝试再次提高它。
  • 只有异常类型可以安全地引发。异常实例可能会导致意外行为,因此受到限制。
  • 我要求在内置线程模块中公开此功能,但是由于ctypes已成为标准库(从2.5版开始),并且此
    功能不太可能与实现无关,因此可以不公开

While it’s rather old, this might be a handy solution for some:

A little module that extends the threading’s module functionality — allows one thread to raise exceptions in the context of another thread. By raising SystemExit, you can finally kill python threads.

import threading
import ctypes     

def _async_raise(tid, excobj):
    res = ctypes.pythonapi.PyThreadState_SetAsyncExc(tid, ctypes.py_object(excobj))
    if res == 0:
        raise ValueError("nonexistent thread id")
    elif res > 1:
        # """if it returns a number greater than one, you're in trouble, 
        # and you should call it again with exc=NULL to revert the effect"""
        ctypes.pythonapi.PyThreadState_SetAsyncExc(tid, 0)
        raise SystemError("PyThreadState_SetAsyncExc failed")

class Thread(threading.Thread):
    def raise_exc(self, excobj):
        assert self.isAlive(), "thread must be started"
        for tid, tobj in threading._active.items():
            if tobj is self:
                _async_raise(tid, excobj)
                return

        # the thread was alive when we entered the loop, but was not found 
        # in the dict, hence it must have been already terminated. should we raise
        # an exception here? silently ignore?

    def terminate(self):
        # must raise the SystemExit type, instead of a SystemExit() instance
        # due to a bug in PyThreadState_SetAsyncExc
        self.raise_exc(SystemExit)

So, it allows a “thread to raise exceptions in the context of another thread” and in this way, the terminated thread can handle the termination without regularly checking an abort flag.

However, according to its original source, there are some issues with this code.

  • The exception will be raised only when executing python bytecode. If your thread calls a native/built-in blocking function, the exception will be raised only when execution returns to the python code.
    • There is also an issue if the built-in function internally calls PyErr_Clear(), which would effectively cancel your pending exception. You can try to raise it again.
  • Only exception types can be raised safely. Exception instances are likely to cause unexpected behavior, and are thus restricted.
  • I asked to expose this function in the built-in thread module, but since ctypes has become a standard library (as of 2.5), and this
    feature is not likely to be implementation-agnostic, it may be kept
    unexposed.

回答 18

这似乎与Windows 7上的pywin32一起使用

my_thread = threading.Thread()
my_thread.start()
my_thread._Thread__stop()

This seems to work with pywin32 on windows 7

my_thread = threading.Thread()
my_thread.start()
my_thread._Thread__stop()

回答 19

ØMQ项目的创始人之一Pieter Hintjens 说,使用ØMQ并避免使用同步原语(例如锁,互斥体,事件等),是编写多线程程序的最安全的方法:

http://zguide.zeromq.org/py:all#Multithreading-with-ZeroMQ

这包括告诉子线程应该取消其工作。这可以通过为该线程配备一个ØMQ套接字并在该套接字上轮询以获取一条消息指出该线程应取消的信息来完成。

该链接还提供了有关带有ØMQ的多线程python代码的示例。

Pieter Hintjens — one of the founders of the ØMQ-project — says, using ØMQ and avoiding synchronization primitives like locks, mutexes, events etc., is the sanest and securest way to write multi-threaded programs:

http://zguide.zeromq.org/py:all#Multithreading-with-ZeroMQ

This includes telling a child thread, that it should cancel its work. This would be done by equipping the thread with a ØMQ-socket and polling on that socket for a message saying that it should cancel.

The link also provides an example on multi-threaded python code with ØMQ.


回答 20

假设您要具有多个具有相同功能的线程,这是恕我直言,最简单的实现是通过id停止一个线程:

import time
from threading import Thread

def doit(id=0):
    doit.stop=0
    print("start id:%d"%id)
    while 1:
        time.sleep(1)
        print(".")
        if doit.stop==id:
            doit.stop=0
            break
    print("end thread %d"%id)

t5=Thread(target=doit, args=(5,))
t6=Thread(target=doit, args=(6,))

t5.start() ; t6.start()
time.sleep(2)
doit.stop =5  #kill t5
time.sleep(2)
doit.stop =6  #kill t6

不错的是,您可以拥有多个相同和不同的功能,并通过以下方式将其全部停止 functionname.stop

如果您只想使用该函数的一个线程,则无需记住ID。如果doit.stop> 0 ,则停止。

Asuming, that you want to have multiple threads of the same function, this is IMHO the easiest implementation to stop one by id:

import time
from threading import Thread

def doit(id=0):
    doit.stop=0
    print("start id:%d"%id)
    while 1:
        time.sleep(1)
        print(".")
        if doit.stop==id:
            doit.stop=0
            break
    print("end thread %d"%id)

t5=Thread(target=doit, args=(5,))
t6=Thread(target=doit, args=(6,))

t5.start() ; t6.start()
time.sleep(2)
doit.stop =5  #kill t5
time.sleep(2)
doit.stop =6  #kill t6

The nice thing is here, you can have multiple of same and different functions, and stop them all by functionname.stop

If you want to have only one thread of the function then you don’t need to remember the id. Just stop, if doit.stop > 0.


回答 21

只是基于@SCB的想法(这正是我所需要的)来创建具有自定义函数的KillableThread子类:

from threading import Thread, Event

class KillableThread(Thread):
    def __init__(self, sleep_interval=1, target=None, name=None, args=(), kwargs={}):
        super().__init__(None, target, name, args, kwargs)
        self._kill = Event()
        self._interval = sleep_interval
        print(self._target)

    def run(self):
        while True:
            # Call custom function with arguments
            self._target(*self._args)

        # If no kill signal is set, sleep for the interval,
        # If kill signal comes in while sleeping, immediately
        #  wake up and handle
        is_killed = self._kill.wait(self._interval)
        if is_killed:
            break

    print("Killing Thread")

def kill(self):
    self._kill.set()

if __name__ == '__main__':

    def print_msg(msg):
        print(msg)

    t = KillableThread(10, print_msg, args=("hello world"))
    t.start()
    time.sleep(6)
    print("About to kill thread")
    t.kill()

自然,就像@SBC一样,线程不必等待运行新循环来停止。在此示例中,您将在“即将杀死线程”之后看到“杀死线程”消息,而不是再等待4秒钟才能完成线程(因为我们已经睡了6秒钟)。

KillableThread构造函数中的第二个参数是您的自定义函数(此处为print_msg)。Args参数是在此处调用函数((“ hello world”))时将使用的参数。

Just to build up on @SCB’s idea (which was exactly what I needed) to create a KillableThread subclass with a customized function:

from threading import Thread, Event

class KillableThread(Thread):
    def __init__(self, sleep_interval=1, target=None, name=None, args=(), kwargs={}):
        super().__init__(None, target, name, args, kwargs)
        self._kill = Event()
        self._interval = sleep_interval
        print(self._target)

    def run(self):
        while True:
            # Call custom function with arguments
            self._target(*self._args)

        # If no kill signal is set, sleep for the interval,
        # If kill signal comes in while sleeping, immediately
        #  wake up and handle
        is_killed = self._kill.wait(self._interval)
        if is_killed:
            break

    print("Killing Thread")

def kill(self):
    self._kill.set()

if __name__ == '__main__':

    def print_msg(msg):
        print(msg)

    t = KillableThread(10, print_msg, args=("hello world"))
    t.start()
    time.sleep(6)
    print("About to kill thread")
    t.kill()

Naturally, like with @SBC, the thread doesn’t wait to run a new loop to stop. In this example, you would see the “Killing Thread” message printed right after the “About to kill thread” instead of waiting for 4 more seconds for the thread to complete (since we have slept for 6 seconds already).

Second argument in KillableThread constructor is your custom function (print_msg here). Args argument are the arguments that will be used when calling the function ((“hello world”)) here.


回答 22

如@Kozyarchuk的答案中所述,安装跟踪有效。由于此答案不包含任何代码,因此下面是一个可以使用的示例:

import sys, threading, time 

class TraceThread(threading.Thread): 
    def __init__(self, *args, **keywords): 
        threading.Thread.__init__(self, *args, **keywords) 
        self.killed = False
    def start(self): 
        self._run = self.run 
        self.run = self.settrace_and_run
        threading.Thread.start(self) 
    def settrace_and_run(self): 
        sys.settrace(self.globaltrace) 
        self._run()
    def globaltrace(self, frame, event, arg): 
        return self.localtrace if event == 'call' else None
    def localtrace(self, frame, event, arg): 
        if self.killed and event == 'line': 
            raise SystemExit() 
        return self.localtrace 

def f(): 
    while True: 
        print('1') 
        time.sleep(2)
        print('2') 
        time.sleep(2)
        print('3') 
        time.sleep(2)

t = TraceThread(target=f) 
t.start() 
time.sleep(2.5) 
t.killed = True

打印1并打印后停止23不打印。

As mentioned in @Kozyarchuk’s answer, installing trace works. Since this answer contained no code, here is a working ready-to-use example:

import sys, threading, time 

class TraceThread(threading.Thread): 
    def __init__(self, *args, **keywords): 
        threading.Thread.__init__(self, *args, **keywords) 
        self.killed = False
    def start(self): 
        self._run = self.run 
        self.run = self.settrace_and_run
        threading.Thread.start(self) 
    def settrace_and_run(self): 
        sys.settrace(self.globaltrace) 
        self._run()
    def globaltrace(self, frame, event, arg): 
        return self.localtrace if event == 'call' else None
    def localtrace(self, frame, event, arg): 
        if self.killed and event == 'line': 
            raise SystemExit() 
        return self.localtrace 

def f(): 
    while True: 
        print('1') 
        time.sleep(2)
        print('2') 
        time.sleep(2)
        print('3') 
        time.sleep(2)

t = TraceThread(target=f) 
t.start() 
time.sleep(2.5) 
t.killed = True

It stops after having printed 1 and 2. 3 is not printed.


回答 23

您可以在进程中执行命令,然后使用进程ID将其杀死。我需要在两个线程之间进行同步,其中一个线程不会自行返回。

processIds = []

def executeRecord(command):
    print(command)

    process = subprocess.Popen(command, stdout=subprocess.PIPE)
    processIds.append(process.pid)
    print(processIds[0])

    #Command that doesn't return by itself
    process.stdout.read().decode("utf-8")
    return;


def recordThread(command, timeOut):

    thread = Thread(target=executeRecord, args=(command,))
    thread.start()
    thread.join(timeOut)

    os.kill(processIds.pop(), signal.SIGINT)

    return;

You can execute your command in a process and then kill it using the process id. I needed to sync between two thread one of which doesn’t return by itself.

processIds = []

def executeRecord(command):
    print(command)

    process = subprocess.Popen(command, stdout=subprocess.PIPE)
    processIds.append(process.pid)
    print(processIds[0])

    #Command that doesn't return by itself
    process.stdout.read().decode("utf-8")
    return;


def recordThread(command, timeOut):

    thread = Thread(target=executeRecord, args=(command,))
    thread.start()
    thread.join(timeOut)

    os.kill(processIds.pop(), signal.SIGINT)

    return;

回答 24

使用setDaemon(True)启动子线程。

def bootstrap(_filename):
    mb = ModelBootstrap(filename=_filename) # Has many Daemon threads. All get stopped automatically when main thread is stopped.

t = threading.Thread(target=bootstrap,args=('models.conf',))
t.setDaemon(False)

while True:
    t.start()
    time.sleep(10) # I am just allowing the sub-thread to run for 10 sec. You can listen on an event to stop execution.
    print('Thread stopped')
    break

Start the sub thread with setDaemon(True).

def bootstrap(_filename):
    mb = ModelBootstrap(filename=_filename) # Has many Daemon threads. All get stopped automatically when main thread is stopped.

t = threading.Thread(target=bootstrap,args=('models.conf',))
t.setDaemon(False)

while True:
    t.start()
    time.sleep(10) # I am just allowing the sub-thread to run for 10 sec. You can listen on an event to stop execution.
    print('Thread stopped')
    break

回答 25

这是一个错误的答案,请参阅评论

方法如下:

from threading import *

...

for thread in enumerate():
    if thread.isAlive():
        try:
            thread._Thread__stop()
        except:
            print(str(thread.getName()) + ' could not be terminated'))

给它几秒钟,然后应该停止线程。还要检查thread._Thread__delete()方法。

thread.quit()为了方便起见,我建议使用一种方法。例如,如果您的线程中有一个套接字,建议quit()您在套接字句柄类中创建一个方法,终止该套接字,然后在的thread._Thread__stop()内部运行quit()

This is a bad answer, see the comments

Here’s how to do it:

from threading import *

...

for thread in enumerate():
    if thread.isAlive():
        try:
            thread._Thread__stop()
        except:
            print(str(thread.getName()) + ' could not be terminated'))

Give it a few seconds then your thread should be stopped. Check also the thread._Thread__delete() method.

I’d recommend a thread.quit() method for convenience. For example if you have a socket in your thread, I’d recommend creating a quit() method in your socket-handle class, terminate the socket, then run a thread._Thread__stop() inside of your quit().


回答 26

如果您确实需要杀死子任务的能力,请使用替代实现。multiprocessing并且gevent都支持滥杀“线程”。

Python的线程不支持取消。想都别想。您的代码很可能会死锁,损坏或泄漏内存,或者具有其他意想不到的“有趣”难以调试的效果,这种情况很少且不确定地发生。

If you really need the ability to kill a sub-task, use an alternate implementation. multiprocessing and gevent both support indiscriminately killing a “thread”.

Python’s threading does not support cancellation. Do not even try. Your code is very likely to deadlock, corrupt or leak memory, or have other unintended “interesting” hard-to-debug effects which happen rarely and nondeterministically.


在Python中将十六进制字符串转换为int

问题:在Python中将十六进制字符串转换为int

如何在Python中将十六进制字符串转换为int?

我可能将其命名为“ 0xffff”或“ ffff”。

How do I convert a hex string to an int in Python?

I may have it as “0xffff” or just “ffff“.


回答 0

如果没有 0x前缀,则需要显式指定基数,否则无法告诉:

x = int("deadbeef", 16)

使用 0x前缀,Python可以自动区分十六进制和十进制。

>>> print int("0xdeadbeef", 0)
3735928559
>>> print int("10", 0)
10

(您必须指定0作为基准才能调用此前缀猜测行为;省略第二个参数意味着假定基准为10。)

Without the 0x prefix, you need to specify the base explicitly, otherwise there’s no way to tell:

x = int("deadbeef", 16)

With the 0x prefix, Python can distinguish hex and decimal automatically.

>>> print int("0xdeadbeef", 0)
3735928559
>>> print int("10", 0)
10

(You must specify 0 as the base in order to invoke this prefix-guessing behavior; omitting the second parameter means to assume base-10.)


回答 1

int(hexString, 16) 可以解决问题,并且可以使用和不使用0x前缀:

>>> int("a", 16)
10
>>> int("0xa",16)
10

int(hexString, 16) does the trick, and works with and without the 0x prefix:

>>> int("a", 16)
10
>>> int("0xa",16)
10

回答 2

对于任何给定的字符串s:

int(s, 16)

For any given string s:

int(s, 16)

回答 3

在Python中将十六进制字符串转换为int

我可能有它"0xffff"或只是它"ffff"

要将字符串转换为int,请将字符串int与要转换的基数一起传递给。

两个字符串都可以通过以下方式进行转换:

>>> string_1 = "0xffff"
>>> string_2 = "ffff"
>>> int(string_1, 16)
65535
>>> int(string_2, 16)
65535

int推断

如果您将0作为基数,int则将从字符串中的前缀推断基数。

>>> int(string_1, 0)
65535

如果没有十六进制前缀0xint没有足够的信息与猜测:

>>> int(string_2, 0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 0: 'ffff'

文字:

如果您要输入源代码或解释器,Python将为您进行转换:

>>> integer = 0xffff
>>> integer
65535

这将无法使用,ffff因为Python会认为您正在尝试编写合法的Python名称:

>>> integer = ffff
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'ffff' is not defined

Python数字以数字字符开头,而Python名称不能以数字字符开头。

Convert hex string to int in Python

I may have it as "0xffff" or just "ffff".

To convert a string to an int, pass the string to int along with the base you are converting from.

Both strings will suffice for conversion in this way:

>>> string_1 = "0xffff"
>>> string_2 = "ffff"
>>> int(string_1, 16)
65535
>>> int(string_2, 16)
65535

Letting int infer

If you pass 0 as the base, int will infer the base from the prefix in the string.

>>> int(string_1, 0)
65535

Without the hexadecimal prefix, 0x, int does not have enough information with which to guess:

>>> int(string_2, 0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 0: 'ffff'

literals:

If you’re typing into source code or an interpreter, Python will make the conversion for you:

>>> integer = 0xffff
>>> integer
65535

This won’t work with ffff because Python will think you’re trying to write a legitimate Python name instead:

>>> integer = ffff
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'ffff' is not defined

Python numbers start with a numeric character, while Python names cannot start with a numeric character.


回答 4

在上述Dan的答案中加上:如果为int()函数提供了十六进制字符串,则必须将基数指定为16,否则它不会认为您给了它有效的值。对于字符串中不包含的十六进制数字,无需指定基数16。

print int(0xdeadbeef) # valid

myHex = "0xdeadbeef"
print int(myHex) # invalid, raises ValueError
print int(myHex , 16) # valid

Adding to Dan’s answer above: if you supply the int() function with a hex string, you will have to specify the base as 16 or it will not think you gave it a valid value. Specifying base 16 is unnecessary for hex numbers not contained in strings.

print int(0xdeadbeef) # valid

myHex = "0xdeadbeef"
print int(myHex) # invalid, raises ValueError
print int(myHex , 16) # valid

回答 5

最坏的方法:

>>> def hex_to_int(x):
    return eval("0x" + x)

>>> hex_to_int("c0ffee")
12648430

请不要这样做!

在Python中使用eval是不好的做法吗?

The worst way:

>>> def hex_to_int(x):
    return eval("0x" + x)

>>> hex_to_int("c0ffee")
12648430

Please don’t do this!

Is using eval in Python a bad practice?


回答 6

或者ast.literal_eval(这很安全,不像eval):

ast.literal_eval("0xffff")

演示:

>>> import ast
>>> ast.literal_eval("0xffff")
65535
>>> 

Or ast.literal_eval (this is safe, unlike eval):

ast.literal_eval("0xffff")

Demo:

>>> import ast
>>> ast.literal_eval("0xffff")
65535
>>> 

回答 7

格式化程序选项’%x’%对我来说似乎也可以在赋值语句中使用。(假设Python 3.0及更高版本)

a = int('0x100', 16)
print(a)   #256
print('%x' % a) #100
b = a
print(b) #256
c = '%x' % a
print(c) #100

The formatter option ‘%x’ % seems to work in assignment statements as well for me. (Assuming Python 3.0 and later)

Example

a = int('0x100', 16)
print(a)   #256
print('%x' % a) #100
b = a
print(b) #256
c = '%x' % a
print(c) #100

回答 8

如果您使用的是python解释器,则只需键入0x(您的十六进制值),解释器就会自动为您转换。

>>> 0xffff

65535

If you are using the python interpreter, you can just type 0x(your hex value) and the interpreter will convert it automatically for you.

>>> 0xffff

65535

回答 9

处理十六进制,八进制,二进制,整数和浮点数

使用标准前缀(即0x,0b,0和0o),此函数会将任何合适的字符串转换为数字。我在这里回答了这个问题:https : //stackoverflow.com/a/58997070/2464381,但这是必需的功能。

def to_number(n):
    ''' Convert any number representation to a number 
    This covers: float, decimal, hex, and octal numbers.
    '''

    try:
        return int(str(n), 0)
    except:
        try:
            # python 3 doesn't accept "010" as a valid octal.  You must use the
            # '0o' prefix
            return int('0o' + n, 0)
        except:
            return float(n)

Handles hex, octal, binary, int, and float

Using the standard prefixes (i.e. 0x, 0b, 0, and 0o) this function will convert any suitable string to a number. I answered this here: https://stackoverflow.com/a/58997070/2464381 but here is the needed function.

def to_number(n):
    ''' Convert any number representation to a number 
    This covers: float, decimal, hex, and octal numbers.
    '''

    try:
        return int(str(n), 0)
    except:
        try:
            # python 3 doesn't accept "010" as a valid octal.  You must use the
            # '0o' prefix
            return int('0o' + n, 0)
        except:
            return float(n)

回答 10

在Python 2.7中,int('deadbeef',10)似乎不起作用。

以下对我有用:

>>a = int('deadbeef',16)
>>float(a)
3735928559.0

In Python 2.7, int('deadbeef',10) doesn’t seem to work.

The following works for me:

>>a = int('deadbeef',16)
>>float(a)
3735928559.0

回答 11

加上“ 0x”前缀,您也可以使用eval函数

例如

>>a='0xff'
>>eval(a)
255

with ‘0x’ prefix, you might also use eval function

For example

>>a='0xff'
>>eval(a)
255