分类目录归档:知识问答

如何在Python中获取列表的最后一项?

问题:如何在Python中获取列表的最后一项?

我需要列表的最后9个数字,而且我敢肯定有一种切片方法,但是我似乎无法理解。我可以这样获得前9个:

num_list[0:9]

I need the last 9 numbers of a list and I’m sure there is a way to do it with slicing, but I can’t seem to get it. I can get the first 9 like this:

num_list[0:9]

回答 0

您可以在切片运算符中使用负整数。这是使用python CLI解释器的示例:

>>> a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
>>> a
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
>>> a[-9:]
[4, 5, 6, 7, 8, 9, 10, 11, 12]

重要的是 a[-9:]

You can use negative integers with the slicing operator for that. Here’s an example using the python CLI interpreter:

>>> a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
>>> a
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
>>> a[-9:]
[4, 5, 6, 7, 8, 9, 10, 11, 12]

the important line is a[-9:]


回答 1

负索引将从列表的末尾开始计数,因此:

num_list[-9:]

a negative index will count from the end of the list, so:

num_list[-9:]

回答 2

切片

Python切片是一项非常快的操作,它是一种快速访问部分数据的便捷方法。

从列表(或支持字符串的任何其他序列,如字符串)中获取最后九个元素的切片表示法如下所示:

num_list[-9:]

看到此内容时,我将括号中的部分读为“从末尾到第9位”。(实际上,我在心理上将其缩写为“ -9,on”)

说明:

完整的符号是

sequence[start:stop:step]

但是冒号是告诉Python您给它一个切片而不是常规索引的原因。这就是为什么在Python 2中复制列表的惯用方式是

list_copy = sequence[:]

清除它们的方法是:

del my_list[:]

(清单get list.copylist.clearin Python3。)

给您的切片起一个描述性的名字!

您可能会发现,将形成切片与将切片传递给list.__getitem__方法分开很有用(这就是方括号所做的事情)。即使您并不陌生,它也可以使您的代码更具可读性,以便其他可能需要阅读您的代码的人可以更轻松地了解您的操作。

但是,您不能仅将一些用冒号分隔的整数分配给变量。您需要使用slice对象:

last_nine_slice = slice(-9, None)

第二个参数,None,是必需的,使得第一参数被解释为所述start参数否则这将是stop自变量

然后可以将slice对象传递给序列:

>>> list(range(100))[last_nine_slice]
[91, 92, 93, 94, 95, 96, 97, 98, 99]

islice

islice从itertools模块中获取是另一种可能的有效方法。islice不会接受否定参数,因此理想情况下,您的iterable具有一个__reversed__特殊的方法-列表确实具有-因此您必须先将您的列表(或with的iterable __reversed__)传递给reversed

>>> from itertools import islice
>>> islice(reversed(range(100)), 0, 9)
<itertools.islice object at 0xffeb87fc>

islice允许对数据管道进行延迟评估,因此要实现数据,请将其传递给构造函数(如list):

>>> list(islice(reversed(range(100)), 0, 9))
[99, 98, 97, 96, 95, 94, 93, 92, 91]

Slicing

Python slicing is an incredibly fast operation, and it’s a handy way to quickly access parts of your data.

Slice notation to get the last nine elements from a list (or any other sequence that supports it, like a string) would look like this:

num_list[-9:]

When I see this, I read the part in the brackets as “9th from the end, to the end.” (Actually, I abbreviate it mentally as “-9, on”)

Explanation:

The full notation is

sequence[start:stop:step]

But the colon is what tells Python you’re giving it a slice and not a regular index. That’s why the idiomatic way of copying lists in Python 2 is

list_copy = sequence[:]

And clearing them is with:

del my_list[:]

(Lists get list.copy and list.clear in Python 3.)

Give your slices a descriptive name!

You may find it useful to separate forming the slice from passing it to the list.__getitem__ method (that’s what the square brackets do). Even if you’re not new to it, it keeps your code more readable so that others that may have to read your code can more readily understand what you’re doing.

However, you can’t just assign some integers separated by colons to a variable. You need to use the slice object:

last_nine_slice = slice(-9, None)

The second argument, None, is required, so that the first argument is interpreted as the start argument otherwise it would be the stop argument.

You can then pass the slice object to your sequence:

>>> list(range(100))[last_nine_slice]
[91, 92, 93, 94, 95, 96, 97, 98, 99]

islice

islice from the itertools module is another possibly performant way to get this. islice doesn’t take negative arguments, so ideally your iterable has a __reversed__ special method – which list does have – so you must first pass your list (or iterable with __reversed__) to reversed.

>>> from itertools import islice
>>> islice(reversed(range(100)), 0, 9)
<itertools.islice object at 0xffeb87fc>

islice allows for lazy evaluation of the data pipeline, so to materialize the data, pass it to a constructor (like list):

>>> list(islice(reversed(range(100)), 0, 9))
[99, 98, 97, 96, 95, 94, 93, 92, 91]

回答 3

您可以根据需要使用numlist [-9:]从左到右读取最后9个元素,或者使用numlist [:-10:-1]从右到左读取。

>>> a=range(17)
>>> print a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
>>> print a[-9:]
[8, 9, 10, 11, 12, 13, 14, 15, 16]
>>> print a[:-10:-1]
[16, 15, 14, 13, 12, 11, 10, 9, 8]

The last 9 elements can be read from left to right using numlist[-9:], or from right to left using numlist[:-10:-1], as you want.

>>> a=range(17)
>>> print a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
>>> print a[-9:]
[8, 9, 10, 11, 12, 13, 14, 15, 16]
>>> print a[:-10:-1]
[16, 15, 14, 13, 12, 11, 10, 9, 8]

回答 4

这是获取迭代的“ tail”项的几个选项:

给定

n = 9
iterable = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

期望的输出

[2, 3, 4, 5, 6, 7, 8, 9, 10]

我们使用以下任一选项获取后者的输出:

from collections import deque
import itertools

import more_itertools


# A: Slicing
iterable[-n:]


# B: Implement an itertools recipe
def tail(n, iterable):
    """Return an iterator over the last *n* items of *iterable*.

        >>> t = tail(3, 'ABCDEFG')
        >>> list(t)
        ['E', 'F', 'G']

    """
    return iter(deque(iterable, maxlen=n))
list(tail(n, iterable))


# C: Use an implemented recipe, via more_itertools
list(more_itertools.tail(n, iterable))


# D: islice, via itertools
list(itertools.islice(iterable, len(iterable)-n, None))


# E: Negative islice, via more_itertools
list(more_itertools.islice_extended(iterable, -n, None))

细节

  • 答:传统的Python 切片是该语言固有的功能。此选项适用于序列,例如字符串,列表和元组。但是,这种切片不适用于迭代器,例如iter(iterable)
  • B. itertools食谱。它可以普遍适用于任何可迭代的对象,并且可以解决最后一个解决方案中的迭代器问题。此配方必须手动实现,因为它尚未正式包含在itertools模块中。
  • C.许多配方,包括后一种工具(B),都已在第三方软件包中方便地实现。安装和导入这些库可以避免手动实施。这些库之一称为more_itertools(通过安装> pip install more-itertools);见more_itertools.tail
  • D. itertools图书馆的成员。注意,itertools.islice 不支持负片
  • E.实现了另一种工具,more_itertools该工具可以概括itertools.islice为支持负切片;见more_itertools.islice_extended

我要使用哪一个?

这要看情况。在大多数情况下,切片(如其他答案中所述的选项A)是语言中最简单的选项,并且支持大多数可迭代类型。对于更通用的迭代器,请使用其余任何选项。请注意,选项C和E需要安装第三方库,某些用户可能会觉得有用。

Here are several options for getting the “tail” items of an iterable:

Given

n = 9
iterable = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Desired Output

[2, 3, 4, 5, 6, 7, 8, 9, 10]

Code

We get the latter output using any of the following options:

from collections import deque
import itertools

import more_itertools


# A: Slicing
iterable[-n:]


# B: Implement an itertools recipe
def tail(n, iterable):
    """Return an iterator over the last *n* items of *iterable*.

        >>> t = tail(3, 'ABCDEFG')
        >>> list(t)
        ['E', 'F', 'G']

    """
    return iter(deque(iterable, maxlen=n))
list(tail(n, iterable))


# C: Use an implemented recipe, via more_itertools
list(more_itertools.tail(n, iterable))


# D: islice, via itertools
list(itertools.islice(iterable, len(iterable)-n, None))


# E: Negative islice, via more_itertools
list(more_itertools.islice_extended(iterable, -n, None))

Details

  • A. Traditional Python slicing is inherent to the language. This option works with sequences such as strings, lists and tuples. However, this kind of slicing does not work on iterators, e.g. iter(iterable).
  • B. An itertools recipe. It is generalized to work on any iterable and resolves the iterator issue in the last solution. This recipe must be implemented manually as it is not officially included in the itertools module.
  • C. Many recipes, including the latter tool (B), have been conveniently implemented in third party packages. Installing and importing these these libraries obviates manual implementation. One of these libraries is called more_itertools (install via > pip install more-itertools); see more_itertools.tail.
  • D. A member of the itertools library. Note, itertools.islice does not support negative slicing.
  • E. Another tool is implemented in more_itertools that generalizes itertools.islice to support negative slicing; see more_itertools.islice_extended.

Which one do I use?

It depends. In most cases, slicing (option A, as mentioned in other answers) is most simple option as it built into the language and supports most iterable types. For more general iterators, use any of the remaining options. Note, options C and E require installing a third-party library, which some users may find useful.


如何将多行字符串分成多行?

问题:如何将多行字符串分成多行?

我有一个多行字符串文字,我想在每一行上执行一个操作,如下所示:

inputString = """Line 1
Line 2
Line 3"""

我想做以下事情:

for line in inputString:
    doStuff()

I have a multi-line string literal that I want to do an operation on each line, like so:

inputString = """Line 1
Line 2
Line 3"""

I want to do something like the following:

for line in inputString:
    doStuff()

回答 0

inputString.splitlines()

将为您提供每个项目的列表,该splitlines()方法旨在将每一行拆分为一个列表元素。

inputString.splitlines()

Will give you a list with each item, the splitlines() method is designed to split each line into a list element.


回答 1

就像其他人说的:

inputString.split('\n')  # --> ['Line 1', 'Line 2', 'Line 3']

与上面的相同,但是不建议使用字符串模块的功能,应避免使用:

import string
string.split(inputString, '\n')  # --> ['Line 1', 'Line 2', 'Line 3']

另外,如果您希望每行都包含中断顺序(CR,LF,CRLF),请将该splitlines方法与True参数一起使用:

inputString.splitlines(True)  # --> ['Line 1\n', 'Line 2\n', 'Line 3']

Like the others said:

inputString.split('\n')  # --> ['Line 1', 'Line 2', 'Line 3']

This is identical to the above, but the string module’s functions are deprecated and should be avoided:

import string
string.split(inputString, '\n')  # --> ['Line 1', 'Line 2', 'Line 3']

Alternatively, if you want each line to include the break sequence (CR,LF,CRLF), use the splitlines method with a True argument:

inputString.splitlines(True)  # --> ['Line 1\n', 'Line 2\n', 'Line 3']

回答 2

使用str.splitlines()

splitlines()不同于,可以正确处理换行符split("\n")

它也具有@efotinis提到的优点,当使用True参数调用时,可以在拆分结果中选择性地包括换行符。


为什么不应该使用的详细说明split("\n")

\n在Python中,代表Unix换行符(ASCII十进制代码10),独立于运行它的平台。但是,换行表示形式取决于平台。在Windows上,\n是两个字符CRLF(ASCII十进制码13和10,\r\n称为AKA 和),而在任何现代Unix(包括OS X)上,它都是单个字符LF

print,例如,即使您有一个行尾与平台不匹配的字符串也可以正常工作:

>>> print " a \n b \r\n c "
 a 
 b 
 c

但是,在“ \ n”上进行显式拆分将产生与平台有关的行为:

>>> " a \n b \r\n c ".split("\n")
[' a ', ' b \r', ' c ']

即使你使用了os.linesep,它只会根据你的平台上的换行分隔符分开,并会失败,如果你在处理文本创建在其他平台上,或用裸\n

>>> " a \n b \r\n c ".split(os.linesep)
[' a \n b ', ' c ']

splitlines 解决了所有这些问题:

>>> " a \n b \r\n c ".splitlines()
[' a ', ' b ', ' c ']

以文本模式读取文件可以部分缓解换行符表示问题,因为它将Python \n转换为平台的换行符表示形式。但是,文本模式仅在Windows上存在。在Unix系统上,所有文件都以二进制模式打开,因此split('\n')在带有Windows文件的UNIX系统中使用将导致不良行为。同样,使用与其他来源(例如来自套接字)的换行符可能不同的字符串来处理字符串也很常见。

Use str.splitlines().

splitlines() handles newlines properly, unlike split("\n").

It also has the the advantage mentioned by @efotinis of optionally including the newline character in the split result when called with a True argument.


Why you shouldn’t use split("\n"):

\n, in Python, represents a Unix line-break (ASCII decimal code 10), independently from the platform where you run it. However, the linebreak representation is platform-dependent. On Windows, \n is two characters, CR and LF (ASCII decimal codes 13 and 10, AKA \r and \n), while on any modern Unix (including OS X), it’s the single character LF.

print, for example, works correctly even if you have a string with line endings that don’t match your platform:

>>> print " a \n b \r\n c "
 a 
 b 
 c

However, explicitly splitting on “\n”, will yield platform-dependent behaviour:

>>> " a \n b \r\n c ".split("\n")
[' a ', ' b \r', ' c ']

Even if you use os.linesep, it will only split according to the newline separator on your platform, and will fail if you’re processing text created in other platforms, or with a bare \n:

>>> " a \n b \r\n c ".split(os.linesep)
[' a \n b ', ' c ']

splitlines solves all these problems:

>>> " a \n b \r\n c ".splitlines()
[' a ', ' b ', ' c ']

Reading files in text mode partially mitigates the newline representation problem, as it converts Python’s \n into the platform’s newline representation. However, text mode only exists on Windows. On Unix systems, all files are opened in binary mode, so using split('\n') in a UNIX system with a Windows file will lead to undesired behavior. Also, it’s not unusual to process strings with potentially different newlines from other sources, such as from a socket.


回答 3

在这种特殊情况下可能会过大,但另一个选择涉及使用StringIO创建文件状对象

for line in StringIO.StringIO(inputString):
    doStuff()

Might be overkill in this particular case but another option involves using StringIO to create a file-like object

for line in StringIO.StringIO(inputString):
    doStuff()

回答 4

原始帖子要求提供代码,该代码将打印一些行(如果在某些情况下是正确的),则打印下一行。我的实现是这样的:

text = """1 sfasdf
asdfasdf
2 sfasdf
asdfgadfg
1 asfasdf
sdfasdgf
"""

text = text.splitlines()
rows_to_print = {}

for line in range(len(text)):
    if text[line][0] == '1':
        rows_to_print = rows_to_print | {line, line + 1}

rows_to_print = sorted(list(rows_to_print))

for i in rows_to_print:
    print(text[i])

The original post requested for code which prints some rows (if they are true for some condition) plus the following row. My implementation would be this:

text = """1 sfasdf
asdfasdf
2 sfasdf
asdfgadfg
1 asfasdf
sdfasdgf
"""

text = text.splitlines()
rows_to_print = {}

for line in range(len(text)):
    if text[line][0] == '1':
        rows_to_print = rows_to_print | {line, line + 1}

rows_to_print = sorted(list(rows_to_print))

for i in rows_to_print:
    print(text[i])

回答 5

我希望注释的代码文本格式正确,因为我认为@ 1_CR的答案需要更多的修改,并且我想扩大他的答案。无论如何,他使我领会了以下技巧:如果可用,它将使用cStringIO(但请注意:cStringIO和StringIO 不相同,因为您不能将cStringIO子类化。。。它是内置的。但是对于基本操作,语法将是相同的,因此您可以这样做):

try:
    import cStringIO
    StringIO = cStringIO
except ImportError:
    import StringIO

for line in StringIO.StringIO(variable_with_multiline_string):
    pass
print line.strip()

I wish comments had proper code text formatting, because I think @1_CR ‘s answer needs more bumps, and I would like to augment his answer. Anyway, He led me to the following technique; it will use cStringIO if available (BUT NOTE: cStringIO and StringIO are not the same, because you cannot subclass cStringIO… it is a built-in… but for basic operations the syntax will be identical, so you can do this):

try:
    import cStringIO
    StringIO = cStringIO
except ImportError:
    import StringIO

for line in StringIO.StringIO(variable_with_multiline_string):
    pass
print line.strip()

如何使用子图更改图形大小?

问题:如何使用子图更改图形大小?

我在Matplotlib网站上遇到了这个示例。我想知道是否有可能增加数字的大小。

我尝试过

f.figsize(15,15)

但它什么也没做。

I came across this example in the Matplotlib website. I was wondering if it was possible to increase the figure size.

I tried with

f.figsize(15,15)

but it does nothing.


回答 0

如果已经有了图形对象,请使用:

f.set_figheight(15)
f.set_figwidth(15)

但是,如果您使用.subplots()命令(如您所显示的示例中所示)来创建新图形,则还可以使用:

f, axs = plt.subplots(2,2,figsize=(15,15))

If you already have the figure object use:

f.set_figheight(15)
f.set_figwidth(15)

But if you use the .subplots() command (as in the examples you’re showing) to create a new figure you can also use:

f, axs = plt.subplots(2,2,figsize=(15,15))

回答 1

或者,figure()使用figsize参数创建一个对象,然后使用add_subplot来添加子图。例如

import matplotlib.pyplot as plt
import numpy as np

f = plt.figure(figsize=(10,3))
ax = f.add_subplot(121)
ax2 = f.add_subplot(122)
x = np.linspace(0,4,1000)
ax.plot(x, np.sin(x))
ax2.plot(x, np.cos(x), 'r:')

简单的例子

此方法的好处是语法更接近于subplot()而不是的调用subplots()。例如,次要情节似乎没有使用支持GridSpec用于控制次要情节的间距,但都subplot()add_subplot()做的。

Alternatively, create a figure() object using the figsize argument and then use add_subplot to add your subplots. E.g.

import matplotlib.pyplot as plt
import numpy as np

f = plt.figure(figsize=(10,3))
ax = f.add_subplot(121)
ax2 = f.add_subplot(122)
x = np.linspace(0,4,1000)
ax.plot(x, np.sin(x))
ax2.plot(x, np.cos(x), 'r:')

Simple Example

Benefits of this method are that the syntax is closer to calls of subplot() instead of subplots(). E.g. subplots doesn’t seem to support using a GridSpec for controlling the spacing of the subplots, but both subplot() and add_subplot() do.


从熊猫数据框列获取列表

问题:从熊猫数据框列获取列表

我有一个看起来像这样的Excel文档。

cluster load_date   budget  actual  fixed_price
A   1/1/2014    1000    4000    Y
A   2/1/2014    12000   10000   Y
A   3/1/2014    36000   2000    Y
B   4/1/2014    15000   10000   N
B   4/1/2014    12000   11500   N
B   4/1/2014    90000   11000   N
C   7/1/2014    22000   18000   N
C   8/1/2014    30000   28960   N
C   9/1/2014    53000   51200   N

我希望能够将第1列的内容-集群作为列表返回,因此我可以对其运行一个for循环,并为每个集群创建一个Excel工作表。

还可以将整行的内容返回到列表吗?例如

list = [], list[column1] or list[df.ix(row1)]

I have an excel document which looks like this..

cluster load_date   budget  actual  fixed_price
A   1/1/2014    1000    4000    Y
A   2/1/2014    12000   10000   Y
A   3/1/2014    36000   2000    Y
B   4/1/2014    15000   10000   N
B   4/1/2014    12000   11500   N
B   4/1/2014    90000   11000   N
C   7/1/2014    22000   18000   N
C   8/1/2014    30000   28960   N
C   9/1/2014    53000   51200   N

I want to be able to return the contents of column 1 – cluster as a list, so I can run a for loop over it, and create an excel worksheet for every cluster.

Is it also possible, to return the contents of a whole row to a list? e.g.

list = [], list[column1] or list[df.ix(row1)]

回答 0

拔出它们时,Pandas DataFrame列是Pandas Series,然后可以调用x.tolist()将其转换为Python列表。另外,您也可以使用list(x)

import pandas as pd

data_dict = {'one': pd.Series([1, 2, 3], index=['a', 'b', 'c']),
             'two': pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}

df = pd.DataFrame(data_dict)

print(f"DataFrame:\n{df}\n")
print(f"column types:\n{df.dtypes}")

col_one_list = df['one'].tolist()

col_one_arr = df['one'].to_numpy()

print(f"\ncol_one_list:\n{col_one_list}\ntype:{type(col_one_list)}")
print(f"\ncol_one_arr:\n{col_one_arr}\ntype:{type(col_one_arr)}")

输出:

DataFrame:
   one  two
a  1.0    1
b  2.0    2
c  3.0    3
d  NaN    4

column types:
one    float64
two      int64
dtype: object

col_one_list:
[1.0, 2.0, 3.0, nan]
type:<class 'list'>

col_two_arr:
[ 1.  2.  3. nan]
type:<class 'numpy.ndarray'>

Pandas DataFrame columns are Pandas Series when you pull them out, which you can then call x.tolist() on to turn them into a Python list. Alternatively you cast it with list(x).

import pandas as pd

data_dict = {'one': pd.Series([1, 2, 3], index=['a', 'b', 'c']),
             'two': pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}

df = pd.DataFrame(data_dict)

print(f"DataFrame:\n{df}\n")
print(f"column types:\n{df.dtypes}")

col_one_list = df['one'].tolist()

col_one_arr = df['one'].to_numpy()

print(f"\ncol_one_list:\n{col_one_list}\ntype:{type(col_one_list)}")
print(f"\ncol_one_arr:\n{col_one_arr}\ntype:{type(col_one_arr)}")

Output:

DataFrame:
   one  two
a  1.0    1
b  2.0    2
c  3.0    3
d  NaN    4

column types:
one    float64
two      int64
dtype: object

col_one_list:
[1.0, 2.0, 3.0, nan]
type:<class 'list'>

col_one_arr:
[ 1.  2.  3. nan]
type:<class 'numpy.ndarray'>

回答 1

这将返回一个numpy数组:

arr = df["cluster"].to_numpy()

这将返回一个唯一值的numpy数组:

unique_arr = df["cluster"].unique()

您也可以使用numpy来获取唯一值,尽管两种方法之间存在差异:

arr = df["cluster"].to_numpy()
unique_arr = np.unique(arr)

This returns a numpy array:

arr = df["cluster"].to_numpy()

This returns a numpy array of unique values:

unique_arr = df["cluster"].unique()

You can also use numpy to get the unique values, although there are differences between the two methods:

arr = df["cluster"].to_numpy()
unique_arr = np.unique(arr)

回答 2

转换示例:

numpy数组->熊猫数据框->熊猫列中的列表

numpy数组

data = np.array([[10,20,30], [20,30,60], [30,60,90]])

将numpy数组转换为Panda数据框

dataPd = pd.DataFrame(data = data)

print(dataPd)
0   1   2
0  10  20  30
1  20  30  60
2  30  60  90

转换一个熊猫框到列表

pdToList = list(dataPd['2'])

Example conversion:

Numpy Array -> Panda Data Frame -> List from one Panda Column

Numpy Array

data = np.array([[10,20,30], [20,30,60], [30,60,90]])

Convert numpy array into Panda data frame

dataPd = pd.DataFrame(data = data)
    
print(dataPd)
0   1   2
0  10  20  30
1  20  30  60
2  30  60  90

Convert one Panda column to list

pdToList = list(dataPd['2'])


回答 3

由于这个问题引起了人们的广泛关注,并且有多种方法可以完成您的任务,所以让我提出几个选择。

顺便说一下,这些都是一线客;)

从…开始:

df
  cluster load_date budget actual fixed_price
0       A  1/1/2014   1000   4000           Y
1       A  2/1/2014  12000  10000           Y
2       A  3/1/2014  36000   2000           Y
3       B  4/1/2014  15000  10000           N
4       B  4/1/2014  12000  11500           N
5       B  4/1/2014  90000  11000           N
6       C  7/1/2014  22000  18000           N
7       C  8/1/2014  30000  28960           N
8       C  9/1/2014  53000  51200           N

潜在运营概述:

ser_aggCol (collapse each column to a list)
cluster          [A, A, A, B, B, B, C, C, C]
load_date      [1/1/2014, 2/1/2014, 3/1/2...
budget         [1000, 12000, 36000, 15000...
actual         [4000, 10000, 2000, 10000,...
fixed_price      [Y, Y, Y, N, N, N, N, N, N]
dtype: object


ser_aggRows (collapse each row to a list)
0     [A, 1/1/2014, 1000, 4000, Y]
1    [A, 2/1/2014, 12000, 10000...
2    [A, 3/1/2014, 36000, 2000, Y]
3    [B, 4/1/2014, 15000, 10000...
4    [B, 4/1/2014, 12000, 11500...
5    [B, 4/1/2014, 90000, 11000...
6    [C, 7/1/2014, 22000, 18000...
7    [C, 8/1/2014, 30000, 28960...
8    [C, 9/1/2014, 53000, 51200...
dtype: object


df_gr (here you get lists for each cluster)
                             load_date                 budget                 actual fixed_price
cluster                                                                                         
A        [1/1/2014, 2/1/2014, 3/1/2...   [1000, 12000, 36000]    [4000, 10000, 2000]   [Y, Y, Y]
B        [4/1/2014, 4/1/2014, 4/1/2...  [15000, 12000, 90000]  [10000, 11500, 11000]   [N, N, N]
C        [7/1/2014, 8/1/2014, 9/1/2...  [22000, 30000, 53000]  [18000, 28960, 51200]   [N, N, N]


a list of separate dataframes for each cluster

df for cluster A
  cluster load_date budget actual fixed_price
0       A  1/1/2014   1000   4000           Y
1       A  2/1/2014  12000  10000           Y
2       A  3/1/2014  36000   2000           Y

df for cluster B
  cluster load_date budget actual fixed_price
3       B  4/1/2014  15000  10000           N
4       B  4/1/2014  12000  11500           N
5       B  4/1/2014  90000  11000           N

df for cluster C
  cluster load_date budget actual fixed_price
6       C  7/1/2014  22000  18000           N
7       C  8/1/2014  30000  28960           N
8       C  9/1/2014  53000  51200           N

just the values of column load_date
0    1/1/2014
1    2/1/2014
2    3/1/2014
3    4/1/2014
4    4/1/2014
5    4/1/2014
6    7/1/2014
7    8/1/2014
8    9/1/2014
Name: load_date, dtype: object


just the values of column number 2
0     1000
1    12000
2    36000
3    15000
4    12000
5    90000
6    22000
7    30000
8    53000
Name: budget, dtype: object


just the values of row number 7
cluster               C
load_date      8/1/2014
budget            30000
actual            28960
fixed_price           N
Name: 7, dtype: object


============================== JUST FOR COMPLETENESS ==============================


you can convert a series to a list
['C', '8/1/2014', '30000', '28960', 'N']
<class 'list'>


you can convert a dataframe to a nested list
[['A', '1/1/2014', '1000', '4000', 'Y'], ['A', '2/1/2014', '12000', '10000', 'Y'], ['A', '3/1/2014', '36000', '2000', 'Y'], ['B', '4/1/2014', '15000', '10000', 'N'], ['B', '4/1/2014', '12000', '11500', 'N'], ['B', '4/1/2014', '90000', '11000', 'N'], ['C', '7/1/2014', '22000', '18000', 'N'], ['C', '8/1/2014', '30000', '28960', 'N'], ['C', '9/1/2014', '53000', '51200', 'N']]
<class 'list'>

the content of a dataframe can be accessed as a numpy.ndarray
[['A' '1/1/2014' '1000' '4000' 'Y']
 ['A' '2/1/2014' '12000' '10000' 'Y']
 ['A' '3/1/2014' '36000' '2000' 'Y']
 ['B' '4/1/2014' '15000' '10000' 'N']
 ['B' '4/1/2014' '12000' '11500' 'N']
 ['B' '4/1/2014' '90000' '11000' 'N']
 ['C' '7/1/2014' '22000' '18000' 'N']
 ['C' '8/1/2014' '30000' '28960' 'N']
 ['C' '9/1/2014' '53000' '51200' 'N']]
<class 'numpy.ndarray'>

码:

# prefix ser refers to pd.Series object
# prefix df refers to pd.DataFrame object
# prefix lst refers to list object

import pandas as pd
import numpy as np

df=pd.DataFrame([
        ['A',   '1/1/2014',    '1000',    '4000',    'Y'],
        ['A',   '2/1/2014',    '12000',   '10000',   'Y'],
        ['A',   '3/1/2014',    '36000',   '2000',    'Y'],
        ['B',   '4/1/2014',    '15000',   '10000',   'N'],
        ['B',   '4/1/2014',    '12000',   '11500',   'N'],
        ['B',   '4/1/2014',    '90000',   '11000',   'N'],
        ['C',   '7/1/2014',    '22000',   '18000',   'N'],
        ['C',   '8/1/2014',    '30000',   '28960',   'N'],
        ['C',   '9/1/2014',    '53000',   '51200',   'N']
        ], columns=['cluster', 'load_date',   'budget',  'actual',  'fixed_price'])
print('df',df, sep='\n', end='\n\n')

ser_aggCol=df.aggregate(lambda x: [x.tolist()], axis=0).map(lambda x:x[0])
print('ser_aggCol (collapse each column to a list)',ser_aggCol, sep='\n', end='\n\n\n')

ser_aggRows=pd.Series(df.values.tolist()) 
print('ser_aggRows (collapse each row to a list)',ser_aggRows, sep='\n', end='\n\n\n')

df_gr=df.groupby('cluster').agg(lambda x: list(x))
print('df_gr (here you get lists for each cluster)',df_gr, sep='\n', end='\n\n\n')

lst_dfFiltGr=[ df.loc[df['cluster']==val,:] for val in df['cluster'].unique() ]
print('a list of separate dataframes for each cluster', sep='\n', end='\n\n')
for dfTmp in lst_dfFiltGr:
    print('df for cluster '+str(dfTmp.loc[dfTmp.index[0],'cluster']),dfTmp, sep='\n', end='\n\n')

ser_singleColLD=df.loc[:,'load_date']
print('just the values of column load_date',ser_singleColLD, sep='\n', end='\n\n\n')

ser_singleCol2=df.iloc[:,2]
print('just the values of column number 2',ser_singleCol2, sep='\n', end='\n\n\n')

ser_singleRow7=df.iloc[7,:]
print('just the values of row number 7',ser_singleRow7, sep='\n', end='\n\n\n')

print('='*30+' JUST FOR COMPLETENESS '+'='*30, end='\n\n\n')

lst_fromSer=ser_singleRow7.tolist()
print('you can convert a series to a list',lst_fromSer, type(lst_fromSer), sep='\n', end='\n\n\n')

lst_fromDf=df.values.tolist()
print('you can convert a dataframe to a nested list',lst_fromDf, type(lst_fromDf), sep='\n', end='\n\n')

arr_fromDf=df.values
print('the content of a dataframe can be accessed as a numpy.ndarray',arr_fromDf, type(arr_fromDf), sep='\n', end='\n\n')

如所指出的cs95其他方法应优先于只大熊猫.values属性从大熊猫版本0.24上看到这里。我在这里使用它,因为大多数人(到2019年)仍将具有较旧的版本,该版本不支持新的建议。您可以使用print(pd.__version__)

As this question attained a lot of attention and there are several ways to fulfill your task, let me present several options.

Those are all one-liners by the way ;)

Starting with:

df
  cluster load_date budget actual fixed_price
0       A  1/1/2014   1000   4000           Y
1       A  2/1/2014  12000  10000           Y
2       A  3/1/2014  36000   2000           Y
3       B  4/1/2014  15000  10000           N
4       B  4/1/2014  12000  11500           N
5       B  4/1/2014  90000  11000           N
6       C  7/1/2014  22000  18000           N
7       C  8/1/2014  30000  28960           N
8       C  9/1/2014  53000  51200           N

Overview of potential operations:

ser_aggCol (collapse each column to a list)
cluster          [A, A, A, B, B, B, C, C, C]
load_date      [1/1/2014, 2/1/2014, 3/1/2...
budget         [1000, 12000, 36000, 15000...
actual         [4000, 10000, 2000, 10000,...
fixed_price      [Y, Y, Y, N, N, N, N, N, N]
dtype: object


ser_aggRows (collapse each row to a list)
0     [A, 1/1/2014, 1000, 4000, Y]
1    [A, 2/1/2014, 12000, 10000...
2    [A, 3/1/2014, 36000, 2000, Y]
3    [B, 4/1/2014, 15000, 10000...
4    [B, 4/1/2014, 12000, 11500...
5    [B, 4/1/2014, 90000, 11000...
6    [C, 7/1/2014, 22000, 18000...
7    [C, 8/1/2014, 30000, 28960...
8    [C, 9/1/2014, 53000, 51200...
dtype: object


df_gr (here you get lists for each cluster)
                             load_date                 budget                 actual fixed_price
cluster                                                                                         
A        [1/1/2014, 2/1/2014, 3/1/2...   [1000, 12000, 36000]    [4000, 10000, 2000]   [Y, Y, Y]
B        [4/1/2014, 4/1/2014, 4/1/2...  [15000, 12000, 90000]  [10000, 11500, 11000]   [N, N, N]
C        [7/1/2014, 8/1/2014, 9/1/2...  [22000, 30000, 53000]  [18000, 28960, 51200]   [N, N, N]


a list of separate dataframes for each cluster

df for cluster A
  cluster load_date budget actual fixed_price
0       A  1/1/2014   1000   4000           Y
1       A  2/1/2014  12000  10000           Y
2       A  3/1/2014  36000   2000           Y

df for cluster B
  cluster load_date budget actual fixed_price
3       B  4/1/2014  15000  10000           N
4       B  4/1/2014  12000  11500           N
5       B  4/1/2014  90000  11000           N

df for cluster C
  cluster load_date budget actual fixed_price
6       C  7/1/2014  22000  18000           N
7       C  8/1/2014  30000  28960           N
8       C  9/1/2014  53000  51200           N

just the values of column load_date
0    1/1/2014
1    2/1/2014
2    3/1/2014
3    4/1/2014
4    4/1/2014
5    4/1/2014
6    7/1/2014
7    8/1/2014
8    9/1/2014
Name: load_date, dtype: object


just the values of column number 2
0     1000
1    12000
2    36000
3    15000
4    12000
5    90000
6    22000
7    30000
8    53000
Name: budget, dtype: object


just the values of row number 7
cluster               C
load_date      8/1/2014
budget            30000
actual            28960
fixed_price           N
Name: 7, dtype: object


============================== JUST FOR COMPLETENESS ==============================


you can convert a series to a list
['C', '8/1/2014', '30000', '28960', 'N']
<class 'list'>


you can convert a dataframe to a nested list
[['A', '1/1/2014', '1000', '4000', 'Y'], ['A', '2/1/2014', '12000', '10000', 'Y'], ['A', '3/1/2014', '36000', '2000', 'Y'], ['B', '4/1/2014', '15000', '10000', 'N'], ['B', '4/1/2014', '12000', '11500', 'N'], ['B', '4/1/2014', '90000', '11000', 'N'], ['C', '7/1/2014', '22000', '18000', 'N'], ['C', '8/1/2014', '30000', '28960', 'N'], ['C', '9/1/2014', '53000', '51200', 'N']]
<class 'list'>

the content of a dataframe can be accessed as a numpy.ndarray
[['A' '1/1/2014' '1000' '4000' 'Y']
 ['A' '2/1/2014' '12000' '10000' 'Y']
 ['A' '3/1/2014' '36000' '2000' 'Y']
 ['B' '4/1/2014' '15000' '10000' 'N']
 ['B' '4/1/2014' '12000' '11500' 'N']
 ['B' '4/1/2014' '90000' '11000' 'N']
 ['C' '7/1/2014' '22000' '18000' 'N']
 ['C' '8/1/2014' '30000' '28960' 'N']
 ['C' '9/1/2014' '53000' '51200' 'N']]
<class 'numpy.ndarray'>

code:

# prefix ser refers to pd.Series object
# prefix df refers to pd.DataFrame object
# prefix lst refers to list object

import pandas as pd
import numpy as np

df=pd.DataFrame([
        ['A',   '1/1/2014',    '1000',    '4000',    'Y'],
        ['A',   '2/1/2014',    '12000',   '10000',   'Y'],
        ['A',   '3/1/2014',    '36000',   '2000',    'Y'],
        ['B',   '4/1/2014',    '15000',   '10000',   'N'],
        ['B',   '4/1/2014',    '12000',   '11500',   'N'],
        ['B',   '4/1/2014',    '90000',   '11000',   'N'],
        ['C',   '7/1/2014',    '22000',   '18000',   'N'],
        ['C',   '8/1/2014',    '30000',   '28960',   'N'],
        ['C',   '9/1/2014',    '53000',   '51200',   'N']
        ], columns=['cluster', 'load_date',   'budget',  'actual',  'fixed_price'])
print('df',df, sep='\n', end='\n\n')

ser_aggCol=df.aggregate(lambda x: [x.tolist()], axis=0).map(lambda x:x[0])
print('ser_aggCol (collapse each column to a list)',ser_aggCol, sep='\n', end='\n\n\n')

ser_aggRows=pd.Series(df.values.tolist()) 
print('ser_aggRows (collapse each row to a list)',ser_aggRows, sep='\n', end='\n\n\n')

df_gr=df.groupby('cluster').agg(lambda x: list(x))
print('df_gr (here you get lists for each cluster)',df_gr, sep='\n', end='\n\n\n')

lst_dfFiltGr=[ df.loc[df['cluster']==val,:] for val in df['cluster'].unique() ]
print('a list of separate dataframes for each cluster', sep='\n', end='\n\n')
for dfTmp in lst_dfFiltGr:
    print('df for cluster '+str(dfTmp.loc[dfTmp.index[0],'cluster']),dfTmp, sep='\n', end='\n\n')

ser_singleColLD=df.loc[:,'load_date']
print('just the values of column load_date',ser_singleColLD, sep='\n', end='\n\n\n')

ser_singleCol2=df.iloc[:,2]
print('just the values of column number 2',ser_singleCol2, sep='\n', end='\n\n\n')

ser_singleRow7=df.iloc[7,:]
print('just the values of row number 7',ser_singleRow7, sep='\n', end='\n\n\n')

print('='*30+' JUST FOR COMPLETENESS '+'='*30, end='\n\n\n')

lst_fromSer=ser_singleRow7.tolist()
print('you can convert a series to a list',lst_fromSer, type(lst_fromSer), sep='\n', end='\n\n\n')

lst_fromDf=df.values.tolist()
print('you can convert a dataframe to a nested list',lst_fromDf, type(lst_fromDf), sep='\n', end='\n\n')

arr_fromDf=df.values
print('the content of a dataframe can be accessed as a numpy.ndarray',arr_fromDf, type(arr_fromDf), sep='\n', end='\n\n')

as pointed out by cs95 other methods should be preferred over pandas .values attribute from pandas version 0.24 on see here. I use it here, because most people will (by 2019) still have an older version, which does not support the new recommendations. You can check your version with print(pd.__version__)


回答 4

如果您的列只有一个值,pd.series.tolist()则将产生错误。为确保它适用于所有情况,请使用以下代码:

(
    df
        .filter(['column_name'])
        .values
        .reshape(1, -1)
        .ravel()
        .tolist()
)

If your column will only have one value something like pd.series.tolist() will produce an error. To guarantee that it will work for all cases, use the code below:

(
    df
        .filter(['column_name'])
        .values
        .reshape(1, -1)
        .ravel()
        .tolist()
)

回答 5

假设读取excel工作表后数据框的名称为df,获取一个空列表(例如dataList),逐行遍历数据框,然后像以下内容一样追加到您的空列表中:

dataList = [] #empty list
for index, row in df.iterrows(): 
    mylist = [row.cluster, row.load_date, row.budget, row.actual, row.fixed_price]
    dataList.append(mylist)

要么,

dataList = [] #empty list
for row in df.itertuples(): 
    mylist = [row.cluster, row.load_date, row.budget, row.actual, row.fixed_price]
    dataList.append(mylist)

不,如果您打印dataList,则将在中获得每一行作为列表dataList

Assuming the name of the dataframe after reading the excel sheet is df, take an empty list (e.g. dataList), iterate through the dataframe row by row and append to your empty list like-

dataList = [] #empty list
for index, row in df.iterrows(): 
    mylist = [row.cluster, row.load_date, row.budget, row.actual, row.fixed_price]
    dataList.append(mylist)

Or,

dataList = [] #empty list
for row in df.itertuples(): 
    mylist = [row.cluster, row.load_date, row.budget, row.actual, row.fixed_price]
    dataList.append(mylist)

No, if you print the dataList, you will get each rows as a list in the dataList.


回答 6

 amount = list()
    for col in df.columns:
        val = list(df[col])
        for v in val:
            amount.append(v)
 amount = list()
    for col in df.columns:
        val = list(df[col])
        for v in val:
            amount.append(v)

使用Python中的索引向后循环?

问题:使用Python中的索引向后循环?

我正在尝试从100循环到0。如何在Python中执行此操作?

for i in range (100,0) 不起作用。

I am trying to loop from 100 to 0. How do I do this in Python?

for i in range (100,0) doesn’t work.


回答 0

试试看range(100,-1,-1),第三个参数是要使用的增量(在此处记录)。

此处记录 “范围”选项,开始,停止,步骤)

Try range(100,-1,-1), the 3rd argument being the increment to use (documented here).

(“range” options, start, stop, step are documented here)


回答 1

我认为这是最易读的:

for i in reversed(xrange(101)):
    print i,

In my opinion, this is the most readable:

for i in reversed(xrange(101)):
    print i,

回答 2

for i in range(100, -1, -1)

和一些稍长(且较慢)的解决方案:

for i in reversed(range(101))

for i in range(101)[::-1]
for i in range(100, -1, -1)

and some slightly longer (and slower) solution:

for i in reversed(range(101))

for i in range(101)[::-1]

回答 3

通常在Python中,您可以使用负索引从背面开始:

numbers = [10, 20, 30, 40, 50]
for i in xrange(len(numbers)):
    print numbers[-i - 1]

结果:

50
40
30
20
10

Generally in Python, you can use negative indices to start from the back:

numbers = [10, 20, 30, 40, 50]
for i in xrange(len(numbers)):
    print numbers[-i - 1]

Result:

50
40
30
20
10

回答 4

为什么您的代码不起作用

您的代码for i in range (100, 0)很好,除了

step默认情况下,第三个参数()是+1。因此,必须向range()指定第三个参数才能-1向后退一步。

for i in range(100, -1, -1):
    print(i)

注意:这在输出中包括100&0。

有多种方法。

更好的方法

对于pythonic方式,请检查PEP 0322

这是Python3 pythonic示例,可从100打印到0(包括100和0)。

for i in reversed(range(101)):
    print(i)

Why your code didn’t work

You code for i in range (100, 0) is fine, except

the third parameter (step) is by default +1. So you have to specify 3rd parameter to range() as -1 to step backwards.

for i in range(100, -1, -1):
    print(i)

NOTE: This includes 100 & 0 in the output.

There are multiple ways.

Better Way

For pythonic way, check PEP 0322.

This is Python3 pythonic example to print from 100 to 0 (including 100 & 0).

for i in reversed(range(101)):
    print(i)

回答 5

另一个解决方案:

z = 10
for x in range (z):
   y = z-x
   print y

结果:

10
9
8
7
6
5
4
3
2
1

提示:如果您使用此方法对列表中的索引进行计数,则您希望从’y’值开始为-1,因为列表索引将从0开始。

Another solution:

z = 10
for x in range (z):
   y = z-x
   print y

Result:

10
9
8
7
6
5
4
3
2
1

Tip: If you are using this method to count back indices in a list, you will want to -1 from the ‘y’ value, as your list indices will begin at 0.


回答 6

解决您的问题的简单答案可能是这样的:

for i in range(100):
    k = 100 - i
    print(k)

The simple answer to solve your problem could be like this:

for i in range(100):
    k = 100 - i
    print(k)

回答 7

for var in range(10,-1,-1) 作品

for var in range(10,-1,-1) works


回答 8

简短而甜美。这是我参加codeAcademy类时的解决方案。以rev顺序打印字符串。

def reverse(text):
    string = ""
    for i in range(len(text)-1,-1,-1):
        string += text[i]
    return string    

Short and sweet. This was my solution when doing codeAcademy course. Prints a string in rev order.

def reverse(text):
    string = ""
    for i in range(len(text)-1,-1,-1):
        string += text[i]
    return string    

回答 9

在您的情况下100 - i,您始终可以增加范围并从变量中减去i in range( 0, 101 )

for i in range( 0, 101 ):
    print 100 - i

You can always do increasing range and subtract from a variable in your case 100 - i where i in range( 0, 101 ).

for i in range( 0, 101 ):
    print 100 - i

回答 10

我在一种代码学院练习中尝试过此操作(在不使用reversed或:: -1的情况下反转字符串中的字符)

def reverse(text):
    chars= []
    l = len(text)
    last = l-1
    for i in range (l):
        chars.append(text[last])
        last-=1

    result= ""   
    for c in chars:
        result += c
    return result
print reverse('hola')

I tried this in one of the codeacademy exercises (reversing chars in a string without using reversed nor :: -1)

def reverse(text):
    chars= []
    l = len(text)
    last = l-1
    for i in range (l):
        chars.append(text[last])
        last-=1

    result= ""   
    for c in chars:
        result += c
    return result
print reverse('hola')

回答 11

我想同时向后遍历两个列表,所以我需要负索引。这是我的解决方案:

a= [1,3,4,5,2]
for i in range(-1, -len(a), -1):
    print(i, a[i])

结果:

-1 2
-2 5
-3 4
-4 3
-5 1

I wanted to loop through a two lists backwards at the same time so I needed the negative index. This is my solution:

a= [1,3,4,5,2]
for i in range(-1, -len(a), -1):
    print(i, a[i])

Result:

-1 2
-2 5
-3 4
-4 3
-5 1

回答 12

哦,好吧,我读错了问题,我想这是关于在数组中向后移动吗?如果是这样,我有这个:

array = ["ty", "rogers", "smith", "davis", "tony", "jack", "john", "jill", "harry", "tom", "jane", "hilary", "jackson", "andrew", "george", "rachel"]


counter = 0   

for loop in range(len(array)):
    if loop <= len(array):
        counter = -1
        reverseEngineering = loop + counter
        print(array[reverseEngineering])

Oh okay read the question wrong, I guess it’s about going backward in an array? if so, I have this:

array = ["ty", "rogers", "smith", "davis", "tony", "jack", "john", "jill", "harry", "tom", "jane", "hilary", "jackson", "andrew", "george", "rachel"]


counter = 0   

for loop in range(len(array)):
    if loop <= len(array):
        counter = -1
        reverseEngineering = loop + counter
        print(array[reverseEngineering])

回答 13

您还可以在python中创建自定义反向机制。可以在任何地方用于循环迭代向后

class Reverse:
    """Iterator for looping over a sequence backwards"""
    def __init__(self, seq):
        self.seq = seq
        self.index = len(seq)

    def __iter__(self):
        return self

    def __next__(self):
        if self.index == 0:
            raise StopIteration
        self.index -= 1
        return self.seq[self.index]


>>> d = [1,2,3,4,5]
>>> for i in Reverse(d):
...   print(i)
... 
5
4
3
2
1

You can also create a custom reverse mechanism in python. Which can be use anywhere for looping an iterable backwards

class Reverse:
    """Iterator for looping over a sequence backwards"""
    def __init__(self, seq):
        self.seq = seq
        self.index = len(seq)

    def __iter__(self):
        return self

    def __next__(self):
        if self.index == 0:
            raise StopIteration
        self.index -= 1
        return self.seq[self.index]


>>> d = [1,2,3,4,5]
>>> for i in Reverse(d):
...   print(i)
... 
5
4
3
2
1

回答 14

a = 10
for i in sorted(range(a), reverse=True):
    print i
a = 10
for i in sorted(range(a), reverse=True):
    print i

检查列表中的所有元素是否相同

问题:检查列表中的所有元素是否相同

我需要以下功能:

输入:alist

输出

  • True 如果输入列表中的所有元素使用标准相等运算符求值彼此相等;
  • False 除此以外。

性能:当然,我不希望产生任何不必要的开销。

我认为最好:

  • 遍历列表
  • 比较相邻元素
  • AND所有结果布尔值

但是我不确定最Pythonic的方法是什么。


缺少短路功能只会损害早期输入不相等的长输入(超过50个元素)。如果这种情况经常发生(频率取决于列表的长度),则需要短路。最好的短路算法似乎是@KennyTM checkEqual1。但是,它为此付出了巨大的代价:

  • 性能几乎是同类产品的20倍
  • 短名单上的性能提高了2.5倍

如果没有出现早期输入不相等的长输入(或发生的次数很少),则不需要短路。然后,到目前为止最快的是@Ivo van der Wijk解决方案。

I need the following function:

Input: a list

Output:

  • True if all elements in the input list evaluate as equal to each other using the standard equality operator;
  • False otherwise.

Performance: of course, I prefer not to incur any unnecessary overhead.

I feel it would be best to:

  • iterate through the list
  • compare adjacent elements
  • and AND all the resulting Boolean values

But I’m not sure what’s the most Pythonic way to do that.


The lack of short-circuit feature only hurts on a long input (over ~50 elements) that have unequal elements early on. If this occurs often enough (how often depends on how long the lists might be), the short-circuit is required. The best short-circuit algorithm seems to be @KennyTM checkEqual1. It pays, however, a significant cost for this:

  • up to 20x in performance nearly-identical lists
  • up to 2.5x in performance on short lists

If the long inputs with early unequal elements don’t happen (or happen sufficiently rarely), short-circuit isn’t required. Then, by far the fastest is @Ivo van der Wijk solution.


回答 0

通用方法:

def checkEqual1(iterator):
    iterator = iter(iterator)
    try:
        first = next(iterator)
    except StopIteration:
        return True
    return all(first == rest for rest in iterator)

单线:

def checkEqual2(iterator):
   return len(set(iterator)) <= 1

也是单线的:

def checkEqual3(lst):
   return lst[1:] == lst[:-1]

这三个版本之间的区别在于:

  1. checkEqual2内容中必须是可哈希的。
  2. checkEqual1并且checkEqual2可以使用任何迭代器,但checkEqual3必须接受序列输入,通常是列表或元组之类的具体容器。
  3. checkEqual1 发现差异后立即停止。
  4. 由于checkEqual1包含更多的Python代码,因此当许多项目在开始时相等时效率较低。
  5. 由于checkEqual2checkEqual3始终执行O(N)复制操作,因此,如果您的大多数输入将返回False,则它们将花费更长的时间。
  6. 对于checkEqual2checkEqual3很难适应从a == b到的比较a is b

timeit 结果,对于Python 2.7和(仅s1,s4,s7,s9应该返回True)

s1 = [1] * 5000
s2 = [1] * 4999 + [2]
s3 = [2] + [1]*4999
s4 = [set([9])] * 5000
s5 = [set([9])] * 4999 + [set([10])]
s6 = [set([10])] + [set([9])] * 4999
s7 = [1,1]
s8 = [1,2]
s9 = []

我们得到

      | checkEqual1 | checkEqual2 | checkEqual3  | checkEqualIvo | checkEqual6502 |
|-----|-------------|-------------|--------------|---------------|----------------|
| s1  | 1.19   msec | 348    usec | 183     usec | 51.6    usec  | 121     usec   |
| s2  | 1.17   msec | 376    usec | 185     usec | 50.9    usec  | 118     usec   |
| s3  | 4.17   usec | 348    usec | 120     usec | 264     usec  | 61.3    usec   |
|     |             |             |              |               |                |
| s4  | 1.73   msec |             | 182     usec | 50.5    usec  | 121     usec   |
| s5  | 1.71   msec |             | 181     usec | 50.6    usec  | 125     usec   |
| s6  | 4.29   usec |             | 122     usec | 423     usec  | 61.1    usec   |
|     |             |             |              |               |                |
| s7  | 3.1    usec | 1.4    usec | 1.24    usec | 0.932   usec  | 1.92    usec   |
| s8  | 4.07   usec | 1.54   usec | 1.28    usec | 0.997   usec  | 1.79    usec   |
| s9  | 5.91   usec | 1.25   usec | 0.749   usec | 0.407   usec  | 0.386   usec   |

注意:

# http://stackoverflow.com/q/3844948/
def checkEqualIvo(lst):
    return not lst or lst.count(lst[0]) == len(lst)

# http://stackoverflow.com/q/3844931/
def checkEqual6502(lst):
    return not lst or [lst[0]]*len(lst) == lst

General method:

def checkEqual1(iterator):
    iterator = iter(iterator)
    try:
        first = next(iterator)
    except StopIteration:
        return True
    return all(first == rest for rest in iterator)

One-liner:

def checkEqual2(iterator):
   return len(set(iterator)) <= 1

Also one-liner:

def checkEqual3(lst):
   return lst[1:] == lst[:-1]

The difference between the 3 versions are that:

  1. In checkEqual2 the content must be hashable.
  2. checkEqual1 and checkEqual2 can use any iterators, but checkEqual3 must take a sequence input, typically concrete containers like a list or tuple.
  3. checkEqual1 stops as soon as a difference is found.
  4. Since checkEqual1 contains more Python code, it is less efficient when many of the items are equal in the beginning.
  5. Since checkEqual2 and checkEqual3 always perform O(N) copying operations, they will take longer if most of your input will return False.
  6. For checkEqual2 and checkEqual3 it’s harder to adapt comparison from a == b to a is b.

timeit result, for Python 2.7 and (only s1, s4, s7, s9 should return True)

s1 = [1] * 5000
s2 = [1] * 4999 + [2]
s3 = [2] + [1]*4999
s4 = [set([9])] * 5000
s5 = [set([9])] * 4999 + [set([10])]
s6 = [set([10])] + [set([9])] * 4999
s7 = [1,1]
s8 = [1,2]
s9 = []

we get

      | checkEqual1 | checkEqual2 | checkEqual3  | checkEqualIvo | checkEqual6502 |
|-----|-------------|-------------|--------------|---------------|----------------|
| s1  | 1.19   msec | 348    usec | 183     usec | 51.6    usec  | 121     usec   |
| s2  | 1.17   msec | 376    usec | 185     usec | 50.9    usec  | 118     usec   |
| s3  | 4.17   usec | 348    usec | 120     usec | 264     usec  | 61.3    usec   |
|     |             |             |              |               |                |
| s4  | 1.73   msec |             | 182     usec | 50.5    usec  | 121     usec   |
| s5  | 1.71   msec |             | 181     usec | 50.6    usec  | 125     usec   |
| s6  | 4.29   usec |             | 122     usec | 423     usec  | 61.1    usec   |
|     |             |             |              |               |                |
| s7  | 3.1    usec | 1.4    usec | 1.24    usec | 0.932   usec  | 1.92    usec   |
| s8  | 4.07   usec | 1.54   usec | 1.28    usec | 0.997   usec  | 1.79    usec   |
| s9  | 5.91   usec | 1.25   usec | 0.749   usec | 0.407   usec  | 0.386   usec   |

Note:

# http://stackoverflow.com/q/3844948/
def checkEqualIvo(lst):
    return not lst or lst.count(lst[0]) == len(lst)

# http://stackoverflow.com/q/3844931/
def checkEqual6502(lst):
    return not lst or [lst[0]]*len(lst) == lst

回答 1

比对序列(不是可迭代对象)使用set()更快的解决方案是仅对第一个元素进行计数。这假设列表是非空的(但是检查起来很麻烦,并自己决定结果应该在空列表中)

x.count(x[0]) == len(x)

一些简单的基准:

>>> timeit.timeit('len(set(s1))<=1', 's1=[1]*5000', number=10000)
1.4383411407470703
>>> timeit.timeit('len(set(s1))<=1', 's1=[1]*4999+[2]', number=10000)
1.4765670299530029
>>> timeit.timeit('s1.count(s1[0])==len(s1)', 's1=[1]*5000', number=10000)
0.26274609565734863
>>> timeit.timeit('s1.count(s1[0])==len(s1)', 's1=[1]*4999+[2]', number=10000)
0.25654196739196777

A solution faster than using set() that works on sequences (not iterables) is to simply count the first element. This assumes the list is non-empty (but that’s trivial to check, and decide yourself what the outcome should be on an empty list)

x.count(x[0]) == len(x)

some simple benchmarks:

>>> timeit.timeit('len(set(s1))<=1', 's1=[1]*5000', number=10000)
1.4383411407470703
>>> timeit.timeit('len(set(s1))<=1', 's1=[1]*4999+[2]', number=10000)
1.4765670299530029
>>> timeit.timeit('s1.count(s1[0])==len(s1)', 's1=[1]*5000', number=10000)
0.26274609565734863
>>> timeit.timeit('s1.count(s1[0])==len(s1)', 's1=[1]*4999+[2]', number=10000)
0.25654196739196777

回答 2

最简单,最优雅的方法如下:

all(x==myList[0] for x in myList)

(是的,这甚至适用于空列表!这是因为这是python具有惰性语义的少数情况之一。)

关于性能,这将尽早失败,因此它是渐近最佳的。

The simplest and most elegant way is as follows:

all(x==myList[0] for x in myList)

(Yes, this even works with the empty list! This is because this is one of the few cases where python has lazy semantics.)

Regarding performance, this will fail at the earliest possible time, so it is asymptotically optimal.


回答 3

一组比较工作:

len(set(the_list)) == 1

使用set删除所有重复的元素。

A set comparison work:

len(set(the_list)) == 1

Using set removes all duplicate elements.


回答 4

您可以将列表转换为集合。集合不能重复。因此,如果原始列表中的所有元素都相同,则该集合将只有一个元素。

if len(sets.Set(input_list)) == 1
// input_list has all identical elements.

You can convert the list to a set. A set cannot have duplicates. So if all the elements in the original list are identical, the set will have just one element.

if len(sets.Set(input_list)) == 1
// input_list has all identical elements.

回答 5

对于它的价值,它最近出现在python-ideas邮件列表中。事实证明,已经有一个itertools方法可以做到这一点:1

def all_equal(iterable):
    "Returns True if all the elements are equal to each other"
    g = groupby(iterable)
    return next(g, True) and not next(g, False)

据说它的性能非常好,并且具有一些不错的属性。

  1. 短路:一旦找到第一个不等项,它将立即停止消耗可迭代项中的项。
  2. 不需要项目是可哈希的。
  3. 它是惰性的,仅需要O(1)额外的内存来执行检查。

1换句话说,我不能因提出解决方案而功不可没-甚至找不到它也不能功劳。

For what it’s worth, this came up on the python-ideas mailing list recently. It turns out that there is an itertools recipe for doing this already:1

def all_equal(iterable):
    "Returns True if all the elements are equal to each other"
    g = groupby(iterable)
    return next(g, True) and not next(g, False)

Supposedly it performs very nicely and has a few nice properties.

  1. Short-circuits: It will stop consuming items from the iterable as soon as it finds the first non-equal item.
  2. Doesn’t require items to be hashable.
  3. It is lazy and only requires O(1) additional memory to do the check.

1In other words, I can’t take the credit for coming up with the solution — nor can I take credit for even finding it.


回答 6

这是两种简单的方法

使用set()

将列表转换为集合时,将删除重复的元素。因此,如果转换后的集合的长度为1,则意味着所有元素都相同。

len(set(input_list))==1

这是一个例子

>>> a = ['not', 'the', 'same']
>>> b = ['same', 'same', 'same']
>>> len(set(a))==1  # == 3
False
>>> len(set(b))==1  # == 1
True

使用all()

这会将输入列表的第一个元素与列表中的所有其他元素进行比较(等效)。如果相等,则返回True,否则返回False。

all(element==input_list[0] for element in input_list)

这是一个例子

>>> a = [1, 2, 3, 4, 5]
>>> b = [1, 1, 1, 1, 1]
>>> all(number==a[0] for number in a)
False
>>> all(number==b[0] for number in b)
True

PS如果要检查整个列表是否等效于某个值,则可以在input_list [0]中设置该值。

Here are two simple ways of doing this

using set()

When converting the list to a set, duplicate elements are removed. So if the length of the converted set is 1, then this implies that all the elements are the same.

len(set(input_list))==1

Here is an example

>>> a = ['not', 'the', 'same']
>>> b = ['same', 'same', 'same']
>>> len(set(a))==1  # == 3
False
>>> len(set(b))==1  # == 1
True

using all()

This will compare (equivalence) the first element of the input list to every other element in the list. If all are equivalent True will be returned, otherwise False will be returned.

all(element==input_list[0] for element in input_list)

Here is an example

>>> a = [1, 2, 3, 4, 5]
>>> b = [1, 1, 1, 1, 1]
>>> all(number==a[0] for number in a)
False
>>> all(number==b[0] for number in b)
True

P.S If you are checking to see if the whole list is equivalent to a certain value, you can suibstitue the value in for input_list[0].


回答 7

这是另一种选择,比len(set(x))==1长列表(使用短路)快

def constantList(x):
    return x and [x[0]]*len(x) == x

This is another option, faster than len(set(x))==1 for long lists (uses short circuit)

def constantList(x):
    return x and [x[0]]*len(x) == x

回答 8

这是一种简单的方法:

result = mylist and all(mylist[0] == elem for elem in mylist)

这稍微复杂一点,但会产生函数调用开销,但语义会更清楚地说明:

def all_identical(seq):
    if not seq:
        # empty list is False.
        return False
    first = seq[0]
    return all(first == elem for elem in seq)

This is a simple way of doing it:

result = mylist and all(mylist[0] == elem for elem in mylist)

This is slightly more complicated, it incurs function call overhead, but the semantics are more clearly spelled out:

def all_identical(seq):
    if not seq:
        # empty list is False.
        return False
    first = seq[0]
    return all(first == elem for elem in seq)

回答 9

检查所有元素是否等于第一个。

np.allclose(array, array[0])

Check if all elements equal to the first.

np.allclose(array, array[0])


回答 10

怀疑这是“最Python化的”,但类似:

>>> falseList = [1,2,3,4]
>>> trueList = [1, 1, 1]
>>> 
>>> def testList(list):
...   for item in list[1:]:
...     if item != list[0]:
...       return False
...   return True
... 
>>> testList(falseList)
False
>>> testList(trueList)
True

会成功的

Doubt this is the “most Pythonic”, but something like:

>>> falseList = [1,2,3,4]
>>> trueList = [1, 1, 1]
>>> 
>>> def testList(list):
...   for item in list[1:]:
...     if item != list[0]:
...       return False
...   return True
... 
>>> testList(falseList)
False
>>> testList(trueList)
True

would do the trick.


回答 11

如果您对可读性更高(但当然不那么有效)感兴趣,可以尝试:

def compare_lists(list1, list2):
    if len(list1) != len(list2): # Weed out unequal length lists.
        return False
    for item in list1:
        if item not in list2:
            return False
    return True

a_list_1 = ['apple', 'orange', 'grape', 'pear']
a_list_2 = ['pear', 'orange', 'grape', 'apple']

b_list_1 = ['apple', 'orange', 'grape', 'pear']
b_list_2 = ['apple', 'orange', 'banana', 'pear']

c_list_1 = ['apple', 'orange', 'grape']
c_list_2 = ['grape', 'orange']

print compare_lists(a_list_1, a_list_2) # Returns True
print compare_lists(b_list_1, b_list_2) # Returns False
print compare_lists(c_list_1, c_list_2) # Returns False

If you’re interested in something a little more readable (but of course not as efficient,) you could try:

def compare_lists(list1, list2):
    if len(list1) != len(list2): # Weed out unequal length lists.
        return False
    for item in list1:
        if item not in list2:
            return False
    return True

a_list_1 = ['apple', 'orange', 'grape', 'pear']
a_list_2 = ['pear', 'orange', 'grape', 'apple']

b_list_1 = ['apple', 'orange', 'grape', 'pear']
b_list_2 = ['apple', 'orange', 'banana', 'pear']

c_list_1 = ['apple', 'orange', 'grape']
c_list_2 = ['grape', 'orange']

print compare_lists(a_list_1, a_list_2) # Returns True
print compare_lists(b_list_1, b_list_2) # Returns False
print compare_lists(c_list_1, c_list_2) # Returns False

回答 12

将列表转换为集合,然后找到集合中的元素数。如果结果为1,则其元素相同,如果不相同,则列表中的元素不相同。

list1 = [1,1,1]
len(set(list1)) 
>1

list1 = [1,2,3]
len(set(list1)
>3

Convert the list into the set and then find the number of elements in the set. If the result is 1, it has identical elements and if not, then the elements in the list are not identical.

list1 = [1,1,1]
len(set(list1)) 
>1

list1 = [1,2,3]
len(set(list1)
>3

回答 13

关于reduce()与一起使用lambda。这是一个我个人认为比其他答案更好的工作代码。

reduce(lambda x, y: (x[1]==y, y), [2, 2, 2], (True, 2))

返回一个元组,如果所有项目相同或不同,则第一个值为布尔值。

Regarding using reduce() with lambda. Here is a working code that I personally think is way nicer than some of the other answers.

reduce(lambda x, y: (x[1]==y, y), [2, 2, 2], (True, 2))

Returns a tuple where the first value is the boolean if all items are same or not.


回答 14

我会做:

not any((x[i] != x[i+1] for i in range(0, len(x)-1)))

因为any一旦找到True条件就停止搜索可迭代对象。

I’d do:

not any((x[i] != x[i+1] for i in range(0, len(x)-1)))

as any stops searching the iterable as soon as it finds a True condition.


回答 15

>>> a = [1, 2, 3, 4, 5, 6]
>>> z = [(a[x], a[x+1]) for x in range(0, len(a)-1)]
>>> z
[(1, 2), (2, 3), (3, 4), (4, 5), (5, 6)]
# Replacing it with the test
>>> z = [(a[x] == a[x+1]) for x in range(0, len(a)-1)]
>>> z
[False, False, False, False, False]
>>> if False in z : Print "All elements are not equal"
>>> a = [1, 2, 3, 4, 5, 6]
>>> z = [(a[x], a[x+1]) for x in range(0, len(a)-1)]
>>> z
[(1, 2), (2, 3), (3, 4), (4, 5), (5, 6)]
# Replacing it with the test
>>> z = [(a[x] == a[x+1]) for x in range(0, len(a)-1)]
>>> z
[False, False, False, False, False]
>>> if False in z : Print "All elements are not equal"

回答 16

def allTheSame(i):
    j = itertools.groupby(i)
    for k in j: break
    for k in j: return False
    return True

在没有“ all”的Python 2.4中工作。

def allTheSame(i):
    j = itertools.groupby(i)
    for k in j: break
    for k in j: return False
    return True

Works in Python 2.4, which doesn’t have “all”.


回答 17

可以使用地图和lambda

lst = [1,1,1,1,1,1,1,1,1]

print all(map(lambda x: x == lst[0], lst[1:]))

Can use map and lambda

lst = [1,1,1,1,1,1,1,1,1]

print all(map(lambda x: x == lst[0], lst[1:]))

回答 18

或使用diffnumpy的方法:

import numpy as np
def allthesame(l):
    return np.all(np.diff(l)==0)

并调用:

print(allthesame([1,1,1]))

输出:

True

Or use diff method of numpy:

import numpy as np
def allthesame(l):
    return np.all(np.diff(l)==0)

And to call:

print(allthesame([1,1,1]))

Output:

True

回答 19

或者使用numpy的diff方法:

import numpy as np
def allthesame(l):
    return np.unique(l).shape[0]<=1

并调用:

print(allthesame([1,1,1]))

输出:

真正

Or use diff method of numpy:

import numpy as np
def allthesame(l):
    return np.unique(l).shape[0]<=1

And to call:

print(allthesame([1,1,1]))

Output:

True


回答 20

你可以做:

reduce(and_, (x==yourList[0] for x in yourList), True)

python使您像那样导入运算符是很烦人的operator.and_。从python3开始,您还需要import functools.reduce

(您不应该使用此方法,因为如果找到不相等的值,它将不会中断,但是会继续检查整个列表。此处仅作为完整性的答案。)

You can do:

reduce(and_, (x==yourList[0] for x in yourList), True)

It is fairly annoying that python makes you import the operators like operator.and_. As of python3, you will need to also import functools.reduce.

(You should not use this method because it will not break if it finds non-equal values, but will continue examining the entire list. It is just included here as an answer for completeness.)


回答 21

lambda lst: reduce(lambda a,b:(b,b==a[0] and a[1]), lst, (lst[0], True))[1]

下一个将短路短路:

all(itertools.imap(lambda i:yourlist[i]==yourlist[i+1], xrange(len(yourlist)-1)))
lambda lst: reduce(lambda a,b:(b,b==a[0] and a[1]), lst, (lst[0], True))[1]

The next one will short short circuit:

all(itertools.imap(lambda i:yourlist[i]==yourlist[i+1], xrange(len(yourlist)-1)))

回答 22

将列表更改为一组。然后,如果集合的大小仅为1,则它们必须相同。

if len(set(my_list)) == 1:

Change the list to a set. Then if the size of the set is only 1, they must have been the same.

if len(set(my_list)) == 1:

回答 23

还有一个纯Python递归选项:

 def checkEqual(lst):
    if len(lst)==2 :
        return lst[0]==lst[1]
    else:
        return lst[0]==lst[1] and checkEqual(lst[1:])

但是由于某种原因,它在某些情况下比其他选择要慢两个数量级。从C语言的心态来看,我期望这会更快,但事实并非如此!

另一个缺点是Python中存在递归限制,在这种情况下需要对其进行调整。例如使用this

There is also a pure Python recursive option:

 def checkEqual(lst):
    if len(lst)==2 :
        return lst[0]==lst[1]
    else:
        return lst[0]==lst[1] and checkEqual(lst[1:])

However for some reason it is in some cases two orders of magnitude slower than other options. Coming from C language mentality, I expected this to be faster, but it is not!

The other disadvantage is that there is recursion limit in Python which needs to be adjusted in this case. For example using this.


回答 24

您可以.nunique()用来查找列表中唯一项目的数量。

def identical_elements(list):
    series = pd.Series(list)
    if series.nunique() == 1: identical = True
    else:  identical = False
    return identical



identical_elements(['a', 'a'])
Out[427]: True

identical_elements(['a', 'b'])
Out[428]: False

You can use .nunique() to find number of unique items in a list.

def identical_elements(list):
    series = pd.Series(list)
    if series.nunique() == 1: identical = True
    else:  identical = False
    return identical



identical_elements(['a', 'a'])
Out[427]: True

identical_elements(['a', 'b'])
Out[428]: False

回答 25

您可以使用set。它将设置并删除重复的元素。然后检查其元素是否超过1个。

if len(set(your_list)) <= 1:
    print('all ements are equal')

例:

>>> len(set([5, 5])) <= 1
True

you can use set. It will make a set and remove repetitive elements. Then check that it has no more than 1 element.

if len(set(your_list)) <= 1:
    print('all ements are equal')

Example:

>>> len(set([5, 5])) <= 1
True

如何在Python中显式释放内存?

问题:如何在Python中显式释放内存?

我编写了一个Python程序,该程序作用于大型输入文件,以创建代表三角形的数百万个对象。该算法是:

  1. 读取输入文件
  2. 处理文件并创建一个三角形列表,以其顶点表示
  3. 以OFF格式输出顶点:顶点列表,后跟三角形列表。三角形由顶点列表中的索引表示

在打印出三角形之前先打印出完整的顶点列表的OFF要求意味着在将输出写入文件之前,必须将三角形的列表保留在内存中。同时,由于列表的大小,我遇到了内存错误。

告诉Python我不再需要某些数据并且可以释放它们的最佳方法是什么?

I wrote a Python program that acts on a large input file to create a few million objects representing triangles. The algorithm is:

  1. read an input file
  2. process the file and create a list of triangles, represented by their vertices
  3. output the vertices in the OFF format: a list of vertices followed by a list of triangles. The triangles are represented by indices into the list of vertices

The requirement of OFF that I print out the complete list of vertices before I print out the triangles means that I have to hold the list of triangles in memory before I write the output to file. In the meanwhile I’m getting memory errors because of the sizes of the lists.

What is the best way to tell Python that I no longer need some of the data, and it can be freed?


回答 0

根据Python官方文档,您可以使用强制垃圾回收器释放未引用的内存gc.collect()。例:

import gc
gc.collect()

According to Python Official Documentation, you can force the Garbage Collector to release unreferenced memory with gc.collect(). Example:

import gc
gc.collect()

回答 1

不幸的是(取决于您的Python版本和版本),某些类型的对象使用“空闲列表”,这是一种整洁的局部优化,但可能会导致内存碎片,特别是通过为特定类型的对象设置越来越多的“专用”内存来实现。因此无法使用“普通基金”。

确保大量但临时使用内存的唯一真正可靠的方法是在完成后将所有资源都返还给系统,这是让使用发生在子进程中,该进程需要大量内存,然后终止。在这种情况下,操作系统将完成其工作,并乐意回收子进程可能吞没的所有资源。幸运的是,该multiprocessing模块使这种操作(过去很痛苦)在现代版本的Python中还不错。

在您的用例中,似乎子过程累积一些结果并确保这些结果可用于主过程的最佳方法是使用半临时文件(我所说的是半临时文件,而不是那种关闭后会自动消失,只会删除您用完后会明确删除的普通文件)。

Unfortunately (depending on your version and release of Python) some types of objects use “free lists” which are a neat local optimization but may cause memory fragmentation, specifically by making more and more memory “earmarked” for only objects of a certain type and thereby unavailable to the “general fund”.

The only really reliable way to ensure that a large but temporary use of memory DOES return all resources to the system when it’s done, is to have that use happen in a subprocess, which does the memory-hungry work then terminates. Under such conditions, the operating system WILL do its job, and gladly recycle all the resources the subprocess may have gobbled up. Fortunately, the multiprocessing module makes this kind of operation (which used to be rather a pain) not too bad in modern versions of Python.

In your use case, it seems that the best way for the subprocesses to accumulate some results and yet ensure those results are available to the main process is to use semi-temporary files (by semi-temporary I mean, NOT the kind of files that automatically go away when closed, just ordinary files that you explicitly delete when you’re all done with them).


回答 2

del语句可能有用,但是IIRC 不能保证释放内存。该文档是在这里 …和为什么它没有被释放是在这里

我听说Linux和Unix类型系统上的人们分叉python进程来做一些工作,获得结果然后杀死它。

本文对Python垃圾收集器进行了说明,但我认为缺乏内存控制是托管内存的缺点

The del statement might be of use, but IIRC it isn’t guaranteed to free the memory. The docs are here … and a why it isn’t released is here.

I have heard people on Linux and Unix-type systems forking a python process to do some work, getting results and then killing it.

This article has notes on the Python garbage collector, but I think lack of memory control is the downside to managed memory


回答 3

Python是垃圾回收的,因此,如果减小列表的大小,它将回收内存。您还可以使用“ del”语句完全摆脱变量:

biglist = [blah,blah,blah]
#...
del biglist

Python is garbage-collected, so if you reduce the size of your list, it will reclaim memory. You can also use the “del” statement to get rid of a variable completely:

biglist = [blah,blah,blah]
#...
del biglist

回答 4

您不能显式释放内存。您需要做的是确保您不保留对对象的引用。然后将对它们进行垃圾回收,从而释放内存。

对于您的情况,当您需要大型列表时,通常需要重新组织代码,通常使用生成器/迭代器。这样,您根本就不需要在内存中存储大型列表。

http://www.prasannatech.net/2009/07/introduction-python-generators.html

You can’t explicitly free memory. What you need to do is to make sure you don’t keep references to objects. They will then be garbage collected, freeing the memory.

In your case, when you need large lists, you typically need to reorganize the code, typically using generators/iterators instead. That way you don’t need to have the large lists in memory at all.

http://www.prasannatech.net/2009/07/introduction-python-generators.html


回答 5

del可以是您的朋友,因为当没有其他引用时,它将对象标记为可删除。现在,CPython解释器通常会保留此内存供以后使用,因此您的操作系统可能看不到“已释放”的内存。)

通过使用更紧凑的数据结构,也许您一开始就不会遇到任何内存问题。因此,数字列表的存储效率比标准array模块或第三方numpy模块使用的格式低得多。通过将顶点放在NumPy 3xN数组中并将三角形放在N元素数组中,可以节省内存。

(del can be your friend, as it marks objects as being deletable when there no other references to them. Now, often the CPython interpreter keeps this memory for later use, so your operating system might not see the “freed” memory.)

Maybe you would not run into any memory problem in the first place by using a more compact structure for your data. Thus, lists of numbers are much less memory-efficient than the format used by the standard array module or the third-party numpy module. You would save memory by putting your vertices in a NumPy 3xN array and your triangles in an N-element array.


回答 6

从文件读取图形时,我遇到了类似的问题。该处理包括计算不适合内存的200 000×200 000浮点矩阵(一次一行)。尝试使用gc.collect()固定的内存相关方面来释放两次计算之间的内存,但这导致了性能问题:我不知道为什么,但是即使使用的内存量保持不变,每次调用也要gc.collect()花费更多的时间。前一个。因此,垃圾收集很快就花费了大部分计算时间。

为了解决内存和性能问题,我改用了在某处阅读过的多线程技巧(很抱歉,我找不到相关的文章了)。在以大for循环读取文件的每一行之前,先对其进行处理,然后gc.collect()每隔一段时间运行一次以释放内存空间。现在,我调用一个在新线程中读取和处理文件块的函数。线程结束后,将自动释放内存,而不会出现奇怪的性能问题。

实际上它是这样的:

from dask import delayed  # this module wraps the multithreading
def f(storage, index, chunk_size):  # the processing function
    # read the chunk of size chunk_size starting at index in the file
    # process it using data in storage if needed
    # append data needed for further computations  to storage 
    return storage

partial_result = delayed([])  # put into the delayed() the constructor for your data structure
# I personally use "delayed(nx.Graph())" since I am creating a networkx Graph
chunk_size = 100  # ideally you want this as big as possible while still enabling the computations to fit in memory
for index in range(0, len(file), chunk_size):
    # we indicates to dask that we will want to apply f to the parameters partial_result, index, chunk_size
    partial_result = delayed(f)(partial_result, index, chunk_size)

    # no computations are done yet !
    # dask will spawn a thread to run f(partial_result, index, chunk_size) once we call partial_result.compute()
    # passing the previous "partial_result" variable in the parameters assures a chunk will only be processed after the previous one is done
    # it also allows you to use the results of the processing of the previous chunks in the file if needed

# this launches all the computations
result = partial_result.compute()

# one thread is spawned for each "delayed" one at a time to compute its result
# dask then closes the tread, which solves the memory freeing issue
# the strange performance issue with gc.collect() is also avoided

I had a similar problem in reading a graph from a file. The processing included the computation of a 200 000×200 000 float matrix (one line at a time) that did not fit into memory. Trying to free the memory between computations using gc.collect() fixed the memory-related aspect of the problem but it resulted in performance issues: I don’t know why but even though the amount of used memory remained constant, each new call to gc.collect() took some more time than the previous one. So quite quickly the garbage collecting took most of the computation time.

To fix both the memory and performance issues I switched to the use of a multithreading trick I read once somewhere (I’m sorry, I cannot find the related post anymore). Before I was reading each line of the file in a big for loop, processing it, and running gc.collect() every once and a while to free memory space. Now I call a function that reads and processes a chunk of the file in a new thread. Once the thread ends, the memory is automatically freed without the strange performance issue.

Practically it works like this:

from dask import delayed  # this module wraps the multithreading
def f(storage, index, chunk_size):  # the processing function
    # read the chunk of size chunk_size starting at index in the file
    # process it using data in storage if needed
    # append data needed for further computations  to storage 
    return storage

partial_result = delayed([])  # put into the delayed() the constructor for your data structure
# I personally use "delayed(nx.Graph())" since I am creating a networkx Graph
chunk_size = 100  # ideally you want this as big as possible while still enabling the computations to fit in memory
for index in range(0, len(file), chunk_size):
    # we indicates to dask that we will want to apply f to the parameters partial_result, index, chunk_size
    partial_result = delayed(f)(partial_result, index, chunk_size)

    # no computations are done yet !
    # dask will spawn a thread to run f(partial_result, index, chunk_size) once we call partial_result.compute()
    # passing the previous "partial_result" variable in the parameters assures a chunk will only be processed after the previous one is done
    # it also allows you to use the results of the processing of the previous chunks in the file if needed

# this launches all the computations
result = partial_result.compute()

# one thread is spawned for each "delayed" one at a time to compute its result
# dask then closes the tread, which solves the memory freeing issue
# the strange performance issue with gc.collect() is also avoided

回答 7

其他人已经发布了一些方法,使您可以“哄骗” Python解释器释放内存(或者避免出现内存问题)。您应该首先尝试他们的想法。但是,我觉得给您直接回答您的问题很重要。

实际上并没有直接告诉Python释放内存的方法。这件事的事实是,如果您想要较低的控制级别,则必须使用C或C ++编写扩展。

也就是说,有一些工具可以帮助您:

Others have posted some ways that you might be able to “coax” the Python interpreter into freeing the memory (or otherwise avoid having memory problems). Chances are you should try their ideas out first. However, I feel it important to give you a direct answer to your question.

There isn’t really any way to directly tell Python to free memory. The fact of that matter is that if you want that low a level of control, you’re going to have to write an extension in C or C++.

That said, there are some tools to help with this:


回答 8

如果您不关心顶点重用,则可以有两个输出文件-一个用于顶点,一个用于三角形。完成后,将三角形文件附加到顶点文件。

If you don’t care about vertex reuse, you could have two output files–one for vertices and one for triangles. Then append the triangle file to the vertex file when you are done.


如何将两个“唯一”字段定义为一对

问题:如何将两个“唯一”字段定义为一对

有没有一种方法可以将几个字段定义为Django中唯一的字段?

我有一张(期刊的)卷表,而我不希望同一期刊有一个以上的卷号。

class Volume(models.Model):
    id = models.AutoField(primary_key=True)
    journal_id = models.ForeignKey(Journals, db_column='jid', null=True, verbose_name = "Journal")
    volume_number = models.CharField('Volume Number', max_length=100)
    comments = models.TextField('Comments', max_length=4000, blank=True)

我试图将unique = Trueas属性放在字段中journal_idvolume_number但是不起作用。

Is there a way to define a couple of fields as unique in Django?

I have a table of volumes (of journals) and I don’t want more then one volume number for the same journal.

class Volume(models.Model):
    id = models.AutoField(primary_key=True)
    journal_id = models.ForeignKey(Journals, db_column='jid', null=True, verbose_name = "Journal")
    volume_number = models.CharField('Volume Number', max_length=100)
    comments = models.TextField('Comments', max_length=4000, blank=True)

I tried to put unique = True as attribute in the fields journal_id and volume_number but it doesn’t work.


回答 0

有一个简单的解决方案称为unique_together,它可以完全满足您的要求。

例如:

class MyModel(models.Model):
  field1 = models.CharField(max_length=50)
  field2 = models.CharField(max_length=50)

  class Meta:
    unique_together = ('field1', 'field2',)

在您的情况下:

class Volume(models.Model):
  id = models.AutoField(primary_key=True)
  journal_id = models.ForeignKey(Journals, db_column='jid', null=True, verbose_name = "Journal")
  volume_number = models.CharField('Volume Number', max_length=100)
  comments = models.TextField('Comments', max_length=4000, blank=True)

  class Meta:
    unique_together = ('journal_id', 'volume_number',)

There is a simple solution for you called unique_together which does exactly what you want.

For example:

class MyModel(models.Model):
  field1 = models.CharField(max_length=50)
  field2 = models.CharField(max_length=50)

  class Meta:
    unique_together = ('field1', 'field2',)

And in your case:

class Volume(models.Model):
  id = models.AutoField(primary_key=True)
  journal_id = models.ForeignKey(Journals, db_column='jid', null=True, verbose_name = "Journal")
  volume_number = models.CharField('Volume Number', max_length=100)
  comments = models.TextField('Comments', max_length=4000, blank=True)

  class Meta:
    unique_together = ('journal_id', 'volume_number',)

回答 1

Django 2.2以上

使用constraints功能UniqueConstraint优于unique_together

从Django文档中获得unique_together

改用UniqueConstraint和Constraints选项。
UniqueConstraint提供的功能比unique_together还要多。
将来可能不推荐使用unique_together。

例如:

class Volume(models.Model):
    id = models.AutoField(primary_key=True)
    journal_id = models.ForeignKey(Journals, db_column='jid', null=True, verbose_name="Journal")
    volume_number = models.CharField('Volume Number', max_length=100)
    comments = models.TextField('Comments', max_length=4000, blank=True)

    class Meta:
        constraints = [
            models.UniqueConstraint(fields=['journal_id', 'volume_number'], name='name of constraint')
        ]

Django 2.2+

Using the constraints features UniqueConstraint is preferred over unique_together.

From the Django documentation for unique_together:

Use UniqueConstraint with the constraints option instead.
UniqueConstraint provides more functionality than unique_together.
unique_together may be deprecated in the future.

For example:

class Volume(models.Model):
    id = models.AutoField(primary_key=True)
    journal_id = models.ForeignKey(Journals, db_column='jid', null=True, verbose_name="Journal")
    volume_number = models.CharField('Volume Number', max_length=100)
    comments = models.TextField('Comments', max_length=4000, blank=True)

    class Meta:
        constraints = [
            models.UniqueConstraint(fields=['journal_id', 'volume_number'], name='name of constraint')
        ]

返回,返回无,根本没有返回?

问题:返回,返回无,根本没有返回?

考虑三个功能:

def my_func1():
  print "Hello World"
  return None

def my_func2():
  print "Hello World"
  return

def my_func3():
  print "Hello World"

它们似乎都返回None。这些函数的返回值的行为方式之间有什么区别吗?是否有任何理由偏爱一个?

Consider three functions:

def my_func1():
  print "Hello World"
  return None

def my_func2():
  print "Hello World"
  return

def my_func3():
  print "Hello World"

They all appear to return None. Are there any differences between how the returned value of these functions behave? Are there any reasons to prefer one versus the other?


回答 0

在实际行为上,没有区别。他们都回来了None,就是这样。但是,所有这些都有时间和地点。以下说明基本上是应如何使用不同方法的方法(或至少应告诉我应如何使用它们的方法),但是它们不是绝对规则,因此您可以根据需要将它们混合使用。

使用 return None

这说明该函数确实是要返回一个值供以后使用,在这种情况下,它返回NoneNone然后可以在其他地方使用此值。return None如果该函数没有其他可能的返回值,则永远不要使用。

在下面的例子中,我们返回personmother,如果person给出的一个人。如果不是人类,我们将返回,None因为person它没有mother(假设它不是动物或其他东西)。

def get_mother(person):
    if is_human(person):
        return person.mother
    else:
        return None

使用 return

出于与break循环中相同的原因使用它。返回值无关紧要,您只想退出整个函数。即使您不经常使用它,它在某些地方也非常有用。

我们已经有15个人,prisoners而且我们知道其中一个拥有一把刀。我们prisoner逐个循环检查他们是否有刀。如果我们用小刀打人,则可以退出该功能,因为我们知道只有一把小刀,没有理由检查其余部分prisoners。如果找不到prisoner刀子,则会发出警报。这可以通过许多不同的方式完成,使用return可能甚至不是最好的方式,但这只是说明如何使用return退出函数的一个示例。

def find_prisoner_with_knife(prisoners):
    for prisoner in prisoners:
        if "knife" in prisoner.items:
            prisoner.move_to_inquisition()
            return # no need to check rest of the prisoners nor raise an alert
    raise_alert()

注意:绝对不要这样做var = find_prisoner_with_knife(),因为返回值不是要捕获的。

使用无return可言

这也将返回None,但是该值并不意味着要使用或捕获。这仅表示该功能已成功结束。它基本上与C ++或Java等语言return中的void函数相同。

在下面的示例中,我们设置了人的母亲的名字,然后该函数在成功完成后退出。

def set_mother(person, mother):
    if is_human(person):
        person.mother = mother

注意:绝对不要这样做var = set_mother(my_person, my_mother),因为返回值不是要捕获的。

On the actual behavior, there is no difference. They all return None and that’s it. However, there is a time and place for all of these. The following instructions are basically how the different methods should be used (or at least how I was taught they should be used), but they are not absolute rules so you can mix them up if you feel necessary to.

Using return None

This tells that the function is indeed meant to return a value for later use, and in this case it returns None. This value None can then be used elsewhere. return None is never used if there are no other possible return values from the function.

In the following example, we return person‘s mother if the person given is a human. If it’s not a human, we return None since the person doesn’t have a mother (let’s suppose it’s not an animal or something).

def get_mother(person):
    if is_human(person):
        return person.mother
    else:
        return None

Using return

This is used for the same reason as break in loops. The return value doesn’t matter and you only want to exit the whole function. It’s extremely useful in some places, even though you don’t need it that often.

We’ve got 15 prisoners and we know one of them has a knife. We loop through each prisoner one by one to check if they have a knife. If we hit the person with a knife, we can just exit the function because we know there’s only one knife and no reason the check rest of the prisoners. If we don’t find the prisoner with a knife, we raise an alert. This could be done in many different ways and using return is probably not even the best way, but it’s just an example to show how to use return for exiting a function.

def find_prisoner_with_knife(prisoners):
    for prisoner in prisoners:
        if "knife" in prisoner.items:
            prisoner.move_to_inquisition()
            return # no need to check rest of the prisoners nor raise an alert
    raise_alert()

Note: You should never do var = find_prisoner_with_knife(), since the return value is not meant to be caught.

Using no return at all

This will also return None, but that value is not meant to be used or caught. It simply means that the function ended successfully. It’s basically the same as return in void functions in languages such as C++ or Java.

In the following example, we set person’s mother’s name and then the function exits after completing successfully.

def set_mother(person, mother):
    if is_human(person):
        person.mother = mother

Note: You should never do var = set_mother(my_person, my_mother), since the return value is not meant to be caught.


回答 1

是的,它们都是一样的。

我们可以查看解释后的机器代码,以确认它们都在做完全相同的事情。

import dis

def f1():
  print "Hello World"
  return None

def f2():
  print "Hello World"
  return

def f3():
  print "Hello World"

dis.dis(f1)
    4   0 LOAD_CONST    1 ('Hello World')
        3 PRINT_ITEM
        4 PRINT_NEWLINE

    5   5 LOAD_CONST    0 (None)
        8 RETURN_VALUE

dis.dis(f2)
    9   0 LOAD_CONST    1 ('Hello World')
        3 PRINT_ITEM
        4 PRINT_NEWLINE

    10  5 LOAD_CONST    0 (None)
        8 RETURN_VALUE

dis.dis(f3)
    14  0 LOAD_CONST    1 ('Hello World')
        3 PRINT_ITEM
        4 PRINT_NEWLINE            
        5 LOAD_CONST    0 (None)
        8 RETURN_VALUE      

Yes, they are all the same.

We can review the interpreted machine code to confirm that that they’re all doing the exact same thing.

import dis

def f1():
  print "Hello World"
  return None

def f2():
  print "Hello World"
  return

def f3():
  print "Hello World"

dis.dis(f1)
    4   0 LOAD_CONST    1 ('Hello World')
        3 PRINT_ITEM
        4 PRINT_NEWLINE

    5   5 LOAD_CONST    0 (None)
        8 RETURN_VALUE

dis.dis(f2)
    9   0 LOAD_CONST    1 ('Hello World')
        3 PRINT_ITEM
        4 PRINT_NEWLINE

    10  5 LOAD_CONST    0 (None)
        8 RETURN_VALUE

dis.dis(f3)
    14  0 LOAD_CONST    1 ('Hello World')
        3 PRINT_ITEM
        4 PRINT_NEWLINE            
        5 LOAD_CONST    0 (None)
        8 RETURN_VALUE      

回答 2

它们每个都返回相同的单例None-功能上没有差异。

我认为,return除非您需要先退出该函数(在这种情况下,裸露return更为常见)或返回除以外的其他值,否则放弃该语句是很习惯的做法None。它也很有意义,并且return None在函数中具有返回除之外的其他值的函数时似乎是惯用的Nonereturn None明确地写出来是读者的视觉提示,还有另一个分支返回更有趣的内容(并且调用代码可能需要处理两种类型的返回值)。

返回的函数通常在Python None中像voidC中的函数一样使用-它们的目的通常是在适当的位置对输入参数进行操作(除非您正在使用全局数据(shudders))。None通常,返回可以使参数更加明确。这使我们更加清楚为什么return从“语言约定”的角度出发不理会该声明。

就是说,如果您正在使用已经针对这些事情设置了预设约定的代码库,那么我肯定会效仿以帮助使代码库保持一致…

They each return the same singleton None — There is no functional difference.

I think that it is reasonably idiomatic to leave off the return statement unless you need it to break out of the function early (in which case a bare return is more common), or return something other than None. It also makes sense and seems to be idiomatic to write return None when it is in a function that has another path that returns something other than None. Writing return None out explicitly is a visual cue to the reader that there’s another branch which returns something more interesting (and that calling code will probably need to handle both types of return values).

Often in Python, functions which return None are used like void functions in C — Their purpose is generally to operate on the input arguments in place (unless you’re using global data (shudders)). Returning None usually makes it more explicit that the arguments were mutated. This makes it a little more clear why it makes sense to leave off the return statement from a “language conventions” standpoint.

That said, if you’re working in a code base that already has pre-set conventions around these things, I’d definitely follow suit to help the code base stay uniform…


回答 3

正如其他人回答的那样,None在所有情况下都将返回完全相同的结果。

区别是风格上的,但请注意,PEP8要求使用时要保持一致:

在返回语句中保持一致。函数中的所有return语句应返回一个表达式,或者都不返回。如果任何return语句返回一个表达式,则不返回任何值的任何return语句应将其显式声明为return None,并且在函数的末尾(如果可访问)应存在显式return语句。

是:

def foo(x):
    if x >= 0:
        return math.sqrt(x)
    else:
        return None

def bar(x):
    if x < 0:
        return None
    return math.sqrt(x)

没有:

def foo(x):
    if x >= 0:
        return math.sqrt(x)

def bar(x):
    if x < 0:
        return
    return math.sqrt(x)

https://www.python.org/dev/peps/pep-0008/#programming-recommendations


基本上,如果您曾经非None在函数中值,则意味着返回值具有含义,并且被调用方捕获。因此,当您返回时None,它也必须是显式的,以None在这种情况下传达含义,它是可能的返回值之一。

如果您根本不需要返回,则函数基本上是作为过程而不是函数工作的,因此不要包括 return语句。

如果您正在编写类似过程的函数,并且有机会早点返回(即您已经完成了此操作,不需要执行其余的函数),则可以使用empty returns向读者发出信号这只是执行的早期完成,None隐式返回的值没有任何意义,也不意味着被捕获(类似过程的函数始终返回None)。

As other have answered, the result is exactly the same, None is returned in all cases.

The difference is stylistic, but please note that PEP8 requires the use to be consistent:

Be consistent in return statements. Either all return statements in a function should return an expression, or none of them should. If any return statement returns an expression, any return statements where no value is returned should explicitly state this as return None, and an explicit return statement should be present at the end of the function (if reachable).

Yes:

def foo(x):
    if x >= 0:
        return math.sqrt(x)
    else:
        return None

def bar(x):
    if x < 0:
        return None
    return math.sqrt(x)

No:

def foo(x):
    if x >= 0:
        return math.sqrt(x)

def bar(x):
    if x < 0:
        return
    return math.sqrt(x)

https://www.python.org/dev/peps/pep-0008/#programming-recommendations


Basically, if you ever return non-None value in a function, it means the return value has meaning and is meant to be caught by callers. So when you return None, it must also be explicit, to convey None in this case has meaning, it is one of the possible return values.

If you don’t need return at all, you function basically works as a procedure instead of a function, so just don’t include the return statement.

If you are writing a procedure-like function and there is an opportunity to return earlier (i.e. you are already done at that point and don’t need to execute the remaining of the function) you may use empty an returns to signal for the reader it is just an early finish of execution and the None value returned implicitly doesn’t have any meaning and is not meant to be caught (the procedure-like function always returns None anyway).


回答 4

就功能而言,它们都是相同的,它们之间的区别在于代码的可读性和样式(要考虑的重要因素)

In terms of functionality these are all the same, the difference between them is in code readability and style (which is important to consider)


如何在Python中声明和添加项目到数组?

问题:如何在Python中声明和添加项目到数组?

我试图将项目添加到python中的数组。

我跑

array = {}

然后,我尝试通过以下操作向此数组添加一些内容:

array.append(valueToBeInserted)

似乎没有.append办法。如何将项目添加到数组?

I’m trying to add items to an array in python.

I run

array = {}

Then, I try to add something to this array by doing:

array.append(valueToBeInserted)

There doesn’t seem to be a .append method for this. How do I add items to an array?


回答 0

{}表示一个空字典,而不是数组/列表。对于列表或数组,您需要[]

要初始化一个空列表,请执行以下操作:

my_list = []

要么

my_list = list()

要将元素添加到列表,请使用 append

my_list.append(12)

extend在列表中包含另一个列表中的元素,请使用extend

my_list.extend([1,2,3,4])
my_list
--> [12,1,2,3,4]

要从列表中删除元素,请使用 remove

my_list.remove(2)

字典表示键/值对的集合,也称为关联数组或映射。

要初始化一个空字典,请使用{}dict()

字典具有键和值

my_dict = {'key':'value', 'another_key' : 0}

要使用其他字典的内容扩展字典,可以使用以下update方法

my_dict.update({'third_key' : 1})

从字典中删除值

del my_dict['key']

{} represents an empty dictionary, not an array/list. For lists or arrays, you need [].

To initialize an empty list do this:

my_list = []

or

my_list = list()

To add elements to the list, use append

my_list.append(12)

To extend the list to include the elements from another list use extend

my_list.extend([1,2,3,4])
my_list
--> [12,1,2,3,4]

To remove an element from a list use remove

my_list.remove(2)

Dictionaries represent a collection of key/value pairs also known as an associative array or a map.

To initialize an empty dictionary use {} or dict()

Dictionaries have keys and values

my_dict = {'key':'value', 'another_key' : 0}

To extend a dictionary with the contents of another dictionary you may use the update method

my_dict.update({'third_key' : 1})

To remove a value from a dictionary

del my_dict['key']

回答 1

不,如果您这样做:

array = {}

在您的示例中,您使用的array是字典,而不是数组。如果需要数组,则在Python中使用列表:

array = []

然后,要添加项目,请执行以下操作:

array.append('a')

No, if you do:

array = {}

IN your example you are using array as a dictionary, not an array. If you need an array, in Python you use lists:

array = []

Then, to add items you do:

array.append('a')

回答 2

数组(list在python中称为)使用该[]符号。{}是用于dict(在其他语言中也称为哈希表,关联的数组等),因此您无需为字典添加“追加”。

如果您实际上想要一个数组(列表),请使用:

array = []
array.append(valueToBeInserted)

Arrays (called list in python) use the [] notation. {} is for dict (also called hash tables, associated arrays, etc in other languages) so you won’t have ‘append’ for a dict.

If you actually want an array (list), use:

array = []
array.append(valueToBeInserted)

回答 3

仅出于完成目的,您还可以执行以下操作:

array = []
array += [valueToBeInserted]

如果它是一个字符串列表,这也将起作用:

array += 'string'

Just for sake of completion, you can also do this:

array = []
array += [valueToBeInserted]

If it’s a list of strings, this will also work:

array += 'string'

回答 4

在某些语言(例如JAVA)中,您可以使用花括号定义数组,如下所示,但在python中,其含义不同:

Java:

int[] myIntArray = {1,2,3};
String[] myStringArray = {"a","b","c"};

但是,在Python中,花括号用于定义字典,需要将key:value赋值设置为{'a':1, 'b':2}

要实际定义一个数组(在python中实际上称为list),您可以执行以下操作:

Python:

mylist = [1,2,3]

或其他示例,例如:

mylist = list()
mylist.append(1)
mylist.append(2)
mylist.append(3)
print(mylist)
>>> [1,2,3]

In some languages like JAVA you define an array using curly braces as following but in python it has a different meaning:

Java:

int[] myIntArray = {1,2,3};
String[] myStringArray = {"a","b","c"};

However, in Python, curly braces are used to define dictionaries, which needs a key:value assignment as {'a':1, 'b':2}

To actually define an array (which is actually called list in python) you can do:

Python:

mylist = [1,2,3]

or other examples like:

mylist = list()
mylist.append(1)
mylist.append(2)
mylist.append(3)
print(mylist)
>>> [1,2,3]

回答 5

您也可以:

array = numpy.append(array, value)

请注意,该numpy.append()方法返回一个新对象,因此,如果要修改初始数组,则必须编写:array = ...

You can also do:

array = numpy.append(array, value)

Note that the numpy.append() method returns a new object, so if you want to modify your initial array, you have to write: array = ...


回答 6

我相信你们都错了。您需要执行以下操作:

array = array[] 为了定义它,然后:

array.append ["hello"] 添加到它。

I believe you are all wrong. you need to do:

array = array[] in order to define it, and then:

array.append ["hello"] to add to it.