>>> arr = numpy.zeros((50,100,25))>>> arr.shape
# (50, 100, 25)>>> new_arr = arr.reshape(5000,25)>>> new_arr.shape
# (5000, 25)# One shape dimension can be -1. # In this case, the value is inferred from # the length of the array and remaining dimensions.>>> another_arr = arr.reshape(-1, arr.shape[-1])>>> another_arr.shape
# (5000, 25)
>>> arr = numpy.zeros((50,100,25))
>>> arr.shape
# (50, 100, 25)
>>> new_arr = arr.reshape(5000,25)
>>> new_arr.shape
# (5000, 25)
# One shape dimension can be -1.
# In this case, the value is inferred from
# the length of the array and remaining dimensions.
>>> another_arr = arr.reshape(-1, arr.shape[-1])
>>> another_arr.shape
# (5000, 25)
A slight generalization to Alexander’s answer – np.reshape can take -1 as an argument, meaning “total array size divided by product of all other listed dimensions”:
The reset_index() is a pandas DataFrame method that will transfer index values into the DataFrame as columns. The default setting for the parameter is drop=False (which will keep the index values as columns).
All you have to do add .reset_index(inplace=True) after the name of the DataFrame:
This doesn’t really apply to your case but could be helpful for others (like myself 5 minutes ago) to know. If one’s multindex have the same name like this:
value
Trial Trial
1 0 13
1 3
2 4
2 0 NaN
1 12
3 0 34
df.reset_index(inplace=True) will fail, cause the columns that are created cannot have the same names.
So then you need to rename the multindex with df.index = df.index.set_names(['Trial', 'measurement']) to get:
And then df.reset_index(inplace=True) will work like a charm.
I encountered this problem after grouping by year and month on a datetime-column(not index) called live_date, which meant that both year and month were named live_date.
ravel returns a view of the original array whenever possible. This isn’t visible in the printed output, but if you modify the array returned by ravel, it may modify the entries in the original array. If you modify the entries in an array returned from flatten this will never happen. ravel will often be faster since no memory is copied, but you have to be more careful about modifying the array it returns.
reshape((-1,)) gets a view whenever the strides of the array allow it even if that means you don’t always get a contiguous array.
import numpy
a = numpy.array([[1,2],[3,4]])
r = numpy.ravel(a)
f = numpy.ndarray.flatten(a)print(id(a))print(id(r))print(id(f))print(r)print(f)print("\nbase r:", r.base)print("\nbase f:", f.base)---returns---140541099429760140541099471056140541099473216[1234][1234]
base r:[[12][34]]
base f:None
Both functions return flattened 1D arrays pointing to the new memory structures.
import numpy
a = numpy.array([[1,2],[3,4]])
r = numpy.ravel(a)
f = numpy.ndarray.flatten(a)
print(id(a))
print(id(r))
print(id(f))
print(r)
print(f)
print("\nbase r:", r.base)
print("\nbase f:", f.base)
---returns---
140541099429760
140541099471056
140541099473216
[1 2 3 4]
[1 2 3 4]
base r: [[1 2]
[3 4]]
base f: None
In the upper example:
the memory locations of the results are different,
the results look the same
flatten would return a copy
ravel would return a view.
How we check if something is a copy?
Using the .base attribute of the ndarray. If it’s a view, the base will be the original array; if it is a copy, the base will be None.
def flatten(x):
result =[]for el in x:if hasattr(el,"__iter__")andnot isinstance(el, basestring):
result.extend(flatten(el))else:
result.append(el)return result
flatten(L)
Yes, I know this subject has been covered before (here, here, here, here), but as far as I know, all solutions, except for one, fail on a list like this:
L = [[[1, 2, 3], [4, 5]], 6]
Where the desired output is
[1, 2, 3, 4, 5, 6]
Or perhaps even better, an iterator. The only solution I saw that works for an arbitrary nesting is found in this question:
def flatten(x):
result = []
for el in x:
if hasattr(el, "__iter__") and not isinstance(el, basestring):
result.extend(flatten(el))
else:
result.append(el)
return result
flatten(L)
Is this the best model? Did I overlook something? Any problems?
回答 0
使用生成器函数可以使您的示例更易于阅读,并可能提高性能。
Python 2
def flatten(l):for el in l:if isinstance(el, collections.Iterable)andnot isinstance(el, basestring):for sub in flatten(el):yield sub
else:yield el
Using generator functions can make your example a little easier to read and probably boost the performance.
Python 2
def flatten(l):
for el in l:
if isinstance(el, collections.Iterable) and not isinstance(el, basestring):
for sub in flatten(el):
yield sub
else:
yield el
Generator using recursion and duck typing (updated for Python 3):
def flatten(L):
for item in L:
try:
yield from flatten(item)
except TypeError:
yield item
list(flatten([[[1, 2, 3], [4, 5]], 6]))
>>>[1, 2, 3, 4, 5, 6]
回答 3
@unutbu的非递归解决方案的生成器版本,由@Andrew在注释中要求:
def genflat(l, ltypes=collections.Sequence):
l = list(l)
i =0while i < len(l):while isinstance(l[i], ltypes):ifnot l[i]:
l.pop(i)
i -=1breakelse:
l[i:i +1]= l[i]yield l[i]
i +=1
此生成器的简化版本:
def genflat(l, ltypes=collections.Sequence):
l = list(l)while l:while l and isinstance(l[0], ltypes):
l[0:1]= l[0]if l:yield l.pop(0)
Generator version of @unutbu’s non-recursive solution, as requested by @Andrew in a comment:
def genflat(l, ltypes=collections.Sequence):
l = list(l)
i = 0
while i < len(l):
while isinstance(l[i], ltypes):
if not l[i]:
l.pop(i)
i -= 1
break
else:
l[i:i + 1] = l[i]
yield l[i]
i += 1
Slightly simplified version of this generator:
def genflat(l, ltypes=collections.Sequence):
l = list(l)
while l:
while l and isinstance(l[0], ltypes):
l[0:1] = l[0]
if l: yield l.pop(0)
Here is my functional version of recursive flatten which handles both tuples and lists, and lets you throw in any mix of positional arguments. Returns a generator which produces the entire sequence in order, arg by arg:
flatten = lambda *n: (e for a in n
for e in (flatten(*a) if isinstance(a, (tuple, list)) else (a,)))
This version of flatten avoids python’s recursion limit (and thus works with arbitrarily deep, nested iterables). It is a generator which can handle strings and arbitrary iterables (even infinite ones).
import itertools as IT
import collections
def flatten(iterable, ltypes=collections.Iterable):
remainder = iter(iterable)
while True:
first = next(remainder)
if isinstance(first, ltypes) and not isinstance(first, (str, bytes)):
remainder = IT.chain(first, remainder)
else:
yield first
Here’s another answer that is even more interesting…
import re
def Flatten(TheList):
a = str(TheList)
b,crap = re.subn(r'[\[,\]]', ' ', a)
c = b.split()
d = [int(x) for x in c]
return(d)
Basically, it converts the nested list to a string, uses a regex to strip out the nested syntax, and then converts the result back to a (flattened) list.
回答 7
def flatten(xs):
res =[]def loop(ys):for i in ys:if isinstance(i, list):
loop(i)else:
res.append(i)
loop(xs)return res
It’s an iterator so you need to iterate it (for example by wrapping it with list or using it in a loop). Internally it uses an iterative approach instead of an recursive approach and it’s written as C extension so it can be faster than pure python approaches:
>>> %timeit list(deepflatten(L))
12.6 µs ± 298 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> %timeit list(deepflatten(L, types=list))
8.7 µs ± 139 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> %timeit list(flatten(L)) # Cristian - Python 3.x approach from https://stackoverflow.com/a/2158532/5393381
86.4 µs ± 4.42 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit list(flatten(L)) # Josh Lee - https://stackoverflow.com/a/2158522/5393381
107 µs ± 2.99 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit list(genflat(L, list)) # Alex Martelli - https://stackoverflow.com/a/2159079/5393381
23.1 µs ± 710 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
I’m the author of the iteration_utilities library.
>>> L =[[[1,2,3],[4,5]],6]>>> list(flatten(L))[1,2,3,4,5,6]>>> list(flatten(123))Traceback(most recent call last):File"<pyshell#32>", line 1,in<module>
list(flatten(123))File"<pyshell#27>", line 2,in flatten
for item in iterable:TypeError:'int' object isnot iterable
>>>
It was fun trying to create a function that could flatten irregular list in Python, but of course that is what Python is for (to make programming fun). The following generator works fairly well with some caveats:
def flatten(iterable):
try:
for item in iterable:
yield from flatten(item)
except TypeError:
yield iterable
It will flatten datatypes that you might want left alone (like bytearray, bytes, and str objects). Also, the code relies on the fact that requesting an iterator from a non-iterable raises a TypeError.
>>> L = [[[1, 2, 3], [4, 5]], 6]
>>> def flatten(iterable):
try:
for item in iterable:
yield from flatten(item)
except TypeError:
yield iterable
>>> list(flatten(L))
[1, 2, 3, 4, 5, 6]
>>>
Edit:
I disagree with the previous implementation. The problem is that you should not be able to flatten something that is not an iterable. It is confusing and gives the wrong impression of the argument.
>>> list(flatten(123))
[123]
>>>
The following generator is almost the same as the first but does not have the problem of trying to flatten a non-iterable object. It fails as one would expect when an inappropriate argument is given to it.
def flatten(iterable):
for item in iterable:
try:
yield from flatten(item)
except TypeError:
yield item
Testing the generator works fine with the list that was provided. However, the new code will raise a TypeError when a non-iterable object is given to it. Example are shown below of the new behavior.
>>> L = [[[1, 2, 3], [4, 5]], 6]
>>> list(flatten(L))
[1, 2, 3, 4, 5, 6]
>>> list(flatten(123))
Traceback (most recent call last):
File "<pyshell#32>", line 1, in <module>
list(flatten(123))
File "<pyshell#27>", line 2, in flatten
for item in iterable:
TypeError: 'int' object is not iterable
>>>
回答 10
尽管选择了一个优雅且非常Python化的答案,但我仅出于审查目的而提出我的解决方案:
def flat(l):
ret =[]for i in l:if isinstance(i, list)or isinstance(i, tuple):
ret.extend(flat(i))else:
ret.append(i)return ret
I prefer simple answers. No generators. No recursion or recursion limits. Just iteration:
def flatten(TheList):
listIsNested = True
while listIsNested: #outer loop
keepChecking = False
Temp = []
for element in TheList: #inner loop
if isinstance(element,list):
Temp.extend(element)
keepChecking = True
else:
Temp.append(element)
listIsNested = keepChecking #determine if outer loop exits
TheList = Temp[:]
return TheList
This works with two lists: an inner for loop and an outer while loop.
The inner for loop iterates through the list. If it finds a list element, it (1) uses list.extend() to flatten that part one level of nesting and (2) switches keepChecking to True. keepchecking is used to control the outer while loop. If the outer loop gets set to true, it triggers the inner loop for another pass.
Those passes keep happening until no more nested lists are found. When a pass finally occurs where none are found, keepChecking never gets tripped to true, which means listIsNested stays false and the outer while loop exits.
The flattened list is then returned.
Test-run
flatten([1,2,3,4,[100,200,300,[1000,2000,3000]]])
[1, 2, 3, 4, 100, 200, 300, 1000, 2000, 3000]
回答 12
这是一个简单的函数,可以平铺任意深度的列表。没有递归,以避免堆栈溢出。
from copy import deepcopy
def flatten_list(nested_list):"""Flatten an arbitrarily nested list, without recursion (to avoid
stack overflows). Returns a new list, the original list is unchanged.
>> list(flatten_list([1, 2, 3, [4], [], [[[[[[[[[5]]]]]]]]]]))
[1, 2, 3, 4, 5]
>> list(flatten_list([[1, 2], 3]))
[1, 2, 3]
"""
nested_list = deepcopy(nested_list)while nested_list:
sublist = nested_list.pop(0)if isinstance(sublist, list):
nested_list = sublist + nested_list
else:yield sublist
I’m surprised no one has thought of this. Damn recursion I don’t get the recursive answers that the advanced people here made. anyway here is my attempt on this. caveat is it’s very specific to the OP’s use case
I didn’t go through all the already available answers here, but here is a one liner I came up with, borrowing from lisp’s way of first and rest list processing
def flatten(l): return flatten(l[0]) + (flatten(l[1:]) if len(l) > 1 else []) if type(l) is list else [l]
x =["over here","am","I"]
y = sorted(x)# You're about to enter a room named `sorted`, note down the current room address here so you can return back: 0x4004f4 (that room address looks weird)# Seems like you went back from your quest using the return address 0x4004f4# Let's see what you've collected print(' '.join(y))
您在地牢中遇到的问题在这里将是相同的,调用堆栈的大小是有限的(此处为1000),因此,如果您输入了太多函数而没有返回,则您将填充调用堆栈并出现错误就像一次调用自己-一遍又一遍-),您将一遍又一遍地输入,直到计算完成(直到找到宝藏为止),然后返回,直到返回到调用的位置为止 “亲爱的冒险家,很抱歉,您的笔记本已经满了”:最初的地方。直到最后一次将调用栈从所有返回地址中释放出来之前,调用栈将永远不会被释放。RecursionError: maximum recursion depth exceeded。请注意,您不需要递归即可填充调用堆栈,但是非递归程序调用1000函数而永远不会返回的可能性很小。同样重要的是要了解,从函数返回后,调用栈将从使用的地址中释放出来(因此,名称“栈”,返回地址在进入函数之前就被压入,并在返回时被拉出)。在简单递归的特殊情况下(一个函数ffff
最后,请记住,您不能使用来打印无限嵌套的列表L,print(L)因为它在内部将使用对__repr__(RecursionError: maximum recursion depth exceeded while getting the repr of an object)的递归调用。出于相同的原因,flatten涉及解决方案str将失败,并显示相同的错误消息。
如果您需要测试解决方案,则可以使用此函数生成一个简单的嵌套列表:
def build_deep_list(depth):"""Returns a list of the form $l_{depth} = [depth-1, l_{depth-1}]$
with $depth > 1$ and $l_0 = [0]$.
"""
sub_list =[0]for d in range(1, depth):
sub_list =[d, sub_list]return sub_list
When trying to answer such a question you really need to give the limitations of the code you propose as a solution. If it was only about performances I wouldn’t mind too much, but most of the codes proposed as solution (including the accepted answer) fail to flatten any list that has a depth greater than 1000.
When I say most of the codes I mean all codes that use any form of recursion (or call a standard library function that is recursive). All these codes fail because for every of the recursive call made, the (call) stack grow by one unit, and the (default) python call stack has a size of 1000.
If you’re not too familiar with the call stack, then maybe the following will help (otherwise you can just scroll to the Implementation).
Call stack size and recursive programming (dungeon analogy)
Finding the treasure and exit
Imagine you enter a huge dungeon with numbered rooms, looking for a treasure. You don’t know the place but you have some indications on how to find the treasure. Each indication is a riddle (difficulty varies, but you can’t predict how hard they will be). You decide to think a little bit about a strategy to save time, you make two observations:
It’s hard (long) to find the treasure as you’ll have to solve (potentially hard) riddles to get there.
Once the treasure found, returning to the entrance may be easy, you just have to use the same path in the other direction (though this needs a bit of memory to recall your path).
When entering the dungeon, you notice a small notebook here. You decide to use it to write down every room you exit after solving a riddle (when entering a new room), this way you’ll be able to return back to the entrance. That’s a genius idea, you won’t even spend a cent implementing your strategy.
You enter the dungeon, solving with great success the first 1001 riddles, but here comes something you hadn’t planed, you have no space left in the notebook you borrowed. You decide to abandon your quest as you prefer not having the treasure than being lost forever inside the dungeon (that looks smart indeed).
Executing a recursive program
Basically, it’s the exact same thing as finding the treasure. The dungeon is the computer’s memory, your goal now is not to find a treasure but to compute some function (find f(x) for a given x). The indications simply are sub-routines that will help you solving f(x). Your strategy is the same as the call stack strategy, the notebook is the stack, the rooms are the functions’ return addresses:
x = ["over here", "am", "I"]
y = sorted(x) # You're about to enter a room named `sorted`, note down the current room address here so you can return back: 0x4004f4 (that room address looks weird)
# Seems like you went back from your quest using the return address 0x4004f4
# Let's see what you've collected
print(' '.join(y))
The problem you encountered in the dungeon will be the same here, the call stack has a finite size (here 1000) and therefore, if you enter too many functions without returning back then you’ll fill the call stack and have an error that look like “Dear adventurer, I’m very sorry but your notebook is full”: RecursionError: maximum recursion depth exceeded. Note that you don’t need recursion to fill the call stack, but it’s very unlikely that a non-recursive program call 1000 functions without ever returning. It’s important to also understand that once you returned from a function, the call stack is freed from the address used (hence the name “stack”, return address are pushed in before entering a function and pulled out when returning). In the special case of a simple recursion (a function f that call itself once — over and over –) you will enter f over and over until the computation is finished (until the treasure is found) and return from f until you go back to the place where you called f in the first place. The call stack will never be freed from anything until the end where it will be freed from all return addresses one after the other.
How to avoid this issue?
That’s actually pretty simple: “don’t use recursion if you don’t know how deep it can go”. That’s not always true as in some cases, Tail Call recursion can be Optimized (TCO). But in python, this is not the case, and even “well written” recursive function will not optimize stack use. There is an interesting post from Guido about this question: Tail Recursion Elimination.
There is a technique that you can use to make any recursive function iterative, this technique we could call bring your own notebook. For example, in our particular case we simply are exploring a list, entering a room is equivalent to entering a sublist, the question you should ask yourself is how can I get back from a list to its parent list? The answer is not that complex, repeat the following until the stack is empty:
push the current list address and index in a stack when entering a new sublist (note that a list address+index is also an address, therefore we just use the exact same technique used by the call stack);
every time an item is found, yield it (or add them in a list);
once a list is fully explored, go back to the parent list using the stackreturn address (and index).
Also note that this is equivalent to a DFS in a tree where some nodes are sublists A = [1, 2] and some are simple items: 0, 1, 2, 3, 4 (for L = [0, [1,2], 3, 4]). The tree looks like this:
The DFS traversal pre-order is: L, 0, A, 1, 2, 3, 4. Remember, in order to implement an iterative DFS you also “need” a stack. The implementation I proposed before result in having the following states (for the stack and the flat_list):
In this example, the stack maximum size is 2, because the input list (and therefore the tree) have depth 2.
Implementation
For the implementation, in python you can simplify a little bit by using iterators instead of simple lists. References to the (sub)iterators will be used to store sublists return addresses (instead of having both the list address and the index). This is not a big difference but I feel this is more readable (and also a bit faster):
Also, notice that in is_list_like I have isinstance(item, list), which could be changed to handle more input types, here I just wanted to have the simplest version where (iterable) is just a list. But you could also do that:
def is_list_like(item):
try:
iter(item)
return not isinstance(item, str) # strings are not lists (hmm...)
except TypeError:
return False
This considers strings as “simple items” and therefore flatten_iter([["test", "a"], "b]) will return ["test", "a", "b"] and not ["t", "e", "s", "t", "a", "b"]. Remark that in that case, iter(item) is called twice on each item, let’s pretend it’s an exercise for the reader to make this cleaner.
Testing and remarks on other implementations
In the end, remember that you can’t print a infinitely nested list L using print(L) because internally it will use recursive calls to __repr__ (RecursionError: maximum recursion depth exceeded while getting the repr of an object). For the same reason, solutions to flatten involving str will fail with the same error message.
If you need to test your solution, you can use this function to generate a simple nested list:
def build_deep_list(depth):
"""Returns a list of the form $l_{depth} = [depth-1, l_{depth-1}]$
with $depth > 1$ and $l_0 = [0]$.
"""
sub_list = [0]
for d in range(1, depth):
sub_list = [d, sub_list]
return sub_list
Which gives: build_deep_list(5) >>> [4, [3, [2, [1, [0]]]]].
回答 16
这是compiler.ast.flatten2.7.5中的实现:
def flatten(seq):
l =[]for elt in seq:
t = type(elt)if t is tuple or t is list:for elt2 in flatten(elt):
l.append(elt2)else:
l.append(elt)return l
Here’s the compiler.ast.flatten implementation in 2.7.5:
def flatten(seq):
l = []
for elt in seq:
t = type(elt)
if t is tuple or t is list:
for elt2 in flatten(elt):
l.append(elt2)
else:
l.append(elt)
return l
There are better, faster methods (If you’ve reached here, you have seen them already)
Also note:
Deprecated since version 2.6: The compiler package has been removed in Python 3.
请注意,大多数繁重的工作都是在C中完成的,因为据我所知,这是itertools的实现方式,因此尽管是递归的,但AFAIK并不受python递归深度的限制,因为函数调用发生在C中,尽管这样做并不意味着您会受到内存的限制,特别是在OS X中,从今天开始,它的堆栈大小有了硬限制(OS X Mavericks)…
Here is another py2 approach, Im not sure if its the fastest or the most elegant nor safest …
from collections import Iterable
from itertools import imap, repeat, chain
def flat(seqs, ignore=(int, long, float, basestring)):
return repeat(seqs, 1) if any(imap(isinstance, repeat(seqs), ignore)) or not isinstance(seqs, Iterable) else chain.from_iterable(imap(flat, seqs))
It can ignore any specific (or derived) type you would like, it returns an iterator, so you can convert it to any specific container such as list, tuple, dict or simply consume it in order to reduce memory footprint, for better or worse it can handle initial non-iterable objects such as int …
Note most of the heavy lifting is done in C, since as far as I know thats how itertools are implemented, so while it is recursive, AFAIK it isn’t bounded by python recursion depth since the function calls are happening in C, though this doesn’t mean you are bounded by memory, specially in OS X where its stack size has a hard limit as of today (OS X Mavericks) …
there is a slightly faster approach, but less portable method, only use it if you can assume that the base elements of the input can be explicitly determined otherwise, you’ll get an infinite recursion, and OS X with its limited stack size, will throw a segmentation fault fairly quickly …
def flat(seqs, ignore={int, long, float, str, unicode}):
return repeat(seqs, 1) if type(seqs) in ignore or not isinstance(seqs, Iterable) else chain.from_iterable(imap(flat, seqs))
here we are using sets to check for the type so it takes O(1) vs O(number of types) to check whether or not an element should be ignored, though of course any value with derived type of the stated ignored types will fail, this is why its using str, unicode so use it with caution …
tests:
import random
def test_flat(test_size=2000):
def increase_depth(value, depth=1):
for func in xrange(depth):
value = repeat(value, 1)
return value
def random_sub_chaining(nested_values):
for values in nested_values:
yield chain((values,), chain.from_iterable(imap(next, repeat(nested_values, random.randint(1, 10)))))
expected_values = zip(xrange(test_size), imap(str, xrange(test_size)))
nested_values = random_sub_chaining((increase_depth(value, depth) for depth, value in enumerate(expected_values)))
assert not any(imap(cmp, chain.from_iterable(expected_values), flat(chain(((),), nested_values, ((),)))))
>>> test_flat()
>>> list(flat([[[1, 2, 3], [4, 5]], 6]))
[1, 2, 3, 4, 5, 6]
>>>
$ uname -a
Darwin Samys-MacBook-Pro.local 13.3.0 Darwin Kernel Version 13.3.0: Tue Jun 3 21:27:35 PDT 2014; root:xnu-2422.110.17~1/RELEASE_X86_64 x86_64
$ python --version
Python 2.7.5
回答 20
不使用任何库:
def flat(l):def _flat(l, r):if type(l)isnot list:
r.append(l)else:for i in l:
r = r + flat(i)return r
return _flat(l,[])# example
test =[[1],[[2]],[3],[['a','b','c'],[['z','x','y']],['d','f','g']],4]print flat(test)# prints [1, 2, 3, 'a', 'b', 'c', 'z', 'x', 'y', 'd', 'f', 'g', 4]
def flat(l):
def _flat(l, r):
if type(l) is not list:
r.append(l)
else:
for i in l:
r = r + flat(i)
return r
return _flat(l, [])
# example
test = [[1], [[2]], [3], [['a','b','c'] , [['z','x','y']], ['d','f','g']], 4]
print flat(test) # prints [1, 2, 3, 'a', 'b', 'c', 'z', 'x', 'y', 'd', 'f', 'g', 4]
回答 21
使用itertools.chain:
import itertools
from collections importIterabledef list_flatten(lst):
flat_lst =[]for item in itertools.chain(lst):if isinstance(item,Iterable):
item = list_flatten(item)
flat_lst.extend(item)else:
flat_lst.append(item)return flat_lst
import itertools
from collections import Iterable
def list_flatten(lst):
flat_lst = []
for item in itertools.chain(lst):
if isinstance(item, Iterable):
item = list_flatten(item)
flat_lst.extend(item)
else:
flat_lst.append(item)
return flat_lst
Or without chaining:
def flatten(q, final):
if not q:
return
if isinstance(q, list):
if not isinstance(q[0], list):
final.append(q[0])
else:
flatten(q[0], final)
flatten(q[1:], final)
else:
final.append(q)
回答 22
我使用递归来解决任何深度的嵌套列表
def combine_nlist(nlist,init=0,combiner=lambda x,y: x+y):'''
apply function: combiner to a nested list element by element(treated as flatten list)
'''
current_value=init
for each_item in nlist:if isinstance(each_item,list):
current_value =combine_nlist(each_item,current_value,combiner)else:
current_value = combiner(current_value,each_item)return current_value
I used recursive to solve nested list with any depth
def combine_nlist(nlist,init=0,combiner=lambda x,y: x+y):
'''
apply function: combiner to a nested list element by element(treated as flatten list)
'''
current_value=init
for each_item in nlist:
if isinstance(each_item,list):
current_value =combine_nlist(each_item,current_value,combiner)
else:
current_value = combiner(current_value,each_item)
return current_value
So after i define function combine_nlist, it is easy to use this function do flatting. Or you can combine it into one function. I like my solution because it can be applied to any nested list.
I am aware that there are already many awesome answers but i wanted to add an answer that uses the functional programming method of solving the question. In this answer i make use of double recursion :
def flatten_list(seq):
if not seq:
return []
elif isinstance(seq[0],list):
return (flatten_list(seq[0])+flatten_list(seq[1:]))
else:
return [seq[0]]+flatten_list(seq[1:])
print(flatten_list([1,2,[3,[4],5],[6,7]]))
output:
[1, 2, 3, 4, 5, 6, 7]
回答 25
我不确定这是否一定更快或更有效,但这是我要做的:
def flatten(lst):return eval('['+ str(lst).replace('[','').replace(']','')+']')
L =[[[1,2,3],[4,5]],6]print(flatten(L))
The flatten function here turns the list into a string, takes out all of the square brackets, attaches square brackets back onto the ends, and turns it back into a list.
Although, if you knew you would have square brackets in your list in strings, like [[1, 2], "[3, 4] and [5]"], you would have to do something else.
This will flatten a list or dictionary (or list of lists or dictionaries of dictionaries etc). It assumes that the values are strings and it creates a string that concatenates each item with a separator argument. If you wanted you could use the separator to split the result into a list object afterward. It uses recursion if the next value is a list or a string. Use the key argument to tell whether you want the keys or the values (set key to false) from the dictionary object.
def flatten_obj(n_obj, key=True, my_sep=''):
my_string = ''
if type(n_obj) == list:
for val in n_obj:
my_sep_setter = my_sep if my_string != '' else ''
if type(val) == list or type(val) == dict:
my_string += my_sep_setter + flatten_obj(val, key, my_sep)
else:
my_string += my_sep_setter + val
elif type(n_obj) == dict:
for k, v in n_obj.items():
my_sep_setter = my_sep if my_string != '' else ''
d_val = k if key else v
if type(v) == list or type(v) == dict:
my_string += my_sep_setter + flatten_obj(v, key, my_sep)
else:
my_string += my_sep_setter + d_val
elif type(n_obj) == str:
my_sep_setter = my_sep if my_string != '' else ''
my_string += my_sep_setter + n_obj
return my_string
return my_string
print(flatten_obj(['just', 'a', ['test', 'to', 'try'], 'right', 'now', ['or', 'later', 'today'],
[{'dictionary_test': 'test'}, {'dictionary_test_two': 'later_today'}, 'my power is 9000']], my_sep=', ')
yields:
just, a, test, to, try, right, now, or, later, today, dictionary_test, dictionary_test_two, my power is 9000
回答 28
如果您喜欢递归,这可能是您感兴趣的解决方案:
def f(E):if E==[]:return[]elif type(E)!= list:return[E]else:
a = f(E[0])
b = f(E[1:])
a.extend(b)return a
l =[[1,2,3],[4,5,6],[7],[8,9]]
reduce(lambda x, y: x.extend(y), l)
错误信息
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <lambda>
AttributeError: 'NoneType' object has no attribute 'extend'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <lambda>
AttributeError: 'NoneType' object has no attribute 'extend'
回答 0
给定一个列表列表l,
flat_list = [item for sublist in l for item in sublist]
意思是:
flat_list =[]for sublist in l:for item in sublist:
flat_list.append(item)
比到目前为止发布的快捷方式快。(l是要展平的列表。)
这是相应的功能:
flatten =lambda l:[item for sublist in l for item in sublist]
作为证据,您可以使用timeit标准库中的模块:
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99''[item for sublist in l for item in sublist]'10000 loops, best of 3:143 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99''sum(l, [])'1000 loops, best of 3:969 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99''reduce(lambda x,y: x+y,l)'1000 loops, best of 3:1.1 msec per loop
flat_list = [item for sublist in l for item in sublist]
which means:
flat_list = []
for sublist in l:
for item in sublist:
flat_list.append(item)
is faster than the shortcuts posted so far. (l is the list to flatten.)
Here is the corresponding function:
flatten = lambda l: [item for sublist in l for item in sublist]
As evidence, you can use the timeit module in the standard library:
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 3: 143 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 3: 969 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 3: 1.1 msec per loop
Explanation: the shortcuts based on + (including the implied use in sum) are, of necessity, O(L**2) when there are L sublists — as the intermediate result list keeps getting longer, at each step a new intermediate result list object gets allocated, and all the items in the previous intermediate result must be copied over (as well as a few new ones added at the end). So, for simplicity and without actual loss of generality, say you have L sublists of I items each: the first I items are copied back and forth L-1 times, the second I items L-2 times, and so on; total number of copies is I times the sum of x for x from 1 to L excluded, i.e., I * (L**2)/2.
The list comprehension just generates one list, once, and copies each item over (from its original place of residence to the result list) also exactly once.
Note from the author: This is inefficient. But fun, because monoids are awesome. It’s not appropriate for production Python code.
>>> sum(l, [])
[1, 2, 3, 4, 5, 6, 7, 8, 9]
This just sums the elements of iterable passed in the first argument, treating second argument as the initial value of the sum (if not given, 0 is used instead and this case will give you an error).
Because you are summing nested lists, you actually get [1,3]+[2,4] as a result of sum([[1,3],[2,4]],[]), which is equal to [1,3,2,4].
Note that only works on lists of lists. For lists of lists of lists, you’ll need another solution.
>>> from matplotlib.cbook import flatten
>>> list(flatten(l))
…Unipath:
>>> from unipath.path import flatten
>>> list(flatten(l))
…Setuptools:
>>> from setuptools.namespaces import flatten
>>> list(flatten(l))
回答 6
这是适用于数字,字符串,嵌套列表和混合容器的通用方法。
码
#from typing import Iterable from collections importIterable# < py38def flatten(items):"""Yield items from any nested iterable; see Reference."""for x in items:if isinstance(x,Iterable)andnot isinstance(x,(str, bytes)):for sub_x in flatten(x):yield sub_x
else:yield x
注意事项:
在Python 3中,yield from flatten(x)可以替换for sub_x in flatten(x): yield sub_x
Here is a general approach that applies to numbers, strings, nested lists and mixed containers.
Code
#from typing import Iterable
from collections import Iterable # < py38
def flatten(items):
"""Yield items from any nested iterable; see Reference."""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
for sub_x in flatten(x):
yield sub_x
else:
yield x
Notes:
In Python 3, yield from flatten(x) can replace for sub_x in flatten(x): yield sub_x
from itertools import chain
from functools import reduce
from collections importIterable# or from collections.abc import Iterableimport operator
from iteration_utilities import deepflatten
def nested_list_comprehension(lsts):return[item for sublist in lsts for item in sublist]def itertools_chain_from_iterable(lsts):return list(chain.from_iterable(lsts))def pythons_sum(lsts):return sum(lsts,[])def reduce_add(lsts):return reduce(lambda x, y: x + y, lsts)def pylangs_flatten(lsts):return list(flatten(lsts))def flatten(items):"""Yield items from any nested iterable; see REF."""for x in items:if isinstance(x,Iterable)andnot isinstance(x,(str, bytes)):yieldfrom flatten(x)else:yield x
def reduce_concat(lsts):return reduce(operator.concat, lsts)def iteration_utilities_deepflatten(lsts):return list(deepflatten(lsts, depth=1))from simple_benchmark import benchmark
b = benchmark([nested_list_comprehension, itertools_chain_from_iterable, pythons_sum, reduce_add,
pylangs_flatten, reduce_concat, iteration_utilities_deepflatten],
arguments={2**i:[[0]*5]*(2**i)for i in range(1,13)},
argument_name='number of inner lists')
b.plot()
Just to add some timings (based on Nico Schlömer answer that didn’t include the function presented in this answer):
It’s a log-log plot to accommodate for the huge range of values spanned. For qualitative reasoning: Lower is better.
The results show that if the iterable contains only a few inner iterables then sum will be fastest, however for long iterables only the itertools.chain.from_iterable, iteration_utilities.deepflatten or the nested comprehension have reasonable performance with itertools.chain.from_iterable being the fastest (as already noticed by Nico Schlömer).
from itertools import chain
from functools import reduce
from collections import Iterable # or from collections.abc import Iterable
import operator
from iteration_utilities import deepflatten
def nested_list_comprehension(lsts):
return [item for sublist in lsts for item in sublist]
def itertools_chain_from_iterable(lsts):
return list(chain.from_iterable(lsts))
def pythons_sum(lsts):
return sum(lsts, [])
def reduce_add(lsts):
return reduce(lambda x, y: x + y, lsts)
def pylangs_flatten(lsts):
return list(flatten(lsts))
def flatten(items):
"""Yield items from any nested iterable; see REF."""
for x in items:
if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
yield from flatten(x)
else:
yield x
def reduce_concat(lsts):
return reduce(operator.concat, lsts)
def iteration_utilities_deepflatten(lsts):
return list(deepflatten(lsts, depth=1))
from simple_benchmark import benchmark
b = benchmark(
[nested_list_comprehension, itertools_chain_from_iterable, pythons_sum, reduce_add,
pylangs_flatten, reduce_concat, iteration_utilities_deepflatten],
arguments={2**i: [[0]*5]*(2**i) for i in range(1, 13)},
argument_name='number of inner lists'
)
b.plot()
1 Disclaimer: I’m the author of that library
回答 8
我收回我的声明。总和不是赢家。尽管列表较小时速度更快。但是,列表较大时,性能会大大降低。
>>> timeit.Timer('[item for sublist in l for item in sublist]','l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]] * 10000').timeit(100)2.0440959930419922
I take my statement back. sum is not the winner. Although it is faster when the list is small. But the performance degrades significantly with larger lists.
>>> timeit.Timer(
'[item for sublist in l for item in sublist]',
'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]] * 10000'
).timeit(100)
2.0440959930419922
The sum version is still running for more than a minute and it hasn’t done processing yet!
There seems to be a confusion with operator.add! When you add two lists together, the correct term for that is concat, not add. operator.concat is what you need to use.
If you’re thinking functional, it is as easy as this::
The reason your function didn’t work is because the extend extends an array in-place and doesn’t return it. You can still return x from lambda, using something like this:
reduce(lambda x,y: x.extend(y) or x, l)
Note: extend is more efficient than + on lists.
回答 13
def flatten(l, a):for i in l:if isinstance(i, list):
flatten(i, a)else:
a.append(i)return a
print(flatten([[[1,[1,1,[3,[4,5,]]]],2,3],[4,5],6],[]))# [1, 1, 1, 3, 4, 5, 2, 3, 4, 5, 6]
def flatten(l, a):
for i in l:
if isinstance(i, list):
flatten(i, a)
else:
a.append(i)
return a
print(flatten([[[1, [1,1, [3, [4,5,]]]], 2, 3], [4, 5],6], []))
# [1, 1, 1, 3, 4, 5, 2, 3, 4, 5, 6]
回答 14
递归版本
x =[1,2,[3,4],[5,[6,[7]]],8,9,[10]]def flatten_list(k):
result = list()for i in k:if isinstance(i,list):#The isinstance() function checks if the object (first argument) is an #instance or subclass of classinfo class (second argument)
result.extend(flatten_list(i))#Recursive callelse:
result.append(i)return result
flatten_list(x)#result = [1,2,3,4,5,6,7,8,9,10]
x = [1,2,[3,4],[5,[6,[7]]],8,9,[10]]
def flatten_list(k):
result = list()
for i in k:
if isinstance(i,list):
#The isinstance() function checks if the object (first argument) is an
#instance or subclass of classinfo class (second argument)
result.extend(flatten_list(i)) #Recursive call
else:
result.append(i)
return result
flatten_list(x)
#result = [1,2,3,4,5,6,7,8,9,10]
Average time over 1000 trials of matplotlib.cbook.flatten:2.55e-05 sec
Average time over 1000 trials of underscore._.flatten:4.63e-04 sec
(time for underscore._)/(time for matplotlib.cbook)=18.1233394636
Average time over 1000 trials of matplotlib.cbook.flatten: 2.55e-05 sec
Average time over 1000 trials of underscore._.flatten: 4.63e-04 sec
(time for underscore._)/(time for matplotlib.cbook) = 18.1233394636
回答 16
在处理基于文本的可变长度列表时,可接受的答案对我不起作用。这是对我有用的另一种方法。
l =['aaa','bb','cccccc',['xx','yyyyyyy']]
接受的答案无效:
flat_list =[item for sublist in l for item in sublist]print(flat_list)['a','a','a','b','b','c','c','c','c','c','c','xx','yyyyyyy']
新提出的解决方案,没有工作对我来说:
flat_list =[]
_ =[flat_list.extend(item)if isinstance(item, list)else flat_list.append(item)for item in l if item]print(flat_list)['aaa','bb','cccccc','xx','yyyyyyy']
The accepted answer did not work for me when dealing with text-based lists of variable lengths. Here is an alternate approach that did work for me.
l = ['aaa', 'bb', 'cccccc', ['xx', 'yyyyyyy']]
Accepted answer that did not work:
flat_list = [item for sublist in l for item in sublist]
print(flat_list)
['a', 'a', 'a', 'b', 'b', 'c', 'c', 'c', 'c', 'c', 'c', 'xx', 'yyyyyyy']
New proposed solution that did work for me:
flat_list = []
_ = [flat_list.extend(item) if isinstance(item, list) else flat_list.append(item) for item in l if item]
print(flat_list)
['aaa', 'bb', 'cccccc', 'xx', 'yyyyyyy']
def list_flatten(l, a=None):#check aif a isNone:#initialize with empty list
a =[]for i in l:if isinstance(i, list):
list_flatten(i, a)else:
a.append(i)return a
An bad feature of Anil’s function above is that it requires the user to always manually specify the second argument to be an empty list []. This should instead be a default. Due to the way Python objects work, these should be set inside the function, not in the arguments.
Here’s a working function:
def list_flatten(l, a=None):
#check a
if a is None:
#initialize with empty list
a = []
for i in l:
if isinstance(i, list):
list_flatten(i, a)
else:
a.append(i)
return a
Testing:
In [2]: lst = [1, 2, [3], [[4]],[5,[6]]]
In [3]: lst
Out[3]: [1, 2, [3], [[4]], [5, [6]]]
In [11]: list_flatten(lst)
Out[11]: [1, 2, 3, 4, 5, 6]
回答 18
以下对我来说似乎最简单:
>>>import numpy as np
>>> l =[[1,2,3],[4,5,6],[7],[8,9]]>>>print(np.concatenate(l))[123456789]
import numpy
l =[[1,2,3],[4,5,6],[7],[8,9]]*99%timeit numpy.concatenate(l).ravel().tolist()1000 loops, best of 3:313µs per loop
%timeit numpy.concatenate(l).tolist()1000 loops, best of 3:312µs per loop
%timeit [item for sublist in l for item in sublist]1000 loops, best of 3:31.5µs per loop
If you are willing to give up a tiny amount of speed for a cleaner look, then you could use numpy.concatenate().tolist() or numpy.concatenate().ravel().tolist():
import numpy
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]] * 99
%timeit numpy.concatenate(l).ravel().tolist()
1000 loops, best of 3: 313 µs per loop
%timeit numpy.concatenate(l).tolist()
1000 loops, best of 3: 312 µs per loop
%timeit [item for sublist in l for item in sublist]
1000 loops, best of 3: 31.5 µs per loop
>>>from collections importIterable>>>from six import string_types
>>>def flatten(obj):...for i in obj:...if isinstance(i,Iterable)andnot isinstance(i, string_types):...yieldfrom flatten(i)...else:...yield i
>>> list(flatten(obj))[1,2,3,4,5,6,'abc',7,8,9,10]
Note: Below applies to Python 3.3+ because it uses yield_from. six is also a third-party package, though it is stable. Alternately, you could use sys.version.
In the case of obj = [[1, 2,], [3, 4], [5, 6]], all of the solutions here are good, including list comprehension and itertools.chain.from_iterable.
However, consider this slightly more complex case:
One element, 6, is just a scalar; it’s not iterable, so the above routes will fail here.
One element, 'abc', is technically iterable (all strs are). However, reading between the lines a bit, you don’t want to treat it as such–you want to treat it as a single element.
The final element, [8, [9, 10]] is itself a nested iterable. Basic list comprehension and chain.from_iterable only extract “1 level down.”
You can remedy this as follows:
>>> from collections import Iterable
>>> from six import string_types
>>> def flatten(obj):
... for i in obj:
... if isinstance(i, Iterable) and not isinstance(i, string_types):
... yield from flatten(i)
... else:
... yield i
>>> list(flatten(obj))
[1, 2, 3, 4, 5, 6, 'abc', 7, 8, 9, 10]
Here, you check that the sub-element (1) is iterable with Iterable, an ABC from itertools, but also want to ensure that (2) the element is not “string-like.”
flat_list = []
for i in list_of_list:
flat_list+=i
This Code also works fine as it just extend the list all the way. Although it is much similar but only have one for loop. So It have less complexity than adding 2 for loops.
回答 27
from nltk import flatten
l =[[1,2,3],[4,5,6],[7],[8,9]]
flatten(l)
a =[[1,2],[[[[3,4,5],6]]],7,[8,[9,[10,11],12,[13,14,[15,[[16,17],18]]]]]]
flist =[]def make_list_flat (l):
flist.extend ([l])if(type (l)isnot list)else[make_list_flat (e)for e in l]
make_list_flat(a)print(flist)
This may not be the most efficient way but I thought to put a one-liner (actually a two-liner). Both versions will work on arbitrary hierarchy nested lists, and exploits language features (Python3.5) and recursion.
def make_list_flat (l):
flist = []
flist.extend ([l]) if (type (l) is not list) else [flist.extend (make_list_flat (e)) for e in l]
return flist
a = [[1, 2], [[[[3, 4, 5], 6]]], 7, [8, [9, [10, 11], 12, [13, 14, [15, [[16, 17], 18]]]]]]
flist = make_list_flat(a)
print (flist)
This works in a depth first manner. The recursion goes down until it finds a non-list element, then extends the local variable flist and then rolls back it to the parent. Whenever flist is returned, it is extended to the parent’s flist in the list comprehension. Therefore, at the root, a flat list is returned.
The above one creates several local lists and returns them which are used to extend the parent’s list. I think the way around for this may be creating a gloabl flist, like below.
a = [[1, 2], [[[[3, 4, 5], 6]]], 7, [8, [9, [10, 11], 12, [13, 14, [15, [[16, 17], 18]]]]]]
flist = []
def make_list_flat (l):
flist.extend ([l]) if (type (l) is not list) else [make_list_flat (e) for e in l]
make_list_flat(a)
print (flist)
Although I am not sure at this time about the efficiency.
回答 29
适用于整数的异质和均质列表的另一种异常方法:
from typing importListdef flatten(l: list)->List[int]:"""Flatten an arbitrary deep nested list of lists of integers.
Examples:
>>> flatten([1, 2, [1, [10]]])
[1, 2, 1, 10]
Args:
l: Union[l, Union[int, List[int]]
Returns:
Flatted list of integer
"""return[int(i.strip('[ ]'))for i in str(l).split(',')]
Another unusual approach that works for hetero- and homogeneous lists of integers:
from typing import List
def flatten(l: list) -> List[int]:
"""Flatten an arbitrary deep nested list of lists of integers.
Examples:
>>> flatten([1, 2, [1, [10]]])
[1, 2, 1, 10]
Args:
l: Union[l, Union[int, List[int]]
Returns:
Flatted list of integer
"""
return [int(i.strip('[ ]')) for i in str(l).split(',')]