标签归档:algorithm

表示并解决给定图像的迷宫

问题:表示并解决给定图像的迷宫

代表并解决给定图像的迷宫的最佳方法是什么?

范围问题134的封面图片

给定JPEG图像(如上所示),读入,将其解析为某种数据结构并解决迷宫的最佳方法是什么?我的第一个本能是逐像素读取图像并将其存储在布尔值列表(数组)中:True对于白色像素,False对于非白色像素(可以丢弃颜色)。这种方法的问题在于图像可能不是“像素完美”的。我的意思只是说,如果墙壁上的某处有白色像素,可能会产生意外的路径。

另一种方法(经过一番思考后才想到)是将图像转换为SVG文件-SVG文件是在画布上绘制的路径的列表。这样,可以将路径读入相同种类的列表(布尔值),其中True表示路径或墙壁,False表示可移动的空间。如果转换不是100%准确,并且不能完全连接所有墙,从而产生间隙,则此方法会出现问题。

转换为SVG的另一个问题是这些线不是“完美”的直线。这导致路径是三次贝塞尔曲线。使用由整数索引的布尔值列表(数组),曲线将不易转移,并且必须计算曲线上直线的所有点,但不会与列表索引完全匹配。

我假设虽然其中一种方法可能会(虽然可能不会)起作用,但考虑到如此大的图像,它们的效率很低,并且存在更好的方法。如何做到最好(最有效和/或最低复杂度)?有没有最好的方法?

然后是迷宫的解决。如果我使用前两种方法中的任何一种,则基本上将得到一个矩阵。根据该答案,表示迷宫的一种好方法是使用树,而使用A *算法来解决它的好方法。一个人如何根据图像创建一棵树?有任何想法吗?

TL; DR
解析的最佳方法?变成什么数据结构?所述结构将如何帮助/阻碍解决?

更新
我已尝试使用numpy@Thomas建议的方式实现@Mikhail用Python编写的内容。我认为该算法是正确的,但无法正常运行。(下面的代码。)PNG库是PyPNG

import png, numpy, Queue, operator, itertools

def is_white(coord, image):
  """ Returns whether (x, y) is approx. a white pixel."""
  a = True
  for i in xrange(3):
    if not a: break
    a = image[coord[1]][coord[0] * 3 + i] > 240
  return a

def bfs(s, e, i, visited):
  """ Perform a breadth-first search. """
  frontier = Queue.Queue()
  while s != e:
    for d in [(-1, 0), (0, -1), (1, 0), (0, 1)]:
      np = tuple(map(operator.add, s, d))
      if is_white(np, i) and np not in visited:
        frontier.put(np)
    visited.append(s)
    s = frontier.get()
  return visited

def main():
  r = png.Reader(filename = "thescope-134.png")
  rows, cols, pixels, meta = r.asDirect()
  assert meta['planes'] == 3 # ensure the file is RGB
  image2d = numpy.vstack(itertools.imap(numpy.uint8, pixels))
  start, end = (402, 985), (398, 27)
  print bfs(start, end, image2d, [])

What is the best way to represent and solve a maze given an image?

The cover image of The Scope Issue 134

Given an JPEG image (as seen above), what’s the best way to read it in, parse it into some data structure and solve the maze? My first instinct is to read the image in pixel by pixel and store it in a list (array) of boolean values: True for a white pixel, and False for a non-white pixel (the colours can be discarded). The issue with this method, is that the image may not be “pixel perfect”. By that I simply mean that if there is a white pixel somewhere on a wall it may create an unintended path.

Another method (which came to me after a bit of thought) is to convert the image to an SVG file – which is a list of paths drawn on a canvas. This way, the paths could be read into the same sort of list (boolean values) where True indicates a path or wall, False indicating a travel-able space. An issue with this method arises if the conversion is not 100% accurate, and does not fully connect all of the walls, creating gaps.

Also an issue with converting to SVG is that the lines are not “perfectly” straight. This results in the paths being cubic bezier curves. With a list (array) of boolean values indexed by integers, the curves would not transfer easily, and all the points that line on the curve would have to be calculated, but won’t exactly match to list indices.

I assume that while one of these methods may work (though probably not) that they are woefully inefficient given such a large image, and that there exists a better way. How is this best (most efficiently and/or with the least complexity) done? Is there even a best way?

Then comes the solving of the maze. If I use either of the first two methods, I will essentially end up with a matrix. According to this answer, a good way to represent a maze is using a tree, and a good way to solve it is using the A* algorithm. How would one create a tree from the image? Any ideas?

TL;DR
Best way to parse? Into what data structure? How would said structure help/hinder solving?

UPDATE
I’ve tried my hand at implementing what @Mikhail has written in Python, using numpy, as @Thomas recommended. I feel that the algorithm is correct, but it’s not working as hoped. (Code below.) The PNG library is PyPNG.

import png, numpy, Queue, operator, itertools

def is_white(coord, image):
  """ Returns whether (x, y) is approx. a white pixel."""
  a = True
  for i in xrange(3):
    if not a: break
    a = image[coord[1]][coord[0] * 3 + i] > 240
  return a

def bfs(s, e, i, visited):
  """ Perform a breadth-first search. """
  frontier = Queue.Queue()
  while s != e:
    for d in [(-1, 0), (0, -1), (1, 0), (0, 1)]:
      np = tuple(map(operator.add, s, d))
      if is_white(np, i) and np not in visited:
        frontier.put(np)
    visited.append(s)
    s = frontier.get()
  return visited

def main():
  r = png.Reader(filename = "thescope-134.png")
  rows, cols, pixels, meta = r.asDirect()
  assert meta['planes'] == 3 # ensure the file is RGB
  image2d = numpy.vstack(itertools.imap(numpy.uint8, pixels))
  start, end = (402, 985), (398, 27)
  print bfs(start, end, image2d, [])

回答 0

这是一个解决方案。

  1. 将图像转换为灰度(尚未二进制),调整颜色的权重,以使最终的灰度图像大致均匀。您只需在Photoshop中控制图像->调整->黑白中的滑块即可完成此操作。
  2. 通过在Photoshop中的“图像”->“调整”->“阈值”中设置适当的阈值,将图像转换为二进制。
  3. 确保正确选择阈值。使用魔术棒工具,公差为0,点采样,连续,无抗锯齿。检查选择中断处的边不是由错误阈值引入的错误边。实际上,从一开始就可以访问此迷宫的所有内部点。
  4. 在迷宫上添加人工边界,以确保虚拟旅行者不会在它周围走动:)
  5. 以您喜欢的语言实现广度优先搜索(BFS),并从头开始运行它。我更喜欢MATLAB来完成这项任务。正如@Thomas已经提到的那样,无需弄乱图的常规表示。您可以直接使用二值化图像。

这是BFS的MATLAB代码:

function path = solve_maze(img_file)
  %% Init data
  img = imread(img_file);
  img = rgb2gray(img);
  maze = img > 0;
  start = [985 398];
  finish = [26 399];

  %% Init BFS
  n = numel(maze);
  Q = zeros(n, 2);
  M = zeros([size(maze) 2]);
  front = 0;
  back = 1;

  function push(p, d)
    q = p + d;
    if maze(q(1), q(2)) && M(q(1), q(2), 1) == 0
      front = front + 1;
      Q(front, :) = q;
      M(q(1), q(2), :) = reshape(p, [1 1 2]);
    end
  end

  push(start, [0 0]);

  d = [0 1; 0 -1; 1 0; -1 0];

  %% Run BFS
  while back <= front
    p = Q(back, :);
    back = back + 1;
    for i = 1:4
      push(p, d(i, :));
    end
  end

  %% Extracting path
  path = finish;
  while true
    q = path(end, :);
    p = reshape(M(q(1), q(2), :), 1, 2);
    path(end + 1, :) = p;
    if isequal(p, start) 
      break;
    end
  end
end

它确实非常简单和标准,因此在Python或其他任何方式中实现它应该没有困难。

这是答案:

在此处输入图片说明

Here is a solution.

  1. Convert image to grayscale (not yet binary), adjusting weights for the colors so that final grayscale image is approximately uniform. You can do it simply by controlling sliders in Photoshop in Image -> Adjustments -> Black & White.
  2. Convert image to binary by setting appropriate threshold in Photoshop in Image -> Adjustments -> Threshold.
  3. Make sure threshold is selected right. Use the Magic Wand Tool with 0 tolerance, point sample, contiguous, no anti-aliasing. Check that edges at which selection breaks are not false edges introduced by wrong threshold. In fact, all interior points of this maze are accessible from the start.
  4. Add artificial borders on the maze to make sure virtual traveler will not walk around it :)
  5. Implement breadth-first search (BFS) in your favorite language and run it from the start. I prefer MATLAB for this task. As @Thomas already mentioned, there is no need to mess with regular representation of graphs. You can work with binarized image directly.

Here is the MATLAB code for BFS:

function path = solve_maze(img_file)
  %% Init data
  img = imread(img_file);
  img = rgb2gray(img);
  maze = img > 0;
  start = [985 398];
  finish = [26 399];

  %% Init BFS
  n = numel(maze);
  Q = zeros(n, 2);
  M = zeros([size(maze) 2]);
  front = 0;
  back = 1;

  function push(p, d)
    q = p + d;
    if maze(q(1), q(2)) && M(q(1), q(2), 1) == 0
      front = front + 1;
      Q(front, :) = q;
      M(q(1), q(2), :) = reshape(p, [1 1 2]);
    end
  end

  push(start, [0 0]);

  d = [0 1; 0 -1; 1 0; -1 0];

  %% Run BFS
  while back <= front
    p = Q(back, :);
    back = back + 1;
    for i = 1:4
      push(p, d(i, :));
    end
  end

  %% Extracting path
  path = finish;
  while true
    q = path(end, :);
    p = reshape(M(q(1), q(2), :), 1, 2);
    path(end + 1, :) = p;
    if isequal(p, start) 
      break;
    end
  end
end

It is really very simple and standard, there should not be difficulties on implementing this in Python or whatever.

And here is the answer:

Enter image description here


回答 1

该解决方案是用Python编写的。感谢米哈伊尔(Mikhail)提供有关图像准备工作的指导。

动画式广度优先搜索:

BFS的动画版本

完成的迷宫:

完成迷宫

#!/usr/bin/env python

import sys

from Queue import Queue
from PIL import Image

start = (400,984)
end = (398,25)

def iswhite(value):
    if value == (255,255,255):
        return True

def getadjacent(n):
    x,y = n
    return [(x-1,y),(x,y-1),(x+1,y),(x,y+1)]

def BFS(start, end, pixels):

    queue = Queue()
    queue.put([start]) # Wrapping the start tuple in a list

    while not queue.empty():

        path = queue.get() 
        pixel = path[-1]

        if pixel == end:
            return path

        for adjacent in getadjacent(pixel):
            x,y = adjacent
            if iswhite(pixels[x,y]):
                pixels[x,y] = (127,127,127) # see note
                new_path = list(path)
                new_path.append(adjacent)
                queue.put(new_path)

    print "Queue has been exhausted. No answer was found."


if __name__ == '__main__':

    # invoke: python mazesolver.py <mazefile> <outputfile>[.jpg|.png|etc.]
    base_img = Image.open(sys.argv[1])
    base_pixels = base_img.load()

    path = BFS(start, end, base_pixels)

    path_img = Image.open(sys.argv[1])
    path_pixels = path_img.load()

    for position in path:
        x,y = position
        path_pixels[x,y] = (255,0,0) # red

    path_img.save(sys.argv[2])

注意:标记为白色的访问像素为灰色。这消除了对访问列表的需求,但是这需要在绘制路径之前从磁盘中第二次加载图像文件(如果您不希望使用最终路径和所有路径的合成图像)。

我使用的迷宫的空白版本。

This solution is written in Python. Thanks Mikhail for the pointers on the image preparation.

An animated Breadth-First Search:

Animated version of BFS

The Completed Maze:

Completed Maze

#!/usr/bin/env python

import sys

from Queue import Queue
from PIL import Image

start = (400,984)
end = (398,25)

def iswhite(value):
    if value == (255,255,255):
        return True

def getadjacent(n):
    x,y = n
    return [(x-1,y),(x,y-1),(x+1,y),(x,y+1)]

def BFS(start, end, pixels):

    queue = Queue()
    queue.put([start]) # Wrapping the start tuple in a list

    while not queue.empty():

        path = queue.get() 
        pixel = path[-1]

        if pixel == end:
            return path

        for adjacent in getadjacent(pixel):
            x,y = adjacent
            if iswhite(pixels[x,y]):
                pixels[x,y] = (127,127,127) # see note
                new_path = list(path)
                new_path.append(adjacent)
                queue.put(new_path)

    print "Queue has been exhausted. No answer was found."


if __name__ == '__main__':

    # invoke: python mazesolver.py <mazefile> <outputfile>[.jpg|.png|etc.]
    base_img = Image.open(sys.argv[1])
    base_pixels = base_img.load()

    path = BFS(start, end, base_pixels)

    path_img = Image.open(sys.argv[1])
    path_pixels = path_img.load()

    for position in path:
        x,y = position
        path_pixels[x,y] = (255,0,0) # red

    path_img.save(sys.argv[2])

Note: Marks a white visited pixel grey. This removes the need for a visited list, but this requires a second load of the image file from disk before drawing a path (if you don’t want a composite image of the final path and ALL paths taken).

A blank version of the maze I used.


回答 2

我尝试自己实施A-Star搜索以解决此问题。紧跟着约瑟夫·科恩Joseph Kern)此处给出的框架和算法伪代码的实现:

def AStar(start, goal, neighbor_nodes, distance, cost_estimate):
    def reconstruct_path(came_from, current_node):
        path = []
        while current_node is not None:
            path.append(current_node)
            current_node = came_from[current_node]
        return list(reversed(path))

    g_score = {start: 0}
    f_score = {start: g_score[start] + cost_estimate(start, goal)}
    openset = {start}
    closedset = set()
    came_from = {start: None}

    while openset:
        current = min(openset, key=lambda x: f_score[x])
        if current == goal:
            return reconstruct_path(came_from, goal)
        openset.remove(current)
        closedset.add(current)
        for neighbor in neighbor_nodes(current):
            if neighbor in closedset:
                continue
            if neighbor not in openset:
                openset.add(neighbor)
            tentative_g_score = g_score[current] + distance(current, neighbor)
            if tentative_g_score >= g_score.get(neighbor, float('inf')):
                continue
            came_from[neighbor] = current
            g_score[neighbor] = tentative_g_score
            f_score[neighbor] = tentative_g_score + cost_estimate(neighbor, goal)
    return []

由于A-Star是一种启发式搜索算法,因此您需要提供一个函数,该函数可以估算直到达到目标之前的剩余成本(此处为距离)。除非您对次优解决方案感到满意,否则不要高估成本。此处的保守选择是曼哈顿(或出租车)距离,因为这表示所用冯·诺依曼邻域在网格上两点之间的直线距离。(在这种情况下,永远都不会高估成本。)

但是,这将大大低估手边给定迷宫的实际成本。因此,我添加了其他两个距离度量标准,即欧几里得距离和曼哈顿距离乘以4进行比较。但是,这些可能会高估实际成本,因此可能会产生次优的结果。

这是代码:

import sys
from PIL import Image

def is_blocked(p):
    x,y = p
    pixel = path_pixels[x,y]
    if any(c < 225 for c in pixel):
        return True
def von_neumann_neighbors(p):
    x, y = p
    neighbors = [(x-1, y), (x, y-1), (x+1, y), (x, y+1)]
    return [p for p in neighbors if not is_blocked(p)]
def manhattan(p1, p2):
    return abs(p1[0]-p2[0]) + abs(p1[1]-p2[1])
def squared_euclidean(p1, p2):
    return (p1[0]-p2[0])**2 + (p1[1]-p2[1])**2

start = (400, 984)
goal = (398, 25)

# invoke: python mazesolver.py <mazefile> <outputfile>[.jpg|.png|etc.]

path_img = Image.open(sys.argv[1])
path_pixels = path_img.load()

distance = manhattan
heuristic = manhattan

path = AStar(start, goal, von_neumann_neighbors, distance, heuristic)

for position in path:
    x,y = position
    path_pixels[x,y] = (255,0,0) # red

path_img.save(sys.argv[2])

这是可视化结果的一些图像(灵感来自Joseph Kern发布的图像)。动画在主while循环进行10000次迭代后分别显示一个新帧。

广度优先搜索:

广度优先搜索

曼哈顿A级明星距离:

曼哈顿A星距离

A星平方欧几里德距离:

A星平方欧几里德距离

曼哈顿A级明星距离乘以4:

曼哈顿A级明星距离乘以4

结果表明,对于使用的启发式方法,迷宫的探索区域差异很大。因此,平方欧几里德距离甚至会产生与其他度量不同的(次优)路径。

关于A-Star算法在终止之前的运行时间方面的性能,请注意,与距离和成本函数相比,很多评估相加,而广度优先搜索(BFS)只需评估目标的“目标”每个候选职位。这些附加功能评估(A-Star)的成本是否超过了要检查的大量节点(BFS)的成本,尤其是性能是否对您的应用程序来说完全是一个问题,这取决于个人的看法。当然不能普遍回答。

一件事,可以在一般的说一下是否知情的搜索算法(如A-星)可能比穷举搜索(例如,BFS)是更好的选择如下。随着迷宫的维数,即搜索树的分支因子,穷举搜索(穷举搜索)的缺点呈指数增长。随着复杂性的增加,这样做变得越来越不可行,并且在某种程度上,您对任何结果路径都非常满意,无论它是否(近似)最佳。

I tried myself implementing A-Star search for this problem. Followed closely the implementation by Joseph Kern for the framework and the algorithm pseudocode given here:

def AStar(start, goal, neighbor_nodes, distance, cost_estimate):
    def reconstruct_path(came_from, current_node):
        path = []
        while current_node is not None:
            path.append(current_node)
            current_node = came_from[current_node]
        return list(reversed(path))

    g_score = {start: 0}
    f_score = {start: g_score[start] + cost_estimate(start, goal)}
    openset = {start}
    closedset = set()
    came_from = {start: None}

    while openset:
        current = min(openset, key=lambda x: f_score[x])
        if current == goal:
            return reconstruct_path(came_from, goal)
        openset.remove(current)
        closedset.add(current)
        for neighbor in neighbor_nodes(current):
            if neighbor in closedset:
                continue
            if neighbor not in openset:
                openset.add(neighbor)
            tentative_g_score = g_score[current] + distance(current, neighbor)
            if tentative_g_score >= g_score.get(neighbor, float('inf')):
                continue
            came_from[neighbor] = current
            g_score[neighbor] = tentative_g_score
            f_score[neighbor] = tentative_g_score + cost_estimate(neighbor, goal)
    return []

As A-Star is a heuristic search algorithm you need to come up with a function that estimates the remaining cost (here: distance) until the goal is reached. Unless you’re comfortable with a suboptimal solution it should not overestimate the cost. A conservative choice would here be the manhattan (or taxicab) distance as this represents the straight-line distance between two points on the grid for the used Von Neumann neighborhood. (Which, in this case, wouldn’t ever overestimate the cost.)

This would however significantly underestimate the actual cost for the given maze at hand. Therefore I’ve added two other distance metrics squared euclidean distance and the manhattan distance multiplied by four for comparison. These however might overestimate the actual cost, and might therefore yield suboptimal results.

Here’s the code:

import sys
from PIL import Image

def is_blocked(p):
    x,y = p
    pixel = path_pixels[x,y]
    if any(c < 225 for c in pixel):
        return True
def von_neumann_neighbors(p):
    x, y = p
    neighbors = [(x-1, y), (x, y-1), (x+1, y), (x, y+1)]
    return [p for p in neighbors if not is_blocked(p)]
def manhattan(p1, p2):
    return abs(p1[0]-p2[0]) + abs(p1[1]-p2[1])
def squared_euclidean(p1, p2):
    return (p1[0]-p2[0])**2 + (p1[1]-p2[1])**2

start = (400, 984)
goal = (398, 25)

# invoke: python mazesolver.py <mazefile> <outputfile>[.jpg|.png|etc.]

path_img = Image.open(sys.argv[1])
path_pixels = path_img.load()

distance = manhattan
heuristic = manhattan

path = AStar(start, goal, von_neumann_neighbors, distance, heuristic)

for position in path:
    x,y = position
    path_pixels[x,y] = (255,0,0) # red

path_img.save(sys.argv[2])

Here are some images for a visualization of the results (inspired by the one posted by Joseph Kern). The animations show a new frame each after 10000 iterations of the main while-loop.

Breadth-First Search:

Breadth-First Search

A-Star Manhattan Distance:

A-Star Manhattan Distance

A-Star Squared Euclidean Distance:

A-Star Squared Euclidean Distance

A-Star Manhattan Distance multiplied by four:

A-Star Manhattan Distance multiplied by four

The results show that the explored regions of the maze differ considerably for the heuristics being used. As such, squared euclidean distance even produces a different (suboptimal) path as the other metrics.

Concerning the performance of the A-Star algorithm in terms of the runtime until termination, note that a lot of evaluation of distance and cost functions add up compared to the Breadth-First Search (BFS) which only needs to evaluate the “goaliness” of each candidate position. Whether or not the cost for these additional function evaluations (A-Star) outweighs the cost for the larger number of nodes to check (BFS) and especially whether or not performance is an issue for your application at all, is a matter of individual perception and can of course not be generally answered.

A thing that can be said in general about whether or not an informed search algorithm (such as A-Star) could be the better choice compared to an exhaustive search (e.g., BFS) is the following. With the number of dimensions of the maze, i.e., the branching factor of the search tree, the disadvantage of an exhaustive search (to search exhaustively) grows exponentially. With growing complexity it becomes less and less feasible to do so and at some point you are pretty much happy with any result path, be it (approximately) optimal or not.


回答 3

树搜索太多。迷宫沿溶液路径固有地可分离。

(感谢Reddit的rainman002向我指出了这一点。)

因此,您可以快速使用连接的组件来识别迷宫墙的连接部分。这会在像素上迭代两次。

如果要将其转换成解决方案路径的理想示意图,则可以将二进制操作与结构化元素一起使用,以填充每个连接区域的“死角”路径。

下面是MATLAB的演示代码。它可以使用调整来更好地清理结果,使其更通用,并使其运行更快。(有时不是2:30 AM。)

% read in and invert the image
im = 255 - imread('maze.jpg');

% sharpen it to address small fuzzy channels
% threshold to binary 15%
% run connected components
result = bwlabel(im2bw(imfilter(im,fspecial('unsharp')),0.15));

% purge small components (e.g. letters)
for i = 1:max(reshape(result,1,1002*800))
    [count,~] = size(find(result==i));
    if count < 500
        result(result==i) = 0;
    end
end

% close dead-end channels
closed = zeros(1002,800);
for i = 1:max(reshape(result,1,1002*800))
    k = zeros(1002,800);
    k(result==i) = 1; k = imclose(k,strel('square',8));
    closed(k==1) = i;
end

% do output
out = 255 - im;
for x = 1:1002
    for y = 1:800
        if closed(x,y) == 0
            out(x,y,:) = 0;
        end
    end
end
imshow(out);

当前代码的结果

Tree search is too much. The maze is inherently separable along the solution path(s).

(Thanks to rainman002 from Reddit for pointing this out to me.)

Because of this, you can quickly use connected components to identify the connected sections of maze wall. This iterates over the pixels twice.

If you want to turn that into a nice diagram of the solution path(s), you can then use binary operations with structuring elements to fill in the “dead end” pathways for each connected region.

Demo code for MATLAB follows. It could use tweaking to clean up the result better, make it more generalizable, and make it run faster. (Sometime when it’s not 2:30 AM.)

% read in and invert the image
im = 255 - imread('maze.jpg');

% sharpen it to address small fuzzy channels
% threshold to binary 15%
% run connected components
result = bwlabel(im2bw(imfilter(im,fspecial('unsharp')),0.15));

% purge small components (e.g. letters)
for i = 1:max(reshape(result,1,1002*800))
    [count,~] = size(find(result==i));
    if count < 500
        result(result==i) = 0;
    end
end

% close dead-end channels
closed = zeros(1002,800);
for i = 1:max(reshape(result,1,1002*800))
    k = zeros(1002,800);
    k(result==i) = 1; k = imclose(k,strel('square',8));
    closed(k==1) = i;
end

% do output
out = 255 - im;
for x = 1:1002
    for y = 1:800
        if closed(x,y) == 0
            out(x,y,:) = 0;
        end
    end
end
imshow(out);

result of current code


回答 4

使用队列进行阈值连续填充。将入口左侧的像素推入队列,然后开始循环。如果排队的像素足够暗,则其颜色为浅灰色(高于阈值),并且所有相邻像素都被推入队列。

from PIL import Image
img = Image.open("/tmp/in.jpg")
(w,h) = img.size
scan = [(394,23)]
while(len(scan) > 0):
    (i,j) = scan.pop()
    (r,g,b) = img.getpixel((i,j))
    if(r*g*b < 9000000):
        img.putpixel((i,j),(210,210,210))
        for x in [i-1,i,i+1]:
            for y in [j-1,j,j+1]:
                scan.append((x,y))
img.save("/tmp/out.png")

解决方案是灰墙和彩色墙之间的走廊。请注意,此迷宫有多种解决方案。而且,这似乎只是起作用。

解

Uses a queue for a threshold continuous fill. Pushes the pixel left of the entrance onto the queue and then starts the loop. If a queued pixel is dark enough, it’s colored light gray (above threshold), and all the neighbors are pushed onto the queue.

from PIL import Image
img = Image.open("/tmp/in.jpg")
(w,h) = img.size
scan = [(394,23)]
while(len(scan) > 0):
    (i,j) = scan.pop()
    (r,g,b) = img.getpixel((i,j))
    if(r*g*b < 9000000):
        img.putpixel((i,j),(210,210,210))
        for x in [i-1,i,i+1]:
            for y in [j-1,j,j+1]:
                scan.append((x,y))
img.save("/tmp/out.png")

Solution is the corridor between gray wall and colored wall. Note this maze has multiple solutions. Also, this merely appears to work.

Solution


回答 5

在这里,您可以去:maze-solver-python(GitHub)

在此处输入图片说明

我玩得很开心,并扩展了约瑟夫·科恩Joseph Kern)的答案。不减损它;我为可能对此感兴趣的其他人做了一些小的补充。

这是一个基于Python的求解器,它使用BFS查找最短路径。当时,我的主要补充是:

  1. 搜索之前将图像清除(即转换为纯黑白)
  2. 自动生成GIF。
  3. 自动生成AVI。

就目前而言,此示例迷宫的起点/终点经过了硬编码,但我计划对其进行扩展,以便您可以选择适当的像素。

Here you go: maze-solver-python (GitHub)

enter image description here

I had fun playing around with this and extended on Joseph Kern‘s answer. Not to detract from it; I just made some minor additions for anyone else who may be interested in playing around with this.

It’s a python-based solver which uses BFS to find the shortest path. My main additions, at the time, are:

  1. The image is cleaned before the search (ie. convert to pure black & white)
  2. Automatically generate a GIF.
  3. Automatically generate an AVI.

As it stands, the start/end-points are hard-coded for this sample maze, but I plan on extending it such that you can pick the appropriate pixels.


回答 6

我会选择矩阵矩阵选项。如果您发现标准的Python列表在此方面效率太低,则可以改用numpy.bool数组。这样,一个1000×1000像素迷宫的存储空间仅为1 MB。

不要为创建任何树或图数据结构而烦恼。那只是思考它的一种方式,但不一定是在内存中表示它的好方法。布尔矩阵既容易编码,也更有效。

然后使用A *算法对其进行求解。对于距离启发式方法,请使用曼哈顿距离(distance_x + distance_y)。

(row, column)坐标元组表示节点。每当算法(Wikipedia伪代码)要求“邻居”时,只需循环遍历四个可能的邻居即可(注意图像的边缘!)。

如果发现它仍然太慢,可以在加载图像之前尝试缩小图像。小心不要在此过程中迷路。

也许也可以在Python中进行1:2的缩减,以检查您实际上没有丢失任何可能的路径。一个有趣的选择,但还需要更多思考。

I’d go for the matrix-of-bools option. If you find that standard Python lists are too inefficient for this, you could use a numpy.bool array instead. Storage for a 1000×1000 pixel maze is then just 1 MB.

Don’t bother with creating any tree or graph data structures. That’s just a way of thinking about it, but not necessarily a good way to represent it in memory; a boolean matrix is both easier to code and more efficient.

Then use the A* algorithm to solve it. For the distance heuristic, use the Manhattan distance (distance_x + distance_y).

Represent nodes by a tuple of (row, column) coordinates. Whenever the algorithm (Wikipedia pseudocode) calls for “neighbours”, it’s a simple matter of looping over the four possible neighbours (mind the edges of the image!).

If you find that it’s still too slow, you could try downscaling the image before you load it. Be careful not to lose any narrow paths in the process.

Maybe it’s possible to do a 1:2 downscaling in Python as well, checking that you don’t actually lose any possible paths. An interesting option, but it needs a bit more thought.


回答 7

这里有一些想法。

(1.图像处理:)

1.1将图像加载为RGB像素图。在C#中,使用是微不足道的system.drawing.bitmap。在没有简单的图像支持的语言中,只需将图像转换为可 移植的pixmap格式(PPM)(Unix文本表示,会生成大文件)或一些您可以轻松阅读的简单二进制文件格式,例如BMPTGA。Unix中的ImageMagick或Windows中的IrfanView

1.2如前所述,您可以通过将每个像素的(R + G + B)/ 3用作灰度指示,然后对该值进行阈值生成黑白表,来简化数据。假设0 =黑色和255 =白色,则接近200的值会去除JPEG伪像。

(2.解决方案:)

2.1深度优先搜索:使用起始位置初始化一个空的堆栈,收集可用的后续动作,随机选择一个并推入堆栈,继续进行直至到达终点或结束。在弹出堆栈的死角回溯中,您需要跟踪在地图上访问过哪些位置,因此当您收集可用的移动时,您绝不会两次走同一条路径。动画非常有趣。

2.2广度优先搜索:之前提到过,与上面类似,但仅使用队列。动画也很有趣。这就像填充图像编辑软件一样。我认为您可以使用此技巧在Photoshop中解决迷宫问题。

2.3 Wall Follower(墙随动件):从几何学上讲,迷宫是一个折叠/盘旋的管子。如果您将手放在墙上,您最终将找到出口;)并非总是如此。有某些假设,例如:完美的迷宫等,例如,某些迷宫包含孤岛。查一下;令人着迷。

(3.评论:)

这是一个棘手的问题。如果以某种简单的形式来表示形式,每个元素都是具有北,东,南和西壁以及已访问标记场的像元类型,则很容易解决迷宫问题。但是鉴于给定的手绘草图,您正在尝试执行此操作,因此会变得凌乱。老实说,我认为尝试使草图合理化会让您发疯。这类似于相当复杂的计算机视觉问题。也许直接进入图像地图可能更容易但更浪费。

Here are some ideas.

(1. Image Processing:)

1.1 Load the image as RGB pixel map. In C# it is trivial using system.drawing.bitmap. In languages with no simple support for imaging, just convert the image to portable pixmap format (PPM) (a Unix text representation, produces large files) or some simple binary file format you can easily read, such as BMP or TGA. ImageMagick in Unix or IrfanView in Windows.

1.2 You may, as mentioned earlier, simplify the data by taking the (R+G+B)/3 for each pixel as an indicator of gray tone and then threshold the value to produce a black and white table. Something close to 200 assuming 0=black and 255=white will take out the JPEG artifacts.

(2. Solutions:)

2.1 Depth-First Search: Init an empty stack with starting location, collect available follow-up moves, pick one at random and push onto the stack, proceed until end is reached or a deadend. On deadend backtrack by popping the stack, you need to keep track of which positions were visited on the map so when you collect available moves you never take the same path twice. Very interesting to animate.

2.2 Breadth-First Search: Mentioned before, similar as above but only using queues. Also interesting to animate. This works like flood-fill in image editing software. I think you may be able to solve a maze in Photoshop using this trick.

2.3 Wall Follower: Geometrically speaking, a maze is a folded/convoluted tube. If you keep your hand on the wall you will eventually find the exit ;) This does not always work. There are certain assumption re: perfect mazes, etc., for instance, certain mazes contain islands. Do look it up; it is fascinating.

(3. Comments:)

This is the tricky one. It is easy to solve mazes if represented in some simple array formal with each element being a cell type with north, east, south and west walls and a visited flag field. However given that you are trying to do this given a hand drawn sketch it becomes messy. I honestly think that trying to rationalize the sketch will drive you nuts. This is akin to computer vision problems which are fairly involved. Perhaps going directly onto the image map may be easier yet more wasteful.


回答 8

这是使用R的解决方案。

### download the image, read it into R, converting to something we can play with...
library(jpeg)
url <- "https://i.stack.imgur.com/TqKCM.jpg"
download.file(url, "./maze.jpg", mode = "wb")
jpg <- readJPEG("./maze.jpg")

### reshape array into data.frame
library(reshape2)
img3 <- melt(jpg, varnames = c("y","x","rgb"))
img3$rgb <- as.character(factor(img3$rgb, levels = c(1,2,3), labels=c("r","g","b")))

## split out rgb values into separate columns
img3 <- dcast(img3, x + y ~ rgb)

RGB到灰度,请参阅:https : //stackoverflow.com/a/27491947/2371031

# convert rgb to greyscale (0, 1)
img3$v <- img3$r*.21 + img3$g*.72 + img3$b*.07
# v: values closer to 1 are white, closer to 0 are black

## strategically fill in some border pixels so the solver doesn't "go around":
img3$v2 <- img3$v
img3[(img3$x == 300 | img3$x == 500) & (img3$y %in% c(0:23,988:1002)),"v2"]  = 0

# define some start/end point coordinates
pts_df <- data.frame(x = c(398, 399),
                     y = c(985, 26))

# set a reference value as the mean of the start and end point greyscale "v"s
ref_val <- mean(c(subset(img3, x==pts_df[1,1] & y==pts_df[1,2])$v,
                  subset(img3, x==pts_df[2,1] & y==pts_df[2,2])$v))

library(sp)
library(gdistance)
spdf3 <- SpatialPixelsDataFrame(points = img3[c("x","y")], data = img3["v2"])
r3 <- rasterFromXYZ(spdf3)

# transition layer defines a "conductance" function between any two points, and the number of connections (4 = Manhatten distances)
# x in the function represents the greyscale values ("v2") of two adjacent points (pixels), i.e., = (x1$v2, x2$v2)
# make function(x) encourages transitions between cells with small changes in greyscale compared to the reference values, such that: 
# when v2 is closer to 0 (black) = poor conductance
# when v2 is closer to 1 (white) = good conductance
tl3 <- transition(r3, function(x) (1/max( abs( (x/ref_val)-1 ) )^2)-1, 4) 

## get the shortest path between start, end points
sPath3 <- shortestPath(tl3, as.numeric(pts_df[1,]), as.numeric(pts_df[2,]), output = "SpatialLines")

## fortify for ggplot
sldf3 <- fortify(SpatialLinesDataFrame(sPath3, data = data.frame(ID = 1)))

# plot the image greyscale with start/end points (red) and shortest path (green)
ggplot(img3) +
  geom_raster(aes(x, y, fill=v2)) +
  scale_fill_continuous(high="white", low="black") +
  scale_y_reverse() +
  geom_point(data=pts_df, aes(x, y), color="red") +
  geom_path(data=sldf3, aes(x=long, y=lat), color="green")

瞧!

正确找到最短路径的解决方案

如果您不填写某些边框像素(哈!),就会发生这种情况。

求解器绕迷宫绕行的解决方案版本

全面披露:在我发现这个问题之前,我自己问和回答了一个非常类似的问题。然后通过SO的魔力,发现这是最重要的“相关问题”之一。我以为我会把这个迷宫作为一个额外的测试用例…我很高兴地发现,我的答案在很少修改的情况下也适用于该应用程序。

Here’s a solution using R.

### download the image, read it into R, converting to something we can play with...
library(jpeg)
url <- "https://i.stack.imgur.com/TqKCM.jpg"
download.file(url, "./maze.jpg", mode = "wb")
jpg <- readJPEG("./maze.jpg")

### reshape array into data.frame
library(reshape2)
img3 <- melt(jpg, varnames = c("y","x","rgb"))
img3$rgb <- as.character(factor(img3$rgb, levels = c(1,2,3), labels=c("r","g","b")))

## split out rgb values into separate columns
img3 <- dcast(img3, x + y ~ rgb)

RGB to greyscale, see: https://stackoverflow.com/a/27491947/2371031

# convert rgb to greyscale (0, 1)
img3$v <- img3$r*.21 + img3$g*.72 + img3$b*.07
# v: values closer to 1 are white, closer to 0 are black

## strategically fill in some border pixels so the solver doesn't "go around":
img3$v2 <- img3$v
img3[(img3$x == 300 | img3$x == 500) & (img3$y %in% c(0:23,988:1002)),"v2"]  = 0

# define some start/end point coordinates
pts_df <- data.frame(x = c(398, 399),
                     y = c(985, 26))

# set a reference value as the mean of the start and end point greyscale "v"s
ref_val <- mean(c(subset(img3, x==pts_df[1,1] & y==pts_df[1,2])$v,
                  subset(img3, x==pts_df[2,1] & y==pts_df[2,2])$v))

library(sp)
library(gdistance)
spdf3 <- SpatialPixelsDataFrame(points = img3[c("x","y")], data = img3["v2"])
r3 <- rasterFromXYZ(spdf3)

# transition layer defines a "conductance" function between any two points, and the number of connections (4 = Manhatten distances)
# x in the function represents the greyscale values ("v2") of two adjacent points (pixels), i.e., = (x1$v2, x2$v2)
# make function(x) encourages transitions between cells with small changes in greyscale compared to the reference values, such that: 
# when v2 is closer to 0 (black) = poor conductance
# when v2 is closer to 1 (white) = good conductance
tl3 <- transition(r3, function(x) (1/max( abs( (x/ref_val)-1 ) )^2)-1, 4) 

## get the shortest path between start, end points
sPath3 <- shortestPath(tl3, as.numeric(pts_df[1,]), as.numeric(pts_df[2,]), output = "SpatialLines")

## fortify for ggplot
sldf3 <- fortify(SpatialLinesDataFrame(sPath3, data = data.frame(ID = 1)))

# plot the image greyscale with start/end points (red) and shortest path (green)
ggplot(img3) +
  geom_raster(aes(x, y, fill=v2)) +
  scale_fill_continuous(high="white", low="black") +
  scale_y_reverse() +
  geom_point(data=pts_df, aes(x, y), color="red") +
  geom_path(data=sldf3, aes(x=long, y=lat), color="green")

Voila!

solution that correctly finds shortest path

This is what happens if you don’t fill in some border pixels (Ha!)…

solution version where the solver goes around the maze

Full disclosure: I asked and answered a very similar question myself before I found this one. Then through the magic of SO, found this one as one of the top “Related Questions”. I thought I’d use this maze as an additional test case… I was very pleased to find that my answer there also works for this application with very little modification.


回答 9

好的解决方案是,而不是按像素查找邻居,而是按单元格完成,因为走廊可以有15px,因此在同一走廊中可以执行左右移动等操作,而如果这样做的话就好像位移是一个多维数据集,这将是一个简单的操作,例如UP,DOWN,LEFT或RIGHT

the good solution would be that instead of finding the neighbors by pixel, it would be done by cell, because a corridor can have 15px so in the same corridor it can take actions like left or right, while if it was done as if the displacement was a cube it would be a simple action like UP,DOWN,LEFT OR RIGHT


检查列表中的所有元素是否相同

问题:检查列表中的所有元素是否相同

我需要以下功能:

输入:alist

输出

  • True 如果输入列表中的所有元素使用标准相等运算符求值彼此相等;
  • False 除此以外。

性能:当然,我不希望产生任何不必要的开销。

我认为最好:

  • 遍历列表
  • 比较相邻元素
  • AND所有结果布尔值

但是我不确定最Pythonic的方法是什么。


缺少短路功能只会损害早期输入不相等的长输入(超过50个元素)。如果这种情况经常发生(频率取决于列表的长度),则需要短路。最好的短路算法似乎是@KennyTM checkEqual1。但是,它为此付出了巨大的代价:

  • 性能几乎是同类产品的20倍
  • 短名单上的性能提高了2.5倍

如果没有出现早期输入不相等的长输入(或发生的次数很少),则不需要短路。然后,到目前为止最快的是@Ivo van der Wijk解决方案。

I need the following function:

Input: a list

Output:

  • True if all elements in the input list evaluate as equal to each other using the standard equality operator;
  • False otherwise.

Performance: of course, I prefer not to incur any unnecessary overhead.

I feel it would be best to:

  • iterate through the list
  • compare adjacent elements
  • and AND all the resulting Boolean values

But I’m not sure what’s the most Pythonic way to do that.


The lack of short-circuit feature only hurts on a long input (over ~50 elements) that have unequal elements early on. If this occurs often enough (how often depends on how long the lists might be), the short-circuit is required. The best short-circuit algorithm seems to be @KennyTM checkEqual1. It pays, however, a significant cost for this:

  • up to 20x in performance nearly-identical lists
  • up to 2.5x in performance on short lists

If the long inputs with early unequal elements don’t happen (or happen sufficiently rarely), short-circuit isn’t required. Then, by far the fastest is @Ivo van der Wijk solution.


回答 0

通用方法:

def checkEqual1(iterator):
    iterator = iter(iterator)
    try:
        first = next(iterator)
    except StopIteration:
        return True
    return all(first == rest for rest in iterator)

单线:

def checkEqual2(iterator):
   return len(set(iterator)) <= 1

也是单线的:

def checkEqual3(lst):
   return lst[1:] == lst[:-1]

这三个版本之间的区别在于:

  1. checkEqual2内容中必须是可哈希的。
  2. checkEqual1并且checkEqual2可以使用任何迭代器,但checkEqual3必须接受序列输入,通常是列表或元组之类的具体容器。
  3. checkEqual1 发现差异后立即停止。
  4. 由于checkEqual1包含更多的Python代码,因此当许多项目在开始时相等时效率较低。
  5. 由于checkEqual2checkEqual3始终执行O(N)复制操作,因此,如果您的大多数输入将返回False,则它们将花费更长的时间。
  6. 对于checkEqual2checkEqual3很难适应从a == b到的比较a is b

timeit 结果,对于Python 2.7和(仅s1,s4,s7,s9应该返回True)

s1 = [1] * 5000
s2 = [1] * 4999 + [2]
s3 = [2] + [1]*4999
s4 = [set([9])] * 5000
s5 = [set([9])] * 4999 + [set([10])]
s6 = [set([10])] + [set([9])] * 4999
s7 = [1,1]
s8 = [1,2]
s9 = []

我们得到

      | checkEqual1 | checkEqual2 | checkEqual3  | checkEqualIvo | checkEqual6502 |
|-----|-------------|-------------|--------------|---------------|----------------|
| s1  | 1.19   msec | 348    usec | 183     usec | 51.6    usec  | 121     usec   |
| s2  | 1.17   msec | 376    usec | 185     usec | 50.9    usec  | 118     usec   |
| s3  | 4.17   usec | 348    usec | 120     usec | 264     usec  | 61.3    usec   |
|     |             |             |              |               |                |
| s4  | 1.73   msec |             | 182     usec | 50.5    usec  | 121     usec   |
| s5  | 1.71   msec |             | 181     usec | 50.6    usec  | 125     usec   |
| s6  | 4.29   usec |             | 122     usec | 423     usec  | 61.1    usec   |
|     |             |             |              |               |                |
| s7  | 3.1    usec | 1.4    usec | 1.24    usec | 0.932   usec  | 1.92    usec   |
| s8  | 4.07   usec | 1.54   usec | 1.28    usec | 0.997   usec  | 1.79    usec   |
| s9  | 5.91   usec | 1.25   usec | 0.749   usec | 0.407   usec  | 0.386   usec   |

注意:

# http://stackoverflow.com/q/3844948/
def checkEqualIvo(lst):
    return not lst or lst.count(lst[0]) == len(lst)

# http://stackoverflow.com/q/3844931/
def checkEqual6502(lst):
    return not lst or [lst[0]]*len(lst) == lst

General method:

def checkEqual1(iterator):
    iterator = iter(iterator)
    try:
        first = next(iterator)
    except StopIteration:
        return True
    return all(first == rest for rest in iterator)

One-liner:

def checkEqual2(iterator):
   return len(set(iterator)) <= 1

Also one-liner:

def checkEqual3(lst):
   return lst[1:] == lst[:-1]

The difference between the 3 versions are that:

  1. In checkEqual2 the content must be hashable.
  2. checkEqual1 and checkEqual2 can use any iterators, but checkEqual3 must take a sequence input, typically concrete containers like a list or tuple.
  3. checkEqual1 stops as soon as a difference is found.
  4. Since checkEqual1 contains more Python code, it is less efficient when many of the items are equal in the beginning.
  5. Since checkEqual2 and checkEqual3 always perform O(N) copying operations, they will take longer if most of your input will return False.
  6. For checkEqual2 and checkEqual3 it’s harder to adapt comparison from a == b to a is b.

timeit result, for Python 2.7 and (only s1, s4, s7, s9 should return True)

s1 = [1] * 5000
s2 = [1] * 4999 + [2]
s3 = [2] + [1]*4999
s4 = [set([9])] * 5000
s5 = [set([9])] * 4999 + [set([10])]
s6 = [set([10])] + [set([9])] * 4999
s7 = [1,1]
s8 = [1,2]
s9 = []

we get

      | checkEqual1 | checkEqual2 | checkEqual3  | checkEqualIvo | checkEqual6502 |
|-----|-------------|-------------|--------------|---------------|----------------|
| s1  | 1.19   msec | 348    usec | 183     usec | 51.6    usec  | 121     usec   |
| s2  | 1.17   msec | 376    usec | 185     usec | 50.9    usec  | 118     usec   |
| s3  | 4.17   usec | 348    usec | 120     usec | 264     usec  | 61.3    usec   |
|     |             |             |              |               |                |
| s4  | 1.73   msec |             | 182     usec | 50.5    usec  | 121     usec   |
| s5  | 1.71   msec |             | 181     usec | 50.6    usec  | 125     usec   |
| s6  | 4.29   usec |             | 122     usec | 423     usec  | 61.1    usec   |
|     |             |             |              |               |                |
| s7  | 3.1    usec | 1.4    usec | 1.24    usec | 0.932   usec  | 1.92    usec   |
| s8  | 4.07   usec | 1.54   usec | 1.28    usec | 0.997   usec  | 1.79    usec   |
| s9  | 5.91   usec | 1.25   usec | 0.749   usec | 0.407   usec  | 0.386   usec   |

Note:

# http://stackoverflow.com/q/3844948/
def checkEqualIvo(lst):
    return not lst or lst.count(lst[0]) == len(lst)

# http://stackoverflow.com/q/3844931/
def checkEqual6502(lst):
    return not lst or [lst[0]]*len(lst) == lst

回答 1

比对序列(不是可迭代对象)使用set()更快的解决方案是仅对第一个元素进行计数。这假设列表是非空的(但是检查起来很麻烦,并自己决定结果应该在空列表中)

x.count(x[0]) == len(x)

一些简单的基准:

>>> timeit.timeit('len(set(s1))<=1', 's1=[1]*5000', number=10000)
1.4383411407470703
>>> timeit.timeit('len(set(s1))<=1', 's1=[1]*4999+[2]', number=10000)
1.4765670299530029
>>> timeit.timeit('s1.count(s1[0])==len(s1)', 's1=[1]*5000', number=10000)
0.26274609565734863
>>> timeit.timeit('s1.count(s1[0])==len(s1)', 's1=[1]*4999+[2]', number=10000)
0.25654196739196777

A solution faster than using set() that works on sequences (not iterables) is to simply count the first element. This assumes the list is non-empty (but that’s trivial to check, and decide yourself what the outcome should be on an empty list)

x.count(x[0]) == len(x)

some simple benchmarks:

>>> timeit.timeit('len(set(s1))<=1', 's1=[1]*5000', number=10000)
1.4383411407470703
>>> timeit.timeit('len(set(s1))<=1', 's1=[1]*4999+[2]', number=10000)
1.4765670299530029
>>> timeit.timeit('s1.count(s1[0])==len(s1)', 's1=[1]*5000', number=10000)
0.26274609565734863
>>> timeit.timeit('s1.count(s1[0])==len(s1)', 's1=[1]*4999+[2]', number=10000)
0.25654196739196777

回答 2

最简单,最优雅的方法如下:

all(x==myList[0] for x in myList)

(是的,这甚至适用于空列表!这是因为这是python具有惰性语义的少数情况之一。)

关于性能,这将尽早失败,因此它是渐近最佳的。

The simplest and most elegant way is as follows:

all(x==myList[0] for x in myList)

(Yes, this even works with the empty list! This is because this is one of the few cases where python has lazy semantics.)

Regarding performance, this will fail at the earliest possible time, so it is asymptotically optimal.


回答 3

一组比较工作:

len(set(the_list)) == 1

使用set删除所有重复的元素。

A set comparison work:

len(set(the_list)) == 1

Using set removes all duplicate elements.


回答 4

您可以将列表转换为集合。集合不能重复。因此,如果原始列表中的所有元素都相同,则该集合将只有一个元素。

if len(sets.Set(input_list)) == 1
// input_list has all identical elements.

You can convert the list to a set. A set cannot have duplicates. So if all the elements in the original list are identical, the set will have just one element.

if len(sets.Set(input_list)) == 1
// input_list has all identical elements.

回答 5

对于它的价值,它最近出现在python-ideas邮件列表中。事实证明,已经有一个itertools方法可以做到这一点:1

def all_equal(iterable):
    "Returns True if all the elements are equal to each other"
    g = groupby(iterable)
    return next(g, True) and not next(g, False)

据说它的性能非常好,并且具有一些不错的属性。

  1. 短路:一旦找到第一个不等项,它将立即停止消耗可迭代项中的项。
  2. 不需要项目是可哈希的。
  3. 它是惰性的,仅需要O(1)额外的内存来执行检查。

1换句话说,我不能因提出解决方案而功不可没-甚至找不到它也不能功劳。

For what it’s worth, this came up on the python-ideas mailing list recently. It turns out that there is an itertools recipe for doing this already:1

def all_equal(iterable):
    "Returns True if all the elements are equal to each other"
    g = groupby(iterable)
    return next(g, True) and not next(g, False)

Supposedly it performs very nicely and has a few nice properties.

  1. Short-circuits: It will stop consuming items from the iterable as soon as it finds the first non-equal item.
  2. Doesn’t require items to be hashable.
  3. It is lazy and only requires O(1) additional memory to do the check.

1In other words, I can’t take the credit for coming up with the solution — nor can I take credit for even finding it.


回答 6

这是两种简单的方法

使用set()

将列表转换为集合时,将删除重复的元素。因此,如果转换后的集合的长度为1,则意味着所有元素都相同。

len(set(input_list))==1

这是一个例子

>>> a = ['not', 'the', 'same']
>>> b = ['same', 'same', 'same']
>>> len(set(a))==1  # == 3
False
>>> len(set(b))==1  # == 1
True

使用all()

这会将输入列表的第一个元素与列表中的所有其他元素进行比较(等效)。如果相等,则返回True,否则返回False。

all(element==input_list[0] for element in input_list)

这是一个例子

>>> a = [1, 2, 3, 4, 5]
>>> b = [1, 1, 1, 1, 1]
>>> all(number==a[0] for number in a)
False
>>> all(number==b[0] for number in b)
True

PS如果要检查整个列表是否等效于某个值,则可以在input_list [0]中设置该值。

Here are two simple ways of doing this

using set()

When converting the list to a set, duplicate elements are removed. So if the length of the converted set is 1, then this implies that all the elements are the same.

len(set(input_list))==1

Here is an example

>>> a = ['not', 'the', 'same']
>>> b = ['same', 'same', 'same']
>>> len(set(a))==1  # == 3
False
>>> len(set(b))==1  # == 1
True

using all()

This will compare (equivalence) the first element of the input list to every other element in the list. If all are equivalent True will be returned, otherwise False will be returned.

all(element==input_list[0] for element in input_list)

Here is an example

>>> a = [1, 2, 3, 4, 5]
>>> b = [1, 1, 1, 1, 1]
>>> all(number==a[0] for number in a)
False
>>> all(number==b[0] for number in b)
True

P.S If you are checking to see if the whole list is equivalent to a certain value, you can suibstitue the value in for input_list[0].


回答 7

这是另一种选择,比len(set(x))==1长列表(使用短路)快

def constantList(x):
    return x and [x[0]]*len(x) == x

This is another option, faster than len(set(x))==1 for long lists (uses short circuit)

def constantList(x):
    return x and [x[0]]*len(x) == x

回答 8

这是一种简单的方法:

result = mylist and all(mylist[0] == elem for elem in mylist)

这稍微复杂一点,但会产生函数调用开销,但语义会更清楚地说明:

def all_identical(seq):
    if not seq:
        # empty list is False.
        return False
    first = seq[0]
    return all(first == elem for elem in seq)

This is a simple way of doing it:

result = mylist and all(mylist[0] == elem for elem in mylist)

This is slightly more complicated, it incurs function call overhead, but the semantics are more clearly spelled out:

def all_identical(seq):
    if not seq:
        # empty list is False.
        return False
    first = seq[0]
    return all(first == elem for elem in seq)

回答 9

检查所有元素是否等于第一个。

np.allclose(array, array[0])

Check if all elements equal to the first.

np.allclose(array, array[0])


回答 10

怀疑这是“最Python化的”,但类似:

>>> falseList = [1,2,3,4]
>>> trueList = [1, 1, 1]
>>> 
>>> def testList(list):
...   for item in list[1:]:
...     if item != list[0]:
...       return False
...   return True
... 
>>> testList(falseList)
False
>>> testList(trueList)
True

会成功的

Doubt this is the “most Pythonic”, but something like:

>>> falseList = [1,2,3,4]
>>> trueList = [1, 1, 1]
>>> 
>>> def testList(list):
...   for item in list[1:]:
...     if item != list[0]:
...       return False
...   return True
... 
>>> testList(falseList)
False
>>> testList(trueList)
True

would do the trick.


回答 11

如果您对可读性更高(但当然不那么有效)感兴趣,可以尝试:

def compare_lists(list1, list2):
    if len(list1) != len(list2): # Weed out unequal length lists.
        return False
    for item in list1:
        if item not in list2:
            return False
    return True

a_list_1 = ['apple', 'orange', 'grape', 'pear']
a_list_2 = ['pear', 'orange', 'grape', 'apple']

b_list_1 = ['apple', 'orange', 'grape', 'pear']
b_list_2 = ['apple', 'orange', 'banana', 'pear']

c_list_1 = ['apple', 'orange', 'grape']
c_list_2 = ['grape', 'orange']

print compare_lists(a_list_1, a_list_2) # Returns True
print compare_lists(b_list_1, b_list_2) # Returns False
print compare_lists(c_list_1, c_list_2) # Returns False

If you’re interested in something a little more readable (but of course not as efficient,) you could try:

def compare_lists(list1, list2):
    if len(list1) != len(list2): # Weed out unequal length lists.
        return False
    for item in list1:
        if item not in list2:
            return False
    return True

a_list_1 = ['apple', 'orange', 'grape', 'pear']
a_list_2 = ['pear', 'orange', 'grape', 'apple']

b_list_1 = ['apple', 'orange', 'grape', 'pear']
b_list_2 = ['apple', 'orange', 'banana', 'pear']

c_list_1 = ['apple', 'orange', 'grape']
c_list_2 = ['grape', 'orange']

print compare_lists(a_list_1, a_list_2) # Returns True
print compare_lists(b_list_1, b_list_2) # Returns False
print compare_lists(c_list_1, c_list_2) # Returns False

回答 12

将列表转换为集合,然后找到集合中的元素数。如果结果为1,则其元素相同,如果不相同,则列表中的元素不相同。

list1 = [1,1,1]
len(set(list1)) 
>1

list1 = [1,2,3]
len(set(list1)
>3

Convert the list into the set and then find the number of elements in the set. If the result is 1, it has identical elements and if not, then the elements in the list are not identical.

list1 = [1,1,1]
len(set(list1)) 
>1

list1 = [1,2,3]
len(set(list1)
>3

回答 13

关于reduce()与一起使用lambda。这是一个我个人认为比其他答案更好的工作代码。

reduce(lambda x, y: (x[1]==y, y), [2, 2, 2], (True, 2))

返回一个元组,如果所有项目相同或不同,则第一个值为布尔值。

Regarding using reduce() with lambda. Here is a working code that I personally think is way nicer than some of the other answers.

reduce(lambda x, y: (x[1]==y, y), [2, 2, 2], (True, 2))

Returns a tuple where the first value is the boolean if all items are same or not.


回答 14

我会做:

not any((x[i] != x[i+1] for i in range(0, len(x)-1)))

因为any一旦找到True条件就停止搜索可迭代对象。

I’d do:

not any((x[i] != x[i+1] for i in range(0, len(x)-1)))

as any stops searching the iterable as soon as it finds a True condition.


回答 15

>>> a = [1, 2, 3, 4, 5, 6]
>>> z = [(a[x], a[x+1]) for x in range(0, len(a)-1)]
>>> z
[(1, 2), (2, 3), (3, 4), (4, 5), (5, 6)]
# Replacing it with the test
>>> z = [(a[x] == a[x+1]) for x in range(0, len(a)-1)]
>>> z
[False, False, False, False, False]
>>> if False in z : Print "All elements are not equal"
>>> a = [1, 2, 3, 4, 5, 6]
>>> z = [(a[x], a[x+1]) for x in range(0, len(a)-1)]
>>> z
[(1, 2), (2, 3), (3, 4), (4, 5), (5, 6)]
# Replacing it with the test
>>> z = [(a[x] == a[x+1]) for x in range(0, len(a)-1)]
>>> z
[False, False, False, False, False]
>>> if False in z : Print "All elements are not equal"

回答 16

def allTheSame(i):
    j = itertools.groupby(i)
    for k in j: break
    for k in j: return False
    return True

在没有“ all”的Python 2.4中工作。

def allTheSame(i):
    j = itertools.groupby(i)
    for k in j: break
    for k in j: return False
    return True

Works in Python 2.4, which doesn’t have “all”.


回答 17

可以使用地图和lambda

lst = [1,1,1,1,1,1,1,1,1]

print all(map(lambda x: x == lst[0], lst[1:]))

Can use map and lambda

lst = [1,1,1,1,1,1,1,1,1]

print all(map(lambda x: x == lst[0], lst[1:]))

回答 18

或使用diffnumpy的方法:

import numpy as np
def allthesame(l):
    return np.all(np.diff(l)==0)

并调用:

print(allthesame([1,1,1]))

输出:

True

Or use diff method of numpy:

import numpy as np
def allthesame(l):
    return np.all(np.diff(l)==0)

And to call:

print(allthesame([1,1,1]))

Output:

True

回答 19

或者使用numpy的diff方法:

import numpy as np
def allthesame(l):
    return np.unique(l).shape[0]<=1

并调用:

print(allthesame([1,1,1]))

输出:

真正

Or use diff method of numpy:

import numpy as np
def allthesame(l):
    return np.unique(l).shape[0]<=1

And to call:

print(allthesame([1,1,1]))

Output:

True


回答 20

你可以做:

reduce(and_, (x==yourList[0] for x in yourList), True)

python使您像那样导入运算符是很烦人的operator.and_。从python3开始,您还需要import functools.reduce

(您不应该使用此方法,因为如果找到不相等的值,它将不会中断,但是会继续检查整个列表。此处仅作为完整性的答案。)

You can do:

reduce(and_, (x==yourList[0] for x in yourList), True)

It is fairly annoying that python makes you import the operators like operator.and_. As of python3, you will need to also import functools.reduce.

(You should not use this method because it will not break if it finds non-equal values, but will continue examining the entire list. It is just included here as an answer for completeness.)


回答 21

lambda lst: reduce(lambda a,b:(b,b==a[0] and a[1]), lst, (lst[0], True))[1]

下一个将短路短路:

all(itertools.imap(lambda i:yourlist[i]==yourlist[i+1], xrange(len(yourlist)-1)))
lambda lst: reduce(lambda a,b:(b,b==a[0] and a[1]), lst, (lst[0], True))[1]

The next one will short short circuit:

all(itertools.imap(lambda i:yourlist[i]==yourlist[i+1], xrange(len(yourlist)-1)))

回答 22

将列表更改为一组。然后,如果集合的大小仅为1,则它们必须相同。

if len(set(my_list)) == 1:

Change the list to a set. Then if the size of the set is only 1, they must have been the same.

if len(set(my_list)) == 1:

回答 23

还有一个纯Python递归选项:

 def checkEqual(lst):
    if len(lst)==2 :
        return lst[0]==lst[1]
    else:
        return lst[0]==lst[1] and checkEqual(lst[1:])

但是由于某种原因,它在某些情况下比其他选择要慢两个数量级。从C语言的心态来看,我期望这会更快,但事实并非如此!

另一个缺点是Python中存在递归限制,在这种情况下需要对其进行调整。例如使用this

There is also a pure Python recursive option:

 def checkEqual(lst):
    if len(lst)==2 :
        return lst[0]==lst[1]
    else:
        return lst[0]==lst[1] and checkEqual(lst[1:])

However for some reason it is in some cases two orders of magnitude slower than other options. Coming from C language mentality, I expected this to be faster, but it is not!

The other disadvantage is that there is recursion limit in Python which needs to be adjusted in this case. For example using this.


回答 24

您可以.nunique()用来查找列表中唯一项目的数量。

def identical_elements(list):
    series = pd.Series(list)
    if series.nunique() == 1: identical = True
    else:  identical = False
    return identical



identical_elements(['a', 'a'])
Out[427]: True

identical_elements(['a', 'b'])
Out[428]: False

You can use .nunique() to find number of unique items in a list.

def identical_elements(list):
    series = pd.Series(list)
    if series.nunique() == 1: identical = True
    else:  identical = False
    return identical



identical_elements(['a', 'a'])
Out[427]: True

identical_elements(['a', 'b'])
Out[428]: False

回答 25

您可以使用set。它将设置并删除重复的元素。然后检查其元素是否超过1个。

if len(set(your_list)) <= 1:
    print('all ements are equal')

例:

>>> len(set([5, 5])) <= 1
True

you can use set. It will make a set and remove repetitive elements. Then check that it has no more than 1 element.

if len(set(your_list)) <= 1:
    print('all ements are equal')

Example:

>>> len(set([5, 5])) <= 1
True

如何生成列表的所有排列?

问题:如何生成列表的所有排列?

如何在Python中生成列表的所有排列,而与列表中元素的类型无关?

例如:

permutations([])
[]

permutations([1])
[1]

permutations([1, 2])
[1, 2]
[2, 1]

permutations([1, 2, 3])
[1, 2, 3]
[1, 3, 2]
[2, 1, 3]
[2, 3, 1]
[3, 1, 2]
[3, 2, 1]

How do you generate all the permutations of a list in Python, independently of the type of elements in that list?

For example:

permutations([])
[]

permutations([1])
[1]

permutations([1, 2])
[1, 2]
[2, 1]

permutations([1, 2, 3])
[1, 2, 3]
[1, 3, 2]
[2, 1, 3]
[2, 3, 1]
[3, 1, 2]
[3, 2, 1]

回答 0

从Python 2.6开始(如果您使用的是Python 3),您可以使用标准库工具:itertools.permutations

import itertools
list(itertools.permutations([1, 2, 3]))

如果您出于某种原因使用旧版Python(<2.6),或者只是想知道它的工作原理,那么这是一种不错的方法,摘自 http://code.activestate.com/recipes/252178/

def all_perms(elements):
    if len(elements) <=1:
        yield elements
    else:
        for perm in all_perms(elements[1:]):
            for i in range(len(elements)):
                # nb elements[0:1] works in both string and list contexts
                yield perm[:i] + elements[0:1] + perm[i:]

的文档中列出了几种其他方法itertools.permutations。这是一个:

def permutations(iterable, r=None):
    # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
    # permutations(range(3)) --> 012 021 102 120 201 210
    pool = tuple(iterable)
    n = len(pool)
    r = n if r is None else r
    if r > n:
        return
    indices = range(n)
    cycles = range(n, n-r, -1)
    yield tuple(pool[i] for i in indices[:r])
    while n:
        for i in reversed(range(r)):
            cycles[i] -= 1
            if cycles[i] == 0:
                indices[i:] = indices[i+1:] + indices[i:i+1]
                cycles[i] = n - i
            else:
                j = cycles[i]
                indices[i], indices[-j] = indices[-j], indices[i]
                yield tuple(pool[i] for i in indices[:r])
                break
        else:
            return

另一个基于itertools.product

def permutations(iterable, r=None):
    pool = tuple(iterable)
    n = len(pool)
    r = n if r is None else r
    for indices in product(range(n), repeat=r):
        if len(set(indices)) == r:
            yield tuple(pool[i] for i in indices)

Starting with Python 2.6 (and if you’re on Python 3) you have a standard-library tool for this: itertools.permutations.

import itertools
list(itertools.permutations([1, 2, 3]))

If you’re using an older Python (<2.6) for some reason or are just curious to know how it works, here’s one nice approach, taken from http://code.activestate.com/recipes/252178/:

def all_perms(elements):
    if len(elements) <=1:
        yield elements
    else:
        for perm in all_perms(elements[1:]):
            for i in range(len(elements)):
                # nb elements[0:1] works in both string and list contexts
                yield perm[:i] + elements[0:1] + perm[i:]

A couple of alternative approaches are listed in the documentation of itertools.permutations. Here’s one:

def permutations(iterable, r=None):
    # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
    # permutations(range(3)) --> 012 021 102 120 201 210
    pool = tuple(iterable)
    n = len(pool)
    r = n if r is None else r
    if r > n:
        return
    indices = range(n)
    cycles = range(n, n-r, -1)
    yield tuple(pool[i] for i in indices[:r])
    while n:
        for i in reversed(range(r)):
            cycles[i] -= 1
            if cycles[i] == 0:
                indices[i:] = indices[i+1:] + indices[i:i+1]
                cycles[i] = n - i
            else:
                j = cycles[i]
                indices[i], indices[-j] = indices[-j], indices[i]
                yield tuple(pool[i] for i in indices[:r])
                break
        else:
            return

And another, based on itertools.product:

def permutations(iterable, r=None):
    pool = tuple(iterable)
    n = len(pool)
    r = n if r is None else r
    for indices in product(range(n), repeat=r):
        if len(set(indices)) == r:
            yield tuple(pool[i] for i in indices)

回答 1

Python 2.6及更高版本中:

import itertools
itertools.permutations([1,2,3])

(作为生成器返回。用于list(permutations(l))作为列表返回。)

And in Python 2.6 onwards:

import itertools
itertools.permutations([1,2,3])

(returned as a generator. Use list(permutations(l)) to return as a list.)


回答 2

以下代码仅适用于Python 2.6及更高版本

首先,导入itertools

import itertools

排列(顺序很重要):

print list(itertools.permutations([1,2,3,4], 2))
[(1, 2), (1, 3), (1, 4),
(2, 1), (2, 3), (2, 4),
(3, 1), (3, 2), (3, 4),
(4, 1), (4, 2), (4, 3)]

组合(顺序无关紧要):

print list(itertools.combinations('123', 2))
[('1', '2'), ('1', '3'), ('2', '3')]

笛卡尔积(具有多个可迭代项):

print list(itertools.product([1,2,3], [4,5,6]))
[(1, 4), (1, 5), (1, 6),
(2, 4), (2, 5), (2, 6),
(3, 4), (3, 5), (3, 6)]

笛卡尔积(本身具有一个可迭代的):

print list(itertools.product([1,2], repeat=3))
[(1, 1, 1), (1, 1, 2), (1, 2, 1), (1, 2, 2),
(2, 1, 1), (2, 1, 2), (2, 2, 1), (2, 2, 2)]

The following code with Python 2.6 and above ONLY

First, import itertools:

import itertools

Permutation (order matters):

print list(itertools.permutations([1,2,3,4], 2))
[(1, 2), (1, 3), (1, 4),
(2, 1), (2, 3), (2, 4),
(3, 1), (3, 2), (3, 4),
(4, 1), (4, 2), (4, 3)]

Combination (order does NOT matter):

print list(itertools.combinations('123', 2))
[('1', '2'), ('1', '3'), ('2', '3')]

Cartesian product (with several iterables):

print list(itertools.product([1,2,3], [4,5,6]))
[(1, 4), (1, 5), (1, 6),
(2, 4), (2, 5), (2, 6),
(3, 4), (3, 5), (3, 6)]

Cartesian product (with one iterable and itself):

print list(itertools.product([1,2], repeat=3))
[(1, 1, 1), (1, 1, 2), (1, 2, 1), (1, 2, 2),
(2, 1, 1), (2, 1, 2), (2, 2, 1), (2, 2, 2)]

回答 3

def permutations(head, tail=''):
    if len(head) == 0: print tail
    else:
        for i in range(len(head)):
            permutations(head[0:i] + head[i+1:], tail+head[i])

称为:

permutations('abc')
def permutations(head, tail=''):
    if len(head) == 0: print tail
    else:
        for i in range(len(head)):
            permutations(head[0:i] + head[i+1:], tail+head[i])

called as:

permutations('abc')

回答 4

#!/usr/bin/env python

def perm(a, k=0):
   if k == len(a):
      print a
   else:
      for i in xrange(k, len(a)):
         a[k], a[i] = a[i] ,a[k]
         perm(a, k+1)
         a[k], a[i] = a[i], a[k]

perm([1,2,3])

输出:

[1, 2, 3]
[1, 3, 2]
[2, 1, 3]
[2, 3, 1]
[3, 2, 1]
[3, 1, 2]

当我交换列表的内容时,需要一个可变的序列类型作为输入。例如,perm(list("ball"))将工作,perm("ball")不会因为您不能更改字符串。

此Python实现受Horowitz,Sahni和Rajasekeran的《计算机算法》一书中介绍的算法的启发

#!/usr/bin/env python

def perm(a, k=0):
   if k == len(a):
      print a
   else:
      for i in xrange(k, len(a)):
         a[k], a[i] = a[i] ,a[k]
         perm(a, k+1)
         a[k], a[i] = a[i], a[k]

perm([1,2,3])

Output:

[1, 2, 3]
[1, 3, 2]
[2, 1, 3]
[2, 3, 1]
[3, 2, 1]
[3, 1, 2]

As I’m swapping the content of the list it’s required a mutable sequence type as input. E.g. perm(list("ball")) will work and perm("ball") won’t because you can’t change a string.

This Python implementation is inspired by the algorithm presented in the book Computer Algorithms by Horowitz, Sahni and Rajasekeran.


回答 5

此解决方案实现了一个生成器,以避免将所有排列保留在内存中:

def permutations (orig_list):
    if not isinstance(orig_list, list):
        orig_list = list(orig_list)

    yield orig_list

    if len(orig_list) == 1:
        return

    for n in sorted(orig_list):
        new_list = orig_list[:]
        pos = new_list.index(n)
        del(new_list[pos])
        new_list.insert(0, n)
        for resto in permutations(new_list[1:]):
            if new_list[:1] + resto <> orig_list:
                yield new_list[:1] + resto

This solution implements a generator, to avoid holding all the permutations on memory:

def permutations (orig_list):
    if not isinstance(orig_list, list):
        orig_list = list(orig_list)

    yield orig_list

    if len(orig_list) == 1:
        return

    for n in sorted(orig_list):
        new_list = orig_list[:]
        pos = new_list.index(n)
        del(new_list[pos])
        new_list.insert(0, n)
        for resto in permutations(new_list[1:]):
            if new_list[:1] + resto <> orig_list:
                yield new_list[:1] + resto

回答 6

以实用的风格

def addperm(x,l):
    return [ l[0:i] + [x] + l[i:]  for i in range(len(l)+1) ]

def perm(l):
    if len(l) == 0:
        return [[]]
    return [x for y in perm(l[1:]) for x in addperm(l[0],y) ]

print perm([ i for i in range(3)])

结果:

[[0, 1, 2], [1, 0, 2], [1, 2, 0], [0, 2, 1], [2, 0, 1], [2, 1, 0]]

In a functional style

def addperm(x,l):
    return [ l[0:i] + [x] + l[i:]  for i in range(len(l)+1) ]

def perm(l):
    if len(l) == 0:
        return [[]]
    return [x for y in perm(l[1:]) for x in addperm(l[0],y) ]

print perm([ i for i in range(3)])

The result:

[[0, 1, 2], [1, 0, 2], [1, 2, 0], [0, 2, 1], [2, 0, 1], [2, 1, 0]]

回答 7

以下代码是给定列表的就地排列,实现为生成器。由于仅返回对列表的引用,因此不应在生成器外部修改列表。该解决方案是非递归的,因此使用低内存。输入列表中元素的多个副本也可以很好地工作。

def permute_in_place(a):
    a.sort()
    yield list(a)

    if len(a) <= 1:
        return

    first = 0
    last = len(a)
    while 1:
        i = last - 1

        while 1:
            i = i - 1
            if a[i] < a[i+1]:
                j = last - 1
                while not (a[i] < a[j]):
                    j = j - 1
                a[i], a[j] = a[j], a[i] # swap the values
                r = a[i+1:last]
                r.reverse()
                a[i+1:last] = r
                yield list(a)
                break
            if i == first:
                a.reverse()
                return

if __name__ == '__main__':
    for n in range(5):
        for a in permute_in_place(range(1, n+1)):
            print a
        print

    for a in permute_in_place([0, 0, 1, 1, 1]):
        print a
    print

The following code is an in-place permutation of a given list, implemented as a generator. Since it only returns references to the list, the list should not be modified outside the generator. The solution is non-recursive, so uses low memory. Work well also with multiple copies of elements in the input list.

def permute_in_place(a):
    a.sort()
    yield list(a)

    if len(a) <= 1:
        return

    first = 0
    last = len(a)
    while 1:
        i = last - 1

        while 1:
            i = i - 1
            if a[i] < a[i+1]:
                j = last - 1
                while not (a[i] < a[j]):
                    j = j - 1
                a[i], a[j] = a[j], a[i] # swap the values
                r = a[i+1:last]
                r.reverse()
                a[i+1:last] = r
                yield list(a)
                break
            if i == first:
                a.reverse()
                return

if __name__ == '__main__':
    for n in range(5):
        for a in permute_in_place(range(1, n+1)):
            print a
        print

    for a in permute_in_place([0, 0, 1, 1, 1]):
        print a
    print

回答 8

我认为一种很明显的方式可能是:

def permutList(l):
    if not l:
            return [[]]
    res = []
    for e in l:
            temp = l[:]
            temp.remove(e)
            res.extend([[e] + r for r in permutList(temp)])

    return res

A quite obvious way in my opinion might be also:

def permutList(l):
    if not l:
            return [[]]
    res = []
    for e in l:
            temp = l[:]
            temp.remove(e)
            res.extend([[e] + r for r in permutList(temp)])

    return res

回答 9

list2Perm = [1, 2.0, 'three']
listPerm = [[a, b, c]
            for a in list2Perm
            for b in list2Perm
            for c in list2Perm
            if ( a != b and b != c and a != c )
            ]
print listPerm

输出:

[
    [1, 2.0, 'three'], 
    [1, 'three', 2.0], 
    [2.0, 1, 'three'], 
    [2.0, 'three', 1], 
    ['three', 1, 2.0], 
    ['three', 2.0, 1]
]
list2Perm = [1, 2.0, 'three']
listPerm = [[a, b, c]
            for a in list2Perm
            for b in list2Perm
            for c in list2Perm
            if ( a != b and b != c and a != c )
            ]
print listPerm

Output:

[
    [1, 2.0, 'three'], 
    [1, 'three', 2.0], 
    [2.0, 1, 'three'], 
    [2.0, 'three', 1], 
    ['three', 1, 2.0], 
    ['three', 2.0, 1]
]

回答 10

我使用了基于阶乘数系统的算法-对于长度为n的列表,您可以逐项组合每个排列项,并从每个阶段剩下的项中进行选择。第一项有n个选择,第二项有n-1个,最后一项只有n个,因此可以将阶乘数字系统中数字的数字用作索引。这样,数字0到n!-1对应于字典顺序中所有可能的排列。

from math import factorial
def permutations(l):
    permutations=[]
    length=len(l)
    for x in xrange(factorial(length)):
        available=list(l)
        newPermutation=[]
        for radix in xrange(length, 0, -1):
            placeValue=factorial(radix-1)
            index=x/placeValue
            newPermutation.append(available.pop(index))
            x-=index*placeValue
        permutations.append(newPermutation)
    return permutations

permutations(range(3))

输出:

[[0, 1, 2], [0, 2, 1], [1, 0, 2], [1, 2, 0], [2, 0, 1], [2, 1, 0]]

该方法是非递归的,但是在我的计算机上它会稍微慢一些,并且当n!时xrange会引发错误!太大而无法转换为C长整数(对我来说n = 13)。当我需要它时就足够了,但是从长远来看这不是itertools.permutations。

I used an algorithm based on the factorial number system– For a list of length n, you can assemble each permutation item by item, selecting from the items left at each stage. You have n choices for the first item, n-1 for the second, and only one for the last, so you can use the digits of a number in the factorial number system as the indices. This way the numbers 0 through n!-1 correspond to all possible permutations in lexicographic order.

from math import factorial
def permutations(l):
    permutations=[]
    length=len(l)
    for x in xrange(factorial(length)):
        available=list(l)
        newPermutation=[]
        for radix in xrange(length, 0, -1):
            placeValue=factorial(radix-1)
            index=x/placeValue
            newPermutation.append(available.pop(index))
            x-=index*placeValue
        permutations.append(newPermutation)
    return permutations

permutations(range(3))

output:

[[0, 1, 2], [0, 2, 1], [1, 0, 2], [1, 2, 0], [2, 0, 1], [2, 1, 0]]

This method is non-recursive, but it is slightly slower on my computer and xrange raises an error when n! is too large to be converted to a C long integer (n=13 for me). It was enough when I needed it, but it’s no itertools.permutations by a long shot.


回答 11

请注意,此算法具有n factorial时间复杂度,其中n是输入列表的长度

打印运行结果:

global result
result = [] 

def permutation(li):
if li == [] or li == None:
    return

if len(li) == 1:
    result.append(li[0])
    print result
    result.pop()
    return

for i in range(0,len(li)):
    result.append(li[i])
    permutation(li[:i] + li[i+1:])
    result.pop()    

例:

permutation([1,2,3])

输出:

[1, 2, 3]
[1, 3, 2]
[2, 1, 3]
[2, 3, 1]
[3, 1, 2]
[3, 2, 1]

Note that this algorithm has an n factorial time complexity, where n is the length of the input list

Print the results on the run:

global result
result = [] 

def permutation(li):
if li == [] or li == None:
    return

if len(li) == 1:
    result.append(li[0])
    print result
    result.pop()
    return

for i in range(0,len(li)):
    result.append(li[i])
    permutation(li[:i] + li[i+1:])
    result.pop()    

Example:

permutation([1,2,3])

Output:

[1, 2, 3]
[1, 3, 2]
[2, 1, 3]
[2, 3, 1]
[3, 1, 2]
[3, 2, 1]

回答 12

正如tzwenn的回答,确实可以迭代每个排列的第一个元素。但是,以这种方式编写此解决方案效率更高:

def all_perms(elements):
    if len(elements) <= 1:
        yield elements  # Only permutation possible = no permutation
    else:
        # Iteration over the first element in the result permutation:
        for (index, first_elmt) in enumerate(elements):
            other_elmts = elements[:index]+elements[index+1:]
            for permutation in all_perms(other_elmts): 
                yield [first_elmt] + permutation

该解决方案的速度提高了约30%,这显然归功于递归以len(elements) <= 1代替0yield就像Riccardo Reyes的解决方案一样,它使用生成器函数(通过),因此内存效率也更高。

One can indeed iterate over the first element of each permutation, as in tzwenn’s answer. It is however more efficient to write this solution this way:

def all_perms(elements):
    if len(elements) <= 1:
        yield elements  # Only permutation possible = no permutation
    else:
        # Iteration over the first element in the result permutation:
        for (index, first_elmt) in enumerate(elements):
            other_elmts = elements[:index]+elements[index+1:]
            for permutation in all_perms(other_elmts): 
                yield [first_elmt] + permutation

This solution is about 30 % faster, apparently thanks to the recursion ending at len(elements) <= 1 instead of 0. It is also much more memory-efficient, as it uses a generator function (through yield), like in Riccardo Reyes’s solution.


回答 13

这是受Haskell实现的启发,该实现使用列表理解:

def permutation(list):
    if len(list) == 0:
        return [[]]
    else:
        return [[x] + ys for x in list for ys in permutation(delete(list, x))]

def delete(list, item):
    lc = list[:]
    lc.remove(item)
    return lc

This is inspired by the Haskell implementation using list comprehension:

def permutation(list):
    if len(list) == 0:
        return [[]]
    else:
        return [[x] + ys for x in list for ys in permutation(delete(list, x))]

def delete(list, item):
    lc = list[:]
    lc.remove(item)
    return lc

回答 14

常规执行(无收益-将在内存中做所有事情):

def getPermutations(array):
    if len(array) == 1:
        return [array]
    permutations = []
    for i in range(len(array)): 
        # get all perm's of subarray w/o current item
        perms = getPermutations(array[:i] + array[i+1:])  
        for p in perms:
            permutations.append([array[i], *p])
    return permutations

收益实施:

def getPermutations(array):
    if len(array) == 1:
        yield array
    else:
        for i in range(len(array)):
            perms = getPermutations(array[:i] + array[i+1:])
            for p in perms:
                yield [array[i], *p]

基本思想是遍历数组中所有元素的第1个位置,然后在第2个位置中遍历所有其余元素,而第1个位置没有选择元素,依此类推。您可以使用recursion进行操作,其中停止条件是到达由1个元素组成的数组-在这种情况下,您将返回该数组。

在此处输入图片说明

Regular implementation (no yield – will do everything in memory):

def getPermutations(array):
    if len(array) == 1:
        return [array]
    permutations = []
    for i in range(len(array)): 
        # get all perm's of subarray w/o current item
        perms = getPermutations(array[:i] + array[i+1:])  
        for p in perms:
            permutations.append([array[i], *p])
    return permutations

Yield implementation:

def getPermutations(array):
    if len(array) == 1:
        yield array
    else:
        for i in range(len(array)):
            perms = getPermutations(array[:i] + array[i+1:])
            for p in perms:
                yield [array[i], *p]

The basic idea is to go over all the elements in the array for the 1st position, and then in 2nd position go over all the rest of the elements without the chosen element for the 1st, etc. You can do this with recursion, where the stop criteria is getting to an array of 1 element – in which case you return that array.

enter image description here


回答 15

为了提高性能,从Knuth(p22)那里得到了一个麻木的解决方案:

from numpy import empty, uint8
from math import factorial

def perms(n):
    f = 1
    p = empty((2*n-1, factorial(n)), uint8)
    for i in range(n):
        p[i, :f] = i
        p[i+1:2*i+1, :f] = p[:i, :f]  # constitution de blocs
        for j in range(i):
            p[:i+1, f*(j+1):f*(j+2)] = p[j+1:j+i+2, :f]  # copie de blocs
        f = f*(i+1)
    return p[:n, :]

复制大块内存可以节省时间-比list(itertools.permutations(range(n))以下方法快20倍:

In [1]: %timeit -n10 list(permutations(range(10)))
10 loops, best of 3: 815 ms per loop

In [2]: %timeit -n100 perms(10) 
100 loops, best of 3: 40 ms per loop

For performance, a numpy solution inspired by Knuth, (p22) :

from numpy import empty, uint8
from math import factorial

def perms(n):
    f = 1
    p = empty((2*n-1, factorial(n)), uint8)
    for i in range(n):
        p[i, :f] = i
        p[i+1:2*i+1, :f] = p[:i, :f]  # constitution de blocs
        for j in range(i):
            p[:i+1, f*(j+1):f*(j+2)] = p[j+1:j+i+2, :f]  # copie de blocs
        f = f*(i+1)
    return p[:n, :]

Copying large blocs of memory saves time – it’s 20x faster than list(itertools.permutations(range(n)) :

In [1]: %timeit -n10 list(permutations(range(10)))
10 loops, best of 3: 815 ms per loop

In [2]: %timeit -n100 perms(10) 
100 loops, best of 3: 40 ms per loop

回答 16

from __future__ import print_function

def perm(n):
    p = []
    for i in range(0,n+1):
        p.append(i)
    while True:
        for i in range(1,n+1):
            print(p[i], end=' ')
        print("")
        i = n - 1
        found = 0
        while (not found and i>0):
            if p[i]<p[i+1]:
                found = 1
            else:
                i = i - 1
        k = n
        while p[i]>p[k]:
            k = k - 1
        aux = p[i]
        p[i] = p[k]
        p[k] = aux
        for j in range(1,(n-i)/2+1):
            aux = p[i+j]
            p[i+j] = p[n-j+1]
            p[n-j+1] = aux
        if not found:
            break

perm(5)
from __future__ import print_function

def perm(n):
    p = []
    for i in range(0,n+1):
        p.append(i)
    while True:
        for i in range(1,n+1):
            print(p[i], end=' ')
        print("")
        i = n - 1
        found = 0
        while (not found and i>0):
            if p[i]<p[i+1]:
                found = 1
            else:
                i = i - 1
        k = n
        while p[i]>p[k]:
            k = k - 1
        aux = p[i]
        p[i] = p[k]
        p[k] = aux
        for j in range(1,(n-i)/2+1):
            aux = p[i+j]
            p[i+j] = p[n-j+1]
            p[n-j+1] = aux
        if not found:
            break

perm(5)

回答 17

这是一种适用于列表的算法,无需创建新的中间列表,类似于https://stackoverflow.com/a/108651/184528上Ber的解决方案。

def permute(xs, low=0):
    if low + 1 >= len(xs):
        yield xs
    else:
        for p in permute(xs, low + 1):
            yield p        
        for i in range(low + 1, len(xs)):        
            xs[low], xs[i] = xs[i], xs[low]
            for p in permute(xs, low + 1):
                yield p        
            xs[low], xs[i] = xs[i], xs[low]

for p in permute([1, 2, 3, 4]):
    print p

您可以在此处亲自尝试该代码:http : //repl.it/J9v

Here is an algorithm that works on a list without creating new intermediate lists similar to Ber’s solution at https://stackoverflow.com/a/108651/184528.

def permute(xs, low=0):
    if low + 1 >= len(xs):
        yield xs
    else:
        for p in permute(xs, low + 1):
            yield p        
        for i in range(low + 1, len(xs)):        
            xs[low], xs[i] = xs[i], xs[low]
            for p in permute(xs, low + 1):
                yield p        
            xs[low], xs[i] = xs[i], xs[low]

for p in permute([1, 2, 3, 4]):
    print p

You can try the code out for yourself here: http://repl.it/J9v


回答 18

递归的美丽:

>>> import copy
>>> def perm(prefix,rest):
...      for e in rest:
...              new_rest=copy.copy(rest)
...              new_prefix=copy.copy(prefix)
...              new_prefix.append(e)
...              new_rest.remove(e)
...              if len(new_rest) == 0:
...                      print new_prefix + new_rest
...                      continue
...              perm(new_prefix,new_rest)
... 
>>> perm([],['a','b','c','d'])
['a', 'b', 'c', 'd']
['a', 'b', 'd', 'c']
['a', 'c', 'b', 'd']
['a', 'c', 'd', 'b']
['a', 'd', 'b', 'c']
['a', 'd', 'c', 'b']
['b', 'a', 'c', 'd']
['b', 'a', 'd', 'c']
['b', 'c', 'a', 'd']
['b', 'c', 'd', 'a']
['b', 'd', 'a', 'c']
['b', 'd', 'c', 'a']
['c', 'a', 'b', 'd']
['c', 'a', 'd', 'b']
['c', 'b', 'a', 'd']
['c', 'b', 'd', 'a']
['c', 'd', 'a', 'b']
['c', 'd', 'b', 'a']
['d', 'a', 'b', 'c']
['d', 'a', 'c', 'b']
['d', 'b', 'a', 'c']
['d', 'b', 'c', 'a']
['d', 'c', 'a', 'b']
['d', 'c', 'b', 'a']

The beauty of recursion:

>>> import copy
>>> def perm(prefix,rest):
...      for e in rest:
...              new_rest=copy.copy(rest)
...              new_prefix=copy.copy(prefix)
...              new_prefix.append(e)
...              new_rest.remove(e)
...              if len(new_rest) == 0:
...                      print new_prefix + new_rest
...                      continue
...              perm(new_prefix,new_rest)
... 
>>> perm([],['a','b','c','d'])
['a', 'b', 'c', 'd']
['a', 'b', 'd', 'c']
['a', 'c', 'b', 'd']
['a', 'c', 'd', 'b']
['a', 'd', 'b', 'c']
['a', 'd', 'c', 'b']
['b', 'a', 'c', 'd']
['b', 'a', 'd', 'c']
['b', 'c', 'a', 'd']
['b', 'c', 'd', 'a']
['b', 'd', 'a', 'c']
['b', 'd', 'c', 'a']
['c', 'a', 'b', 'd']
['c', 'a', 'd', 'b']
['c', 'b', 'a', 'd']
['c', 'b', 'd', 'a']
['c', 'd', 'a', 'b']
['c', 'd', 'b', 'a']
['d', 'a', 'b', 'c']
['d', 'a', 'c', 'b']
['d', 'b', 'a', 'c']
['d', 'b', 'c', 'a']
['d', 'c', 'a', 'b']
['d', 'c', 'b', 'a']

回答 19

此算法是最有效的算法,它避免了在递归调用中进行数组传递和操作,在Python 2、3中有效:

def permute(items):
    length = len(items)
    def inner(ix=[]):
        do_yield = len(ix) == length - 1
        for i in range(0, length):
            if i in ix: #avoid duplicates
                continue
            if do_yield:
                yield tuple([items[y] for y in ix + [i]])
            else:
                for p in inner(ix + [i]):
                    yield p
    return inner()

用法:

for p in permute((1,2,3)):
    print(p)

(1, 2, 3)
(1, 3, 2)
(2, 1, 3)
(2, 3, 1)
(3, 1, 2)
(3, 2, 1)

This algorithm is the most effective one, it avoids of array passing and manipulation in recursive calls, works in Python 2, 3:

def permute(items):
    length = len(items)
    def inner(ix=[]):
        do_yield = len(ix) == length - 1
        for i in range(0, length):
            if i in ix: #avoid duplicates
                continue
            if do_yield:
                yield tuple([items[y] for y in ix + [i]])
            else:
                for p in inner(ix + [i]):
                    yield p
    return inner()

Usage:

for p in permute((1,2,3)):
    print(p)

(1, 2, 3)
(1, 3, 2)
(2, 1, 3)
(2, 3, 1)
(3, 1, 2)
(3, 2, 1)

回答 20

def pzip(c, seq):
    result = []
    for item in seq:
        for i in range(len(item)+1):
            result.append(item[i:]+c+item[:i])
    return result


def perm(line):
    seq = [c for c in line]
    if len(seq) <=1 :
        return seq
    else:
        return pzip(seq[0], perm(seq[1:]))
def pzip(c, seq):
    result = []
    for item in seq:
        for i in range(len(item)+1):
            result.append(item[i:]+c+item[:i])
    return result


def perm(line):
    seq = [c for c in line]
    if len(seq) <=1 :
        return seq
    else:
        return pzip(seq[0], perm(seq[1:]))

回答 21

另一种方法(无库)

def permutation(input):
    if len(input) == 1:
        return input if isinstance(input, list) else [input]

    result = []
    for i in range(len(input)):
        first = input[i]
        rest = input[:i] + input[i + 1:]
        rest_permutation = permutation(rest)
        for p in rest_permutation:
            result.append(first + p)
    return result

输入可以是字符串或列表

print(permutation('abcd'))
print(permutation(['a', 'b', 'c', 'd']))

ANOTHER APPROACH (without libs)

def permutation(input):
    if len(input) == 1:
        return input if isinstance(input, list) else [input]

    result = []
    for i in range(len(input)):
        first = input[i]
        rest = input[:i] + input[i + 1:]
        rest_permutation = permutation(rest)
        for p in rest_permutation:
            result.append(first + p)
    return result

Input can be a string or a list

print(permutation('abcd'))
print(permutation(['a', 'b', 'c', 'd']))

回答 22

免责声明:软件包作者提供的无形插件。:)

猪手包是从大多数实现不同,它产生不实际包含的排列,而是描述的排列和各位置之间的映射关系的顺序,使其能够工作,排列非常大名单“,如图所示伪名单在这个演示中,执行了一个非常瞬时的操作并在“包含”字母中所有字母排列的伪列表中进行查找,而没有使用比典型的Web页面更多的内存或处理。

无论如何,要生成排列列表,我们可以执行以下操作。

import trotter

my_permutations = trotter.Permutations(3, [1, 2, 3])

print(my_permutations)

for p in my_permutations:
    print(p)

输出:

包含[1、2、3]的6个3排列的伪列表。
[1,2,3]
[1、3、2]
[3,1,2]
[3,2,1]
[2,3,1]
[2,1,3]

Disclaimer: shapeless plug by package author. :)

The trotter package is different from most implementations in that it generates pseudo lists that don’t actually contain permutations but rather describe mappings between permutations and respective positions in an ordering, making it possible to work with very large ‘lists’ of permutations, as shown in this demo which performs pretty instantaneous operations and look-ups in a pseudo-list ‘containing’ all the permutations of the letters in the alphabet, without using more memory or processing than a typical web page.

In any case, to generate a list of permutations, we can do the following.

import trotter

my_permutations = trotter.Permutations(3, [1, 2, 3])

print(my_permutations)

for p in my_permutations:
    print(p)

Output:

A pseudo-list containing 6 3-permutations of [1, 2, 3].
[1, 2, 3]
[1, 3, 2]
[3, 1, 2]
[3, 2, 1]
[2, 3, 1]
[2, 1, 3]

回答 23

生成所有可能的排列

我正在使用python3.4:

def calcperm(arr, size):
    result = set([()])
    for dummy_idx in range(size):
        temp = set()
        for dummy_lst in result:
            for dummy_outcome in arr:
                if dummy_outcome not in dummy_lst:
                    new_seq = list(dummy_lst)
                    new_seq.append(dummy_outcome)
                    temp.add(tuple(new_seq))
        result = temp
    return result

测试用例:

lst = [1, 2, 3, 4]
#lst = ["yellow", "magenta", "white", "blue"]
seq = 2
final = calcperm(lst, seq)
print(len(final))
print(final)

Generate all possible permutations

I’m using python3.4:

def calcperm(arr, size):
    result = set([()])
    for dummy_idx in range(size):
        temp = set()
        for dummy_lst in result:
            for dummy_outcome in arr:
                if dummy_outcome not in dummy_lst:
                    new_seq = list(dummy_lst)
                    new_seq.append(dummy_outcome)
                    temp.add(tuple(new_seq))
        result = temp
    return result

Test Cases:

lst = [1, 2, 3, 4]
#lst = ["yellow", "magenta", "white", "blue"]
seq = 2
final = calcperm(lst, seq)
print(len(final))
print(final)

回答 24

为了节省大家的搜索和实验时间,以下是Python中的非递归置换解决方案,该解决方案也适用于Numba(自0.41版起):

@numba.njit()
def permutations(A, k):
    r = [[i for i in range(0)]]
    for i in range(k):
        r = [[a] + b for a in A for b in r if (a in b)==False]
    return r
permutations([1,2,3],3)
[[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]

给人印象的表现:

%timeit permutations(np.arange(5),5)

243 µs ± 11.1 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
time: 406 ms

%timeit list(itertools.permutations(np.arange(5),5))
15.9 µs ± 8.61 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
time: 12.9 s

因此,仅当必须从njitted函数调用它时才使用此版本,否则,请选择itertools实现。

To save you folks possible hours of searching and experimenting, here’s the non-recursive permutaions solution in Python which also works with Numba (as of v. 0.41):

@numba.njit()
def permutations(A, k):
    r = [[i for i in range(0)]]
    for i in range(k):
        r = [[a] + b for a in A for b in r if (a in b)==False]
    return r
permutations([1,2,3],3)
[[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]

To give an impression about performance:

%timeit permutations(np.arange(5),5)

243 µs ± 11.1 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
time: 406 ms

%timeit list(itertools.permutations(np.arange(5),5))
15.9 µs ± 8.61 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
time: 12.9 s

So use this version only if you have to call it from njitted function, otherwise prefer itertools implementation.


回答 25

我看到这些递归函数内部进行了很多迭代,而并非完全是函数递归…

因此对于那些甚至无法遵守一个循环的人来说,这是一个总的,完全不必要的完全递归解决方案

def all_insert(x, e, i=0):
    return [x[0:i]+[e]+x[i:]] + all_insert(x,e,i+1) if i<len(x)+1 else []

def for_each(X, e):
    return all_insert(X[0], e) + for_each(X[1:],e) if X else []

def permute(x):
    return [x] if len(x) < 2 else for_each( permute(x[1:]) , x[0])


perms = permute([1,2,3])

I see a lot of iteration going on inside these recursive functions, not exactly pure recursion…

so for those of you who cannot abide by even a single loop, here’s a gross, totally unnecessary fully recursive solution

def all_insert(x, e, i=0):
    return [x[0:i]+[e]+x[i:]] + all_insert(x,e,i+1) if i<len(x)+1 else []

def for_each(X, e):
    return all_insert(X[0], e) + for_each(X[1:],e) if X else []

def permute(x):
    return [x] if len(x) < 2 else for_each( permute(x[1:]) , x[0])


perms = permute([1,2,3])

回答 26

另一个解决方案:

def permutation(flag, k =1 ):
    N = len(flag)
    for i in xrange(0, N):
        if flag[i] != 0:
            continue
        flag[i] = k 
        if k == N:
            print flag
        permutation(flag, k+1)
        flag[i] = 0

permutation([0, 0, 0])

Another solution:

def permutation(flag, k =1 ):
    N = len(flag)
    for i in xrange(0, N):
        if flag[i] != 0:
            continue
        flag[i] = k 
        if k == N:
            print flag
        permutation(flag, k+1)
        flag[i] = 0

permutation([0, 0, 0])

回答 27

我的Python解决方案:

def permutes(input,offset):
    if( len(input) == offset ):
        return [''.join(input)]

    result=[]        
    for i in range( offset, len(input) ):
         input[offset], input[i] = input[i], input[offset]
         result = result + permutes(input,offset+1)
         input[offset], input[i] = input[i], input[offset]
    return result

# input is a "string"
# return value is a list of strings
def permutations(input):
    return permutes( list(input), 0 )

# Main Program
print( permutations("wxyz") )

My Python Solution:

def permutes(input,offset):
    if( len(input) == offset ):
        return [''.join(input)]

    result=[]        
    for i in range( offset, len(input) ):
         input[offset], input[i] = input[i], input[offset]
         result = result + permutes(input,offset+1)
         input[offset], input[i] = input[i], input[offset]
    return result

# input is a "string"
# return value is a list of strings
def permutations(input):
    return permutes( list(input), 0 )

# Main Program
print( permutations("wxyz") )

回答 28

def permutation(word, first_char=None):
    if word == None or len(word) == 0: return []
    if len(word) == 1: return [word]

    result = []
    first_char = word[0]
    for sub_word in permutation(word[1:], first_char):
        result += insert(first_char, sub_word)
    return sorted(result)

def insert(ch, sub_word):
    arr = [ch + sub_word]
    for i in range(len(sub_word)):
        arr.append(sub_word[i:] + ch + sub_word[:i])
    return arr


assert permutation(None) == []
assert permutation('') == []
assert permutation('1')  == ['1']
assert permutation('12') == ['12', '21']

print permutation('abc')

输出:[‘abc’,’acb’,’bac’,’bca’,’cab’,’cba’]

def permutation(word, first_char=None):
    if word == None or len(word) == 0: return []
    if len(word) == 1: return [word]

    result = []
    first_char = word[0]
    for sub_word in permutation(word[1:], first_char):
        result += insert(first_char, sub_word)
    return sorted(result)

def insert(ch, sub_word):
    arr = [ch + sub_word]
    for i in range(len(sub_word)):
        arr.append(sub_word[i:] + ch + sub_word[:i])
    return arr


assert permutation(None) == []
assert permutation('') == []
assert permutation('1')  == ['1']
assert permutation('12') == ['12', '21']

print permutation('abc')

Output: [‘abc’, ‘acb’, ‘bac’, ‘bca’, ‘cab’, ‘cba’]


回答 29

使用 Counter

from collections import Counter

def permutations(nums):
    ans = [[]]
    cache = Counter(nums)

    for idx, x in enumerate(nums):
        result = []
        for items in ans:
            cache1 = Counter(items)
            for id, n in enumerate(nums):
                if cache[n] != cache1[n] and items + [n] not in result:
                    result.append(items + [n])

        ans = result
    return ans
permutations([1, 2, 2])
> [[1, 2, 2], [2, 1, 2], [2, 2, 1]]

Using Counter

from collections import Counter

def permutations(nums):
    ans = [[]]
    cache = Counter(nums)

    for idx, x in enumerate(nums):
        result = []
        for items in ans:
            cache1 = Counter(items)
            for id, n in enumerate(nums):
                if cache[n] != cache1[n] and items + [n] not in result:
                    result.append(items + [n])

        ans = result
    return ans
permutations([1, 2, 2])
> [[1, 2, 2], [2, 1, 2], [2, 2, 1]]


删除列表中的重复项

问题:删除列表中的重复项

我几乎需要编写一个程序来检查列表中是否有重复项,如果删除了重复项,则将其删除并返回一个新列表,其中包含未重复/删除的项。这就是我所拥有的,但老实说我不知道​​该怎么办。

def remove_duplicates():
    t = ['a', 'b', 'c', 'd']
    t2 = ['a', 'c', 'd']
    for t in t2:
        t.append(t.remove())
    return t

Pretty much I need to write a program to check if a list has any duplicates and if it does it removes them and returns a new list with the items that weren’t duplicated/removed. This is what I have but to be honest I do not know what to do.

def remove_duplicates():
    t = ['a', 'b', 'c', 'd']
    t2 = ['a', 'c', 'd']
    for t in t2:
        t.append(t.remove())
    return t

回答 0

获取唯一项目集合的常用方法是使用set。集是不同对象的无序集合。要从任何迭代创建集合,只需将其传递给内置函数即可。如果以后再次需要真实列表,则可以类似地将集合传递给set()list()函数。

以下示例应涵盖您尝试做的所有事情:

>>> t = [1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> t
[1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> list(set(t))
[1, 2, 3, 5, 6, 7, 8]
>>> s = [1, 2, 3]
>>> list(set(t) - set(s))
[8, 5, 6, 7]

从示例结果可以看出,原始订单未得到维护。如上所述,集合本身是无序集合,因此顺序丢失。将集合转换回列表时,将创建任意顺序。

维持秩序

如果订单对您很重要,那么您将不得不使用其他机制。一个非常常见的解决方案是OrderedDict在插入期间依靠保持键的顺序:

>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys(t))
[1, 2, 3, 5, 6, 7, 8]

从Python 3.7开始,内置字典也保证可以保持插入顺序,因此,如果您使用的是Python 3.7或更高版本(或CPython 3.6),也可以直接使用它:

>>> list(dict.fromkeys(t))
[1, 2, 3, 5, 6, 7, 8]

请注意,这可能会产生一些开销,先创建字典,然后再从中创建列表。如果您实际上不需要保留订单,则通常最好使用一组,特别是因为它可以为您提供更多操作。请查看此问题,以获取更多详细信息以及删除重复项时保留订单的其他方法。


最后请注意,解决方案setOrderedDict/ dict解决方案都要求您的项目是可哈希的。这通常意味着它们必须是不变的。如果必须处理不可散列的项目(例如列表对象),则必须使用慢速方法,在这种方法中,您基本上必须将每个项目与嵌套循环中的所有其他项目进行比较。

The common approach to get a unique collection of items is to use a set. Sets are unordered collections of distinct objects. To create a set from any iterable, you can simply pass it to the built-in set() function. If you later need a real list again, you can similarly pass the set to the list() function.

The following example should cover whatever you are trying to do:

>>> t = [1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> t
[1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> list(set(t))
[1, 2, 3, 5, 6, 7, 8]
>>> s = [1, 2, 3]
>>> list(set(t) - set(s))
[8, 5, 6, 7]

As you can see from the example result, the original order is not maintained. As mentioned above, sets themselves are unordered collections, so the order is lost. When converting a set back to a list, an arbitrary order is created.

Maintaining order

If order is important to you, then you will have to use a different mechanism. A very common solution for this is to rely on OrderedDict to keep the order of keys during insertion:

>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys(t))
[1, 2, 3, 5, 6, 7, 8]

Starting with Python 3.7, the built-in dictionary is guaranteed to maintain the insertion order as well, so you can also use that directly if you are on Python 3.7 or later (or CPython 3.6):

>>> list(dict.fromkeys(t))
[1, 2, 3, 5, 6, 7, 8]

Note that this may have some overhead of creating a dictionary first, and then creating a list from it. If you don’t actually need to preserve the order, you’re often better off using a set, especially because it gives you a lot more operations to work with. Check out this question for more details and alternative ways to preserve the order when removing duplicates.


Finally note that both the set as well as the OrderedDict/dict solutions require your items to be hashable. This usually means that they have to be immutable. If you have to deal with items that are not hashable (e.g. list objects), then you will have to use a slow approach in which you will basically have to compare every item with every other item in a nested loop.


回答 1

在Python 2.7中,从迭代器中删除重复项并同时保持其原始顺序的新方法是:

>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

在Python 3.5中,OrderedDict具有C实现。我的时间表明,这是Python 3.5各种方法中最快也是最短的。

在Python 3.6中,常规字典变得有序且紧凑。(此功能适用于CPython和PyPy,但在其他实现中可能不存在)。这为我们提供了一种在保留订单的同时进行重复数据删除的最快方法:

>>> list(dict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

在Python 3.7中,保证常规dict在所有实现中都排序。 因此,最短,最快的解决方案是:

>>> list(dict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

In Python 2.7, the new way of removing duplicates from an iterable while keeping it in the original order is:

>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

In Python 3.5, the OrderedDict has a C implementation. My timings show that this is now both the fastest and shortest of the various approaches for Python 3.5.

In Python 3.6, the regular dict became both ordered and compact. (This feature is holds for CPython and PyPy but may not present in other implementations). That gives us a new fastest way of deduping while retaining order:

>>> list(dict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

In Python 3.7, the regular dict is guaranteed to both ordered across all implementations. So, the shortest and fastest solution is:

>>> list(dict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

回答 2

这是单线的:list(set(source_list))会成功的。

A set是不可能重复的东西。

更新:保留订单的方法有两行:

from collections import OrderedDict
OrderedDict((x, True) for x in source_list).keys()

在这里,我们使用一个事实,即OrderedDict记住键的插入顺序,并且在更新特定键的值时不会更改它。我们插入True作为值,但是我们可以插入任何东西,只是不使用值。(也set很像dict带有忽略值的a 。)

It’s a one-liner: list(set(source_list)) will do the trick.

A set is something that can’t possibly have duplicates.

Update: an order-preserving approach is two lines:

from collections import OrderedDict
OrderedDict((x, True) for x in source_list).keys()

Here we use the fact that OrderedDict remembers the insertion order of keys, and does not change it when a value at a particular key is updated. We insert True as values, but we could insert anything, values are just not used. (set works a lot like a dict with ignored values, too.)


回答 3

>>> t = [1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> t
[1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> s = []
>>> for i in t:
       if i not in s:
          s.append(i)
>>> s
[1, 2, 3, 5, 6, 7, 8]
>>> t = [1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> t
[1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> s = []
>>> for i in t:
       if i not in s:
          s.append(i)
>>> s
[1, 2, 3, 5, 6, 7, 8]

回答 4

如果您不关心订单,请执行以下操作:

def remove_duplicates(l):
    return list(set(l))

set保证A 没有重复项。

If you don’t care about the order, just do this:

def remove_duplicates(l):
    return list(set(l))

A set is guaranteed to not have duplicates.


回答 5

制作一个新列表,其中保留重复项中第一个元素的顺序 L

newlist=[ii for n,ii in enumerate(L) if ii not in L[:n]]

例如,if L=[1, 2, 2, 3, 4, 2, 4, 3, 5]那么newlist将是[1,2,3,4,5]

这会在添加每个新元素之前检查它是否没有出现在列表中。而且它不需要进口。

To make a new list retaining the order of first elements of duplicates in L

newlist=[ii for n,ii in enumerate(L) if ii not in L[:n]]

for example if L=[1, 2, 2, 3, 4, 2, 4, 3, 5] then newlist will be [1,2,3,4,5]

This checks each new element has not appeared previously in the list before adding it. Also it does not need imports.


回答 6

一位同事已将接受的答案作为他的代码的一部分发送给我,以供今天进行代码审查。尽管我当然很欣赏所提问题的优雅之处,但我对这种表现并不满意。我已经尝试过此解决方案(我使用set来减少查找时间)

def ordered_set(in_list):
    out_list = []
    added = set()
    for val in in_list:
        if not val in added:
            out_list.append(val)
            added.add(val)
    return out_list

为了比较效率,我使用了100个整数的随机样本-62个是唯一的

from random import randint
x = [randint(0,100) for _ in xrange(100)]

In [131]: len(set(x))
Out[131]: 62

这是测量结果

In [129]: %timeit list(OrderedDict.fromkeys(x))
10000 loops, best of 3: 86.4 us per loop

In [130]: %timeit ordered_set(x)
100000 loops, best of 3: 15.1 us per loop

好吧,如果将集合从解决方案中删除,会发生什么?

def ordered_set(inlist):
    out_list = []
    for val in inlist:
        if not val in out_list:
            out_list.append(val)
    return out_list

结果不如OrderedDict差,但仍然是原始解决方案的3倍以上

In [136]: %timeit ordered_set(x)
10000 loops, best of 3: 52.6 us per loop

A colleague have sent the accepted answer as part of his code to me for a codereview today. While I certainly admire the elegance of the answer in question, I am not happy with the performance. I have tried this solution (I use set to reduce lookup time)

def ordered_set(in_list):
    out_list = []
    added = set()
    for val in in_list:
        if not val in added:
            out_list.append(val)
            added.add(val)
    return out_list

To compare efficiency, I used a random sample of 100 integers – 62 were unique

from random import randint
x = [randint(0,100) for _ in xrange(100)]

In [131]: len(set(x))
Out[131]: 62

Here are the results of the measurements

In [129]: %timeit list(OrderedDict.fromkeys(x))
10000 loops, best of 3: 86.4 us per loop

In [130]: %timeit ordered_set(x)
100000 loops, best of 3: 15.1 us per loop

Well, what happens if set is removed from the solution?

def ordered_set(inlist):
    out_list = []
    for val in inlist:
        if not val in out_list:
            out_list.append(val)
    return out_list

The result is not as bad as with the OrderedDict, but still more than 3 times of the original solution

In [136]: %timeit ordered_set(x)
10000 loops, best of 3: 52.6 us per loop

回答 7

也有使用Pandas和Numpy的解决方案。它们都返回numpy数组,因此.tolist()如果需要列表,则必须使用该函数。

t=['a','a','b','b','b','c','c','c']
t2= ['c','c','b','b','b','a','a','a']

熊猫解决方案

使用熊猫功能unique()

import pandas as pd
pd.unique(t).tolist()
>>>['a','b','c']
pd.unique(t2).tolist()
>>>['c','b','a']

脾气暴躁的解决方案

使用numpy函数unique()

import numpy as np
np.unique(t).tolist()
>>>['a','b','c']
np.unique(t2).tolist()
>>>['a','b','c']

请注意,numpy.unique()也对值进行排序。因此,列表t2按排序返回。如果您想保留订单,请按照以下答案进行操作

_, idx = np.unique(t2, return_index=True)
t2[np.sort(idx)].tolist()
>>>['c','b','a']

与其他解决方案相比,该解决方案并不那么优雅,但是与pandas.unique()相比,numpy.unique()还可让您检查嵌套数组在一个选定轴上是否唯一。

There are also solutions using Pandas and Numpy. They both return numpy array so you have to use the function .tolist() if you want a list.

t=['a','a','b','b','b','c','c','c']
t2= ['c','c','b','b','b','a','a','a']

Pandas solution

Using Pandas function unique():

import pandas as pd
pd.unique(t).tolist()
>>>['a','b','c']
pd.unique(t2).tolist()
>>>['c','b','a']

Numpy solution

Using numpy function unique().

import numpy as np
np.unique(t).tolist()
>>>['a','b','c']
np.unique(t2).tolist()
>>>['a','b','c']

Note that numpy.unique() also sort the values. So the list t2 is returned sorted. If you want to have the order preserved use as in this answer:

_, idx = np.unique(t2, return_index=True)
t2[np.sort(idx)].tolist()
>>>['c','b','a']

The solution is not so elegant compared to the others, however, compared to pandas.unique(), numpy.unique() allows you also to check if nested arrays are unique along one selected axis.


回答 8

另一种方式:

>>> seq = [1,2,3,'a', 'a', 1,2]
>> dict.fromkeys(seq).keys()
['a', 1, 2, 3]

Another way of doing:

>>> seq = [1,2,3,'a', 'a', 1,2]
>> dict.fromkeys(seq).keys()
['a', 1, 2, 3]

回答 9

简单易行:

myList = [1, 2, 3, 1, 2, 5, 6, 7, 8]
cleanlist = []
[cleanlist.append(x) for x in myList if x not in cleanlist]

输出:

>>> cleanlist 
[1, 2, 3, 5, 6, 7, 8]

Simple and easy:

myList = [1, 2, 3, 1, 2, 5, 6, 7, 8]
cleanlist = []
[cleanlist.append(x) for x in myList if x not in cleanlist]

Output:

>>> cleanlist 
[1, 2, 3, 5, 6, 7, 8]

回答 10

在这个答案中,将分为两个部分:两个独特的解决方案,以及特定解决方案的速度图表。

删除重复项

这些答案大多数都只删除可哈希的重复项,但是这个问题并不意味着它不仅需要可哈希项,这意味着我将提供一些不需要哈希项的解决方案。

collections.Counter是标准库中的强大工具,可能对此非常理想。只有另一种解决方案甚至包含Counter。但是,该解决方案也仅限于可哈希键。

为了在Counter中允许不可散列的键,我制作了一个Container类,它将尝试获取对象的默认散列函数,但是如果失败,它将尝试其标识函数。它还定义了一个eq和一个哈希方法。这应该足以允许我们的解决方案中使用不可散列的项目。不可哈希对象将被视为可哈希对象。但是,此哈希函数对不可哈希对象使用标识,这意味着两个不可哈希的相等对象将不起作用。我建议您重写此方法,并将其更改为使用等效可变类型的哈希(例如使用hash(tuple(my_list))ifmy_list是列表)。

我还提出了两种解决方案。另一个解决方案是使用OrderedDict和Counter的子类(称为“ OrderedCounter”)来保持商品的顺序。现在,这里是功能:

from collections import OrderedDict, Counter

class Container:
    def __init__(self, obj):
        self.obj = obj
    def __eq__(self, obj):
        return self.obj == obj
    def __hash__(self):
        try:
            return hash(self.obj)
        except:
            return id(self.obj)

class OrderedCounter(Counter, OrderedDict):
     'Counter that remembers the order elements are first encountered'

     def __repr__(self):
         return '%s(%r)' % (self.__class__.__name__, OrderedDict(self))

     def __reduce__(self):
         return self.__class__, (OrderedDict(self),)

def remd(sequence):
    cnt = Counter()
    for x in sequence:
        cnt[Container(x)] += 1
    return [item.obj for item in cnt]

def oremd(sequence):
    cnt = OrderedCounter()
    for x in sequence:
        cnt[Container(x)] += 1
    return [item.obj for item in cnt]

remd是非排序排序,oremd是排序排序。您可以清楚地分辨出哪一个速度更快,但无论如何我都会解释。无序排序略快。由于不需要排序,因此它保留的数据较少。

现在,我还想显示每个答案的速度比较。所以,我现在就开始做。

哪个功能最快?

为了删除重复项,我从一些答案中收集了10个函数。我计算了每个函数的速度,并使用matplotlib.pyplot将其放入图表中。

我将其分为三轮。可哈希对象是可以被哈希处理的任何对象,不可哈希对象是不能被哈希处理的任何对象。有序序列是保留顺序的序列,无序序列不保留顺序。现在,这里还有一些术语:

“无序哈希”适用于任何删除重复项的方法,这些方法不一定必须保持顺序。它不必为无法哈希​​的文件工作,但是可以。

Ordered Hashable适用于将项目的顺序保留在列表中的任何方法,但是它不一定适用于unhashables,但是可以。

Ordered Unhashable是保留列表中项目顺序并适用于unhashable的任何方法。

在y轴上是花费的秒数。

在x轴上是应用该功能的编号。

我们通过以下理解为无序哈希和有序哈希生成序列: [list(range(x)) + list(range(x)) for x in range(0, 1000, 10)]

对于订购的不可哈希值: [[list(range(y)) + list(range(y)) for y in range(x)] for x in range(0, 1000, 10)]

请注意,该范围内有一个“台阶”,因为没有它,这将花费10倍的时间。另外,由于我个人的观点,我认为它看起来似乎更容易阅读。

另请注意,图例上的键是我试图猜测为功能最重要的部分。至于什么功能最差或最好?该图说明了一切。

解决之后,下面是图表。

无序哈希

在此处输入图片说明 (放大) 在此处输入图片说明

有序哈希

在此处输入图片说明 (放大) 在此处输入图片说明

有序的不可哈希

在此处输入图片说明 (放大) 在此处输入图片说明

In this answer, will be two sections: Two unique solutions, and a graph of speed for specific solutions.

Removing Duplicate Items

Most of these answers only remove duplicate items which are hashable, but this question doesn’t imply it doesn’t just need hashable items, meaning I’ll offer some solutions which don’t require hashable items.

collections.Counter is a powerful tool in the standard library which could be perfect for this. There’s only one other solution which even has Counter in it. However, that solution is also limited to hashable keys.

To allow unhashable keys in Counter, I made a Container class, which will try to get the object’s default hash function, but if it fails, it will try its identity function. It also defines an eq and a hash method. This should be enough to allow unhashable items in our solution. Unhashable objects will be treated as if they are hashable. However, this hash function uses identity for unhashable objects, meaning two equal objects that are both unhashable won’t work. I suggest you overriding this, and changing it to use the hash of an equivalent mutable type (like using hash(tuple(my_list)) if my_list is a list).

I also made two solutions. Another solution which keeps the order of the items, using a subclass of both OrderedDict and Counter which is named ‘OrderedCounter’. Now, here are the functions:

from collections import OrderedDict, Counter

class Container:
    def __init__(self, obj):
        self.obj = obj
    def __eq__(self, obj):
        return self.obj == obj
    def __hash__(self):
        try:
            return hash(self.obj)
        except:
            return id(self.obj)

class OrderedCounter(Counter, OrderedDict):
     'Counter that remembers the order elements are first encountered'

     def __repr__(self):
         return '%s(%r)' % (self.__class__.__name__, OrderedDict(self))

     def __reduce__(self):
         return self.__class__, (OrderedDict(self),)

def remd(sequence):
    cnt = Counter()
    for x in sequence:
        cnt[Container(x)] += 1
    return [item.obj for item in cnt]

def oremd(sequence):
    cnt = OrderedCounter()
    for x in sequence:
        cnt[Container(x)] += 1
    return [item.obj for item in cnt]

remd is non-ordered sorting, oremd is ordered sorting. You can clearly tell which one is faster, but I’ll explain anyways. The non-ordered sorting is slightly faster. It keeps less data, since it doesn’t need order.

Now, I also wanted to show the speed comparisons of each answer. So, I’ll do that now.

Which Function is the Fastest?

For removing duplicates, I gathered 10 functions from a few answers. I calculated the speed of each function and put it into a graph using matplotlib.pyplot.

I divided this into three rounds of graphing. A hashable is any object which can be hashed, an unhashable is any object which cannot be hashed. An ordered sequence is a sequence which preserves order, an unordered sequence does not preserve order. Now, here are a few more terms:

Unordered Hashable was for any method which removed duplicates, which didn’t necessarily have to keep the order. It didn’t have to work for unhashables, but it could.

Ordered Hashable was for any method which kept the order of the items in the list, but it didn’t have to work for unhashables, but it could.

Ordered Unhashable was any method which kept the order of the items in the list, and worked for unhashables.

On the y-axis is the amount of seconds it took.

On the x-axis is the number the function was applied to.

We generated sequences for unordered hashables and ordered hashables with the following comprehension: [list(range(x)) + list(range(x)) for x in range(0, 1000, 10)]

For ordered unhashables: [[list(range(y)) + list(range(y)) for y in range(x)] for x in range(0, 1000, 10)]

Note there is a ‘step’ in the range because without it, this would’ve taken 10x as long. Also because in my personal opinion, I thought it might’ve looked a little easier to read.

Also note the keys on the legend are what I tried to guess as the most vital parts of the function. As for what function does the worst or best? The graph speaks for itself.

With that settled, here are the graphs.

Unordered Hashables

enter image description here (Zoomed in) enter image description here

Ordered Hashables

enter image description here (Zoomed in) enter image description here

Ordered Unhashables

enter image description here (Zoomed in) enter image description here


回答 11

我的清单上有一个字典,所以我不能使用上述方法。我得到了错误:

TypeError: unhashable type:

因此,如果您关心订单和/或某些项目无法散列。然后,您可能会发现这很有用:

def make_unique(original_list):
    unique_list = []
    [unique_list.append(obj) for obj in original_list if obj not in unique_list]
    return unique_list

有些人可能认为列表理解有副作用不是一个好的解决方案。这是一个替代方案:

def make_unique(original_list):
    unique_list = []
    map(lambda x: unique_list.append(x) if (x not in unique_list) else False, original_list)
    return unique_list

I had a dict in my list, so I could not use the above approach. I got the error:

TypeError: unhashable type:

So if you care about order and/or some items are unhashable. Then you might find this useful:

def make_unique(original_list):
    unique_list = []
    [unique_list.append(obj) for obj in original_list if obj not in unique_list]
    return unique_list

Some may consider list comprehension with a side effect to not be a good solution. Here’s an alternative:

def make_unique(original_list):
    unique_list = []
    map(lambda x: unique_list.append(x) if (x not in unique_list) else False, original_list)
    return unique_list

回答 12

所有的保持阶接近我在这里看到迄今要么使用比较幼稚(具有为O(n ^ 2)在最佳的时间复杂度)或重重量OrderedDicts/ set+ list的组合被限制于可哈希输入。这是独立于哈希的O(nlogn)解决方案:

更新添加了key参数,文档和Python 3兼容性。

# from functools import reduce <-- add this import on Python 3

def uniq(iterable, key=lambda x: x):
    """
    Remove duplicates from an iterable. Preserves order. 
    :type iterable: Iterable[Ord => A]
    :param iterable: an iterable of objects of any orderable type
    :type key: Callable[A] -> (Ord => B)
    :param key: optional argument; by default an item (A) is discarded 
    if another item (B), such that A == B, has already been encountered and taken. 
    If you provide a key, this condition changes to key(A) == key(B); the callable 
    must return orderable objects.
    """
    # Enumerate the list to restore order lately; reduce the sorted list; restore order
    def append_unique(acc, item):
        return acc if key(acc[-1][1]) == key(item[1]) else acc.append(item) or acc 
    srt_enum = sorted(enumerate(iterable), key=lambda item: key(item[1]))
    return [item[1] for item in sorted(reduce(append_unique, srt_enum, [srt_enum[0]]))] 

All the order-preserving approaches I’ve seen here so far either use naive comparison (with O(n^2) time-complexity at best) or heavy-weight OrderedDicts/set+list combinations that are limited to hashable inputs. Here is a hash-independent O(nlogn) solution:

Update added the key argument, documentation and Python 3 compatibility.

# from functools import reduce <-- add this import on Python 3

def uniq(iterable, key=lambda x: x):
    """
    Remove duplicates from an iterable. Preserves order. 
    :type iterable: Iterable[Ord => A]
    :param iterable: an iterable of objects of any orderable type
    :type key: Callable[A] -> (Ord => B)
    :param key: optional argument; by default an item (A) is discarded 
    if another item (B), such that A == B, has already been encountered and taken. 
    If you provide a key, this condition changes to key(A) == key(B); the callable 
    must return orderable objects.
    """
    # Enumerate the list to restore order lately; reduce the sorted list; restore order
    def append_unique(acc, item):
        return acc if key(acc[-1][1]) == key(item[1]) else acc.append(item) or acc 
    srt_enum = sorted(enumerate(iterable), key=lambda item: key(item[1]))
    return [item[1] for item in sorted(reduce(append_unique, srt_enum, [srt_enum[0]]))] 

回答 13

如果您想保留订单,并且不使用任何外部模块,则可以通过以下简便方法进行操作:

>>> t = [1, 9, 2, 3, 4, 5, 3, 6, 7, 5, 8, 9]
>>> list(dict.fromkeys(t))
[1, 9, 2, 3, 4, 5, 6, 7, 8]

注意:此方法保留了外观顺序,因此,如前所述,因为它是第一次出现,所以后面将有九个。但是,这与您得到的结果相同

from collections import OrderedDict
ulist=list(OrderedDict.fromkeys(l))

但它更短,并且运行更快。

之所以fromkeys可行,是因为每次函数尝试创建一个新键时,如果该值已经存在,它将简单地覆盖它。但是,这根本不会影响字典,因为fromkeys会创建一个字典,其中所有键都具有value None,因此有效地它消除了所有重复项。

If you want to preserve the order, and not use any external modules here is an easy way to do this:

>>> t = [1, 9, 2, 3, 4, 5, 3, 6, 7, 5, 8, 9]
>>> list(dict.fromkeys(t))
[1, 9, 2, 3, 4, 5, 6, 7, 8]

Note: This method preserves the order of appearance, so, as seen above, nine will come after one because it was the first time it appeared. This however, is the same result as you would get with doing

from collections import OrderedDict
ulist=list(OrderedDict.fromkeys(l))

but it is much shorter, and runs faster.

This works because each time the fromkeys function tries to create a new key, if the value already exists it will simply overwrite it. This wont affect the dictionary at all however, as fromkeys creates a dictionary where all keys have the value None, so effectively it eliminates all duplicates this way.


回答 14

您也可以这样做:

>>> t = [1, 2, 3, 3, 2, 4, 5, 6]
>>> s = [x for i, x in enumerate(t) if i == t.index(x)]
>>> s
[1, 2, 3, 4, 5, 6]

上面的工作原理是该index方法仅返回元素的第一个索引。重复元素具有更高的索引。请参考这里

list.index(x [,start [,end]])
在值为x的第一项列表中返回从零开始的索引。如果没有这样的项目,则引发ValueError。

You could also do this:

>>> t = [1, 2, 3, 3, 2, 4, 5, 6]
>>> s = [x for i, x in enumerate(t) if i == t.index(x)]
>>> s
[1, 2, 3, 4, 5, 6]

The reason that above works is that index method returns only the first index of an element. Duplicate elements have higher indices. Refer to here:

list.index(x[, start[, end]])
Return zero-based index in the list of the first item whose value is x. Raises a ValueError if there is no such item.


回答 15

尝试使用集合:

import sets
t = sets.Set(['a', 'b', 'c', 'd'])
t1 = sets.Set(['a', 'b', 'c'])

print t | t1
print t - t1

Try using sets:

import sets
t = sets.Set(['a', 'b', 'c', 'd'])
t1 = sets.Set(['a', 'b', 'c'])

print t | t1
print t - t1

回答 16

通过保留订单来减少变体:

假设我们有清单:

l = [5, 6, 6, 1, 1, 2, 2, 3, 4]

减少变体(无效):

>>> reduce(lambda r, v: v in r and r or r + [v], l, [])
[5, 6, 1, 2, 3, 4]

速度提高5倍,但功能更先进

>>> reduce(lambda r, v: v in r[1] and r or (r[0].append(v) or r[1].add(v)) or r, l, ([], set()))[0]
[5, 6, 1, 2, 3, 4]

说明:

default = (list(), set())
# user list to keep order
# use set to make lookup faster

def reducer(result, item):
    if item not in result[1]:
        result[0].append(item)
        result[1].add(item)
    return result

reduce(reducer, l, default)[0]

Reduce variant with ordering preserve:

Assume that we have list:

l = [5, 6, 6, 1, 1, 2, 2, 3, 4]

Reduce variant (unefficient):

>>> reduce(lambda r, v: v in r and r or r + [v], l, [])
[5, 6, 1, 2, 3, 4]

5 x faster but more sophisticated

>>> reduce(lambda r, v: v in r[1] and r or (r[0].append(v) or r[1].add(v)) or r, l, ([], set()))[0]
[5, 6, 1, 2, 3, 4]

Explanation:

default = (list(), set())
# user list to keep order
# use set to make lookup faster

def reducer(result, item):
    if item not in result[1]:
        result[0].append(item)
        result[1].add(item)
    return result

reduce(reducer, l, default)[0]

回答 17

从列表中删除重复项的最佳方法是使用python中可用的set()函数,再次将其转换为列表

In [2]: some_list = ['a','a','v','v','v','c','c','d']
In [3]: list(set(some_list))
Out[3]: ['a', 'c', 'd', 'v']

Best approach of removing duplicates from a list is using set() function, available in python, again converting that set into list

In [2]: some_list = ['a','a','v','v','v','c','c','d']
In [3]: list(set(some_list))
Out[3]: ['a', 'c', 'd', 'v']

回答 18

您可以使用以下功能:

def rem_dupes(dup_list): 
    yooneeks = [] 
    for elem in dup_list: 
        if elem not in yooneeks: 
            yooneeks.append(elem) 
    return yooneeks

范例

my_list = ['this','is','a','list','with','dupicates','in', 'the', 'list']

用法:

rem_dupes(my_list)

[‘this’,’is’,’a’,’list’,’with’,’dupicates,’in’,’the’]

You can use the following function:

def rem_dupes(dup_list): 
    yooneeks = [] 
    for elem in dup_list: 
        if elem not in yooneeks: 
            yooneeks.append(elem) 
    return yooneeks

Example:

my_list = ['this','is','a','list','with','dupicates','in', 'the', 'list']

Usage:

rem_dupes(my_list)

[‘this’, ‘is’, ‘a’, ‘list’, ‘with’, ‘dupicates’, ‘in’, ‘the’]


回答 19

还有许多其他答案建议使用不同的方法来执行此操作,但是它们都是批处理操作,其中一些会放弃原始订单。根据您的需要,这可能没问题,但是如果您要按每个值的第一个实例的顺序迭代这些值,并且想要即时删除所有重复项,而一次删除所有重复项,则可以使用此生成器:

def uniqify(iterable):
    seen = set()
    for item in iterable:
        if item not in seen:
            seen.add(item)
            yield item

这将返回一个生成器/迭代器,因此您可以在可以使用迭代器的任何地方使用它。

for unique_item in uniqify([1, 2, 3, 4, 3, 2, 4, 5, 6, 7, 6, 8, 8]):
    print(unique_item, end=' ')

print()

输出:

1 2 3 4 5 6 7 8

如果您确实想要a list,则可以执行以下操作:

unique_list = list(uniqify([1, 2, 3, 4, 3, 2, 4, 5, 6, 7, 6, 8, 8]))

print(unique_list)

输出:

[1, 2, 3, 4, 5, 6, 7, 8]

There are many other answers suggesting different ways to do this, but they’re all batch operations, and some of them throw away the original order. That might be okay depending on what you need, but if you want to iterate over the values in the order of the first instance of each value, and you want to remove the duplicates on-the-fly versus all at once, you could use this generator:

def uniqify(iterable):
    seen = set()
    for item in iterable:
        if item not in seen:
            seen.add(item)
            yield item

This returns a generator/iterator, so you can use it anywhere that you can use an iterator.

for unique_item in uniqify([1, 2, 3, 4, 3, 2, 4, 5, 6, 7, 6, 8, 8]):
    print(unique_item, end=' ')

print()

Output:

1 2 3 4 5 6 7 8

If you do want a list, you can do this:

unique_list = list(uniqify([1, 2, 3, 4, 3, 2, 4, 5, 6, 7, 6, 8, 8]))

print(unique_list)

Output:

[1, 2, 3, 4, 5, 6, 7, 8]

回答 20

不使用设置

data=[1, 2, 3, 1, 2, 5, 6, 7, 8]
uni_data=[]
for dat in data:
    if dat not in uni_data:
        uni_data.append(dat)

print(uni_data) 

Without using set

data=[1, 2, 3, 1, 2, 5, 6, 7, 8]
uni_data=[]
for dat in data:
    if dat not in uni_data:
        uni_data.append(dat)

print(uni_data) 

回答 21

您可以使用set删除重复项:

mylist = list(set(mylist))

但是请注意,结果将是无序的。如果这是一个问题:

mylist.sort()

You can use set to remove duplicates:

mylist = list(set(mylist))

But note the results will be unordered. If that’s an issue:

mylist.sort()

回答 22

还有一种更好的方法是

import pandas as pd

myList = [1, 2, 3, 1, 2, 5, 6, 7, 8]
cleanList = pd.Series(myList).drop_duplicates().tolist()
print(cleanList)

#> [1, 2, 3, 5, 6, 7, 8]

并且订单保持不变。

One more better approach could be,

import pandas as pd

myList = [1, 2, 3, 1, 2, 5, 6, 7, 8]
cleanList = pd.Series(myList).drop_duplicates().tolist()
print(cleanList)

#> [1, 2, 3, 5, 6, 7, 8]

and the order remains preserved.


回答 23

这个人关心订单的过程没有太多麻烦(OrderdDict等)。可能不是最Python的方式,也不是最短的方式,但是可以解决这个问题:

def remove_duplicates(list):
    ''' Removes duplicate items from a list '''
    singles_list = []
    for element in list:
        if element not in singles_list:
            singles_list.append(element)
    return singles_list

This one cares about the order without too much hassle (OrderdDict & others). Probably not the most Pythonic way, nor shortest way, but does the trick:

def remove_duplicates(list):
    ''' Removes duplicate items from a list '''
    singles_list = []
    for element in list:
        if element not in singles_list:
            singles_list.append(element)
    return singles_list

回答 24

下面的代码很容易删除列表中的重复项

def remove_duplicates(x):
    a = []
    for i in x:
        if i not in a:
            a.append(i)
    return a

print remove_duplicates([1,2,2,3,3,4])

它返回[1,2,3,4]

below code is simple for removing duplicate in list

def remove_duplicates(x):
    a = []
    for i in x:
        if i not in a:
            a.append(i)
    return a

print remove_duplicates([1,2,2,3,3,4])

it returns [1,2,3,4]


回答 25

这是最快的pythonic解决方案,适用于其他答复中列出的解决方案。

使用短路评估的实施细节可以使用列表理解,这足够快。visited.add(item)始终返回None结果,其结果为False,因此的右侧or始终是该表达式的结果。

自己计时

def deduplicate(sequence):
    visited = set()
    adder = visited.add  # get rid of qualification overhead
    out = [adder(item) or item for item in sequence if item not in visited]
    return out

Here’s the fastest pythonic solution comaring to others listed in replies.

Using implementation details of short-circuit evaluation allows to use list comprehension, which is fast enough. visited.add(item) always returns None as a result, which is evaluated as False, so the right-side of or would always be the result of such an expression.

Time it yourself

def deduplicate(sequence):
    visited = set()
    adder = visited.add  # get rid of qualification overhead
    out = [adder(item) or item for item in sequence if item not in visited]
    return out

回答 26

使用set

a = [0,1,2,3,4,3,3,4]
a = list(set(a))
print a

使用独特的

import numpy as np
a = [0,1,2,3,4,3,3,4]
a = np.unique(a).tolist()
print a

Using set :

a = [0,1,2,3,4,3,3,4]
a = list(set(a))
print a

Using unique :

import numpy as np
a = [0,1,2,3,4,3,3,4]
a = np.unique(a).tolist()
print a

回答 27

不幸。此处的大多数答案要么不保留顺序,要么太长。这是一个简单的订单保留答案。

s = [1,2,3,4,5,2,5,6,7,1,3,9,3,5]
x=[]

[x.append(i) for i in s if i not in x]
print(x)

这将为您x删除重复项,但保留顺序。

Unfortunately. Most answers here either do not preserve the order or are too long. Here is a simple, order preserving answer.

s = [1,2,3,4,5,2,5,6,7,1,3,9,3,5]
x=[]

[x.append(i) for i in s if i not in x]
print(x)

This will give you x with duplicates removed but preserving the order.


回答 28

Python 3中非常简单的方法:

>>> n = [1, 2, 3, 4, 1, 1]
>>> n
[1, 2, 3, 4, 1, 1]
>>> m = sorted(list(set(n)))
>>> m
[1, 2, 3, 4]

Very simple way in Python 3:

>>> n = [1, 2, 3, 4, 1, 1]
>>> n
[1, 2, 3, 4, 1, 1]
>>> m = sorted(list(set(n)))
>>> m
[1, 2, 3, 4]

回答 29

Python内置类型的魔力

在python中,仅通过python的内置类型,即可轻松处理此类复杂情况。

让我告诉你怎么做!

方法1:一般情况

删除列表中重复元素并仍然保持排序顺序的方式(1行代码

line = [1, 2, 3, 1, 2, 5, 6, 7, 8]
new_line = sorted(set(line), key=line.index) # remove duplicated element
print(new_line)

您将得到结果

[1, 2, 3, 5, 6, 7, 8]

方法2:特例

TypeError: unhashable type: 'list'

处理不可散列的特殊情况(3行代码

line=[['16.4966155686595', '-27.59776154691', '52.3786295521147']
,['16.4966155686595', '-27.59776154691', '52.3786295521147']
,['17.6508629295574', '-27.143305738671', '47.534955022564']
,['17.6508629295574', '-27.143305738671', '47.534955022564']
,['18.8051102904552', '-26.688849930432', '42.6912804930134']
,['18.8051102904552', '-26.688849930432', '42.6912804930134']
,['19.5504702331098', '-26.205884452727', '37.7709192714727']
,['19.5504702331098', '-26.205884452727', '37.7709192714727']
,['20.2929416861422', '-25.722717575124', '32.8500163147157']
,['20.2929416861422', '-25.722717575124', '32.8500163147157']]

tuple_line = [tuple(pt) for pt in line] # convert list of list into list of tuple
tuple_new_line = sorted(set(tuple_line),key=tuple_line.index) # remove duplicated element
new_line = [list(t) for t in tuple_new_line] # convert list of tuple into list of list

print (new_line)

您将得到结果:

[
  ['16.4966155686595', '-27.59776154691', '52.3786295521147'], 
  ['17.6508629295574', '-27.143305738671', '47.534955022564'], 
  ['18.8051102904552', '-26.688849930432', '42.6912804930134'], 
  ['19.5504702331098', '-26.205884452727', '37.7709192714727'], 
  ['20.2929416861422', '-25.722717575124', '32.8500163147157']
]

由于元组是可哈希的,因此您可以轻松地在列表和元组之间转换数据

The Magic of Python Built-in type

In python, it is very easy to process the complicated cases like this and only by python’s built-in type.

Let me show you how to do !

Method 1: General Case

The way (1 line code) to remove duplicated element in list and still keep sorting order

line = [1, 2, 3, 1, 2, 5, 6, 7, 8]
new_line = sorted(set(line), key=line.index) # remove duplicated element
print(new_line)

You will get the result

[1, 2, 3, 5, 6, 7, 8]

Method 2: Special Case

TypeError: unhashable type: 'list'

The special case to process unhashable (3 line codes)

line=[['16.4966155686595', '-27.59776154691', '52.3786295521147']
,['16.4966155686595', '-27.59776154691', '52.3786295521147']
,['17.6508629295574', '-27.143305738671', '47.534955022564']
,['17.6508629295574', '-27.143305738671', '47.534955022564']
,['18.8051102904552', '-26.688849930432', '42.6912804930134']
,['18.8051102904552', '-26.688849930432', '42.6912804930134']
,['19.5504702331098', '-26.205884452727', '37.7709192714727']
,['19.5504702331098', '-26.205884452727', '37.7709192714727']
,['20.2929416861422', '-25.722717575124', '32.8500163147157']
,['20.2929416861422', '-25.722717575124', '32.8500163147157']]

tuple_line = [tuple(pt) for pt in line] # convert list of list into list of tuple
tuple_new_line = sorted(set(tuple_line),key=tuple_line.index) # remove duplicated element
new_line = [list(t) for t in tuple_new_line] # convert list of tuple into list of list

print (new_line)

You will get the result :

[
  ['16.4966155686595', '-27.59776154691', '52.3786295521147'], 
  ['17.6508629295574', '-27.143305738671', '47.534955022564'], 
  ['18.8051102904552', '-26.688849930432', '42.6912804930134'], 
  ['19.5504702331098', '-26.205884452727', '37.7709192714727'], 
  ['20.2929416861422', '-25.722717575124', '32.8500163147157']
]

Because tuple is hashable and you can convert data between list and tuple easily


Leetcode-master 刷题攻略:200W道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图

一些闲话:

  1. 介绍:本项目是一套完整的刷题计划,旨在帮助大家少走弯路,循序渐进学算法,关注作者
  2. Pdf版本「代码随想录」算法精讲 PDF 版本那就是。
  3. 刷题顺序:自述文件已经将刷题顺序排好了,按照顺序一道一道刷就可以。
  4. 学习社区:一起学习打卡/面试技巧/如何选择Offer/大厂内推/职场规则/简历修改/技术分享/程序人生。欢迎加入「代码随想录」学习社区那就是。
  5. 提交代码:本项目统一使用C++语言进行讲解,但已经有JAVA、Python、Go、JavaScript等等多语言版本,感谢这里的每一位贡献者,如果你也想贡献代码点亮你的头像,点击这里了解提交代码的方式.
  6. 转载须知:以下所有文章皆为我(程序员Carl)的原创.引用本项目文章请注明出处,发现恶意抄袭或搬运,会动用法律武器维护自己的权益.让我们一起维护一个良好的技术创作环境!


LeetCode刷题攻略

刷题攻略的背景

很多刚开始刷题的同学都有一个困惑:面对leetcode上近两千道题目,从何刷起.

大家平时刷题感觉效率低,浪费的时间主要在三点:

  • 找题
  • 找到了不应该现阶段做的题
  • 没有全套的优质题解可以参考

其实我之前在知乎上回答过这个问题,回答内容大概是按照如下类型来刷数组->链表->哈希表->字符串->栈与队列->树->回溯->贪心->动态规划->图论->高级数据结构,再从简单刷起,做了几个类型题目之后,再慢慢做中等题目、困难题目.

但我能设身处地的感受到:即使有这样一个整体规划,对于一位初学者甚至算法老手寻找合适自己的题目也是很困难,时间成本很高,而且题目还不一定就是经典题目.

对于刷题,我们都是想用最短的时间按照循序渐进的难度顺序把经典题目都做一遍,这样效率才是最高的!

所以我整理了LeetCode刷题攻略:一个超级详细的刷题顺序,每道题目都是我精心筛选,都是经典题目高频面试题,大家只要按照这个顺序刷就可以了,你没看错,自述已经把题目顺序都排好了,文章顺序就是刷题顺序!挨个刷就可以,不用自己再去题海里选题了!

而且每道题目我都写了的详细题解(图文并茂,难点配有视频),力扣上我的题解都是排在对应题目的首页,质量是有目共睹的.

那么现在我把刷题顺序都整理出来,是为了帮助更多的学习算法的同学少走弯路!

如果你在刷LeetCode,强烈建议先按照本攻略刷题顺序来刷,刷完了你会发现对整个知识体系有一个质的飞跃,不用在题海茫然的寻找方向.

最新文章会首发在公众号“代码随想录”,扫码看看吧,你会发现相见恨晚!

如何使用该刷题攻略

电脑端还看不到留言,大家可以在公众号「代码随想录」,左下角有“刷题攻略”,这是手机版刷题攻略,看完就会发现有很多录友(代码随想录的朋友们)在文章下留言打卡,这份刷题顺序和题解已经陪伴了上万录友了,同时也说明文章的质量是经过上万人的考验!

欢迎每一位学习算法的小伙伴加入到这个学习阵营来!

目前已经更新了,数组->链表->哈希表->字符串->栈与队列->树->回溯->贪心,八个专题了,正在讲解动态规划!

在刷题攻略中,每个专题开始都有理论基础篇,并不像是教科书般的理论介绍,而是从实战中归纳需要的基础知识.每个专题结束都有总结篇,最这个专题的归纳总结.

如果你是算法老手,这篇攻略也是复习的最佳资料,如果把每个系列对应的总结篇,快速过一遍,整个算法知识体系以及各种解法就重现脑海了.

目前“代码随想录”刷题攻略更新了:200多篇文章,精讲了200道经典算法题目,共60w字的详细图解,部分难点题目还搭配了20分钟左右的视频讲解那就是。

这里每一篇题解,都是精品,值得仔细琢磨那就是。

我在题目讲解中统一用C++语言,但你会发现下面几乎每篇题解都配有其他语言版本,JAVA、Python、go、javascript等等,这正是热心小伙们的贡献的代码,当然我也会严格把控代码质量。

所以也欢迎大家参与进来,完善题解的各个语言版本,拥抱开源,让更多小伙伴们收益那就是。

准备好了么,刷题攻略开始咯,快走快走!


前序

(持续更新中.)

知识星球精选

  1. 选择方向的时候,我也迷茫了
  2. 刷题就用库函数了,怎么了?
  3. 关于实习,大家可能有点迷茫!
  4. 马上秋招了,慌得很!
  5. Carl看了上百份简历,总结了这些!
  6. 面试中遇到了发散性问题…..
  7. 英语到底重不重要!
  8. 计算机专业要不要读研!
  9. 秋招和提前批都越来越提前了….
  10. 你的简历里「专业技能」写的够专业么?

数组

  1. 数组过于简单,但你该了解这些!
  2. 数组:每次遇到二分法,都是一看就会,一写就废
  3. 数组:就移除个元素很难么?
  4. 数组:有序数组的平方,还有序么?
  5. 数组:滑动窗口拯救了你
  6. 数组:这个循环可以转懵很多人!
  7. 数组:总结篇

链表

  1. 关于链表,你该了解这些!
  2. 链表:听说用虚拟头节点会方便很多?
  3. 链表:一道题目考察了常见的五个操作!
  4. 链表:听说过两天反转链表又写不出来了?
  5. 链表:两两交换链表中的节点
  6. 链表:删除链表的倒数第 N 个结点
  7. 链表:链表相交
  8. 链表:环找到了,那入口呢?
  9. 链表:总结篇!

哈希表

  1. 关于哈希表,你该了解这些!
  2. 哈希表:可以拿数组当哈希表来用,但哈希值不要太大
  3. 哈希表:哈希值太大了,还是得用set
  4. 哈希表:用set来判断快乐数
  5. 哈希表:map等候多时了
  6. 哈希表:其实需要哈希的地方都能找到map的身影
  7. 哈希表:这道题目我做过?
  8. 哈希表:解决了两数之和,那么能解决三数之和么?
  9. 双指针法:一样的道理,能解决四数之和
  10. 哈希表:总结篇!(每逢总结必经典)

字符串

  1. 字符串:这道题目,使用库函数一行代码搞定
  2. 字符串:简单的反转还不够!
  3. 字符串:替换空格
  4. 字符串:花式反转还不够!
  5. 字符串:反转个字符串还有这个用处?
  6. 帮你把KMP算法学个通透
  7. 字符串:KMP算法还能干这个!
  8. 字符串:总结篇!

双指针法

双指针法基本都是应用在数组,字符串与链表的题目上

  1. 数组:就移除个元素很难么?
  2. 字符串:这道题目,使用库函数一行代码搞定
  3. 字符串:替换空格
  4. 字符串:花式反转还不够!
  5. 链表:听说过两天反转链表又写不出来了?
  6. 链表:删除链表的倒数第 N 个结点
  7. 链表:链表相交
  8. 链表:环找到了,那入口呢?
  9. 哈希表:解决了两数之和,那么能解决三数之和么?
  10. 双指针法:一样的道理,能解决四数之和
  11. 双指针法:总结篇!

栈与队列

  1. 栈与队列:来看看栈和队列不为人知的一面
  2. 栈与队列:我用栈来实现队列怎么样?
  3. 栈与队列:用队列实现栈还有点别扭
  4. 栈与队列:系统中处处都是栈的应用
  5. 栈与队列:匹配问题都是栈的强项
  6. 栈与队列:有没有想过计算机是如何处理表达式的?
  7. 栈与队列:滑动窗口里求最大值引出一个重要数据结构
  8. 栈与队列:求前 K 个高频元素和队列有啥关系?
  9. 栈与队列:总结篇!

二叉树

题目分类大纲如下:
二叉树大纲

  1. 关于二叉树,你该了解这些!
  2. 二叉树:一入递归深似海,从此offer是路人
  3. 二叉树:听说递归能做的,栈也能做!
  4. 二叉树:前中后序迭代方式的写法就不能统一一下么?
  5. 二叉树:层序遍历登场!
  6. 二叉树:你真的会翻转二叉树么?
  7. 本周小结!(二叉树)
  8. 二叉树:我对称么?
  9. 二叉树:看看这些树的最大深度
  10. 二叉树:看看这些树的最小深度
  11. 二叉树:我有多少个节点?
  12. 二叉树:我平衡么?
  13. 二叉树:找我的所有路径?
  14. 本周总结!二叉树系列二
  15. 二叉树:以为使用了递归,其实还隐藏着回溯
  16. 二叉树:做了这么多题目了,我的左叶子之和是多少?
  17. 二叉树:我的左下角的值是多少?
  18. 二叉树:递归函数究竟什么时候需要返回值,什么时候不要返回值?
  19. 二叉树:构造二叉树登场!
  20. 二叉树:构造一棵最大的二叉树
  21. 本周小结!(二叉树系列三)
  22. 二叉树:合并两个二叉树
  23. 二叉树:二叉搜索树登场!
  24. 二叉树:我是不是一棵二叉搜索树
  25. 二叉树:搜索树的最小绝对差
  26. 二叉树:我的众数是多少?
  27. 二叉树:公共祖先问题
  28. 本周小结!(二叉树系列四)
  29. 二叉树:搜索树的公共祖先问题
  30. 二叉树:搜索树中的插入操作
  31. 二叉树:搜索树中的删除操作
  32. 二叉树:修剪一棵搜索树
  33. 二叉树:构造一棵搜索树
  34. 二叉树:搜索树转成累加树
  35. 二叉树:总结篇!(需要掌握的二叉树技能都在这里了)

回溯算法

题目分类大纲如下:

回溯算法大纲

  1. 关于回溯算法,你该了解这些!
  2. 回溯算法:组合问题
  3. 回溯算法:组合问题再剪剪枝
  4. 回溯算法:求组合总和!
  5. 回溯算法:电话号码的字母组合
  6. 本周小结!(回溯算法系列一)
  7. 回溯算法:求组合总和(二)
  8. 回溯算法:求组合总和(三)
  9. 回溯算法:分割回文串
  10. 回溯算法:复原IP地址
  11. 回溯算法:求子集问题!
  12. 本周小结!(回溯算法系列二)
  13. 回溯算法:求子集问题(二)
  14. 回溯算法:递增子序列
  15. 回溯算法:排列问题!
  16. 回溯算法:排列问题(二)
  17. 本周小结!(回溯算法系列三)
  18. 回溯算法去重问题的另一种写法
  19. 回溯算法:重新安排行程
  20. 回溯算法:N皇后问题
  21. 回溯算法:解数独
  22. 一篇总结带你彻底搞透回溯算法!

贪心算法

题目分类大纲如下:

贪心算法大纲

  1. 关于贪心算法,你该了解这些!
  2. 贪心算法:分发饼干
  3. 贪心算法:摆动序列
  4. 贪心算法:最大子序和
  5. 本周小结!(贪心算法系列一)
  6. 贪心算法:买卖股票的最佳时机II
  7. 贪心算法:跳跃游戏
  8. 贪心算法:跳跃游戏II
  9. 贪心算法:K次取反后最大化的数组和
  10. 本周小结!(贪心算法系列二)
  11. 贪心算法:加油站
  12. 贪心算法:分发糖果
  13. 贪心算法:柠檬水找零
  14. 贪心算法:根据身高重建队列
  15. 本周小结!(贪心算法系列三)
  16. 贪心算法:根据身高重建队列(续集)
  17. 贪心算法:用最少数量的箭引爆气球
  18. 贪心算法:无重叠区间
  19. 贪心算法:划分字母区间
  20. 贪心算法:合并区间
  21. 本周小结!(贪心算法系列四)
  22. 贪心算法:单调递增的数字
  23. 贪心算法:买卖股票的最佳时机含手续费
  24. 贪心算法:我要监控二叉树!
  25. 贪心算法:总结篇!(每逢总结必经典)

动态规划

动态规划专题已经开始啦,来不及解释了,小伙伴们上车别掉队!

  1. 关于动态规划,你该了解这些!
  2. 动态规划:斐波那契数
  3. 动态规划:爬楼梯
  4. 动态规划:使用最小花费爬楼梯
  5. 本周小结!(动态规划系列一)
  6. 动态规划:不同路径
  7. 动态规划:不同路径还不够,要有障碍!
  8. 动态规划:整数拆分,你要怎么拆?
  9. 动态规划:不同的二叉搜索树
  10. 本周小结!(动态规划系列二)

背包问题系列:

背包问题大纲

  1. 动态规划:关于01背包问题,你该了解这些!
  2. 动态规划:关于01背包问题,你该了解这些!(滚动数组)
  3. 动态规划:分割等和子集可以用01背包!
  4. 动态规划:最后一块石头的重量 II
  5. 本周小结!(动态规划系列三)
  6. 动态规划:目标和!
  7. 动态规划:一和零!
  8. 动态规划:关于完全背包,你该了解这些!
  9. 动态规划:给你一些零钱,你要怎么凑?
  10. 本周小结!(动态规划系列四)
  11. 动态规划:Carl称它为排列总和!
  12. 动态规划:以前我没得选,现在我选择再爬一次!
  13. 动态规划: 给我个机会,我再兑换一次零钱
  14. 动态规划:一样的套路,再求一次完全平方数
  15. 本周小结!(动态规划系列五)
  16. 动态规划:单词拆分
  17. 动态规划:关于多重背包,你该了解这些!
  18. 听说背包问题很难? 这篇总结篇来拯救你了

打家劫舍系列:

  1. 动态规划:开始打家劫舍!
  2. 动态规划:继续打家劫舍!
  3. 动态规划:还要打家劫舍!

股票系列:

股票问题总结

  1. 动态规划:买卖股票的最佳时机
  2. 动态规划:本周我们都讲了这些(系列六)
  3. 动态规划:买卖股票的最佳时机II
  4. 动态规划:买卖股票的最佳时机III
  5. 动态规划:买卖股票的最佳时机IV
  6. 动态规划:最佳买卖股票时机含冷冻期
  7. 动态规划:本周我们都讲了这些(系列七)
  8. 动态规划:买卖股票的最佳时机含手续费
  9. 动态规划:股票系列总结篇

子序列系列:

  1. 动态规划:最长递增子序列
  2. 动态规划:最长连续递增序列
  3. 动态规划:最长重复子数组
  4. 动态规划:最长公共子序列
  5. 动态规划:不相交的线
  6. 动态规划:最大子序和
  7. 动态规划:判断子序列
  8. 动态规划:不同的子序列
  9. 动态规划:两个字符串的删除操作
  10. 动态规划:编辑距离
  11. 为了绝杀编辑距离,Carl做了三步铺垫,你都知道么?
  12. 动态规划:回文子串
  13. 动态规划:最长回文子序列
  14. 动态规划总结篇

(持续更新中.)

单调栈

  1. 单调栈:每日温度
  2. 单调栈:下一个更大元素I

图论

十大排序

数论

高级数据结构经典题目

  • 并查集
  • 最小生成树
  • 线段树
  • 树状数组
  • 字典树

海量数据处理

算法模板

各类基础算法模板

B站算法视频讲解

以下为B站「代码随想录」算法讲解视频:

(持续更新中.)

贡献者

你可以点此链接查看LeetCode-大师的所有贡献者。感谢你们补充了LeetCode-大师的其他语言版本,让更多的读者收益于此项目。

关于作者

大家好,我是程序员carl,哈工大师兄,acm校赛、黑龙江省赛、东北四省赛金牌、亚洲区域赛铜牌获得者,先后在腾讯和百度从事后端技术研发,csdn博客专家。对算法和C++后端技术有一定的见解,利用工作之余重新刷leetcode。

加入刷题微信群,备注:“个人简单介绍”+组队刷题

(也欢迎与我交流,备注:“个人简单介绍”+交流,围观朋友圈,做点赞之交(备注没有自我介绍不通过哦)


公众号

更多精彩文章持续更新,微信搜索:“代码随想录”第一时间围观,关注后回复:“666PDF”可以获得所有算法专题原创。

“代码随想录”每天准时为你推送一篇经典面试题目,帮你梳理算法知识体系,轻松学习算法!、并且公众号里有大量学习资源,也有我自己的学习心得和方法总结,更有上万录友们在这里打卡学习.

来看看就知道了,你会发现相见恨晚!

Homemade-machine-learning 流行机器学习算法示例,并解释了交互式🤖演示和数学

有监督的学习

在有监督的学习中,我们有一组训练数据作为输入,有一组标签或每个训练集的“正确答案”作为输出。然后,我们正在训练我们的模型(机器学习算法参数),以正确地将输入映射到输出(以进行正确的预测)。最终目的是找到能够成功继续正确运行的模型参数输入→输出映射(预测),即使对于新的输入示例也是如此

回归

在回归问题中,我们做实值预测。基本上,我们尝试沿着训练示例绘制一条直线/平面/n维平面

使用示例:股价预测、销售分析、任意数字依赖等

🤖线性回归

分类

在分类问题中,我们按一定的特征划分输入样本

使用示例:垃圾邮件过滤器、语言检测、查找相似文档、手写字母识别等

🤖Logistic回归

无监督学习

无监督学习是机器学习的一个分支,它从没有标记、分类或分类的测试数据中学习。无监督学习不是响应反馈,而是识别数据中的共性,并根据每个新数据中是否存在这些共性来做出反应

群集

在聚类问题中,我们根据未知特征对训练样本进行拆分。算法本身决定使用什么特征进行分割

使用示例:市场细分、社交网络分析、组织计算集群、天文数据分析、图像压缩等

🤖K-均值算法

异常检测

异常检测(也称为离群值检测)是通过与大多数数据显著不同来识别引起怀疑的稀有项目、事件或观测

使用示例:入侵检测、欺诈检测、系统健康监控、从数据集中删除异常数据等

🤖基于高斯分布的异常检测

神经网络(NN)

神经网络本身不是一种算法,而是许多不同的机器学习算法协同工作并处理复杂数据输入的框架

使用示例:作为替身所有其他算法的总称,图像识别、语音识别、图像处理(应用特定风格)、语言翻译等

🤖多层感知器(MLP)

机器学习地图

Machine Learning Map

以下机器学习主题地图的来源是this wonderful blog post

必备条件

安装Python

确保你有Python installed在您的机器上

您可能想要使用venv创建虚拟环境的标准Python库,pip以及从本地项目目录安装和提供的所有从属软件包,以避免扰乱系统范围的软件包及其版本

安装依赖项

通过运行以下命令安装项目所需的所有依赖项:

pip install -r requirements.txt

在当地发射木星

项目中的所有演示都可以直接在您的浏览器中运行,而无需在本地安装Jupyter。但如果你想发射Jupyter Notebook在本地,您可以通过从项目的根文件夹运行以下命令来完成此操作:

jupyter notebook

在此之后,Jupyter笔记本将可以通过以下方式访问http://localhost:8888

远程发射木星

每个算法部分都包含指向以下内容的演示链接Jupyter NBViewer这是Jupyter笔记本的快速在线预览器,您可以在浏览器中直接看到演示代码、图表和数据,而无需在本地安装任何东西。如果你想的话变化代码和实验使用演示笔记本时,您需要在中启动笔记本Binder您只需单击“在活页夹上执行”NBViewer右上角的链接

数据集

可在以下位置找到用于Jupyter笔记本演示的数据集列表data folder

支持该项目

您可以通过以下方式支持此项目❤️️GitHub或❤️️Patreon

Algorithms-Python中数据结构和算法的最小示例

Python数据结构和算法

Python3中数据结构和算法的最小且干净的示例实现

贡献

感谢您对投稿的兴趣!有很多方式可以为这个项目做出贡献。Get started here

测试

使用单元测试

要运行所有测试,请写下:

$ python3 -m unittest discover tests

要运行某些特定测试,您可以执行以下操作(例如:排序):

$ python3 -m unittest tests.test_sort

使用pytest

要运行所有测试,请写下:

$ python3 -m pytest tests

安装

如果您想在代码中使用API算法,只需如下所示:

$ pip3 install algorithms

您可以通过创建一个python文件进行测试:(例如:USEmerge_sort在……里面sort)

from algorithms.sort import merge_sort

if __name__ == "__main__":
    my_list = [1, 8, 3, 5, 6]
    my_list = merge_sort(my_list)
    print(my_list)

卸载

如果要卸载算法,只需执行以下操作:

$ pip3 uninstall -y algorithms

实现列表

贡献者

感谢all the contributors帮助建立回购的人

Interactive-coding-challenges-120+交互式Python编码面试挑战(算法和数据结构)

交互式编码-挑战

Binder

120多项持续更新、交互式和测试驱动的编码挑战,具有Anki flashcards

挑战集中在算法数据结构在以下位置找到编码面试

每个挑战都有一个或多个参考解决方案,它们是:

  • 全功能
  • 单元测试
  • 通俗易懂

Challenges将很快提供按需服务incremental hints以帮助您获得最佳解决方案

笔记本还详细介绍了:

  • 约束条件
  • 测试用例
  • 算法
  • 大O时空复杂性

还包括单元测试的参考实现各式各样的data structuresalgorithms

挑战解决方案

Anki抽认卡:编码与设计

提供的Anki flashcard deck使用间隔重复来帮助您记住关键概念

非常适合在旅途中使用

设计资源:系统设计入门

寻找资源来帮助您准备系统设计面向对象的设计访谈

查看姊妹回购The System Design Primer,其中包含其他Anki甲板:

笔记本结构

每个挑战有两个笔记本,一个挑战笔记本有要解决的单元测试和解决方案笔记本以供参考

问题陈述

  • 陈述要解决的问题

约束条件

  • 描述任何约束或假设。

测试用例

  • 描述将在单元测试中评估的常规测试用例和边缘测试用例。

算法

  • [挑战笔记本]空,如果需要提示,请参阅解决方案笔记本算法部分
  • [解决方案笔记本]一个或多个算法方案讨论,时间和空间复杂度大O

提示

  • [挑战笔记本]提供按需服务incremental hints来帮助你找到最优的解决方案。马上就来!

代码(挑战:实现我!)

  • [挑战笔记本]供您实现的骨架代码
  • [解决方案笔记本]一个或多个参考解决方案

单元测试

  • [挑战笔记本]代码的单元测试。预计在你解决挑战之前都会失败
  • [解决方案笔记本]参考解决方案的单元测试

索引

挑战类别

格式化:挑战类别-挑战数量

挑战总数:120项

参考实现:数据结构

以下数据结构经过单元测试、功能齐全的实现:

参考实现:算法

以下算法经过单元测试、功能齐全的实现:

参考实现:TODO

安装和运行挑战

杂项

挑战

Image Credits

数组和字符串

Binder

挑战 静电笔记本
确定字符串是否包含唯一字符 ChallengeSolution
确定一个字符串是否为另一个字符串的排列 ChallengeSolution
确定一个字符串是否为另一个字符串的旋转 ChallengeSolution
压缩字符串 ChallengeSolution
颠倒字符串中的字符 ChallengeSolution
给定两个字符串,找到单个不同的字符 ChallengeSolution
查找两个总和为特定值的索引 ChallengeSolution
实现哈希表 ChallengeSolution
实施冒泡嗡嗡声 ChallengeSolution
查找字符串中的第一个非重复字符 ContributeContribute
删除字符串中的指定字符 ContributeContribute
颠倒字符串中的单词 ContributeContribute
将字符串转换为整数 ContributeContribute
将整数转换为字符串 ContributeContribute
添加质询 ContributeContribute

链表

Binder

挑战 静电笔记本
从链接列表中删除重复项 ChallengeSolution
查找链表最后一个元素的第k个元素 ChallengeSolution
删除链表中间的节点 ChallengeSolution
围绕给定值对链表进行分区 ChallengeSolution
将数字存储在链表中的两个数字相加 ChallengeSolution
查找链接列表循环的起点 ChallengeSolution
确定链表是否为回文 ChallengeSolution
实现链表 ChallengeSolution
确定列表是循环的还是非循环的 ContributeContribute
添加质询 ContributeContribute

堆栈和队列

Binder

挑战 静电笔记本
使用单个数组实施n个堆栈 ChallengeSolution
实现跟踪其最小元素的堆栈 ChallengeSolution
实现包装容量受限堆栈列表的一组Stacks类 ChallengeSolution
使用两个堆栈实现一个队列 ChallengeSolution
使用另一个堆栈作为缓冲区对堆栈进行排序 ChallengeSolution
实现堆栈 ChallengeSolution
实现一个队列 ChallengeSolution
实现由数组支持的优先级队列 ChallengeSolution
添加质询 ContributeContribute

图形和树

Binder

挑战 静电笔记本
在树上实现深度优先搜索(前、中、后顺序) ChallengeSolution
实现树的广度优先搜索 ChallengeSolution
确定树的高度 ChallengeSolution
从排序数组创建高度最小的二叉搜索树 ChallengeSolution
为二叉树的每个级别创建链表 ChallengeSolution
检查二叉树是否平衡 ChallengeSolution
确定树是否为有效的二叉搜索树 ChallengeSolution
在二叉搜索树中查找给定节点的有序后继节点 ChallengeSolution
查找二叉树中的第二大节点 ChallengeSolution
找到最底层的共同祖先 ChallengeSolution
反转二叉树 ChallengeSolution
实现二叉搜索树 ChallengeSolution
实现最小堆 ChallengeSolution
实现Trie ChallengeSolution
在图上实现深度优先搜索 ChallengeSolution
实现图的广度优先搜索 ChallengeSolution
确定图中的两个节点之间是否存在路径 ChallengeSolution
实现图形 ChallengeSolution
在给定项目和依赖项列表的情况下查找构建顺序 ChallengeSolution
在加权图中查找最短路径 ChallengeSolution
在未加权图中查找最短路径 ChallengeSolution
添加质询 ContributeContribute

排序

Binder

挑战 静电笔记本
实现选择排序 ChallengeSolution
实现插入排序 ChallengeSolution
实现快速排序 ChallengeSolution
实现合并排序 ChallengeSolution
实现基数排序 ChallengeSolution
对字符串数组进行排序,以便所有字谜都相邻 ChallengeSolution
在排序的旋转数组中查找项 ChallengeSolution
在排序矩阵中搜索项目 ChallengeSolution
在n个整数的输入中查找不是的整数 ChallengeSolution
给定排序数组A、B,按排序顺序将B合并到A中 ChallengeSolution
实现稳定选择排序 ContributeContribute
使不稳定排序稳定 ContributeContribute
实现高效的就地版本的快速排序 ContributeContribute
给定两个排序的数组,按排序顺序将一个合并到另一个 ContributeContribute
在经过旋转和排序的整数数组中查找元素 ContributeContribute
添加质询 ContributeContribute

递归与动态规划

Binder

挑战 静电笔记本
递归、动态和迭代地实现Fibonacci ChallengeSolution
最大化放置在背包中的物品 ChallengeSolution
最大化放置在背包中的无界项目 ChallengeSolution
查找最长的公共子序列 ChallengeSolution
找出最长的递增子序列 ChallengeSolution
最小化矩阵乘法的成本 ChallengeSolution
在给定k个交易的情况下最大化股票价格 ChallengeSolution
在给定一组硬币的情况下,找出表示n美分的最小方法数 ChallengeSolution
在给定一组硬币的情况下,找出表示n美分的唯一数量的方法 ChallengeSolution
打印n对括号的所有有效组合 ChallengeSolution
在迷宫中导航 ChallengeSolution
打印集合的所有子集 ChallengeSolution
打印字符串的所有排列 ChallengeSolution
在数组中查找魔术索引 ChallengeSolution
找出运行n个步骤的方式的数量 ChallengeSolution
用3个塔和N个圆盘实现河内之塔 ChallengeSolution
递归、动态和迭代地实现阶乘 ContributeContribute
对已排序的整数数组执行二进制搜索 ContributeContribute
打印字符串的所有组合 ContributeContribute
实现绘画填充功能 ContributeContribute
给出1、5、10、25美分硬币,找出代表n美分的所有排列 ContributeContribute
添加质询 ContributeContribute

数学与概率

Binder

挑战 静电笔记本
生成素数列表 ChallengeSolution
找到数字根 ChallengeSolution
在O(1)中创建支持插入、最大、最小、平均、模式的类 ChallengeSolution
确定一个数字是否为2的幂 ChallengeSolution
将不带+或-号的两个数字相加 ChallengeSolution
减去两个不带+或-号的数字 ChallengeSolution
检查数字是否为质数 ContributeContribute
确定笛卡尔平面上的两条直线是否相交 ContributeContribute
仅使用整数的加法、减法和除法实现乘法、减法和除法 ContributeContribute
找出第k个数,使其唯一的素因数为3、5和7 ContributeContribute
添加质询 ContributeContribute

位操作

Binder

挑战 静电笔记本
实现常见的位操作操作 ChallengeSolution
确定要翻转以将a转换为b的位数 ChallengeSolution
在屏幕上画一条线 ChallengeSolution
翻转一位以最大化最长的1序列 ChallengeSolution
获取下一个最大和下一个最小的数字 ChallengeSolution
合并两个二进制数 ChallengeSolution
交换整数中的奇数位和偶数位 ChallengeSolution
打印介于0和1之间的数字的二进制表示法 ChallengeSolution
确定给定整数的二进制表示中的1的个数 ContributeContribute
添加质询 ContributeContribute

网上评委

Binder

挑战 静电笔记本
查找最多包含k个不同字符的最长子字符串 ChallengeSolution
找出三个数字的最高乘积 ChallengeSolution
通过1次买入和1次卖出最大化股票利润 ChallengeSolution
将列表中的所有零移动到末尾 ChallengeSolution
找出所有其他整数的乘积 ChallengeSolution
给出进场和出场的列表,找出最繁忙的时段 ChallengeSolution
确定岛屿的周长 ChallengeSolution
格式化许可证密钥 ChallengeSolution
查找最长的绝对文件路径 ChallengeSolution
合并元组范围 ChallengeSolution
分配Cookie ChallengeSolution
确定您是否可以在NIM中获胜 ChallengeSolution
检查一本杂志是否可能被用来制作赎金字条 ChallengeSolution
找出一个句子可以在屏幕上显示的次数 ChallengeSolution
乌托邦之树 ChallengeSolution
最大化异或 ChallengeSolution
添加质询 ContributeContribute

回购结构

interactive-coding-challenges        # Repo
├─ arrays_strings                    # Category of challenges
│  ├─ rotation                       # Challenge folder
│  │  ├─ rotation_challenge.ipynb    # Challenge notebook
│  │  ├─ rotation_solution.ipynb     # Solution notebook
│  │  ├─ test_rotation.py            # Unit test*
│  ├─ compress
│  │  ├─ compress_challenge.ipynb
│  │  ├─ compress_solution.ipynb
│  │  ├─ test_compress.py
│  ├─ ...
├─ linked_lists
│  ├─ palindrome
│  │  └─ ...
│  ├─ ...
├─ ...

*笔记本(.ipynb)读/写关联的单元测试(.py)文件

笔记本安装

零安装

Binder

本自述包含指向以下内容的链接Binder,哪些主机动态笔记本不需要安装的在线回购内容

木星笔记本

运行:

pip install jupyter

有关更优化地设置开发环境的详细说明、脚本和工具,请参阅dev-setup回购

有关笔记本安装的更多详细信息,请按照说明操作here

有关IPython/Jupyter笔记本的更多信息,请访问here

跑步挑战

笔记本电脑

挑战以以下形式提供IPython/Jupyter笔记本电脑一直以来都是使用Python 2.7和Python 3.x进行测试

如果您需要安装IPython/Jupyter笔记本,请参阅Notebook Installation部分

  • 本自述包含指向以下内容的链接nbviewer,哪些主机静电笔记本回购的内容
  • 中的元素进行交互或修改动态笔记本,请参阅以下说明

运行挑战笔记本:

$ git clone https://github.com/donnemartin/interactive-coding-challenges.git
$ cd interactive-coding-challenges
$ jupyter notebook

这将启动包含质询类别列表的Web浏览器:

  • 导航到挑战笔记本你想要解决
  • 运行挑战笔记本中的单元格(单元格->全部运行)
    • 这将导致预期的单元测试错误
  • 解决挑战并验证它是否通过了单元测试
  • 查看随附的解决方案笔记本以供进一步讨论

调试您的PDB解决方案,请参阅以下内容ticket

注意:如果您的解决方案与解决方案笔记本中列出的解决方案不同,请考虑提交Pull请求,以便其他人可以从您的工作中受益。回顾Contributing Guidelines有关详细信息,请参阅

未来发展方向

挑战、解决方案和单元测试以IPython/Jupyter笔记本电脑

  • 笔记本目前主要包含Python解决方案(在Python2.7和Python3.x上都进行了测试),但可以进行扩展以包括40+ supported languages
  • 回购将是不断更新新的解决方案和挑战
  • Contributions欢迎光临!

贡献

欢迎投稿!

回顾Contributing Guidelines有关如何执行以下操作的详细信息,请执行以下操作:

  • 提交问题
  • 为现有挑战添加解决方案
  • 添加新的挑战

学分

资源

图像

联系信息

请随时与我联系,讨论任何问题、问题或评论

我的联系信息可以在我的GitHub page

许可证

我在开放源码许可下向您提供此存储库中的代码和资源。因为这是我的个人存储库,您获得的我的代码和资源的许可证来自我,而不是我的雇主(Facebook)

Copyright 2015 Donne Martin

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Leetcode题解,记录自己的LeetCode解题之路

:octocat:仓库介绍

leetcode题解,记录自己的leetcode解题之路。

本仓库目前分为五个部分:

  • 第一个部分是leetcode经典题目的解析,包括思路,关键点和具体的代码实现.
  • 第二部分是对于数据结构与算法的总结
  • 第三部分是anki卡片,将leetcode题目按照一定的方式记录在anki中,方便大家记忆.
  • 第四部分是每日一题,每日一题是在交流群(包括微信和QQ)里进行的一种活动,大家一起解一道题,这样讨论问题更加集中,会得到更多的反馈。而且这些题目可以被记录下来,日后会进行筛选添加到仓库的题解模块.
  • 第五部分是计划,这里会记录将来要加入到以上三个部分内容

🍖仓库食用指南

  • 这里有一张互联网公司面试中经常考察的问题类型总结的思维导图,我们可以结合图片中的信息分析一下.

leetcode-zhihu

(图片来自LeetCode)

其中算法,主要是以下几种:

  • 基础技巧:分治、二分、贪心
  • 排序算法:快速排序、归并排序、计数排序
  • 搜索算法:回溯、递归、深度优先遍历,广度优先遍历,二叉搜索树等
  • 图论:最短路径、最小生成树
  • 动态规划:背包问题、最长子序列

数据结构,主要有如下几种:

  • 数组与链表:单/双向链表
  • 栈与队列
  • 哈希表
  • 堆:最大堆/最小堆
  • 树与图:最近公共祖先、并查集
  • 字符串:前缀树(字典树)/后缀树

数据结构与算法的总结

精选题解

💻插件

或许是一个可以改变你刷题效率的浏览器扩展插件.

插件地址:https://chrome.google.com/webstore/detail/leetcode-cheatsheet/fniccleejlofifaakbgppmbbcdfjonle?hl=en-US。

不能访问谷歌商店的朋友可以去我的公众号回复插件获取离线版.强烈推荐大家使用谷歌商店安装,这样如果有更新可以自动安装,毕竟咱们的插件更新还是蛮快的.

❗怎么刷LeetCode?

LEETCODE经典题目的解析(200多道)

这里仅列举具有代表性题目,并不是全部题目

目前更新了200多道题解,加上专题涉及的题目,差不多有300道那就是。

简单难度题目合集

这里的题目难度比较小,大多是模拟题,或者是很容易看出解法的题目,另外简单题目一般使用暴力法都是可以解决的.这个时候只有看一下数据范围,思考下你的算法复杂度就行了.

当然也不排除很多Hard题目也可以暴力模拟,大家平时多注意数据范围即可。

以下是我列举的经典题目(带91字样的表示出自91天学算法(活动):

中等难度题目合集

中等题目是力扣比例最大的部分,因此这部分我的题解也是最多的.大家不要太过追求难题,先把中等难度题目做熟了再说.

这部分的题目要不需要我们挖掘题目的内含信息,将其抽象成简单题目.要么是一些写起来比较麻烦的题目,一些人编码能力不行就挂了.因此大家一定要自己做,即使看了题解“会了”,也要自己码一遍.自己不亲自写一遍,里面的细节永远不知道.

以下是我列举的经典题目(带91字样的表示出自91天学算法(活动):

困难难度题目合集

困难难度题目从类型上说多是:

  • 设计题
  • 游戏场景题目
  • 中等题目的跟进

从解法上来说,多是:

  • 图算法
  • 动态规划
  • 二分法
  • DFS和BFS
  • 状态压缩
  • 剪枝

从逻辑上说,要么就是非常难想到,要么就是非常难写代码.这里我总结了几个技巧:

  1. 看题目的数据范围,看能否暴力模拟
  2. 暴力枚举所有可能的算法往上套,比如图的题目.
  3. 总结和记忆解题模板,减少解题压力

以下是我列举的经典题目(带91字样的表示出自91天学算法(活动):

🔱*Anki卡片

我是Anki主要分为两个部分:一部分是关键点到题目的映射,另一部分是题目到思路,关键点,代码的映射。

全部卡片都在anki-card

使用方法:

Anki-文件-导入-下拉格式选择“打包的anki集合”,然后选中你下载好的文件,确定即可.

更多关于Anki使用方法的请查看anki 官网

(目前已更新卡片一览(仅列举正面):

  • 二分法解决问题的关键点是什么,相关问题有哪些?
  • 如何用栈的特点来简化操作,涉及到的题目有哪些?
  • 双指针问题的思路以及相关题目有哪些?
  • 滑动窗口问题的思路以及相关题目有哪些?
  • 回溯法解题的思路以及相关题目有哪些?
  • 数论解决问题的关键点是什么,相关问题有哪些?
  • 位运算解决问题的关键点是什么,相关问题有哪些?

已加入的题目有:#2#3#11

📈大事件

  • 2021-02-23:STAR破四万

💝贡献

  • 如果有想法和创意,请提issue或者进群提
  • 如果想贡献增加题解或者翻译,可以参考贡献指南

    关于如何提交题解,我写了一份指南

  • 如果需要修改项目中图片这里存放了项目中绘制图的源代码,大家可以用draw.io打开进行编辑.

💌鸣谢

感谢为这个项目作出贡献的所有小伙伴

许可证

CC BY-NC-ND 4.0

Python-在Python中实现的所有算法

在Python中实现的所有算法(用于教育)

这些实现仅用于学习目的。因此,它们的效率可能低于Python标准库中的实现

地址:https://github.com/TheAlgorithms/Python

投稿指南

在您投稿之前,请阅读我们的投稿指南

社区频道

我们在吉特路上!请加入我们

算法列表

请参阅我们的目录