问题:表示并解决给定图像的迷宫

代表并解决给定图像的迷宫的最佳方法是什么?

范围问题134的封面图片

给定JPEG图像(如上所示),读入,将其解析为某种数据结构并解决迷宫的最佳方法是什么?我的第一个本能是逐像素读取图像并将其存储在布尔值列表(数组)中:True对于白色像素,False对于非白色像素(可以丢弃颜色)。这种方法的问题在于图像可能不是“像素完美”的。我的意思只是说,如果墙壁上的某处有白色像素,可能会产生意外的路径。

另一种方法(经过一番思考后才想到)是将图像转换为SVG文件-SVG文件是在画布上绘制的路径的列表。这样,可以将路径读入相同种类的列表(布尔值),其中True表示路径或墙壁,False表示可移动的空间。如果转换不是100%准确,并且不能完全连接所有墙,从而产生间隙,则此方法会出现问题。

转换为SVG的另一个问题是这些线不是“完美”的直线。这导致路径是三次贝塞尔曲线。使用由整数索引的布尔值列表(数组),曲线将不易转移,并且必须计算曲线上直线的所有点,但不会与列表索引完全匹配。

我假设虽然其中一种方法可能会(虽然可能不会)起作用,但考虑到如此大的图像,它们的效率很低,并且存在更好的方法。如何做到最好(最有效和/或最低复杂度)?有没有最好的方法?

然后是迷宫的解决。如果我使用前两种方法中的任何一种,则基本上将得到一个矩阵。根据该答案,表示迷宫的一种好方法是使用树,而使用A *算法来解决它的好方法。一个人如何根据图像创建一棵树?有任何想法吗?

TL; DR
解析的最佳方法?变成什么数据结构?所述结构将如何帮助/阻碍解决?

更新
我已尝试使用numpy@Thomas建议的方式实现@Mikhail用Python编写的内容。我认为该算法是正确的,但无法正常运行。(下面的代码。)PNG库是PyPNG

import png, numpy, Queue, operator, itertools

def is_white(coord, image):
  """ Returns whether (x, y) is approx. a white pixel."""
  a = True
  for i in xrange(3):
    if not a: break
    a = image[coord[1]][coord[0] * 3 + i] > 240
  return a

def bfs(s, e, i, visited):
  """ Perform a breadth-first search. """
  frontier = Queue.Queue()
  while s != e:
    for d in [(-1, 0), (0, -1), (1, 0), (0, 1)]:
      np = tuple(map(operator.add, s, d))
      if is_white(np, i) and np not in visited:
        frontier.put(np)
    visited.append(s)
    s = frontier.get()
  return visited

def main():
  r = png.Reader(filename = "thescope-134.png")
  rows, cols, pixels, meta = r.asDirect()
  assert meta['planes'] == 3 # ensure the file is RGB
  image2d = numpy.vstack(itertools.imap(numpy.uint8, pixels))
  start, end = (402, 985), (398, 27)
  print bfs(start, end, image2d, [])

What is the best way to represent and solve a maze given an image?

The cover image of The Scope Issue 134

Given an JPEG image (as seen above), what’s the best way to read it in, parse it into some data structure and solve the maze? My first instinct is to read the image in pixel by pixel and store it in a list (array) of boolean values: True for a white pixel, and False for a non-white pixel (the colours can be discarded). The issue with this method, is that the image may not be “pixel perfect”. By that I simply mean that if there is a white pixel somewhere on a wall it may create an unintended path.

Another method (which came to me after a bit of thought) is to convert the image to an SVG file – which is a list of paths drawn on a canvas. This way, the paths could be read into the same sort of list (boolean values) where True indicates a path or wall, False indicating a travel-able space. An issue with this method arises if the conversion is not 100% accurate, and does not fully connect all of the walls, creating gaps.

Also an issue with converting to SVG is that the lines are not “perfectly” straight. This results in the paths being cubic bezier curves. With a list (array) of boolean values indexed by integers, the curves would not transfer easily, and all the points that line on the curve would have to be calculated, but won’t exactly match to list indices.

I assume that while one of these methods may work (though probably not) that they are woefully inefficient given such a large image, and that there exists a better way. How is this best (most efficiently and/or with the least complexity) done? Is there even a best way?

Then comes the solving of the maze. If I use either of the first two methods, I will essentially end up with a matrix. According to this answer, a good way to represent a maze is using a tree, and a good way to solve it is using the A* algorithm. How would one create a tree from the image? Any ideas?

TL;DR
Best way to parse? Into what data structure? How would said structure help/hinder solving?

UPDATE
I’ve tried my hand at implementing what @Mikhail has written in Python, using numpy, as @Thomas recommended. I feel that the algorithm is correct, but it’s not working as hoped. (Code below.) The PNG library is PyPNG.

import png, numpy, Queue, operator, itertools

def is_white(coord, image):
  """ Returns whether (x, y) is approx. a white pixel."""
  a = True
  for i in xrange(3):
    if not a: break
    a = image[coord[1]][coord[0] * 3 + i] > 240
  return a

def bfs(s, e, i, visited):
  """ Perform a breadth-first search. """
  frontier = Queue.Queue()
  while s != e:
    for d in [(-1, 0), (0, -1), (1, 0), (0, 1)]:
      np = tuple(map(operator.add, s, d))
      if is_white(np, i) and np not in visited:
        frontier.put(np)
    visited.append(s)
    s = frontier.get()
  return visited

def main():
  r = png.Reader(filename = "thescope-134.png")
  rows, cols, pixels, meta = r.asDirect()
  assert meta['planes'] == 3 # ensure the file is RGB
  image2d = numpy.vstack(itertools.imap(numpy.uint8, pixels))
  start, end = (402, 985), (398, 27)
  print bfs(start, end, image2d, [])

回答 0

这是一个解决方案。

  1. 将图像转换为灰度(尚未二进制),调整颜色的权重,以使最终的灰度图像大致均匀。您只需在Photoshop中控制图像->调整->黑白中的滑块即可完成此操作。
  2. 通过在Photoshop中的“图像”->“调整”->“阈值”中设置适当的阈值,将图像转换为二进制。
  3. 确保正确选择阈值。使用魔术棒工具,公差为0,点采样,连续,无抗锯齿。检查选择中断处的边不是由错误阈值引入的错误边。实际上,从一开始就可以访问此迷宫的所有内部点。
  4. 在迷宫上添加人工边界,以确保虚拟旅行者不会在它周围走动:)
  5. 以您喜欢的语言实现广度优先搜索(BFS),并从头开始运行它。我更喜欢MATLAB来完成这项任务。正如@Thomas已经提到的那样,无需弄乱图的常规表示。您可以直接使用二值化图像。

这是BFS的MATLAB代码:

function path = solve_maze(img_file)
  %% Init data
  img = imread(img_file);
  img = rgb2gray(img);
  maze = img > 0;
  start = [985 398];
  finish = [26 399];

  %% Init BFS
  n = numel(maze);
  Q = zeros(n, 2);
  M = zeros([size(maze) 2]);
  front = 0;
  back = 1;

  function push(p, d)
    q = p + d;
    if maze(q(1), q(2)) && M(q(1), q(2), 1) == 0
      front = front + 1;
      Q(front, :) = q;
      M(q(1), q(2), :) = reshape(p, [1 1 2]);
    end
  end

  push(start, [0 0]);

  d = [0 1; 0 -1; 1 0; -1 0];

  %% Run BFS
  while back <= front
    p = Q(back, :);
    back = back + 1;
    for i = 1:4
      push(p, d(i, :));
    end
  end

  %% Extracting path
  path = finish;
  while true
    q = path(end, :);
    p = reshape(M(q(1), q(2), :), 1, 2);
    path(end + 1, :) = p;
    if isequal(p, start) 
      break;
    end
  end
end

它确实非常简单和标准,因此在Python或其他任何方式中实现它应该没有困难。

这是答案:

在此处输入图片说明

Here is a solution.

  1. Convert image to grayscale (not yet binary), adjusting weights for the colors so that final grayscale image is approximately uniform. You can do it simply by controlling sliders in Photoshop in Image -> Adjustments -> Black & White.
  2. Convert image to binary by setting appropriate threshold in Photoshop in Image -> Adjustments -> Threshold.
  3. Make sure threshold is selected right. Use the Magic Wand Tool with 0 tolerance, point sample, contiguous, no anti-aliasing. Check that edges at which selection breaks are not false edges introduced by wrong threshold. In fact, all interior points of this maze are accessible from the start.
  4. Add artificial borders on the maze to make sure virtual traveler will not walk around it :)
  5. Implement breadth-first search (BFS) in your favorite language and run it from the start. I prefer MATLAB for this task. As @Thomas already mentioned, there is no need to mess with regular representation of graphs. You can work with binarized image directly.

Here is the MATLAB code for BFS:

function path = solve_maze(img_file)
  %% Init data
  img = imread(img_file);
  img = rgb2gray(img);
  maze = img > 0;
  start = [985 398];
  finish = [26 399];

  %% Init BFS
  n = numel(maze);
  Q = zeros(n, 2);
  M = zeros([size(maze) 2]);
  front = 0;
  back = 1;

  function push(p, d)
    q = p + d;
    if maze(q(1), q(2)) && M(q(1), q(2), 1) == 0
      front = front + 1;
      Q(front, :) = q;
      M(q(1), q(2), :) = reshape(p, [1 1 2]);
    end
  end

  push(start, [0 0]);

  d = [0 1; 0 -1; 1 0; -1 0];

  %% Run BFS
  while back <= front
    p = Q(back, :);
    back = back + 1;
    for i = 1:4
      push(p, d(i, :));
    end
  end

  %% Extracting path
  path = finish;
  while true
    q = path(end, :);
    p = reshape(M(q(1), q(2), :), 1, 2);
    path(end + 1, :) = p;
    if isequal(p, start) 
      break;
    end
  end
end

It is really very simple and standard, there should not be difficulties on implementing this in Python or whatever.

And here is the answer:

Enter image description here


回答 1

该解决方案是用Python编写的。感谢米哈伊尔(Mikhail)提供有关图像准备工作的指导。

动画式广度优先搜索:

BFS的动画版本

完成的迷宫:

完成迷宫

#!/usr/bin/env python

import sys

from Queue import Queue
from PIL import Image

start = (400,984)
end = (398,25)

def iswhite(value):
    if value == (255,255,255):
        return True

def getadjacent(n):
    x,y = n
    return [(x-1,y),(x,y-1),(x+1,y),(x,y+1)]

def BFS(start, end, pixels):

    queue = Queue()
    queue.put([start]) # Wrapping the start tuple in a list

    while not queue.empty():

        path = queue.get() 
        pixel = path[-1]

        if pixel == end:
            return path

        for adjacent in getadjacent(pixel):
            x,y = adjacent
            if iswhite(pixels[x,y]):
                pixels[x,y] = (127,127,127) # see note
                new_path = list(path)
                new_path.append(adjacent)
                queue.put(new_path)

    print "Queue has been exhausted. No answer was found."


if __name__ == '__main__':

    # invoke: python mazesolver.py <mazefile> <outputfile>[.jpg|.png|etc.]
    base_img = Image.open(sys.argv[1])
    base_pixels = base_img.load()

    path = BFS(start, end, base_pixels)

    path_img = Image.open(sys.argv[1])
    path_pixels = path_img.load()

    for position in path:
        x,y = position
        path_pixels[x,y] = (255,0,0) # red

    path_img.save(sys.argv[2])

注意:标记为白色的访问像素为灰色。这消除了对访问列表的需求,但是这需要在绘制路径之前从磁盘中第二次加载图像文件(如果您不希望使用最终路径和所有路径的合成图像)。

我使用的迷宫的空白版本。

This solution is written in Python. Thanks Mikhail for the pointers on the image preparation.

An animated Breadth-First Search:

Animated version of BFS

The Completed Maze:

Completed Maze

#!/usr/bin/env python

import sys

from Queue import Queue
from PIL import Image

start = (400,984)
end = (398,25)

def iswhite(value):
    if value == (255,255,255):
        return True

def getadjacent(n):
    x,y = n
    return [(x-1,y),(x,y-1),(x+1,y),(x,y+1)]

def BFS(start, end, pixels):

    queue = Queue()
    queue.put([start]) # Wrapping the start tuple in a list

    while not queue.empty():

        path = queue.get() 
        pixel = path[-1]

        if pixel == end:
            return path

        for adjacent in getadjacent(pixel):
            x,y = adjacent
            if iswhite(pixels[x,y]):
                pixels[x,y] = (127,127,127) # see note
                new_path = list(path)
                new_path.append(adjacent)
                queue.put(new_path)

    print "Queue has been exhausted. No answer was found."


if __name__ == '__main__':

    # invoke: python mazesolver.py <mazefile> <outputfile>[.jpg|.png|etc.]
    base_img = Image.open(sys.argv[1])
    base_pixels = base_img.load()

    path = BFS(start, end, base_pixels)

    path_img = Image.open(sys.argv[1])
    path_pixels = path_img.load()

    for position in path:
        x,y = position
        path_pixels[x,y] = (255,0,0) # red

    path_img.save(sys.argv[2])

Note: Marks a white visited pixel grey. This removes the need for a visited list, but this requires a second load of the image file from disk before drawing a path (if you don’t want a composite image of the final path and ALL paths taken).

A blank version of the maze I used.


回答 2

我尝试自己实施A-Star搜索以解决此问题。紧跟着约瑟夫·科恩Joseph Kern)此处给出的框架和算法伪代码的实现:

def AStar(start, goal, neighbor_nodes, distance, cost_estimate):
    def reconstruct_path(came_from, current_node):
        path = []
        while current_node is not None:
            path.append(current_node)
            current_node = came_from[current_node]
        return list(reversed(path))

    g_score = {start: 0}
    f_score = {start: g_score[start] + cost_estimate(start, goal)}
    openset = {start}
    closedset = set()
    came_from = {start: None}

    while openset:
        current = min(openset, key=lambda x: f_score[x])
        if current == goal:
            return reconstruct_path(came_from, goal)
        openset.remove(current)
        closedset.add(current)
        for neighbor in neighbor_nodes(current):
            if neighbor in closedset:
                continue
            if neighbor not in openset:
                openset.add(neighbor)
            tentative_g_score = g_score[current] + distance(current, neighbor)
            if tentative_g_score >= g_score.get(neighbor, float('inf')):
                continue
            came_from[neighbor] = current
            g_score[neighbor] = tentative_g_score
            f_score[neighbor] = tentative_g_score + cost_estimate(neighbor, goal)
    return []

由于A-Star是一种启发式搜索算法,因此您需要提供一个函数,该函数可以估算直到达到目标之前的剩余成本(此处为距离)。除非您对次优解决方案感到满意,否则不要高估成本。此处的保守选择是曼哈顿(或出租车)距离,因为这表示所用冯·诺依曼邻域在网格上两点之间的直线距离。(在这种情况下,永远都不会高估成本。)

但是,这将大大低估手边给定迷宫的实际成本。因此,我添加了其他两个距离度量标准,即欧几里得距离和曼哈顿距离乘以4进行比较。但是,这些可能会高估实际成本,因此可能会产生次优的结果。

这是代码:

import sys
from PIL import Image

def is_blocked(p):
    x,y = p
    pixel = path_pixels[x,y]
    if any(c < 225 for c in pixel):
        return True
def von_neumann_neighbors(p):
    x, y = p
    neighbors = [(x-1, y), (x, y-1), (x+1, y), (x, y+1)]
    return [p for p in neighbors if not is_blocked(p)]
def manhattan(p1, p2):
    return abs(p1[0]-p2[0]) + abs(p1[1]-p2[1])
def squared_euclidean(p1, p2):
    return (p1[0]-p2[0])**2 + (p1[1]-p2[1])**2

start = (400, 984)
goal = (398, 25)

# invoke: python mazesolver.py <mazefile> <outputfile>[.jpg|.png|etc.]

path_img = Image.open(sys.argv[1])
path_pixels = path_img.load()

distance = manhattan
heuristic = manhattan

path = AStar(start, goal, von_neumann_neighbors, distance, heuristic)

for position in path:
    x,y = position
    path_pixels[x,y] = (255,0,0) # red

path_img.save(sys.argv[2])

这是可视化结果的一些图像(灵感来自Joseph Kern发布的图像)。动画在主while循环进行10000次迭代后分别显示一个新帧。

广度优先搜索:

广度优先搜索

曼哈顿A级明星距离:

曼哈顿A星距离

A星平方欧几里德距离:

A星平方欧几里德距离

曼哈顿A级明星距离乘以4:

曼哈顿A级明星距离乘以4

结果表明,对于使用的启发式方法,迷宫的探索区域差异很大。因此,平方欧几里德距离甚至会产生与其他度量不同的(次优)路径。

关于A-Star算法在终止之前的运行时间方面的性能,请注意,与距离和成本函数相比,很多评估相加,而广度优先搜索(BFS)只需评估目标的“目标”每个候选职位。这些附加功能评估(A-Star)的成本是否超过了要检查的大量节点(BFS)的成本,尤其是性能是否对您的应用程序来说完全是一个问题,这取决于个人的看法。当然不能普遍回答。

一件事,可以在一般的说一下是否知情的搜索算法(如A-星)可能比穷举搜索(例如,BFS)是更好的选择如下。随着迷宫的维数,即搜索树的分支因子,穷举搜索(穷举搜索)的缺点呈指数增长。随着复杂性的增加,这样做变得越来越不可行,并且在某种程度上,您对任何结果路径都非常满意,无论它是否(近似)最佳。

I tried myself implementing A-Star search for this problem. Followed closely the implementation by Joseph Kern for the framework and the algorithm pseudocode given here:

def AStar(start, goal, neighbor_nodes, distance, cost_estimate):
    def reconstruct_path(came_from, current_node):
        path = []
        while current_node is not None:
            path.append(current_node)
            current_node = came_from[current_node]
        return list(reversed(path))

    g_score = {start: 0}
    f_score = {start: g_score[start] + cost_estimate(start, goal)}
    openset = {start}
    closedset = set()
    came_from = {start: None}

    while openset:
        current = min(openset, key=lambda x: f_score[x])
        if current == goal:
            return reconstruct_path(came_from, goal)
        openset.remove(current)
        closedset.add(current)
        for neighbor in neighbor_nodes(current):
            if neighbor in closedset:
                continue
            if neighbor not in openset:
                openset.add(neighbor)
            tentative_g_score = g_score[current] + distance(current, neighbor)
            if tentative_g_score >= g_score.get(neighbor, float('inf')):
                continue
            came_from[neighbor] = current
            g_score[neighbor] = tentative_g_score
            f_score[neighbor] = tentative_g_score + cost_estimate(neighbor, goal)
    return []

As A-Star is a heuristic search algorithm you need to come up with a function that estimates the remaining cost (here: distance) until the goal is reached. Unless you’re comfortable with a suboptimal solution it should not overestimate the cost. A conservative choice would here be the manhattan (or taxicab) distance as this represents the straight-line distance between two points on the grid for the used Von Neumann neighborhood. (Which, in this case, wouldn’t ever overestimate the cost.)

This would however significantly underestimate the actual cost for the given maze at hand. Therefore I’ve added two other distance metrics squared euclidean distance and the manhattan distance multiplied by four for comparison. These however might overestimate the actual cost, and might therefore yield suboptimal results.

Here’s the code:

import sys
from PIL import Image

def is_blocked(p):
    x,y = p
    pixel = path_pixels[x,y]
    if any(c < 225 for c in pixel):
        return True
def von_neumann_neighbors(p):
    x, y = p
    neighbors = [(x-1, y), (x, y-1), (x+1, y), (x, y+1)]
    return [p for p in neighbors if not is_blocked(p)]
def manhattan(p1, p2):
    return abs(p1[0]-p2[0]) + abs(p1[1]-p2[1])
def squared_euclidean(p1, p2):
    return (p1[0]-p2[0])**2 + (p1[1]-p2[1])**2

start = (400, 984)
goal = (398, 25)

# invoke: python mazesolver.py <mazefile> <outputfile>[.jpg|.png|etc.]

path_img = Image.open(sys.argv[1])
path_pixels = path_img.load()

distance = manhattan
heuristic = manhattan

path = AStar(start, goal, von_neumann_neighbors, distance, heuristic)

for position in path:
    x,y = position
    path_pixels[x,y] = (255,0,0) # red

path_img.save(sys.argv[2])

Here are some images for a visualization of the results (inspired by the one posted by Joseph Kern). The animations show a new frame each after 10000 iterations of the main while-loop.

Breadth-First Search:

Breadth-First Search

A-Star Manhattan Distance:

A-Star Manhattan Distance

A-Star Squared Euclidean Distance:

A-Star Squared Euclidean Distance

A-Star Manhattan Distance multiplied by four:

A-Star Manhattan Distance multiplied by four

The results show that the explored regions of the maze differ considerably for the heuristics being used. As such, squared euclidean distance even produces a different (suboptimal) path as the other metrics.

Concerning the performance of the A-Star algorithm in terms of the runtime until termination, note that a lot of evaluation of distance and cost functions add up compared to the Breadth-First Search (BFS) which only needs to evaluate the “goaliness” of each candidate position. Whether or not the cost for these additional function evaluations (A-Star) outweighs the cost for the larger number of nodes to check (BFS) and especially whether or not performance is an issue for your application at all, is a matter of individual perception and can of course not be generally answered.

A thing that can be said in general about whether or not an informed search algorithm (such as A-Star) could be the better choice compared to an exhaustive search (e.g., BFS) is the following. With the number of dimensions of the maze, i.e., the branching factor of the search tree, the disadvantage of an exhaustive search (to search exhaustively) grows exponentially. With growing complexity it becomes less and less feasible to do so and at some point you are pretty much happy with any result path, be it (approximately) optimal or not.


回答 3

树搜索太多。迷宫沿溶液路径固有地可分离。

(感谢Reddit的rainman002向我指出了这一点。)

因此,您可以快速使用连接的组件来识别迷宫墙的连接部分。这会在像素上迭代两次。

如果要将其转换成解决方案路径的理想示意图,则可以将二进制操作与结构化元素一起使用,以填充每个连接区域的“死角”路径。

下面是MATLAB的演示代码。它可以使用调整来更好地清理结果,使其更通用,并使其运行更快。(有时不是2:30 AM。)

% read in and invert the image
im = 255 - imread('maze.jpg');

% sharpen it to address small fuzzy channels
% threshold to binary 15%
% run connected components
result = bwlabel(im2bw(imfilter(im,fspecial('unsharp')),0.15));

% purge small components (e.g. letters)
for i = 1:max(reshape(result,1,1002*800))
    [count,~] = size(find(result==i));
    if count < 500
        result(result==i) = 0;
    end
end

% close dead-end channels
closed = zeros(1002,800);
for i = 1:max(reshape(result,1,1002*800))
    k = zeros(1002,800);
    k(result==i) = 1; k = imclose(k,strel('square',8));
    closed(k==1) = i;
end

% do output
out = 255 - im;
for x = 1:1002
    for y = 1:800
        if closed(x,y) == 0
            out(x,y,:) = 0;
        end
    end
end
imshow(out);

当前代码的结果

Tree search is too much. The maze is inherently separable along the solution path(s).

(Thanks to rainman002 from Reddit for pointing this out to me.)

Because of this, you can quickly use connected components to identify the connected sections of maze wall. This iterates over the pixels twice.

If you want to turn that into a nice diagram of the solution path(s), you can then use binary operations with structuring elements to fill in the “dead end” pathways for each connected region.

Demo code for MATLAB follows. It could use tweaking to clean up the result better, make it more generalizable, and make it run faster. (Sometime when it’s not 2:30 AM.)

% read in and invert the image
im = 255 - imread('maze.jpg');

% sharpen it to address small fuzzy channels
% threshold to binary 15%
% run connected components
result = bwlabel(im2bw(imfilter(im,fspecial('unsharp')),0.15));

% purge small components (e.g. letters)
for i = 1:max(reshape(result,1,1002*800))
    [count,~] = size(find(result==i));
    if count < 500
        result(result==i) = 0;
    end
end

% close dead-end channels
closed = zeros(1002,800);
for i = 1:max(reshape(result,1,1002*800))
    k = zeros(1002,800);
    k(result==i) = 1; k = imclose(k,strel('square',8));
    closed(k==1) = i;
end

% do output
out = 255 - im;
for x = 1:1002
    for y = 1:800
        if closed(x,y) == 0
            out(x,y,:) = 0;
        end
    end
end
imshow(out);

result of current code


回答 4

使用队列进行阈值连续填充。将入口左侧的像素推入队列,然后开始循环。如果排队的像素足够暗,则其颜色为浅灰色(高于阈值),并且所有相邻像素都被推入队列。

from PIL import Image
img = Image.open("/tmp/in.jpg")
(w,h) = img.size
scan = [(394,23)]
while(len(scan) > 0):
    (i,j) = scan.pop()
    (r,g,b) = img.getpixel((i,j))
    if(r*g*b < 9000000):
        img.putpixel((i,j),(210,210,210))
        for x in [i-1,i,i+1]:
            for y in [j-1,j,j+1]:
                scan.append((x,y))
img.save("/tmp/out.png")

解决方案是灰墙和彩色墙之间的走廊。请注意,此迷宫有多种解决方案。而且,这似乎只是起作用。

解

Uses a queue for a threshold continuous fill. Pushes the pixel left of the entrance onto the queue and then starts the loop. If a queued pixel is dark enough, it’s colored light gray (above threshold), and all the neighbors are pushed onto the queue.

from PIL import Image
img = Image.open("/tmp/in.jpg")
(w,h) = img.size
scan = [(394,23)]
while(len(scan) > 0):
    (i,j) = scan.pop()
    (r,g,b) = img.getpixel((i,j))
    if(r*g*b < 9000000):
        img.putpixel((i,j),(210,210,210))
        for x in [i-1,i,i+1]:
            for y in [j-1,j,j+1]:
                scan.append((x,y))
img.save("/tmp/out.png")

Solution is the corridor between gray wall and colored wall. Note this maze has multiple solutions. Also, this merely appears to work.

Solution


回答 5

在这里,您可以去:maze-solver-python(GitHub)

在此处输入图片说明

我玩得很开心,并扩展了约瑟夫·科恩Joseph Kern)的答案。不减损它;我为可能对此感兴趣的其他人做了一些小的补充。

这是一个基于Python的求解器,它使用BFS查找最短路径。当时,我的主要补充是:

  1. 搜索之前将图像清除(即转换为纯黑白)
  2. 自动生成GIF。
  3. 自动生成AVI。

就目前而言,此示例迷宫的起点/终点经过了硬编码,但我计划对其进行扩展,以便您可以选择适当的像素。

Here you go: maze-solver-python (GitHub)

enter image description here

I had fun playing around with this and extended on Joseph Kern‘s answer. Not to detract from it; I just made some minor additions for anyone else who may be interested in playing around with this.

It’s a python-based solver which uses BFS to find the shortest path. My main additions, at the time, are:

  1. The image is cleaned before the search (ie. convert to pure black & white)
  2. Automatically generate a GIF.
  3. Automatically generate an AVI.

As it stands, the start/end-points are hard-coded for this sample maze, but I plan on extending it such that you can pick the appropriate pixels.


回答 6

我会选择矩阵矩阵选项。如果您发现标准的Python列表在此方面效率太低,则可以改用numpy.bool数组。这样,一个1000×1000像素迷宫的存储空间仅为1 MB。

不要为创建任何树或图数据结构而烦恼。那只是思考它的一种方式,但不一定是在内存中表示它的好方法。布尔矩阵既容易编码,也更有效。

然后使用A *算法对其进行求解。对于距离启发式方法,请使用曼哈顿距离(distance_x + distance_y)。

(row, column)坐标元组表示节点。每当算法(Wikipedia伪代码)要求“邻居”时,只需循环遍历四个可能的邻居即可(注意图像的边缘!)。

如果发现它仍然太慢,可以在加载图像之前尝试缩小图像。小心不要在此过程中迷路。

也许也可以在Python中进行1:2的缩减,以检查您实际上没有丢失任何可能的路径。一个有趣的选择,但还需要更多思考。

I’d go for the matrix-of-bools option. If you find that standard Python lists are too inefficient for this, you could use a numpy.bool array instead. Storage for a 1000×1000 pixel maze is then just 1 MB.

Don’t bother with creating any tree or graph data structures. That’s just a way of thinking about it, but not necessarily a good way to represent it in memory; a boolean matrix is both easier to code and more efficient.

Then use the A* algorithm to solve it. For the distance heuristic, use the Manhattan distance (distance_x + distance_y).

Represent nodes by a tuple of (row, column) coordinates. Whenever the algorithm (Wikipedia pseudocode) calls for “neighbours”, it’s a simple matter of looping over the four possible neighbours (mind the edges of the image!).

If you find that it’s still too slow, you could try downscaling the image before you load it. Be careful not to lose any narrow paths in the process.

Maybe it’s possible to do a 1:2 downscaling in Python as well, checking that you don’t actually lose any possible paths. An interesting option, but it needs a bit more thought.


回答 7

这里有一些想法。

(1.图像处理:)

1.1将图像加载为RGB像素图。在C#中,使用是微不足道的system.drawing.bitmap。在没有简单的图像支持的语言中,只需将图像转换为可 移植的pixmap格式(PPM)(Unix文本表示,会生成大文件)或一些您可以轻松阅读的简单二进制文件格式,例如BMPTGA。Unix中的ImageMagick或Windows中的IrfanView

1.2如前所述,您可以通过将每个像素的(R + G + B)/ 3用作灰度指示,然后对该值进行阈值生成黑白表,来简化数据。假设0 =黑色和255 =白色,则接近200的值会去除JPEG伪像。

(2.解决方案:)

2.1深度优先搜索:使用起始位置初始化一个空的堆栈,收集可用的后续动作,随机选择一个并推入堆栈,继续进行直至到达终点或结束。在弹出堆栈的死角回溯中,您需要跟踪在地图上访问过哪些位置,因此当您收集可用的移动时,您绝不会两次走同一条路径。动画非常有趣。

2.2广度优先搜索:之前提到过,与上面类似,但仅使用队列。动画也很有趣。这就像填充图像编辑软件一样。我认为您可以使用此技巧在Photoshop中解决迷宫问题。

2.3 Wall Follower(墙随动件):从几何学上讲,迷宫是一个折叠/盘旋的管子。如果您将手放在墙上,您最终将找到出口;)并非总是如此。有某些假设,例如:完美的迷宫等,例如,某些迷宫包含孤岛。查一下;令人着迷。

(3.评论:)

这是一个棘手的问题。如果以某种简单的形式来表示形式,每个元素都是具有北,东,南和西壁以及已访问标记场的像元类型,则很容易解决迷宫问题。但是鉴于给定的手绘草图,您正在尝试执行此操作,因此会变得凌乱。老实说,我认为尝试使草图合理化会让您发疯。这类似于相当复杂的计算机视觉问题。也许直接进入图像地图可能更容易但更浪费。

Here are some ideas.

(1. Image Processing:)

1.1 Load the image as RGB pixel map. In C# it is trivial using system.drawing.bitmap. In languages with no simple support for imaging, just convert the image to portable pixmap format (PPM) (a Unix text representation, produces large files) or some simple binary file format you can easily read, such as BMP or TGA. ImageMagick in Unix or IrfanView in Windows.

1.2 You may, as mentioned earlier, simplify the data by taking the (R+G+B)/3 for each pixel as an indicator of gray tone and then threshold the value to produce a black and white table. Something close to 200 assuming 0=black and 255=white will take out the JPEG artifacts.

(2. Solutions:)

2.1 Depth-First Search: Init an empty stack with starting location, collect available follow-up moves, pick one at random and push onto the stack, proceed until end is reached or a deadend. On deadend backtrack by popping the stack, you need to keep track of which positions were visited on the map so when you collect available moves you never take the same path twice. Very interesting to animate.

2.2 Breadth-First Search: Mentioned before, similar as above but only using queues. Also interesting to animate. This works like flood-fill in image editing software. I think you may be able to solve a maze in Photoshop using this trick.

2.3 Wall Follower: Geometrically speaking, a maze is a folded/convoluted tube. If you keep your hand on the wall you will eventually find the exit ;) This does not always work. There are certain assumption re: perfect mazes, etc., for instance, certain mazes contain islands. Do look it up; it is fascinating.

(3. Comments:)

This is the tricky one. It is easy to solve mazes if represented in some simple array formal with each element being a cell type with north, east, south and west walls and a visited flag field. However given that you are trying to do this given a hand drawn sketch it becomes messy. I honestly think that trying to rationalize the sketch will drive you nuts. This is akin to computer vision problems which are fairly involved. Perhaps going directly onto the image map may be easier yet more wasteful.


回答 8

这是使用R的解决方案。

### download the image, read it into R, converting to something we can play with...
library(jpeg)
url <- "https://i.stack.imgur.com/TqKCM.jpg"
download.file(url, "./maze.jpg", mode = "wb")
jpg <- readJPEG("./maze.jpg")

### reshape array into data.frame
library(reshape2)
img3 <- melt(jpg, varnames = c("y","x","rgb"))
img3$rgb <- as.character(factor(img3$rgb, levels = c(1,2,3), labels=c("r","g","b")))

## split out rgb values into separate columns
img3 <- dcast(img3, x + y ~ rgb)

RGB到灰度,请参阅:https : //stackoverflow.com/a/27491947/2371031

# convert rgb to greyscale (0, 1)
img3$v <- img3$r*.21 + img3$g*.72 + img3$b*.07
# v: values closer to 1 are white, closer to 0 are black

## strategically fill in some border pixels so the solver doesn't "go around":
img3$v2 <- img3$v
img3[(img3$x == 300 | img3$x == 500) & (img3$y %in% c(0:23,988:1002)),"v2"]  = 0

# define some start/end point coordinates
pts_df <- data.frame(x = c(398, 399),
                     y = c(985, 26))

# set a reference value as the mean of the start and end point greyscale "v"s
ref_val <- mean(c(subset(img3, x==pts_df[1,1] & y==pts_df[1,2])$v,
                  subset(img3, x==pts_df[2,1] & y==pts_df[2,2])$v))

library(sp)
library(gdistance)
spdf3 <- SpatialPixelsDataFrame(points = img3[c("x","y")], data = img3["v2"])
r3 <- rasterFromXYZ(spdf3)

# transition layer defines a "conductance" function between any two points, and the number of connections (4 = Manhatten distances)
# x in the function represents the greyscale values ("v2") of two adjacent points (pixels), i.e., = (x1$v2, x2$v2)
# make function(x) encourages transitions between cells with small changes in greyscale compared to the reference values, such that: 
# when v2 is closer to 0 (black) = poor conductance
# when v2 is closer to 1 (white) = good conductance
tl3 <- transition(r3, function(x) (1/max( abs( (x/ref_val)-1 ) )^2)-1, 4) 

## get the shortest path between start, end points
sPath3 <- shortestPath(tl3, as.numeric(pts_df[1,]), as.numeric(pts_df[2,]), output = "SpatialLines")

## fortify for ggplot
sldf3 <- fortify(SpatialLinesDataFrame(sPath3, data = data.frame(ID = 1)))

# plot the image greyscale with start/end points (red) and shortest path (green)
ggplot(img3) +
  geom_raster(aes(x, y, fill=v2)) +
  scale_fill_continuous(high="white", low="black") +
  scale_y_reverse() +
  geom_point(data=pts_df, aes(x, y), color="red") +
  geom_path(data=sldf3, aes(x=long, y=lat), color="green")

瞧!

正确找到最短路径的解决方案

如果您不填写某些边框像素(哈!),就会发生这种情况。

求解器绕迷宫绕行的解决方案版本

全面披露:在我发现这个问题之前,我自己问和回答了一个非常类似的问题。然后通过SO的魔力,发现这是最重要的“相关问题”之一。我以为我会把这个迷宫作为一个额外的测试用例…我很高兴地发现,我的答案在很少修改的情况下也适用于该应用程序。

Here’s a solution using R.

### download the image, read it into R, converting to something we can play with...
library(jpeg)
url <- "https://i.stack.imgur.com/TqKCM.jpg"
download.file(url, "./maze.jpg", mode = "wb")
jpg <- readJPEG("./maze.jpg")

### reshape array into data.frame
library(reshape2)
img3 <- melt(jpg, varnames = c("y","x","rgb"))
img3$rgb <- as.character(factor(img3$rgb, levels = c(1,2,3), labels=c("r","g","b")))

## split out rgb values into separate columns
img3 <- dcast(img3, x + y ~ rgb)

RGB to greyscale, see: https://stackoverflow.com/a/27491947/2371031

# convert rgb to greyscale (0, 1)
img3$v <- img3$r*.21 + img3$g*.72 + img3$b*.07
# v: values closer to 1 are white, closer to 0 are black

## strategically fill in some border pixels so the solver doesn't "go around":
img3$v2 <- img3$v
img3[(img3$x == 300 | img3$x == 500) & (img3$y %in% c(0:23,988:1002)),"v2"]  = 0

# define some start/end point coordinates
pts_df <- data.frame(x = c(398, 399),
                     y = c(985, 26))

# set a reference value as the mean of the start and end point greyscale "v"s
ref_val <- mean(c(subset(img3, x==pts_df[1,1] & y==pts_df[1,2])$v,
                  subset(img3, x==pts_df[2,1] & y==pts_df[2,2])$v))

library(sp)
library(gdistance)
spdf3 <- SpatialPixelsDataFrame(points = img3[c("x","y")], data = img3["v2"])
r3 <- rasterFromXYZ(spdf3)

# transition layer defines a "conductance" function between any two points, and the number of connections (4 = Manhatten distances)
# x in the function represents the greyscale values ("v2") of two adjacent points (pixels), i.e., = (x1$v2, x2$v2)
# make function(x) encourages transitions between cells with small changes in greyscale compared to the reference values, such that: 
# when v2 is closer to 0 (black) = poor conductance
# when v2 is closer to 1 (white) = good conductance
tl3 <- transition(r3, function(x) (1/max( abs( (x/ref_val)-1 ) )^2)-1, 4) 

## get the shortest path between start, end points
sPath3 <- shortestPath(tl3, as.numeric(pts_df[1,]), as.numeric(pts_df[2,]), output = "SpatialLines")

## fortify for ggplot
sldf3 <- fortify(SpatialLinesDataFrame(sPath3, data = data.frame(ID = 1)))

# plot the image greyscale with start/end points (red) and shortest path (green)
ggplot(img3) +
  geom_raster(aes(x, y, fill=v2)) +
  scale_fill_continuous(high="white", low="black") +
  scale_y_reverse() +
  geom_point(data=pts_df, aes(x, y), color="red") +
  geom_path(data=sldf3, aes(x=long, y=lat), color="green")

Voila!

solution that correctly finds shortest path

This is what happens if you don’t fill in some border pixels (Ha!)…

solution version where the solver goes around the maze

Full disclosure: I asked and answered a very similar question myself before I found this one. Then through the magic of SO, found this one as one of the top “Related Questions”. I thought I’d use this maze as an additional test case… I was very pleased to find that my answer there also works for this application with very little modification.


回答 9

好的解决方案是,而不是按像素查找邻居,而是按单元格完成,因为走廊可以有15px,因此在同一走廊中可以执行左右移动等操作,而如果这样做的话就好像位移是一个多维数据集,这将是一个简单的操作,例如UP,DOWN,LEFT或RIGHT

the good solution would be that instead of finding the neighbors by pixel, it would be done by cell, because a corridor can have 15px so in the same corridor it can take actions like left or right, while if it was done as if the displacement was a cube it would be a simple action like UP,DOWN,LEFT OR RIGHT


声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。