标签归档:image-processing

二维阵列中的峰检测

问题:二维阵列中的峰检测

我正在帮助兽医诊所测量狗爪下的压力。我使用Python进行数据分析,但现在我一直试图将爪子分成(解剖)子区域。

我制作了每个爪子的2D数组,其中包含爪子随时间推移已加载的每个传感器的最大值。这是一个爪子的示例,我使用Excel绘制了要“检测”的区域。这些是传感器周围具有最大最大值的2 x 2框,它们的总和最大。

因此,我尝试了一些实验,并决定只查找每一列和每一行的最大值(由于爪子的形状而不能朝一个方向看)。这似乎可以很好地“检测”到各个脚趾的位置,但是它也标记了相邻的传感器。

那么,告诉Python我想要这些最大值中的哪一个是最好的方法呢?

注意:2×2的正方形不能重叠,因为它们必须是单独的脚趾!

同样我以2×2为方便,欢迎使用任何更高级的解决方案,但我只是人类运动的科学家,所以我既不是真正的程序员也不是数学家,所以请保持“简单”。

是可以加载版本np.loadtxt


结果

因此,我尝试了@jextee的解决方案(请参见下面的结果)。如您所见,它在前爪上非常有效,但在后腿上效果较差。

更具体地说,它无法识别出第四脚趾的小峰。显然,这是循环固有的固有观点,即循环从上到下朝向最低值,而不考虑此位置。

谁会知道如何调整@jextee的算法,以便它也能够找到第四个脚趾?

由于我尚未处理其他任何试验,因此无法提供其他任何样品。但是我之前提供的数据是每只爪子的平均值。该文件是一个数组,其中最大9爪的数据按它们与板接触的顺序排列。

此图显示了它们如何在空间上分布在板上。

更新:

我已经为有兴趣的任何人建立了博客,为SkyDrive设置了所有原始测量值。因此,对于任何需要更多数据的人:给您更多的权力!


新更新:

因此,在获得帮助后,我遇到了有关爪子检测爪子分类的问题,我终于能够检查每个爪子的脚趾检测!事实证明,除了爪子大小像我自己的示例中的爪子一样,它在任何情况下都无法正常工作。事后看来,如此随意地选择2×2是我自己的错。

这是一个出问题的好例子:指甲被识别为脚趾,而“脚跟”是如此之宽,被识别两次!

脚掌太大,因此尺寸为2×2,没有重叠,会导致两次检测到一些脚趾。相反,在小型犬中,它通常找不到第5个脚趾,我怀疑这是2×2区域太大造成的。

尝试所有测量的当前解决方案得出了一个惊人的结论:几乎对我所有的小型犬来说,它都找不到第五个脚趾,而在大型犬的50%以上的撞击中,它会发现更多!

所以很明显我需要更改它。我自己的猜测是将其大小更改为neighborhood较小的大小(对于小型狗)和较大的大小(对于大型狗)。但是generate_binary_structure不允许我更改数组的大小。

因此,我希望其他人对脚趾的定位有更好的建议,也许脚趾的面积与爪子的大小成正比?

I’m helping a veterinary clinic measuring pressure under a dogs paw. I use Python for my data analysis and now I’m stuck trying to divide the paws into (anatomical) subregions.

I made a 2D array of each paw, that consists of the maximal values for each sensor that has been loaded by the paw over time. Here’s an example of one paw, where I used Excel to draw the areas I want to ‘detect’. These are 2 by 2 boxes around the sensor with local maxima’s, that together have the largest sum.

So I tried some experimenting and decide to simply look for the maximums of each column and row (can’t look in one direction due to the shape of the paw). This seems to ‘detect’ the location of the separate toes fairly well, but it also marks neighboring sensors.

So what would be the best way to tell Python which of these maximums are the ones I want?

Note: The 2×2 squares can’t overlap, since they have to be separate toes!

Also I took 2×2 as a convenience, any more advanced solution is welcome, but I’m simply a human movement scientist, so I’m neither a real programmer or a mathematician, so please keep it ‘simple’.

Here’s a version that can be loaded with np.loadtxt


Results

So I tried @jextee’s solution (see the results below). As you can see, it works very on the front paws, but it works less well for the hind legs.

More specifically, it can’t recognize the small peak that’s the fourth toe. This is obviously inherent to the fact that the loop looks top down towards the lowest value, without taking into account where this is.

Would anyone know how to tweak @jextee’s algorithm, so that it might be able to find the 4th toe too?

Since I haven’t processed any other trials yet, I can’t supply any other samples. But the data I gave before were the averages of each paw. This file is an array with the maximal data of 9 paws in the order they made contact with the plate.

This image shows how they were spatially spread out over the plate.

Update:

I have set up a blog for anyone interested and I have setup a SkyDrive with all the raw measurements. So to anyone requesting more data: more power to you!


New update:

So after the help I got with my questions regarding paw detection and paw sorting, I was finally able to check the toe detection for every paw! Turns out, it doesn’t work so well in anything but paws sized like the one in my own example. Off course in hindsight, it’s my own fault for choosing the 2×2 so arbitrarily.

Here’s a nice example of where it goes wrong: a nail is being recognized as a toe and the ‘heel’ is so wide, it gets recognized twice!

The paw is too large, so taking a 2×2 size with no overlap, causes some toes to be detected twice. The other way around, in small dogs it often fails to find a 5th toe, which I suspect is being caused by the 2×2 area being too large.

After trying the current solution on all my measurements I came to the staggering conclusion that for nearly all my small dogs it didn’t find a 5th toe and that in over 50% of the impacts for the large dogs it would find more!

So clearly I need to change it. My own guess was changing the size of the neighborhood to something smaller for small dogs and larger for large dogs. But generate_binary_structure wouldn’t let me change the size of the array.

Therefore, I’m hoping that anyone else has a better suggestion for locating the toes, perhaps having the toe area scale with the paw size?


回答 0

我使用局部最大滤波器检测到峰值。这是第一个4个爪子的数据集的结果:

我还在9个爪子的第二个数据集上运行了它,效果也很好

这是您的操作方式:

import numpy as np
from scipy.ndimage.filters import maximum_filter
from scipy.ndimage.morphology import generate_binary_structure, binary_erosion
import matplotlib.pyplot as pp

#for some reason I had to reshape. Numpy ignored the shape header.
paws_data = np.loadtxt("paws.txt").reshape(4,11,14)

#getting a list of images
paws = [p.squeeze() for p in np.vsplit(paws_data,4)]


def detect_peaks(image):
    """
    Takes an image and detect the peaks usingthe local maximum filter.
    Returns a boolean mask of the peaks (i.e. 1 when
    the pixel's value is the neighborhood maximum, 0 otherwise)
    """

    # define an 8-connected neighborhood
    neighborhood = generate_binary_structure(2,2)

    #apply the local maximum filter; all pixel of maximal value 
    #in their neighborhood are set to 1
    local_max = maximum_filter(image, footprint=neighborhood)==image
    #local_max is a mask that contains the peaks we are 
    #looking for, but also the background.
    #In order to isolate the peaks we must remove the background from the mask.

    #we create the mask of the background
    background = (image==0)

    #a little technicality: we must erode the background in order to 
    #successfully subtract it form local_max, otherwise a line will 
    #appear along the background border (artifact of the local maximum filter)
    eroded_background = binary_erosion(background, structure=neighborhood, border_value=1)

    #we obtain the final mask, containing only peaks, 
    #by removing the background from the local_max mask (xor operation)
    detected_peaks = local_max ^ eroded_background

    return detected_peaks


#applying the detection and plotting results
for i, paw in enumerate(paws):
    detected_peaks = detect_peaks(paw)
    pp.subplot(4,2,(2*i+1))
    pp.imshow(paw)
    pp.subplot(4,2,(2*i+2) )
    pp.imshow(detected_peaks)

pp.show()

之后,您要做的就是scipy.ndimage.measurements.label在蒙版上使用以标记所有不同的对象。这样您就可以分别与他们一起玩了。

请注意,该方法效果很好,因为背景不嘈杂。如果是这样,您将在背景中检测到许多其他不需要的峰。另一个重要因素是邻里的大小。如果峰大小发生变化,则需要对其进行调整(该值应保持大致比例)。

I detected the peaks using a local maximum filter. Here is the result on your first dataset of 4 paws:

I also ran it on the second dataset of 9 paws and it worked as well.

Here is how you do it:

import numpy as np
from scipy.ndimage.filters import maximum_filter
from scipy.ndimage.morphology import generate_binary_structure, binary_erosion
import matplotlib.pyplot as pp

#for some reason I had to reshape. Numpy ignored the shape header.
paws_data = np.loadtxt("paws.txt").reshape(4,11,14)

#getting a list of images
paws = [p.squeeze() for p in np.vsplit(paws_data,4)]


def detect_peaks(image):
    """
    Takes an image and detect the peaks usingthe local maximum filter.
    Returns a boolean mask of the peaks (i.e. 1 when
    the pixel's value is the neighborhood maximum, 0 otherwise)
    """

    # define an 8-connected neighborhood
    neighborhood = generate_binary_structure(2,2)

    #apply the local maximum filter; all pixel of maximal value 
    #in their neighborhood are set to 1
    local_max = maximum_filter(image, footprint=neighborhood)==image
    #local_max is a mask that contains the peaks we are 
    #looking for, but also the background.
    #In order to isolate the peaks we must remove the background from the mask.

    #we create the mask of the background
    background = (image==0)

    #a little technicality: we must erode the background in order to 
    #successfully subtract it form local_max, otherwise a line will 
    #appear along the background border (artifact of the local maximum filter)
    eroded_background = binary_erosion(background, structure=neighborhood, border_value=1)

    #we obtain the final mask, containing only peaks, 
    #by removing the background from the local_max mask (xor operation)
    detected_peaks = local_max ^ eroded_background

    return detected_peaks


#applying the detection and plotting results
for i, paw in enumerate(paws):
    detected_peaks = detect_peaks(paw)
    pp.subplot(4,2,(2*i+1))
    pp.imshow(paw)
    pp.subplot(4,2,(2*i+2) )
    pp.imshow(detected_peaks)

pp.show()

All you need to do after is use scipy.ndimage.measurements.label on the mask to label all distinct objects. Then you’ll be able to play with them individually.

Note that the method works well because the background is not noisy. If it were, you would detect a bunch of other unwanted peaks in the background. Another important factor is the size of the neighborhood. You will need to adjust it if the peak size changes (the should remain roughly proportional).


回答 1

数据文件:paw.txt。源代码:

from scipy import *
from operator import itemgetter

n = 5  # how many fingers are we looking for

d = loadtxt("paw.txt")
width, height = d.shape

# Create an array where every element is a sum of 2x2 squares.

fourSums = d[:-1,:-1] + d[1:,:-1] + d[1:,1:] + d[:-1,1:]

# Find positions of the fingers.

# Pair each sum with its position number (from 0 to width*height-1),

pairs = zip(arange(width*height), fourSums.flatten())

# Sort by descending sum value, filter overlapping squares

def drop_overlapping(pairs):
    no_overlaps = []
    def does_not_overlap(p1, p2):
        i1, i2 = p1[0], p2[0]
        r1, col1 = i1 / (width-1), i1 % (width-1)
        r2, col2 = i2 / (width-1), i2 % (width-1)
        return (max(abs(r1-r2),abs(col1-col2)) >= 2)
    for p in pairs:
        if all(map(lambda prev: does_not_overlap(p,prev), no_overlaps)):
            no_overlaps.append(p)
    return no_overlaps

pairs2 = drop_overlapping(sorted(pairs, key=itemgetter(1), reverse=True))

# Take the first n with the heighest values

positions = pairs2[:n]

# Print results

print d, "\n"

for i, val in positions:
    row = i / (width-1)
    column = i % (width-1)
    print "sum = %f @ %d,%d (%d)" % (val, row, column, i)
    print d[row:row+2,column:column+2], "\n"

输出没有重叠的正方形。似乎选择了与您的示例相同的区域。

一些评论

棘手的部分是计算所有2×2平方和。我以为您需要所有这些,所以可能会有一些重叠。我使用切片从原始2D数组中剪切出第一行/最后一列和行,然后将它们全部重叠在一起并计算总和。

为了更好地理解它,对3×3阵列进行成像:

>>> a = arange(9).reshape(3,3) ; a
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

然后,您可以对其进行切片:

>>> a[:-1,:-1]
array([[0, 1],
       [3, 4]])
>>> a[1:,:-1]
array([[3, 4],
       [6, 7]])
>>> a[:-1,1:]
array([[1, 2],
       [4, 5]])
>>> a[1:,1:]
array([[4, 5],
       [7, 8]])

现在,假设您将它们一个堆叠在另一个之上,然后将元素求和在相同位置。这些总和将与2×2正方形上的总和完全相同,并且左上角在相同位置:

>>> sums = a[:-1,:-1] + a[1:,:-1] + a[:-1,1:] + a[1:,1:]; sums
array([[ 8, 12],
       [20, 24]])

当总和超过2×2平方时,可以使用max来找到最大值或sortsorted来找到峰值。

为了记住峰的位置,我将每个值(总和)与其在平坦阵列中的序数位置相结合(请参见 zip)。然后,当我打印结果时,我再次计算行/列位置。

笔记

我允许2×2正方形重叠。编辑后的版本会过滤掉其中的一些,以便结果中仅显示不重叠的正方形。

选择手指(一个想法)

另一个问题是如何从所有峰中选择可能是手指的东西。我有一个想法可能会或可能不会。我现在没有时间实现它,所以只有伪代码。

我注意到,如果前手指几乎保持在一个完美的圆上,则后手指应该在该圆的内部。同样,前指或多或少地等距分布。我们可能会尝试使用这些启发式属性来检测手指。

伪代码:

select the top N finger candidates (not too many, 10 or 12)
consider all possible combinations of 5 out of N (use itertools.combinations)
for each combination of 5 fingers:
    for each finger out of 5:
        fit the best circle to the remaining 4
        => position of the center, radius
        check if the selected finger is inside of the circle
        check if the remaining four are evenly spread
        (for example, consider angles from the center of the circle)
        assign some cost (penalty) to this selection of 4 peaks + a rear finger
        (consider, probably weighted:
             circle fitting error,
             if the rear finger is inside,
             variance in the spreading of the front fingers,
             total intensity of 5 peaks)
choose a combination of 4 peaks + a rear peak with the lowest penalty

这是一种蛮力的方法。如果N相对较小,那么我认为它是可行的。对于N = 12,有C_12 ^ 5 = 792个组合,乘以5种选择后指的方式,因此每只爪子需要评估3960种情况。

Solution

Data file: paw.txt. Source code:

from scipy import *
from operator import itemgetter

n = 5  # how many fingers are we looking for

d = loadtxt("paw.txt")
width, height = d.shape

# Create an array where every element is a sum of 2x2 squares.

fourSums = d[:-1,:-1] + d[1:,:-1] + d[1:,1:] + d[:-1,1:]

# Find positions of the fingers.

# Pair each sum with its position number (from 0 to width*height-1),

pairs = zip(arange(width*height), fourSums.flatten())

# Sort by descending sum value, filter overlapping squares

def drop_overlapping(pairs):
    no_overlaps = []
    def does_not_overlap(p1, p2):
        i1, i2 = p1[0], p2[0]
        r1, col1 = i1 / (width-1), i1 % (width-1)
        r2, col2 = i2 / (width-1), i2 % (width-1)
        return (max(abs(r1-r2),abs(col1-col2)) >= 2)
    for p in pairs:
        if all(map(lambda prev: does_not_overlap(p,prev), no_overlaps)):
            no_overlaps.append(p)
    return no_overlaps

pairs2 = drop_overlapping(sorted(pairs, key=itemgetter(1), reverse=True))

# Take the first n with the heighest values

positions = pairs2[:n]

# Print results

print d, "\n"

for i, val in positions:
    row = i / (width-1)
    column = i % (width-1)
    print "sum = %f @ %d,%d (%d)" % (val, row, column, i)
    print d[row:row+2,column:column+2], "\n"

Output without overlapping squares. It seems that the same areas are selected as in your example.

Some comments

The tricky part is to calculate sums of all 2×2 squares. I assumed you need all of them, so there might be some overlapping. I used slices to cut the first/last columns and rows from the original 2D array, and then overlapping them all together and calculating sums.

To understand it better, imaging a 3×3 array:

>>> a = arange(9).reshape(3,3) ; a
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

Then you can take its slices:

>>> a[:-1,:-1]
array([[0, 1],
       [3, 4]])
>>> a[1:,:-1]
array([[3, 4],
       [6, 7]])
>>> a[:-1,1:]
array([[1, 2],
       [4, 5]])
>>> a[1:,1:]
array([[4, 5],
       [7, 8]])

Now imagine you stack them one above the other and sum elements at the same positions. These sums will be exactly the same sums over the 2×2 squares with the top-left corner in the same position:

>>> sums = a[:-1,:-1] + a[1:,:-1] + a[:-1,1:] + a[1:,1:]; sums
array([[ 8, 12],
       [20, 24]])

When you have the sums over 2×2 squares, you can use max to find the maximum, or sort, or sorted to find the peaks.

To remember positions of the peaks I couple every value (the sum) with its ordinal position in a flattened array (see zip). Then I calculate row/column position again when I print the results.

Notes

I allowed for the 2×2 squares to overlap. Edited version filters out some of them such that only non-overlapping squares appear in the results.

Choosing fingers (an idea)

Another problem is how to choose what is likely to be fingers out of all the peaks. I have an idea which may or may not work. I don’t have time to implement it right now, so just pseudo-code.

I noticed that if the front fingers stay on almost a perfect circle, the rear finger should be inside of that circle. Also, the front fingers are more or less equally spaced. We may try to use these heuristic properties to detect the fingers.

Pseudo code:

select the top N finger candidates (not too many, 10 or 12)
consider all possible combinations of 5 out of N (use itertools.combinations)
for each combination of 5 fingers:
    for each finger out of 5:
        fit the best circle to the remaining 4
        => position of the center, radius
        check if the selected finger is inside of the circle
        check if the remaining four are evenly spread
        (for example, consider angles from the center of the circle)
        assign some cost (penalty) to this selection of 4 peaks + a rear finger
        (consider, probably weighted:
             circle fitting error,
             if the rear finger is inside,
             variance in the spreading of the front fingers,
             total intensity of 5 peaks)
choose a combination of 4 peaks + a rear peak with the lowest penalty

This is a brute-force approach. If N is relatively small, then I think it is doable. For N=12, there are C_12^5 = 792 combinations, times 5 ways to select a rear finger, so 3960 cases to evaluate for every paw.


回答 2

这是图像配准问题。总体策略是:

  • 有一个已知的例子,或某种先验的数据。
  • 使数据适合示例,或使示例适合数据。
  • 如果您的数据首先是大致对齐的,则将有帮助。

这是一种粗略而现成的方法,“可能会起作用的最愚蠢的事情”:

  • 从五个脚趾坐标开始,大致在您期望的位置。
  • 与每个人一起,反复攀登到山顶。即给定当前位置,如果其值大于当前像素,则移动到最大相邻像素。当脚趾坐标停止移动时停止。

要解决方向问题,您可以为基本方向(北,东北等)设置8个左右的初始设置。单独运行每个,并丢弃两个或多个脚趾最终位于同一像素的任何结果。我会再考虑一下,但是在图像处理中仍在研究这种东西-没有正确的答案!

稍微复杂一点的想法:(加权)K均值聚类。没那么糟糕。

  • 从五个脚趾坐标开始,但是现在这些是“集群中心”。

然后迭代直到收敛:

  • 将每个像素分配给最近的群集(只需为每个群集列出一个列表)。
  • 计算每个群集的质心。对于每个群集,这是:总和(坐标*强度值)/总和(坐标)
  • 将每个群集移动到新的质心。

这种方法几乎可以肯定会带来更好的结果,而且您​​会得到每个簇的质量,这可能有助于识别脚趾。

(同样,您已经预先指定了簇数。使用簇时,必须以一种或另一种方式指定密度:选择适合这种情况的簇数,或者选择一个簇半径,然后查看要终止的簇数后者的一个例子是均值漂移。)

抱歉,缺少实施细节或其他细节。我会对此进行编码,但是我有最后期限。如果下周没有其他工作,请告诉我,我会尝试一下。

This is an image registration problem. The general strategy is:

  • Have a known example, or some kind of prior on the data.
  • Fit your data to the example, or fit the example to your data.
  • It helps if your data is roughly aligned in the first place.

Here’s a rough and ready approach, “the dumbest thing that could possibly work”:

  • Start with five toe coordinates in roughly the place you expect.
  • With each one, iteratively climb to the top of the hill. i.e. given current position, move to maximum neighbouring pixel, if its value is greater than current pixel. Stop when your toe coordinates have stopped moving.

To counteract the orientation problem, you could have 8 or so initial settings for the basic directions (North, North East, etc). Run each one individually and throw away any results where two or more toes end up at the same pixel. I’ll think about this some more, but this kind of thing is still being researched in image processing – there are no right answers!

Slightly more complex idea: (weighted) K-means clustering. It’s not that bad.

  • Start with five toe coordinates, but now these are “cluster centres”.

Then iterate until convergence:

  • Assign each pixel to the closest cluster (just make a list for each cluster).
  • Calculate the center of mass of each cluster. For each cluster, this is: Sum(coordinate * intensity value)/Sum(coordinate)
  • Move each cluster to the new centre of mass.

This method will almost certainly give much better results, and you get the mass of each cluster which may help in identifying the toes.

(Again, you’ve specified the number of clusters up front. With clustering you have to specify the density one way or another: Either choose the number of clusters, appropriate in this case, or choose a cluster radius and see how many you end up with. An example of the latter is mean-shift.)

Sorry about the lack of implementation details or other specifics. I would code this up but I’ve got a deadline. If nothing else has worked by next week let me know and I’ll give it a shot.


回答 3

使用持久同源性分析您的数据集,我得到以下结果(点击放大):

这是此SO Answer中描述的峰值检测方法的2D版本。上图仅显示了按持久性排序的0维持久性同源类。

我使用scipy.misc.imresize()将原始数据集按比例放大了2倍。但是,请注意,我确实将这四个爪子视为一个数据集。将其分为四个部分将使问题更容易解决。

方法。 这背后的想法很简单:考虑为每个像素分配其级别的函数的函数图。看起来像这样:

现在考虑高度255处的水位,该水位连续下降到较低的水位。在局部最大的岛上弹出(出生)。在马鞍点,两个岛屿合并。我们认为较低的岛屿将合并到较高的岛屿(死亡)。所谓的持久性图(属于零维度同质性类,我们的岛屿)描述了所有岛屿的死亡人数与出生人数的关系:

持久性那么,一个岛屿就是出生和死亡水平之间的差异。点到灰色主对角线的垂直距离。该图通过减少持久性来标记岛屿。

第一张图片显示了岛屿的出生地点。该方法不仅给出局部最大值,而且通过上述持久性来量化它们的“重要性”。然后,将以太低的持久性过滤掉所有岛屿。但是,在您的示例中,每个岛(即每个局部最大值)都是您要寻找的峰值。

Python代码可以在这里找到。

Using persistent homology to analyze your data set I get the following result (click to enlarge):

This is the 2D-version of the peak detection method described in this SO answer. The above figure simply shows 0-dimensional persistent homology classes sorted by persistence.

I did upscale the original dataset by a factor of 2 using scipy.misc.imresize(). However, note that I did consider the four paws as one dataset; splitting it into four would make the problem easier.

Methodology. The idea behind this quite simple: Consider the function graph of the function that assigns each pixel its level. It looks like this:

Now consider a water level at height 255 that continuously descents to lower levels. At local maxima islands pop up (birth). At saddle points two islands merge; we consider the lower island to be merged to the higher island (death). The so-called persistence diagram (of the 0-th dimensional homology classes, our islands) depicts death- over birth-values of all islands:

The persistence of an island is then the difference between the birth- and death-level; the vertical distance of a dot to the grey main diagonal. The figure labels the islands by decreasing persistence.

The very first picture shows the locations of births of the islands. This method not only gives the local maxima but also quantifies their “significance” by the above mentioned persistence. One would then filter out all islands with a too low persistence. However, in your example every island (i.e., every local maximum) is a peak you look for.

Python code can be found here.


回答 4

物理学家已经对该问题进行了深入研究。在ROOT中有一个很好的实现。查看TSpectrum类(尤其是TSpectrum2您的案例的)和它们的文档。

参考文献:

  1. M.Morhac等人:多维重合伽马射线光谱的背景消除方法。物理研究中的核仪器和方法A 401(1997)113-132。
  2. M.Morhac等:有效的一维和二维金反卷积及其在伽马射线光谱分解中的应用。物理研究中的核仪器和方法A 401(1997)385-408。
  3. M.Morhac等人:多维重合伽马射线光谱中的峰鉴定。《研究物理中的核仪器与方法》,A 443(2000),108-125。

…对于那些无权订阅NIM的用户:

This problem has been studied in some depth by physicists. There is a good implementation in ROOT. Look at the TSpectrum classes (especially TSpectrum2 for your case) and the documentation for them.

References:

  1. M.Morhac et al.: Background elimination methods for multidimensional coincidence gamma-ray spectra. Nuclear Instruments and Methods in Physics Research A 401 (1997) 113-132.
  2. M.Morhac et al.: Efficient one- and two-dimensional Gold deconvolution and its application to gamma-ray spectra decomposition. Nuclear Instruments and Methods in Physics Research A 401 (1997) 385-408.
  3. M.Morhac et al.: Identification of peaks in multidimensional coincidence gamma-ray spectra. Nuclear Instruments and Methods in Research Physics A 443(2000), 108-125.

…and for those who don’t have access to a subscription to NIM:


回答 5

这是一个想法:您计算图像的(离散)拉普拉斯算子。我希望它在最大值上(负)很大,比原​​始图像更具戏剧性。因此,最大值可能更容易找到。

这是另一个想法:如果您知道高压点的典型大小,则可以先通过使用相同大小的高斯将其卷积来平滑图像。这可以使您处理更简单的图像。

Here is an idea: you calculate the (discrete) Laplacian of the image. I would expect it to be (negative and) large at maxima, in a way that is more dramatic than in the original images. Thus, maxima could be easier to find.

Here is another idea: if you know the typical size of the high-pressure spots, you can first smooth your image by convoluting it with a Gaussian of the same size. This may give you simpler images to process.


回答 6

我的脑海中只有几个想法:

  • 进行扫描的梯度(导数),看看是否消除了错误的调用
  • 取局部最大值的最大值

您可能还想看看OpenCV,它有一个相当不错的Python API,并且可能有一些有用的功能。

Just a couple of ideas off the top of my head:

  • take the gradient (derivative) of the scan, see if that eliminates the false calls
  • take the maximum of the local maxima

You might also want to take a look at OpenCV, it’s got a fairly decent Python API and might have some functions you’d find useful.


回答 7

我确定您现在有足够的工作要做,但是我不禁建议使用k-means聚类方法。k-means是一种无监督的聚类算法,它将获取您的数据(任意数量的维度-我碰巧是在3D中进行此操作),并将其排列到具有不同边界的k个聚类中。这里很好,因为您确切知道这些犬(应该)有多少个脚趾。

此外,它是在Scipy中实现的,这真的很好(http://docs.scipy.org/doc/scipy/reference/cluster.vq.html)。

这是在空间上解析3D群集的方法的示例:

您想要做的有点不同(2D并包含压力值),但我仍然认为您可以尝试一下。

I’m sure you have enough to go on by now, but I can’t help but suggest using the k-means clustering method. k-means is an unsupervised clustering algorithm which will take you data (in any number of dimensions – I happen to do this in 3D) and arrange it into k clusters with distinct boundaries. It’s nice here because you know exactly how many toes these canines (should) have.

Additionally, it’s implemented in Scipy which is really nice (http://docs.scipy.org/doc/scipy/reference/cluster.vq.html).

Here’s an example of what it can do to spatially resolve 3D clusters:

What you want to do is a bit different (2D and includes pressure values), but I still think you could give it a shot.


回答 8

感谢您的原始数据。我在火车上,这是我所能到达的最大距离(我的停靠站即将来临)。我用正则表达式对您的txt文件进行了按摩,并将其放入带有一些JavaScript的html页面中以进行可视化。我在这里分享它是因为某些人(例如我自己)可能会发现它比python更容易被黑客攻击。

我认为一个好的方法将是尺度和旋转不变,而我的下一步将是研究高斯的混合。(每个爪垫都是高斯的中心)。

    <html>
<head>
    <script type="text/javascript" src="http://vis.stanford.edu/protovis/protovis-r3.2.js"></script> 
    <script type="text/javascript">
    var heatmap = [[[0,0,0,0,0,0,0,4,4,0,0,0,0],
[0,0,0,0,0,7,14,22,18,7,0,0,0],
[0,0,0,0,11,40,65,43,18,7,0,0,0],
[0,0,0,0,14,61,72,32,7,4,11,14,4],
[0,7,14,11,7,22,25,11,4,14,65,72,14],
[4,29,79,54,14,7,4,11,18,29,79,83,18],
[0,18,54,32,18,43,36,29,61,76,25,18,4],
[0,4,7,7,25,90,79,36,79,90,22,0,0],
[0,0,0,0,11,47,40,14,29,36,7,0,0],
[0,0,0,0,4,7,7,4,4,4,0,0,0]
],[
[0,0,0,4,4,0,0,0,0,0,0,0,0],
[0,0,11,18,18,7,0,0,0,0,0,0,0],
[0,4,29,47,29,7,0,4,4,0,0,0,0],
[0,0,11,29,29,7,7,22,25,7,0,0,0],
[0,0,0,4,4,4,14,61,83,22,0,0,0],
[4,7,4,4,4,4,14,32,25,7,0,0,0],
[4,11,7,14,25,25,47,79,32,4,0,0,0],
[0,4,4,22,58,40,29,86,36,4,0,0,0],
[0,0,0,7,18,14,7,18,7,0,0,0,0],
[0,0,0,0,4,4,0,0,0,0,0,0,0],
],[
[0,0,0,4,11,11,7,4,0,0,0,0,0],
[0,0,0,4,22,36,32,22,11,4,0,0,0],
[4,11,7,4,11,29,54,50,22,4,0,0,0],
[11,58,43,11,4,11,25,22,11,11,18,7,0],
[11,50,43,18,11,4,4,7,18,61,86,29,4],
[0,11,18,54,58,25,32,50,32,47,54,14,0],
[0,0,14,72,76,40,86,101,32,11,7,4,0],
[0,0,4,22,22,18,47,65,18,0,0,0,0],
[0,0,0,0,4,4,7,11,4,0,0,0,0],
],[
[0,0,0,0,4,4,4,0,0,0,0,0,0],
[0,0,0,4,14,14,18,7,0,0,0,0,0],
[0,0,0,4,14,40,54,22,4,0,0,0,0],
[0,7,11,4,11,32,36,11,0,0,0,0,0],
[4,29,36,11,4,7,7,4,4,0,0,0,0],
[4,25,32,18,7,4,4,4,14,7,0,0,0],
[0,7,36,58,29,14,22,14,18,11,0,0,0],
[0,11,50,68,32,40,61,18,4,4,0,0,0],
[0,4,11,18,18,43,32,7,0,0,0,0,0],
[0,0,0,0,4,7,4,0,0,0,0,0,0],
],[
[0,0,0,0,0,0,4,7,4,0,0,0,0],
[0,0,0,0,4,18,25,32,25,7,0,0,0],
[0,0,0,4,18,65,68,29,11,0,0,0,0],
[0,4,4,4,18,65,54,18,4,7,14,11,0],
[4,22,36,14,4,14,11,7,7,29,79,47,7],
[7,54,76,36,18,14,11,36,40,32,72,36,4],
[4,11,18,18,61,79,36,54,97,40,14,7,0],
[0,0,0,11,58,101,40,47,108,50,7,0,0],
[0,0,0,4,11,25,7,11,22,11,0,0,0],
[0,0,0,0,0,4,0,0,0,0,0,0,0],
],[
[0,0,4,7,4,0,0,0,0,0,0,0,0],
[0,0,11,22,14,4,0,4,0,0,0,0,0],
[0,0,7,18,14,4,4,14,18,4,0,0,0],
[0,4,0,4,4,0,4,32,54,18,0,0,0],
[4,11,7,4,7,7,18,29,22,4,0,0,0],
[7,18,7,22,40,25,50,76,25,4,0,0,0],
[0,4,4,22,61,32,25,54,18,0,0,0,0],
[0,0,0,4,11,7,4,11,4,0,0,0,0],
],[
[0,0,0,0,7,14,11,4,0,0,0,0,0],
[0,0,0,4,18,43,50,32,14,4,0,0,0],
[0,4,11,4,7,29,61,65,43,11,0,0,0],
[4,18,54,25,7,11,32,40,25,7,11,4,0],
[4,36,86,40,11,7,7,7,7,25,58,25,4],
[0,7,18,25,65,40,18,25,22,22,47,18,0],
[0,0,4,32,79,47,43,86,54,11,7,4,0],
[0,0,0,14,32,14,25,61,40,7,0,0,0],
[0,0,0,0,4,4,4,11,7,0,0,0,0],
],[
[0,0,0,0,4,7,11,4,0,0,0,0,0],
[0,4,4,0,4,11,18,11,0,0,0,0,0],
[4,11,11,4,0,4,4,4,0,0,0,0,0],
[4,18,14,7,4,0,0,4,7,7,0,0,0],
[0,7,18,29,14,11,11,7,18,18,4,0,0],
[0,11,43,50,29,43,40,11,4,4,0,0,0],
[0,4,18,25,22,54,40,7,0,0,0,0,0],
[0,0,4,4,4,11,7,0,0,0,0,0,0],
],[
[0,0,0,0,0,7,7,7,7,0,0,0,0],
[0,0,0,0,7,32,32,18,4,0,0,0,0],
[0,0,0,0,11,54,40,14,4,4,22,11,0],
[0,7,14,11,4,14,11,4,4,25,94,50,7],
[4,25,65,43,11,7,4,7,22,25,54,36,7],
[0,7,25,22,29,58,32,25,72,61,14,7,0],
[0,0,4,4,40,115,68,29,83,72,11,0,0],
[0,0,0,0,11,29,18,7,18,14,4,0,0],
[0,0,0,0,0,4,0,0,0,0,0,0,0],
]
];
</script>
</head>
<body>
    <script type="text/javascript+protovis">    
    for (var a=0; a < heatmap.length; a++) {
    var w = heatmap[a][0].length,
    h = heatmap[a].length;
var vis = new pv.Panel()
    .width(w * 6)
    .height(h * 6)
    .strokeStyle("#aaa")
    .lineWidth(4)
    .antialias(true);
vis.add(pv.Image)
    .imageWidth(w)
    .imageHeight(h)
    .image(pv.Scale.linear()
        .domain(0, 99, 100)
        .range("#000", "#fff", '#ff0a0a')
        .by(function(i, j) heatmap[a][j][i]));
vis.render();
}
</script>
  </body>
</html>

thanks for the raw data. I’m on the train and this is as far as I’ve gotten (my stop is coming up). I massaged your txt file with regexps and have plopped it into a html page with some javascript for visualization. I’m sharing it here because some, like myself, might find it more readily hackable than python.

I think a good approach will be scale and rotation invariant, and my next step will be to investigate mixtures of gaussians. (each paw pad being the center of a gaussian).

    <html>
<head>
    <script type="text/javascript" src="http://vis.stanford.edu/protovis/protovis-r3.2.js"></script> 
    <script type="text/javascript">
    var heatmap = [[[0,0,0,0,0,0,0,4,4,0,0,0,0],
[0,0,0,0,0,7,14,22,18,7,0,0,0],
[0,0,0,0,11,40,65,43,18,7,0,0,0],
[0,0,0,0,14,61,72,32,7,4,11,14,4],
[0,7,14,11,7,22,25,11,4,14,65,72,14],
[4,29,79,54,14,7,4,11,18,29,79,83,18],
[0,18,54,32,18,43,36,29,61,76,25,18,4],
[0,4,7,7,25,90,79,36,79,90,22,0,0],
[0,0,0,0,11,47,40,14,29,36,7,0,0],
[0,0,0,0,4,7,7,4,4,4,0,0,0]
],[
[0,0,0,4,4,0,0,0,0,0,0,0,0],
[0,0,11,18,18,7,0,0,0,0,0,0,0],
[0,4,29,47,29,7,0,4,4,0,0,0,0],
[0,0,11,29,29,7,7,22,25,7,0,0,0],
[0,0,0,4,4,4,14,61,83,22,0,0,0],
[4,7,4,4,4,4,14,32,25,7,0,0,0],
[4,11,7,14,25,25,47,79,32,4,0,0,0],
[0,4,4,22,58,40,29,86,36,4,0,0,0],
[0,0,0,7,18,14,7,18,7,0,0,0,0],
[0,0,0,0,4,4,0,0,0,0,0,0,0],
],[
[0,0,0,4,11,11,7,4,0,0,0,0,0],
[0,0,0,4,22,36,32,22,11,4,0,0,0],
[4,11,7,4,11,29,54,50,22,4,0,0,0],
[11,58,43,11,4,11,25,22,11,11,18,7,0],
[11,50,43,18,11,4,4,7,18,61,86,29,4],
[0,11,18,54,58,25,32,50,32,47,54,14,0],
[0,0,14,72,76,40,86,101,32,11,7,4,0],
[0,0,4,22,22,18,47,65,18,0,0,0,0],
[0,0,0,0,4,4,7,11,4,0,0,0,0],
],[
[0,0,0,0,4,4,4,0,0,0,0,0,0],
[0,0,0,4,14,14,18,7,0,0,0,0,0],
[0,0,0,4,14,40,54,22,4,0,0,0,0],
[0,7,11,4,11,32,36,11,0,0,0,0,0],
[4,29,36,11,4,7,7,4,4,0,0,0,0],
[4,25,32,18,7,4,4,4,14,7,0,0,0],
[0,7,36,58,29,14,22,14,18,11,0,0,0],
[0,11,50,68,32,40,61,18,4,4,0,0,0],
[0,4,11,18,18,43,32,7,0,0,0,0,0],
[0,0,0,0,4,7,4,0,0,0,0,0,0],
],[
[0,0,0,0,0,0,4,7,4,0,0,0,0],
[0,0,0,0,4,18,25,32,25,7,0,0,0],
[0,0,0,4,18,65,68,29,11,0,0,0,0],
[0,4,4,4,18,65,54,18,4,7,14,11,0],
[4,22,36,14,4,14,11,7,7,29,79,47,7],
[7,54,76,36,18,14,11,36,40,32,72,36,4],
[4,11,18,18,61,79,36,54,97,40,14,7,0],
[0,0,0,11,58,101,40,47,108,50,7,0,0],
[0,0,0,4,11,25,7,11,22,11,0,0,0],
[0,0,0,0,0,4,0,0,0,0,0,0,0],
],[
[0,0,4,7,4,0,0,0,0,0,0,0,0],
[0,0,11,22,14,4,0,4,0,0,0,0,0],
[0,0,7,18,14,4,4,14,18,4,0,0,0],
[0,4,0,4,4,0,4,32,54,18,0,0,0],
[4,11,7,4,7,7,18,29,22,4,0,0,0],
[7,18,7,22,40,25,50,76,25,4,0,0,0],
[0,4,4,22,61,32,25,54,18,0,0,0,0],
[0,0,0,4,11,7,4,11,4,0,0,0,0],
],[
[0,0,0,0,7,14,11,4,0,0,0,0,0],
[0,0,0,4,18,43,50,32,14,4,0,0,0],
[0,4,11,4,7,29,61,65,43,11,0,0,0],
[4,18,54,25,7,11,32,40,25,7,11,4,0],
[4,36,86,40,11,7,7,7,7,25,58,25,4],
[0,7,18,25,65,40,18,25,22,22,47,18,0],
[0,0,4,32,79,47,43,86,54,11,7,4,0],
[0,0,0,14,32,14,25,61,40,7,0,0,0],
[0,0,0,0,4,4,4,11,7,0,0,0,0],
],[
[0,0,0,0,4,7,11,4,0,0,0,0,0],
[0,4,4,0,4,11,18,11,0,0,0,0,0],
[4,11,11,4,0,4,4,4,0,0,0,0,0],
[4,18,14,7,4,0,0,4,7,7,0,0,0],
[0,7,18,29,14,11,11,7,18,18,4,0,0],
[0,11,43,50,29,43,40,11,4,4,0,0,0],
[0,4,18,25,22,54,40,7,0,0,0,0,0],
[0,0,4,4,4,11,7,0,0,0,0,0,0],
],[
[0,0,0,0,0,7,7,7,7,0,0,0,0],
[0,0,0,0,7,32,32,18,4,0,0,0,0],
[0,0,0,0,11,54,40,14,4,4,22,11,0],
[0,7,14,11,4,14,11,4,4,25,94,50,7],
[4,25,65,43,11,7,4,7,22,25,54,36,7],
[0,7,25,22,29,58,32,25,72,61,14,7,0],
[0,0,4,4,40,115,68,29,83,72,11,0,0],
[0,0,0,0,11,29,18,7,18,14,4,0,0],
[0,0,0,0,0,4,0,0,0,0,0,0,0],
]
];
</script>
</head>
<body>
    <script type="text/javascript+protovis">    
    for (var a=0; a < heatmap.length; a++) {
    var w = heatmap[a][0].length,
    h = heatmap[a].length;
var vis = new pv.Panel()
    .width(w * 6)
    .height(h * 6)
    .strokeStyle("#aaa")
    .lineWidth(4)
    .antialias(true);
vis.add(pv.Image)
    .imageWidth(w)
    .imageHeight(h)
    .image(pv.Scale.linear()
        .domain(0, 99, 100)
        .range("#000", "#fff", '#ff0a0a')
        .by(function(i, j) heatmap[a][j][i]));
vis.render();
}
</script>
  </body>
</html>


回答 9

物理学家的解决方案:
定义5个由其位置标识的爪标记,X_i并以随机位置初始化它们。定义一些能量函数,将对标记在脚掌位置的奖励与对标记重叠的惩罚相结合;比方说:

E(X_i;S)=-Sum_i(S(X_i))+alfa*Sum_ij (|X_i-Xj|<=2*sqrt(2)?1:0)

S(X_i)是2×2平方左右的平均力X_ialfa是实验达到峰值的参数)

现在是时候做一些Metropolis-Hastings魔术了:
1.选择随机标记并将其沿随机方向移动一个像素。
2.计算dE,此移动引起的能量差。
3.从0-1获得一个统一的随机数,并将其称为r。
4.如果dE<0exp(-beta*dE)>r,接受移动并转到1;否则,转到1。如果不是,请撤消移动并转到1。
应该重复这一步骤,直到标记会聚到爪子上为止。Beta控制扫描以优化权衡,因此也应通过实验进行优化;随着模拟时间(模拟退火)的增加,它也可以不断增加。

Physicist’s solution:
Define 5 paw-markers identified by their positions X_i and init them with random positions. Define some energy function combining some award for location of markers in paws’ positions with some punishment for overlap of markers; let’s say:

E(X_i;S)=-Sum_i(S(X_i))+alfa*Sum_ij (|X_i-Xj|<=2*sqrt(2)?1:0)

(S(X_i) is the mean force in 2×2 square around X_i, alfa is a parameter to be peaked experimentally)

Now time to do some Metropolis-Hastings magic:
1. Select random marker and move it by one pixel in random direction.
2. Calculate dE, the difference of energy this move caused.
3. Get an uniform random number from 0-1 and call it r.
4. If dE<0 or exp(-beta*dE)>r, accept the move and go to 1; if not, undo the move and go to 1.
This should be repeated until the markers will converge to paws. Beta controls the scanning to optimizing tradeoff, so it should be also optimized experimentally; it can be also constantly increased with the time of simulation (simulated annealing).


回答 10

这是我对大型望远镜进行类似操作时使用的另一种方法:

1)搜索最高像素。有了该值后,请在该值附近搜索2×2的最佳拟合(也许使2×2的总和最大化),或者在以最高像素为中心的4×4子区域内进行2d高斯拟合。

然后将您发现的2×2像素在峰中心周围设为零(或3×3)

返回1)并重复直到最高峰降到噪声阈值以下,或者您拥有了所有需要的脚趾

Heres another approach that I used when doing something similar for a large telescope:

1) Search for the highest pixel. Once you have that, search around that for the best fit for 2×2 (maybe maximizing the 2×2 sum), or do a 2d gaussian fit inside the sub region of say 4×4 centered on the highest pixel.

Then set those 2×2 pixels you have found to zero (or maybe 3×3) around the peak center

go back to 1) and repeat till the highest peak falls below a noise threshold, or you have all the toes you need


回答 11

如果您能够创建一些训练数据,则可能值得尝试使用神经网络…但是,这需要手工标注许多示例。

It’s probably worth to try with neural networks if you are able to create some training data… but this needs many samples annotated by hand.


回答 12

粗略的轮廓…

您可能希望使用连接的组件算法来隔离每个爪区域。Wiki在此处对此有一个不错的描述(带有一些代码):http : //en.wikipedia.org/wiki/Connected_Component_Labeling

您必须决定是使用4个连接还是8个连接。就个人而言,对于大多数问题,我更喜欢6连通性。无论如何,一旦将每个“爪印”分离为一个相连的区域,就应该足够容易地遍历该区域并找到最大值。一旦找到最大值,就可以迭代放大区域,直到达到预定阈值,以将其识别为给定的“脚趾”。

这里一个微妙的问题是,一旦您开始使用计算机视觉技术将某事物识别为右/左/前/后爪,并且开始查看各个脚趾,就必须开始考虑旋转,偏斜和平移。这是通过分析所谓的“时刻”来完成的。在视觉应用中需要考虑一些不同的时刻:

中心矩:平移不变归一化矩:缩放和平移不变hu矩:平移,尺度和旋转不变

通过在Wiki上搜索“图像时刻”可以找到有关时刻的更多信息。

a rough outline…

you’d probably want to use a connected components algorithm to isolate each paw region. wiki has a decent description of this (with some code) here: http://en.wikipedia.org/wiki/Connected_Component_Labeling

you’ll have to make a decision about whether to use 4 or 8 connectedness. personally, for most problems i prefer 6-connectedness. anyway, once you’ve separated out each “paw print” as a connected region, it should be easy enough to iterate through the region and find the maxima. once you’ve found the maxima, you could iteratively enlarge the region until you reach a predetermined threshold in order to identify it as a given “toe”.

one subtle problem here is that as soon as you start using computer vision techniques to identify something as a right/left/front/rear paw and you start looking at individual toes, you have to start taking rotations, skews, and translations into account. this is accomplished through the analysis of so-called “moments”. there are a few different moments to consider in vision applications:

central moments: translation invariant normalized moments: scaling and translation invariant hu moments: translation, scale, and rotation invariant

more information about moments can be found by searching “image moments” on wiki.


回答 13

也许您可以使用诸如高斯混合模型之类的东西。这是用于执行GMM的Python程序包(就像Google搜索一样) http://www.ar.media.kyoto-u.ac.jp/members/david/softwares/em/

Perhaps you can use something like Gaussian Mixture Models. Here’s a Python package for doing GMMs (just did a Google search) http://www.ar.media.kyoto-u.ac.jp/members/david/softwares/em/


回答 14

看来您可以使用jetxee的算法作弊。他发现前三个脚趾很好,您应该能够猜出第四个脚趾是基于哪个脚趾。

It seems you can cheat a bit using jetxee’s algorithm. He is finding the first three toes fine, and you should be able to guess where the fourth is based off that.


回答 15

有趣的问题。我尝试的解决方案如下。

  1. 应用低通滤波器,例如与2D高斯蒙版进行卷积。这将为您提供一堆(可能但不一定是浮点数)值。

  2. 使用每个爪垫(或脚趾)的已知近似半径执行2D非最大抑制。

这应该为您提供最大的职位,而不会使多个候选职位靠在一起。为了明确起见,步骤1中的遮罩半径也应与步骤2中使用的半径相似。该半径可以选择,或者兽医可以事先对其进行明确测量(随年龄/品种/等的不同而不同)。

建议的某些解决方案(均值漂移,神经网络等)可能会在某种程度上起作用,但过于复杂且可能不理想。

Interesting problem. The solution I would try is the following.

  1. Apply a low pass filter, such as convolution with a 2D gaussian mask. This will give you a bunch of (probably, but not necessarily floating point) values.

  2. Perform a 2D non-maximal suppression using the known approximate radius of each paw pad (or toe).

This should give you the maximal positions without having multiple candidates which are close together. Just to clarify, the radius of the mask in step 1 should also be similar to the radius used in step 2. This radius could be selectable, or the vet could explicitly measure it beforehand (it will vary with age/breed/etc).

Some of the solutions suggested (mean shift, neural nets, and so on) probably will work to some degree, but are overly complicated and probably not ideal.


回答 16

好吧,这是一些简单但效率不高的代码,但是对于这种大小的数据集来说,这很好。

import numpy as np
grid = np.array([[0,0,0,0,0,0,0,0,0,0,0,0,0,0],
              [0,0,0,0,0,0,0,0,0.4,0.4,0.4,0,0,0],
              [0,0,0,0,0.4,1.4,1.4,1.8,0.7,0,0,0,0,0],
              [0,0,0,0,0.4,1.4,4,5.4,2.2,0.4,0,0,0,0],
              [0,0,0.7,1.1,0.4,1.1,3.2,3.6,1.1,0,0,0,0,0],
              [0,0.4,2.9,3.6,1.1,0.4,0.7,0.7,0.4,0.4,0,0,0,0],
              [0,0.4,2.5,3.2,1.8,0.7,0.4,0.4,0.4,1.4,0.7,0,0,0],
              [0,0,0.7,3.6,5.8,2.9,1.4,2.2,1.4,1.8,1.1,0,0,0],
              [0,0,1.1,5,6.8,3.2,4,6.1,1.8,0.4,0.4,0,0,0],
              [0,0,0.4,1.1,1.8,1.8,4.3,3.2,0.7,0,0,0,0,0],
              [0,0,0,0,0,0.4,0.7,0.4,0,0,0,0,0,0]])

arr = []
for i in xrange(grid.shape[0] - 1):
    for j in xrange(grid.shape[1] - 1):
        tot = grid[i][j] + grid[i+1][j] + grid[i][j+1] + grid[i+1][j+1]
        arr.append([(i,j),tot])

best = []

arr.sort(key = lambda x: x[1])

for i in xrange(5):
    best.append(arr.pop())
    badpos = set([(best[-1][0][0]+x,best[-1][0][1]+y)
                  for x in [-1,0,1] for y in [-1,0,1] if x != 0 or y != 0])
    for j in xrange(len(arr)-1,-1,-1):
        if arr[j][0] in badpos:
            arr.pop(j)


for item in best:
    print grid[item[0][0]:item[0][0]+2,item[0][1]:item[0][1]+2]

我基本上只是用左上角的位置和每个2×2正方形的和组成一个数组,然后按和对它进行排序。然后,我从争用中取出具有最高总和的2×2正方形,将其放入best数组中,并删除所有其他2×2正方形,这些正方形使用了刚刚删除的2×2正方形的任何部分。

除了最后一个爪子(在第一张图片的最右端具有最小和的爪子)之外,它似乎工作正常,结果发现还有另外两个合格的2×2平方和,它们的总和等于彼此)。其中一个仍然从您的2×2正方形中选择一个正方形,但另一个在左侧。幸运的是,幸运的是,我们可以选择更多您想要的东西,但这可能需要使用其他一些思想来始终获得您真正想要的东西。

Well, here’s some simple and not terribly efficient code, but for this size of a data set it is fine.

import numpy as np
grid = np.array([[0,0,0,0,0,0,0,0,0,0,0,0,0,0],
              [0,0,0,0,0,0,0,0,0.4,0.4,0.4,0,0,0],
              [0,0,0,0,0.4,1.4,1.4,1.8,0.7,0,0,0,0,0],
              [0,0,0,0,0.4,1.4,4,5.4,2.2,0.4,0,0,0,0],
              [0,0,0.7,1.1,0.4,1.1,3.2,3.6,1.1,0,0,0,0,0],
              [0,0.4,2.9,3.6,1.1,0.4,0.7,0.7,0.4,0.4,0,0,0,0],
              [0,0.4,2.5,3.2,1.8,0.7,0.4,0.4,0.4,1.4,0.7,0,0,0],
              [0,0,0.7,3.6,5.8,2.9,1.4,2.2,1.4,1.8,1.1,0,0,0],
              [0,0,1.1,5,6.8,3.2,4,6.1,1.8,0.4,0.4,0,0,0],
              [0,0,0.4,1.1,1.8,1.8,4.3,3.2,0.7,0,0,0,0,0],
              [0,0,0,0,0,0.4,0.7,0.4,0,0,0,0,0,0]])

arr = []
for i in xrange(grid.shape[0] - 1):
    for j in xrange(grid.shape[1] - 1):
        tot = grid[i][j] + grid[i+1][j] + grid[i][j+1] + grid[i+1][j+1]
        arr.append([(i,j),tot])

best = []

arr.sort(key = lambda x: x[1])

for i in xrange(5):
    best.append(arr.pop())
    badpos = set([(best[-1][0][0]+x,best[-1][0][1]+y)
                  for x in [-1,0,1] for y in [-1,0,1] if x != 0 or y != 0])
    for j in xrange(len(arr)-1,-1,-1):
        if arr[j][0] in badpos:
            arr.pop(j)


for item in best:
    print grid[item[0][0]:item[0][0]+2,item[0][1]:item[0][1]+2]

I basically just make an array with the position of the upper-left and the sum of each 2×2 square and sort it by the sum. I then take the 2×2 square with the highest sum out of contention, put it in the best array, and remove all other 2×2 squares that used any part of this just removed 2×2 square.

It seems to work fine except with the last paw (the one with the smallest sum on the far right in your first picture), it turns out that there are two other eligible 2×2 squares with a larger sum (and they have an equal sum to each other). One of them is still selects one square from your 2×2 square, but the other is off to the left. Fortunately, by luck we see to be choosing more of the one that you would want, but this may require some other ideas to be used to get what you actually want all of the time.


回答 17

只是想告诉大家maxima,使用python在图像中找到本地是一个不错的选择:

from skimage.feature import peak_local_max

或skimage 0.8.0

from skimage.feature.peak import peak_local_max

http://scikit-image.org/docs/0.8.0/api/skimage.feature.peak.html

Just wanna tell you guys there is a nice option to find local maxima in images with python:

from skimage.feature import peak_local_max

or for skimage 0.8.0:

from skimage.feature.peak import peak_local_max

http://scikit-image.org/docs/0.8.0/api/skimage.feature.peak.html


回答 18

也许在这里一个简单的方法就足够了:建立飞机上所有2×2正方形的列表,并按它们的总和(降序排列)进行排序。

首先,在“爪子列表”中选择价值最高的正方形。然后,迭代选择不与任何先前找到的正方形相交的次佳正方形中的4个。

Maybe a naive approach is sufficient here: Build a list of all 2×2 squares on your plane, order them by their sum (in descending order).

First, select the highest-valued square into your “paw list”. Then, iteratively pick 4 of the next-best squares that don’t intersect with any of the previously found squares.


回答 19

天文学和宇宙学界提供了许多广泛的软件-这是历史上和当前的重要研究领域。

如果您不是天文学家,请不要惊慌-有些易于在野外使用。例如,您可以使用astropy / photutils:

https://photutils.readthedocs.io/en/stable/detection.html#local-peak-detection

[在这里重复简短的示例代码似乎有点不礼貌。]

下面列出了可能不感兴趣的技术/软件包/链接的不完整列表,并略有偏差。请在注释中添加更多内容,并在必要时更新此答案。当然,要在精度与计算资源之间进行权衡。[老实说,在这样的单个答案中有太多的例子无法给出代码示例,所以我不确定这个答案是否可行。]

源提取器https://www.astromatic.net/software/sextractor

MultiNest https://github.com/farhanferoz/MultiNest [+ pyMultiNest]

ASKAP / EMU寻源挑战:https ://arxiv.org/abs/1509.03931

您也可以搜索Planck和/或WMAP源提取挑战。

There are several and extensive pieces of software available from the astronomy and cosmology community – this is a significant area of research both historically and currently.

Do not be alarmed if you are not an astronomer – some are easy to use outside the field. For example, you could use astropy/photutils:

https://photutils.readthedocs.io/en/stable/detection.html#local-peak-detection

[It seems a bit rude to repeat their short sample code here.]

An incomplete and slightly biased list of techniques/packages/links that might be of interest is given below – do add more in the comments and I will update this answer as necessary. Of course there is a trade-off of accuracy vs compute resources. [Honestly, there are too many to give code examples in a single answer such as this so I am not sure whether this answer will fly or not.]

Source Extractor https://www.astromatic.net/software/sextractor

MultiNest https://github.com/farhanferoz/MultiNest [+ pyMultiNest]

ASKAP/EMU source-finding challenge: https://arxiv.org/abs/1509.03931

You could also search for Planck and/or WMAP source-extraction challenges.


回答 20

如果逐步进行操作怎么办:首先找到全局最大值,然后根据需要处理周围点的值,然后将找到的区域设置为零,然后对下一个重复。

What if you proceed step by step: you first locate the global maximum, process if needed the surrounding points given their value, then set the found region to zero, and repeat for the next one.


回答 21

我不确定这是否能回答问题,但似乎您可以寻找没有邻居的n个最高峰。

这是要点。 请注意,它在Ruby中,但是思路应该很清楚。

require 'pp'

NUM_PEAKS = 5
NEIGHBOR_DISTANCE = 1

data = [[1,2,3,4,5],
        [2,6,4,4,6],
        [3,6,7,4,3],
       ]

def tuples(matrix)
  tuples = []
  matrix.each_with_index { |row, ri|
    row.each_with_index { |value, ci|
      tuples << [value, ri, ci]
    }
  }
  tuples
end

def neighbor?(t1, t2, distance = 1)
  [1,2].each { |axis|
    return false if (t1[axis] - t2[axis]).abs > distance
  }
  true
end

# convert the matrix into a sorted list of tuples (value, row, col), highest peaks first
sorted = tuples(data).sort_by { |tuple| tuple.first }.reverse

# the list of peaks that don't have neighbors
non_neighboring_peaks = []

sorted.each { |candidate|
  # always take the highest peak
  if non_neighboring_peaks.empty?
    non_neighboring_peaks << candidate
    puts "took the first peak: #{candidate}"
  else
    # check that this candidate doesn't have any accepted neighbors
    is_ok = true
    non_neighboring_peaks.each { |accepted|
      if neighbor?(candidate, accepted, NEIGHBOR_DISTANCE)
        is_ok = false
        break
      end
    }
    if is_ok
      non_neighboring_peaks << candidate
      puts "took #{candidate}"
    else
      puts "denied #{candidate}"
    end
  end
}

pp non_neighboring_peaks

I am not sure this answers the question, but it seems like you can just look for the n highest peaks that don’t have neighbors.

Here is the gist. Note that it’s in Ruby, but the idea should be clear.

require 'pp'

NUM_PEAKS = 5
NEIGHBOR_DISTANCE = 1

data = [[1,2,3,4,5],
        [2,6,4,4,6],
        [3,6,7,4,3],
       ]

def tuples(matrix)
  tuples = []
  matrix.each_with_index { |row, ri|
    row.each_with_index { |value, ci|
      tuples << [value, ri, ci]
    }
  }
  tuples
end

def neighbor?(t1, t2, distance = 1)
  [1,2].each { |axis|
    return false if (t1[axis] - t2[axis]).abs > distance
  }
  true
end

# convert the matrix into a sorted list of tuples (value, row, col), highest peaks first
sorted = tuples(data).sort_by { |tuple| tuple.first }.reverse

# the list of peaks that don't have neighbors
non_neighboring_peaks = []

sorted.each { |candidate|
  # always take the highest peak
  if non_neighboring_peaks.empty?
    non_neighboring_peaks << candidate
    puts "took the first peak: #{candidate}"
  else
    # check that this candidate doesn't have any accepted neighbors
    is_ok = true
    non_neighboring_peaks.each { |accepted|
      if neighbor?(candidate, accepted, NEIGHBOR_DISTANCE)
        is_ok = false
        break
      end
    }
    if is_ok
      non_neighboring_peaks << candidate
      puts "took #{candidate}"
    else
      puts "denied #{candidate}"
    end
  end
}

pp non_neighboring_peaks

Computervision-recipes-计算机视觉的最佳实践、代码示例和文档

计算机视觉

近年来,我们看到了计算机视觉的非同寻常的增长,应用于人脸识别、图像理解、搜索、无人机、地图绘制、半自动和自动驾驶车辆。其中许多应用的关键部分是视觉识别任务,例如图像分类、目标检测和图像相似度

此存储库提供构建计算机视觉系统的示例和最佳实践指南。该存储库的目标是构建一套全面的工具和示例,以利用计算机视觉算法、神经体系结构和实现此类系统的最新进展。我们不是从头开始创建实现,而是利用现有的最先进的库,围绕加载图像数据、优化和评估模型以及向上扩展到云来构建额外的实用程序。此外,在此领域工作多年后,我们的目标是回答常见问题,指出经常观察到的陷阱,并展示如何使用云进行培训和部署

我们希望这些示例和实用程序可以通过按数量级简化从定义业务问题到开发解决方案的过程来显著缩短“上市时间”。此外,示例笔记本将作为指南,并以多种语言展示工具的最佳实践和用法

这些示例提供为Jupyter notebooks也很常见utility functions所有示例都使用PyTorch作为底层深度学习库

目标受众

我们这个存储库的目标受众包括具有不同计算机视觉知识水平的数据科学家和机器学习工程师,因为我们的内容是纯来源的,目标是自定义的机器学习建模。所提供的实用程序和示例旨在作为解决实际视觉问题的加速器

快速入门

要开始,请导航到Setup Guide,其中列出了有关如何设置计算环境和运行此Repo中的笔记本所需的依赖项的说明。设置环境后,请导航到Scenarios文件夹,开始浏览笔记本。我们建议从图像分类笔记本,因为这引入了其他场景也使用的概念(例如关于ImageNet的预培训)

或者,我们支持活页夹只需点击此链接,即可在网络浏览器中轻松试用我们的笔记本电脑。然而,Binder是免费的,因此只提供有限的CPU计算能力,并且没有GPU支持。预计笔记本的运行速度会非常慢(通过将图像分辨率降低到例如60像素,这在一定程度上有所改善,但代价是精确度较低)

场景

以下是此存储库中涵盖的常用计算机视觉场景的摘要。对于每个主要场景(“基础”),我们都会提供工具来有效地构建您自己的模型。这包括在您自己的数据上微调您自己的模型等简单任务,以及硬性否定挖掘甚至模型部署等更复杂的任务

场景 支持 描述
Classification 基地 图像分类是一种有监督的机器学习技术,用于学习和预测给定图像的类别
Similarity 基地 图像相似度是一种计算给定一对图像的相似度分数的方法。在给定图像的情况下,它允许您识别给定数据集中最相似的图像
Detection 基地 对象检测是一种允许您检测图像中对象的边界框的技术
Keypoints 基地 关键点检测可用于检测对象上的特定点。提供了一种预先训练的模型来检测人体关节,以进行人体姿态估计。
Segmentation 基地 图像分割为图像中的每个像素分配类别
Action recognition 基地 动作识别,用于在视频/网络摄像机镜头中识别执行的动作(例如,“运行”、“打开瓶子”)以及各自的开始/结束时间。我们还实现了可以在(Contrib)[contrib]下找到的动作识别的i3D实现
Tracking 基地 跟踪允许随时间检测和跟踪视频序列中的多个对象
Crowd counting Contrrib 统计低人群密度(如10人以下)和高人群密度(如数千人)场景下的人数

我们将支持的CV方案分为两个位置:(I)基地:“utils_cv”和“Scenario”文件夹中的代码和笔记本遵循严格的编码准则,经过良好的测试和维护;(Ii)Contrrib:“contrib”文件夹中的代码和其他资源,主要介绍使用尖端技术的不太常见的CV场景。“contrib”中的代码没有定期测试或维护

计算机视觉在蔚蓝上的应用

请注意,对于某些计算机视觉问题,您可能不需要构建自己的模型。取而代之的是,Azure上存在预先构建的或可轻松定制的解决方案,不需要任何自定义编码或机器学习专业知识。我们强烈建议您评估这些方法是否足以解决您的问题。如果这些解决方案不适用,或者这些解决方案的准确性不够,则可能需要求助于更复杂、更耗时的自定义方法

以下Microsoft服务提供了解决常见计算机视觉任务的简单解决方案:

  • Vision Services是一组经过预先训练的睡觉API,可以调用它们来进行图像标记、人脸识别、光学字符识别、视频分析等。这些API开箱即用,只需要极少的机器学习专业知识,但定制功能有限。查看各种可用的演示以体验该功能(例如Computer Vision)。该服务可通过API调用或通过SDK(以.NET、Python、Java、Node和Go语言提供)使用
  • Custom Vision是一项SaaS服务,用于在给定用户提供的培训集的情况下将模型训练和部署为睡觉应用编程接口。所有步骤,包括图像上传、注释和模型部署,都可以使用直观的UI或通过SDK(.Net、Python、Java、Node和Go语言)执行。训练图像分类或目标检测模型可以用最少的机器学习专业知识来实现。与使用预先培训的认知服务API相比,Custom Vision提供了更大的灵活性,但需要用户自带数据并对其进行注释

如果您需要培训您自己的模型,以下服务和链接提供了可能有用的附加信息

  • Azure Machine Learning service (AzureML)是一项帮助用户加速训练和部署机器学习模型的服务。虽然AzureML Python SDK不特定于计算机视觉工作负载,但它可以用于可伸缩且可靠的培训,并将机器学习解决方案部署到云中。我们在此存储库中的几个笔记本中利用Azure机器学习(例如deployment to Azure Kubernetes Service)
  • Azure AI Reference architectures提供一组示例(由代码支持),说明如何构建利用多个云组件的常见面向AI的工作负载。虽然不是特定于计算机视觉的,但这些参考体系结构涵盖了几个机器学习工作负载,例如模型部署或批处理评分

生成状态

AzureML测试

构建类型 分支机构 状态 分支机构 状态
Linux GPU 师傅 试运行
Linux CPU 师傅 试运行
笔记本电脑单元GPU 师傅 试运行

贡献

这个项目欢迎大家提供意见和建议。请参阅我们的contribution guidelines

Google-images-download 从“Google Images”下载数百张图片的Python脚本。这是一个随时可以运行的代码!

Google图片下载

Python脚本,用于“搜索”和“下载”数百张谷歌图片到本地硬盘!

文档

免责声明

这个程序可以让你从谷歌下载大量图片。请不要下载或使用任何违反其版权条款的图片。Google Images是一个搜索引擎,它只对图像进行索引,并允许您找到它们。它不生产自己的图像,因此,它没有任何图像的版权。图像的原始创作者拥有版权

在美国发布的图像自动受其所有者的版权保护,即使它们没有明确带有版权警告。除非在“合理使用”的情况下,否则您不得在未经版权所有者许可的情况下复制版权图像,否则您可能会面临遇到律师警告、停止和停止函以及版权诉讼的风险。请在使用前非常小心!此脚本/代码仅用于教育目的