分类目录归档:知识问答

Python字典理解

问题:Python字典理解

是否可以在Python(用于键)中创建字典理解?

在没有列表理解的情况下,您可以使用以下内容:

l = []
for n in range(1, 11):
    l.append(n)

我们可以将其简化为列表理解:l = [n for n in range(1, 11)]

但是,说我想将字典的键设置为相同的值。我可以:

d = {}
for n in range(1, 11):
     d[n] = True # same value for each

我已经试过了:

d = {}
d[i for i in range(1, 11)] = True

不过,我得到一个SyntaxErrorfor

另外(我不需要这部分,只是想知道),您能否将字典的键设置为一堆不同的值,例如:

d = {}
for n in range(1, 11):
    d[n] = n

字典理解有可能吗?

d = {}
d[i for i in range(1, 11)] = [x for x in range(1, 11)]

这也引发了SyntaxErrorfor

Is it possible to create a dictionary comprehension in Python (for the keys)?

Without list comprehensions, you can use something like this:

l = []
for n in range(1, 11):
    l.append(n)

We can shorten this to a list comprehension: l = [n for n in range(1, 11)].

However, say I want to set a dictionary’s keys to the same value. I can do:

d = {}
for n in range(1, 11):
     d[n] = True # same value for each

I’ve tried this:

d = {}
d[i for i in range(1, 11)] = True

However, I get a SyntaxError on the for.

In addition (I don’t need this part, but just wondering), can you set a dictionary’s keys to a bunch of different values, like this:

d = {}
for n in range(1, 11):
    d[n] = n

Is this possible with a dictionary comprehension?

d = {}
d[i for i in range(1, 11)] = [x for x in range(1, 11)]

This also raises a SyntaxError on the for.


回答 0

Python 2.7+中字典理解功能,但是它们不能完全按照您尝试的方式工作。就像列表理解一样,他们创建了一个字典。您不能使用它们将键添加到现有字典中。此外,您还必须指定键和值,尽管您当然可以根据需要指定一个哑数值。

>>> d = {n: n**2 for n in range(5)}
>>> print d
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

如果要将它们全部设置为True:

>>> d = {n: True for n in range(5)}
>>> print d
{0: True, 1: True, 2: True, 3: True, 4: True}

您似乎想要的是一种在现有字典上一次设置多个键的方法。没有直接的捷径。您可以像已经显示的那样循环,也可以使用字典理解来创建具有新值的新字典,然后oldDict.update(newDict)将新值合并到旧字典中。

There are dictionary comprehensions in Python 2.7+, but they don’t work quite the way you’re trying. Like a list comprehension, they create a new dictionary; you can’t use them to add keys to an existing dictionary. Also, you have to specify the keys and values, although of course you can specify a dummy value if you like.

>>> d = {n: n**2 for n in range(5)}
>>> print d
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

If you want to set them all to True:

>>> d = {n: True for n in range(5)}
>>> print d
{0: True, 1: True, 2: True, 3: True, 4: True}

What you seem to be asking for is a way to set multiple keys at once on an existing dictionary. There’s no direct shortcut for that. You can either loop like you already showed, or you could use a dictionary comprehension to create a new dict with the new values, and then do oldDict.update(newDict) to merge the new values into the old dict.


回答 1

您可以使用dict.fromkeys类方法…

>>> dict.fromkeys(range(5), True)
{0: True, 1: True, 2: True, 3: True, 4: True}

这是创建所有键都映射到相同值的字典的最快方法。

但是千万不能用可变对象使用

d = dict.fromkeys(range(5), [])
# {0: [], 1: [], 2: [], 3: [], 4: []}
d[1].append(2)
# {0: [2], 1: [2], 2: [2], 3: [2], 4: [2]} !!!

如果您实际上不需要初始化所有键,那么a defaultdict也可能很有用:

from collections import defaultdict
d = defaultdict(True)

要回答第二部分,您需要的是dict理解:

{k: k for k in range(10)}

您可能不应该这样做,但也可以创建一个子类,如果您重写dict它,该子类的工作原理类似于a :defaultdict__missing__

>>> class KeyDict(dict):
...    def __missing__(self, key):
...       #self[key] = key  # Maybe add this also?
...       return key
... 
>>> d = KeyDict()
>>> d[1]
1
>>> d[2]
2
>>> d[3]
3
>>> print(d)
{}

You can use the dict.fromkeys class method …

>>> dict.fromkeys(range(5), True)
{0: True, 1: True, 2: True, 3: True, 4: True}

This is the fastest way to create a dictionary where all the keys map to the same value.

But do not use this with mutable objects:

d = dict.fromkeys(range(5), [])
# {0: [], 1: [], 2: [], 3: [], 4: []}
d[1].append(2)
# {0: [2], 1: [2], 2: [2], 3: [2], 4: [2]} !!!

If you don’t actually need to initialize all the keys, a defaultdict might be useful as well:

from collections import defaultdict
d = defaultdict(True)

To answer the second part, a dict-comprehension is just what you need:

{k: k for k in range(10)}

You probably shouldn’t do this but you could also create a subclass of dict which works somewhat like a defaultdict if you override __missing__:

>>> class KeyDict(dict):
...    def __missing__(self, key):
...       #self[key] = key  # Maybe add this also?
...       return key
... 
>>> d = KeyDict()
>>> d[1]
1
>>> d[2]
2
>>> d[3]
3
>>> print(d)
{}

回答 2

>>> {i:i for i in range(1, 11)}
{1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9, 10: 10}
>>> {i:i for i in range(1, 11)}
{1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9, 10: 10}

回答 3

我真的很喜欢@mgilson注释,因为如果您有两个可迭代项,一个与键相对应,另一个与值相对应,则还可以执行以下操作。

keys = ['a', 'b', 'c']
values = [1, 2, 3]
d = dict(zip(keys, values))

给予

d = {‘a’:1,’b’:2,’c’:3}

I really like the @mgilson comment, since if you have a two iterables, one that corresponds to the keys and the other the values, you can also do the following.

keys = ['a', 'b', 'c']
values = [1, 2, 3]
d = dict(zip(keys, values))

giving

d = {‘a’: 1, ‘b’: 2, ‘c’: 3}


回答 4

在元组列表上使用dict(),此解决方案将允许您在每个列表中具有任意值,只要它们的长度相同

i_s = range(1, 11)
x_s = range(1, 11)
# x_s = range(11, 1, -1) # Also works
d = dict([(i_s[index], x_s[index], ) for index in range(len(i_s))])

Use dict() on a list of tuples, this solution will allow you to have arbitrary values in each list, so long as they are the same length

i_s = range(1, 11)
x_s = range(1, 11)
# x_s = range(11, 1, -1) # Also works
d = dict([(i_s[index], x_s[index], ) for index in range(len(i_s))])

回答 5

考虑以下示例,该示例使用字典理解来计算列表中单词的出现

my_list = ['hello', 'hi', 'hello', 'today', 'morning', 'again', 'hello']
my_dict = {k:my_list.count(k) for k in my_list}
print(my_dict)

结果是

{'again': 1, 'hi': 1, 'hello': 3, 'today': 1, 'morning': 1}

Consider this example of counting the occurrence of words in a list using dictionary comprehension

my_list = ['hello', 'hi', 'hello', 'today', 'morning', 'again', 'hello']
my_dict = {k:my_list.count(k) for k in my_list}
print(my_dict)

And the result is

{'again': 1, 'hi': 1, 'hello': 3, 'today': 1, 'morning': 1}

回答 6

列表理解的主要目的是在不更改或破坏原始列表的基础上创建基于另一个列表的新列表。

而不是写作

    l = []
    for n in range(1, 11):
        l.append(n)

要么

    l = [n for n in range(1, 11)]

你应该只写

    l = range(1, 11)

在两个最重要的代码块中,您将创建一个新列表,对其进行迭代并仅返回每个元素。这只是创建列表副本的一种昂贵方法。

要获得一个新的字典,并且根据另一个字典将所有键设置为相同的值,请执行以下操作:

    old_dict = {'a': 1, 'c': 3, 'b': 2}
    new_dict = { key:'your value here' for key in old_dict.keys()}

您收到SyntaxError的原因是,在编写时

    d = {}
    d[i for i in range(1, 11)] = True

您基本上是在说:“将我的密钥’i for range(1,11)’中的i设置为True”和“ i for i in range(1,11)中的”)不是有效的密钥,这只是语法错误。如果将dicts支持的列表作为键,您将执行以下操作

    d[[i for i in range(1, 11)]] = True

并不是

    d[i for i in range(1, 11)] = True

但是列表不可散列,因此您不能将它们用作字典键。

The main purpose of a list comprehension is to create a new list based on another one without changing or destroying the original list.

Instead of writing

    l = []
    for n in range(1, 11):
        l.append(n)

or

    l = [n for n in range(1, 11)]

you should write only

    l = range(1, 11)

In the two top code blocks you’re creating a new list, iterating through it and just returning each element. It’s just an expensive way of creating a list copy.

To get a new dictionary with all keys set to the same value based on another dict, do this:

    old_dict = {'a': 1, 'c': 3, 'b': 2}
    new_dict = { key:'your value here' for key in old_dict.keys()}

You’re receiving a SyntaxError because when you write

    d = {}
    d[i for i in range(1, 11)] = True

you’re basically saying: “Set my key ‘i for i in range(1, 11)’ to True” and “i for i in range(1, 11)” is not a valid key, it’s just a syntax error. If dicts supported lists as keys, you would do something like

    d[[i for i in range(1, 11)]] = True

and not

    d[i for i in range(1, 11)] = True

but lists are not hashable, so you can’t use them as dict keys.


回答 7

您不能像这样散列表。试试这个,它使用元组

d[tuple([i for i in range(1,11)])] = True

you can’t hash a list like that. try this instead, it uses tuples

d[tuple([i for i in range(1,11)])] = True

如何按类别查找元素

问题:如何按类别查找元素

我在使用Beautifulsoup解析具有“ class”属性的HTML元素时遇到了麻烦。代码看起来像这样

soup = BeautifulSoup(sdata)
mydivs = soup.findAll('div')
for div in mydivs: 
    if (div["class"] == "stylelistrow"):
        print div

脚本完成后的同一行出现错误。

File "./beautifulcoding.py", line 130, in getlanguage
  if (div["class"] == "stylelistrow"):
File "/usr/local/lib/python2.6/dist-packages/BeautifulSoup.py", line 599, in __getitem__
   return self._getAttrMap()[key]
KeyError: 'class'

我如何摆脱这个错误?

I’m having trouble parsing HTML elements with “class” attribute using Beautifulsoup. The code looks like this

soup = BeautifulSoup(sdata)
mydivs = soup.findAll('div')
for div in mydivs: 
    if (div["class"] == "stylelistrow"):
        print div

I get an error on the same line “after” the script finishes.

File "./beautifulcoding.py", line 130, in getlanguage
  if (div["class"] == "stylelistrow"):
File "/usr/local/lib/python2.6/dist-packages/BeautifulSoup.py", line 599, in __getitem__
   return self._getAttrMap()[key]
KeyError: 'class'

How do I get rid of this error?


回答 0

您可以使用BS3优化搜索以仅找到具有给定类的那些div:

mydivs = soup.findAll("div", {"class": "stylelistrow"})

You can refine your search to only find those divs with a given class using BS3:

mydivs = soup.findAll("div", {"class": "stylelistrow"})

回答 1

从文档中:

从Beautiful Soup 4.1.2开始,您可以使用关键字arguments通过CSS类进行搜索 class_

soup.find_all("a", class_="sister")

在这种情况下将是:

soup.find_all("div", class_="stylelistrow")

它也适用于:

soup.find_all("div", class_="stylelistrowone stylelistrowtwo")

From the documentation:

As of Beautiful Soup 4.1.2, you can search by CSS class using the keyword argument class_:

soup.find_all("a", class_="sister")

Which in this case would be:

soup.find_all("div", class_="stylelistrow")

It would also work for:

soup.find_all("div", class_="stylelistrowone stylelistrowtwo")

回答 2

更新:2016在最新版本的beautifulsoup中,方法“ findAll”已重命名为“ find_all”。链接到官方文档

方法名称列表已更改

因此答案将是

soup.find_all("html_element", class_="your_class_name")

Update: 2016 In the latest version of beautifulsoup, the method ‘findAll’ has been renamed to ‘find_all’. Link to official documentation

List of method names changed

Hence the answer will be

soup.find_all("html_element", class_="your_class_name")

回答 3

特定于BeautifulSoup 3:

soup.findAll('div',
             {'class': lambda x: x 
                       and 'stylelistrow' in x.split()
             }
            )

将找到所有这些:

<div class="stylelistrow">
<div class="stylelistrow button">
<div class="button stylelistrow">

Specific to BeautifulSoup 3:

soup.findAll('div',
             {'class': lambda x: x 
                       and 'stylelistrow' in x.split()
             }
            )

Will find all of these:

<div class="stylelistrow">
<div class="stylelistrow button">
<div class="button stylelistrow">

回答 4

直接的方法是:

soup = BeautifulSoup(sdata)
for each_div in soup.findAll('div',{'class':'stylelist'}):
    print each_div

确保您使用findAll的大小写,而不是findall的大小写

A straight forward way would be :

soup = BeautifulSoup(sdata)
for each_div in soup.findAll('div',{'class':'stylelist'}):
    print each_div

Make sure you take of the casing of findAll, its not findall


回答 5

如何按类别查找元素

我在使用Beautifulsoup解析具有“ class”属性的html元素时遇到了麻烦。

您可以轻松地按一个类别查找,但是如果要按两个类别的相交查找,则要困难一些,

文档(添加重点):

如果要搜索与两个或多个 CSS类匹配的标签,则应使用CSS选择器:

css_soup.select("p.strikeout.body")
# [<p class="body strikeout"></p>]

为了清楚起见,此操作仅选择既是删除线又是正文类的p标签。

要在一组类中查找任何交集(不是交集,而是联合),可以给class_关键字参数提供一个列表(从4.1.2开始):

soup = BeautifulSoup(sdata)
class_list = ["stylelistrow"] # can add any other classes to this list.
# will find any divs with any names in class_list:
mydivs = soup.find_all('div', class_=class_list) 

还要注意,findAll已从camelCase重命名为更多Pythonic find_all

How to find elements by class

I’m having trouble parsing html elements with “class” attribute using Beautifulsoup.

You can easily find by one class, but if you want to find by the intersection of two classes, it’s a little more difficult,

From the documentation (emphasis added):

If you want to search for tags that match two or more CSS classes, you should use a CSS selector:

css_soup.select("p.strikeout.body")
# [<p class="body strikeout"></p>]

To be clear, this selects only the p tags that are both strikeout and body class.

To find for the intersection of any in a set of classes (not the intersection, but the union), you can give a list to the class_ keyword argument (as of 4.1.2):

soup = BeautifulSoup(sdata)
class_list = ["stylelistrow"] # can add any other classes to this list.
# will find any divs with any names in class_list:
mydivs = soup.find_all('div', class_=class_list) 

Also note that findAll has been renamed from the camelCase to the more Pythonic find_all.


回答 6

CSS选择器

单班第一场比赛

soup.select_one('.stylelistrow')

比赛清单

soup.select('.stylelistrow')

复合类(即AND另一类)

soup.select_one('.stylelistrow.otherclassname')
soup.select('.stylelistrow.otherclassname')

复合类名称中的空格例如class = stylelistrow otherclassname用“。”代替。您可以继续添加类。

类列表(或-匹配存在的任何一个

soup.select_one('.stylelistrow, .otherclassname')
soup.select('.stylelistrow, .otherclassname')

bs4 4.7.1 +

innerText包含字符串的特定类

soup.select_one('.stylelistrow:contains("some string")')
soup.select('.stylelistrow:contains("some string")')

具有特定子元素(例如a标签)的特定类

soup.select_one('.stylelistrow:has(a)')
soup.select('.stylelistrow:has(a)')

CSS selectors

single class first match

soup.select_one('.stylelistrow')

list of matches

soup.select('.stylelistrow')

compound class (i.e. AND another class)

soup.select_one('.stylelistrow.otherclassname')
soup.select('.stylelistrow.otherclassname')

Spaces in compound class names e.g. class = stylelistrow otherclassname are replaced with “.”. You can continue to add classes.

list of classes (OR – match whichever present

soup.select_one('.stylelistrow, .otherclassname')
soup.select('.stylelistrow, .otherclassname')

bs4 4.7.1 +

Specific class whose innerText contains a string

soup.select_one('.stylelistrow:contains("some string")')
soup.select('.stylelistrow:contains("some string")')

Specific class which has a certain child element e.g. a tag

soup.select_one('.stylelistrow:has(a)')
soup.select('.stylelistrow:has(a)')

回答 7

从BeautifulSoup 4+开始,

如果您只有一个类名,则只需将类名作为参数传递即可:

mydivs = soup.find_all('div', 'class_name')

或者,如果您有多个类名,只需将类名列表作为参数传递即可:

mydivs = soup.find_all('div', ['class1', 'class2'])

As of BeautifulSoup 4+ ,

If you have a single class name , you can just pass the class name as parameter like :

mydivs = soup.find_all('div', 'class_name')

Or if you have more than one class names , just pass the list of class names as parameter like :

mydivs = soup.find_all('div', ['class1', 'class2'])

回答 8

尝试首先检查div是否具有class属性,如下所示:

soup = BeautifulSoup(sdata)
mydivs = soup.findAll('div')
for div in mydivs:
    if "class" in div:
        if (div["class"]=="stylelistrow"):
            print div

Try to check if the div has a class attribute first, like this:

soup = BeautifulSoup(sdata)
mydivs = soup.findAll('div')
for div in mydivs:
    if "class" in div:
        if (div["class"]=="stylelistrow"):
            print div

回答 9

这对我来说可以访问class属性(在beautifulsoup 4上,与文档中所说的相反)。KeyError会返回一个列表,而不是字典。

for hit in soup.findAll(name='span'):
    print hit.contents[1]['class']

This works for me to access the class attribute (on beautifulsoup 4, contrary to what the documentation says). The KeyError comes a list being returned not a dictionary.

for hit in soup.findAll(name='span'):
    print hit.contents[1]['class']

回答 10

以下对我有用

a_tag = soup.find_all("div",class_='full tabpublist')

the following worked for me

a_tag = soup.find_all("div",class_='full tabpublist')

回答 11

这为我工作:

for div in mydivs:
    try:
        clazz = div["class"]
    except KeyError:
        clazz = ""
    if (clazz == "stylelistrow"):
        print div

This worked for me:

for div in mydivs:
    try:
        clazz = div["class"]
    except KeyError:
        clazz = ""
    if (clazz == "stylelistrow"):
        print div

回答 12

或者,我们可以使用lxml,它支持xpath并且非常快!

from lxml import html, etree 

attr = html.fromstring(html_text)#passing the raw html
handles = attr.xpath('//div[@class="stylelistrow"]')#xpath exresssion to find that specific class

for each in handles:
    print(etree.tostring(each))#printing the html as string

Alternatively we can use lxml, it support xpath and very fast!

from lxml import html, etree 

attr = html.fromstring(html_text)#passing the raw html
handles = attr.xpath('//div[@class="stylelistrow"]')#xpath exresssion to find that specific class

for each in handles:
    print(etree.tostring(each))#printing the html as string

回答 13

这应该工作:

soup = BeautifulSoup(sdata)
mydivs = soup.findAll('div')
for div in mydivs: 
    if (div.find(class_ == "stylelistrow"):
        print div

This should work:

soup = BeautifulSoup(sdata)
mydivs = soup.findAll('div')
for div in mydivs: 
    if (div.find(class_ == "stylelistrow"):
        print div

回答 14

其他答案对我不起作用。

在其他答案中,findAll它被用于汤对象本身,但是我需要一种方法,可以通过对从我做完之后获得的对象中提取的特定元素内的对象进行类名查找findAll

如果您要在嵌套的HTML元素中进行搜索以按类名获取对象,请尝试以下操作-

# parse html
page_soup = soup(web_page.read(), "html.parser")

# filter out items matching class name
all_songs = page_soup.findAll("li", "song_item")

# traverse through all_songs
for song in all_songs:

    # get text out of span element matching class 'song_name'
    # doing a 'find' by class name within a specific song element taken out of 'all_songs' collection
    song.find("span", "song_name").text

注意事项:

  1. 我没有明确定义要在’class’属性上进行搜索findAll("li", {"class": "song_item"}),因为它是我要搜索的唯一属性,并且如果您不专门指出要在哪个属性上查找,默认情况下它将搜索class属性。

  2. 当您执行findAllor或时find,生成的对象属于的bs4.element.ResultSet子类list。您可以ResultSet在任意数量的嵌套元素(只要类型为ResultSet)内利用的所有方法进行查找或全部查找。

  3. 我的BS4版本-4.9.1,Python版本-3.8.1

Other answers did not work for me.

In other answers the findAll is being used on the soup object itself, but I needed a way to do a find by class name on objects inside a specific element extracted from the object I obtained after doing findAll.

If you are trying to do a search inside nested HTML elements to get objects by class name, try below –

# parse html
page_soup = soup(web_page.read(), "html.parser")

# filter out items matching class name
all_songs = page_soup.findAll("li", "song_item")

# traverse through all_songs
for song in all_songs:

    # get text out of span element matching class 'song_name'
    # doing a 'find' by class name within a specific song element taken out of 'all_songs' collection
    song.find("span", "song_name").text

Points to note:

  1. I’m not explicitly defining the search to be on ‘class’ attribute findAll("li", {"class": "song_item"}), since it’s the only attribute I’m searching on and it will by default search for class attribute if you don’t exclusively tell which attribute you want to find on.

  2. When you do a findAll or find, the resulting object is of class bs4.element.ResultSet which is a subclass of list. You can utilize all methods of ResultSet, inside any number of nested elements (as long as they are of type ResultSet) to do a find or find all.

  3. My BS4 version – 4.9.1, Python version – 3.8.1


回答 15

以下应该工作

soup.find('span', attrs={'class':'totalcount'})

将“ totalcount”替换为您的类的名称,并将“ span”替换为您要查找的标签。另外,如果您的Class包含多个带空格的名称,只需选择一个并使用即可。

PS这将找到具有给定条件的第一个元素。如果要查找所有元素,则将“ find”替换为“ find_all”。

The following should work

soup.find('span', attrs={'class':'totalcount'})

replace ‘totalcount’ with your class name and ‘span’ with tag you are looking for. Also, if your class contains multiple names with space, just choose one and use.

P.S. This finds the first element with given criteria. If you want to find all elements then replace ‘find’ with ‘find_all’.


用于将PDF转换为文本的Python模块

问题:用于将PDF转换为文本的Python模块

是否有任何Python模块可将PDF文件转换为文本?我尝试了在Activestate中找到的一段使用pypdf 的代码,但是生成的文本之间没有空格,也没有用。

Is there any python module to convert PDF files into text? I tried one piece of code found in Activestate which uses pypdf but the text generated had no space between and was of no use.


回答 0

试试PDFMiner。它可以从PDF文件中以HTML,SGML或“标记的PDF”格式提取文本。

Tagged PDF格式似乎是最干净的格式,而去掉XML标签只会留下纯文本。

Python 3版本在以下位置可用:

Try PDFMiner. It can extract text from PDF files as HTML, SGML or “Tagged PDF” format.

The Tagged PDF format seems to be the cleanest, and stripping out the XML tags leaves just the bare text.

A Python 3 version is available under:


回答 1

Codeape发布以来,PDFMiner软件包已更改。

编辑(再次):

PDFMiner版本已再次更新 20100213

您可以使用以下方法检查已安装的版本:

>>> import pdfminer
>>> pdfminer.__version__
'20100213'

这是更新的版本(带有有关我更改/添加的内容的注释):

def pdf_to_csv(filename):
    from cStringIO import StringIO  #<-- added so you can copy/paste this to try it
    from pdfminer.converter import LTTextItem, TextConverter
    from pdfminer.pdfparser import PDFDocument, PDFParser
    from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter

    class CsvConverter(TextConverter):
        def __init__(self, *args, **kwargs):
            TextConverter.__init__(self, *args, **kwargs)

        def end_page(self, i):
            from collections import defaultdict
            lines = defaultdict(lambda : {})
            for child in self.cur_item.objs:
                if isinstance(child, LTTextItem):
                    (_,_,x,y) = child.bbox                   #<-- changed
                    line = lines[int(-y)]
                    line[x] = child.text.encode(self.codec)  #<-- changed

            for y in sorted(lines.keys()):
                line = lines[y]
                self.outfp.write(";".join(line[x] for x in sorted(line.keys())))
                self.outfp.write("\n")

    # ... the following part of the code is a remix of the 
    # convert() function in the pdfminer/tools/pdf2text module
    rsrc = PDFResourceManager()
    outfp = StringIO()
    device = CsvConverter(rsrc, outfp, codec="utf-8")  #<-- changed 
        # becuase my test documents are utf-8 (note: utf-8 is the default codec)

    doc = PDFDocument()
    fp = open(filename, 'rb')
    parser = PDFParser(fp)       #<-- changed
    parser.set_document(doc)     #<-- added
    doc.set_parser(parser)       #<-- added
    doc.initialize('')

    interpreter = PDFPageInterpreter(rsrc, device)

    for i, page in enumerate(doc.get_pages()):
        outfp.write("START PAGE %d\n" % i)
        interpreter.process_page(page)
        outfp.write("END PAGE %d\n" % i)

    device.close()
    fp.close()

    return outfp.getvalue()

编辑(再次):

下面是最新版本的更新的PyPI20100619p1。简而言之,我替换LTTextItemLTCharLAParams实例并将其传递给CsvConverter构造函数。

def pdf_to_csv(filename):
    from cStringIO import StringIO  
    from pdfminer.converter import LTChar, TextConverter    #<-- changed
    from pdfminer.layout import LAParams
    from pdfminer.pdfparser import PDFDocument, PDFParser
    from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter

    class CsvConverter(TextConverter):
        def __init__(self, *args, **kwargs):
            TextConverter.__init__(self, *args, **kwargs)

        def end_page(self, i):
            from collections import defaultdict
            lines = defaultdict(lambda : {})
            for child in self.cur_item.objs:
                if isinstance(child, LTChar):               #<-- changed
                    (_,_,x,y) = child.bbox                   
                    line = lines[int(-y)]
                    line[x] = child.text.encode(self.codec)

            for y in sorted(lines.keys()):
                line = lines[y]
                self.outfp.write(";".join(line[x] for x in sorted(line.keys())))
                self.outfp.write("\n")

    # ... the following part of the code is a remix of the 
    # convert() function in the pdfminer/tools/pdf2text module
    rsrc = PDFResourceManager()
    outfp = StringIO()
    device = CsvConverter(rsrc, outfp, codec="utf-8", laparams=LAParams())  #<-- changed
        # becuase my test documents are utf-8 (note: utf-8 is the default codec)

    doc = PDFDocument()
    fp = open(filename, 'rb')
    parser = PDFParser(fp)       
    parser.set_document(doc)     
    doc.set_parser(parser)       
    doc.initialize('')

    interpreter = PDFPageInterpreter(rsrc, device)

    for i, page in enumerate(doc.get_pages()):
        outfp.write("START PAGE %d\n" % i)
        if page is not None:
            interpreter.process_page(page)
        outfp.write("END PAGE %d\n" % i)

    device.close()
    fp.close()

    return outfp.getvalue()

编辑(再过一次):

已更新版本20110515(感谢Oeufcoque Penteano!):

def pdf_to_csv(filename):
    from cStringIO import StringIO  
    from pdfminer.converter import LTChar, TextConverter
    from pdfminer.layout import LAParams
    from pdfminer.pdfparser import PDFDocument, PDFParser
    from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter

    class CsvConverter(TextConverter):
        def __init__(self, *args, **kwargs):
            TextConverter.__init__(self, *args, **kwargs)

        def end_page(self, i):
            from collections import defaultdict
            lines = defaultdict(lambda : {})
            for child in self.cur_item._objs:                #<-- changed
                if isinstance(child, LTChar):
                    (_,_,x,y) = child.bbox                   
                    line = lines[int(-y)]
                    line[x] = child._text.encode(self.codec) #<-- changed

            for y in sorted(lines.keys()):
                line = lines[y]
                self.outfp.write(";".join(line[x] for x in sorted(line.keys())))
                self.outfp.write("\n")

    # ... the following part of the code is a remix of the 
    # convert() function in the pdfminer/tools/pdf2text module
    rsrc = PDFResourceManager()
    outfp = StringIO()
    device = CsvConverter(rsrc, outfp, codec="utf-8", laparams=LAParams())
        # becuase my test documents are utf-8 (note: utf-8 is the default codec)

    doc = PDFDocument()
    fp = open(filename, 'rb')
    parser = PDFParser(fp)       
    parser.set_document(doc)     
    doc.set_parser(parser)       
    doc.initialize('')

    interpreter = PDFPageInterpreter(rsrc, device)

    for i, page in enumerate(doc.get_pages()):
        outfp.write("START PAGE %d\n" % i)
        if page is not None:
            interpreter.process_page(page)
        outfp.write("END PAGE %d\n" % i)

    device.close()
    fp.close()

    return outfp.getvalue()

The PDFMiner package has changed since codeape posted.

EDIT (again):

PDFMiner has been updated again in version 20100213

You can check the version you have installed with the following:

>>> import pdfminer
>>> pdfminer.__version__
'20100213'

Here’s the updated version (with comments on what I changed/added):

def pdf_to_csv(filename):
    from cStringIO import StringIO  #<-- added so you can copy/paste this to try it
    from pdfminer.converter import LTTextItem, TextConverter
    from pdfminer.pdfparser import PDFDocument, PDFParser
    from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter

    class CsvConverter(TextConverter):
        def __init__(self, *args, **kwargs):
            TextConverter.__init__(self, *args, **kwargs)

        def end_page(self, i):
            from collections import defaultdict
            lines = defaultdict(lambda : {})
            for child in self.cur_item.objs:
                if isinstance(child, LTTextItem):
                    (_,_,x,y) = child.bbox                   #<-- changed
                    line = lines[int(-y)]
                    line[x] = child.text.encode(self.codec)  #<-- changed

            for y in sorted(lines.keys()):
                line = lines[y]
                self.outfp.write(";".join(line[x] for x in sorted(line.keys())))
                self.outfp.write("\n")

    # ... the following part of the code is a remix of the 
    # convert() function in the pdfminer/tools/pdf2text module
    rsrc = PDFResourceManager()
    outfp = StringIO()
    device = CsvConverter(rsrc, outfp, codec="utf-8")  #<-- changed 
        # becuase my test documents are utf-8 (note: utf-8 is the default codec)

    doc = PDFDocument()
    fp = open(filename, 'rb')
    parser = PDFParser(fp)       #<-- changed
    parser.set_document(doc)     #<-- added
    doc.set_parser(parser)       #<-- added
    doc.initialize('')

    interpreter = PDFPageInterpreter(rsrc, device)

    for i, page in enumerate(doc.get_pages()):
        outfp.write("START PAGE %d\n" % i)
        interpreter.process_page(page)
        outfp.write("END PAGE %d\n" % i)

    device.close()
    fp.close()

    return outfp.getvalue()

Edit (yet again):

Here is an update for the latest version in pypi, 20100619p1. In short I replaced LTTextItem with LTChar and passed an instance of LAParams to the CsvConverter constructor.

def pdf_to_csv(filename):
    from cStringIO import StringIO  
    from pdfminer.converter import LTChar, TextConverter    #<-- changed
    from pdfminer.layout import LAParams
    from pdfminer.pdfparser import PDFDocument, PDFParser
    from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter

    class CsvConverter(TextConverter):
        def __init__(self, *args, **kwargs):
            TextConverter.__init__(self, *args, **kwargs)

        def end_page(self, i):
            from collections import defaultdict
            lines = defaultdict(lambda : {})
            for child in self.cur_item.objs:
                if isinstance(child, LTChar):               #<-- changed
                    (_,_,x,y) = child.bbox                   
                    line = lines[int(-y)]
                    line[x] = child.text.encode(self.codec)

            for y in sorted(lines.keys()):
                line = lines[y]
                self.outfp.write(";".join(line[x] for x in sorted(line.keys())))
                self.outfp.write("\n")

    # ... the following part of the code is a remix of the 
    # convert() function in the pdfminer/tools/pdf2text module
    rsrc = PDFResourceManager()
    outfp = StringIO()
    device = CsvConverter(rsrc, outfp, codec="utf-8", laparams=LAParams())  #<-- changed
        # becuase my test documents are utf-8 (note: utf-8 is the default codec)

    doc = PDFDocument()
    fp = open(filename, 'rb')
    parser = PDFParser(fp)       
    parser.set_document(doc)     
    doc.set_parser(parser)       
    doc.initialize('')

    interpreter = PDFPageInterpreter(rsrc, device)

    for i, page in enumerate(doc.get_pages()):
        outfp.write("START PAGE %d\n" % i)
        if page is not None:
            interpreter.process_page(page)
        outfp.write("END PAGE %d\n" % i)

    device.close()
    fp.close()

    return outfp.getvalue()

EDIT (one more time):

Updated for version 20110515 (thanks to Oeufcoque Penteano!):

def pdf_to_csv(filename):
    from cStringIO import StringIO  
    from pdfminer.converter import LTChar, TextConverter
    from pdfminer.layout import LAParams
    from pdfminer.pdfparser import PDFDocument, PDFParser
    from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter

    class CsvConverter(TextConverter):
        def __init__(self, *args, **kwargs):
            TextConverter.__init__(self, *args, **kwargs)

        def end_page(self, i):
            from collections import defaultdict
            lines = defaultdict(lambda : {})
            for child in self.cur_item._objs:                #<-- changed
                if isinstance(child, LTChar):
                    (_,_,x,y) = child.bbox                   
                    line = lines[int(-y)]
                    line[x] = child._text.encode(self.codec) #<-- changed

            for y in sorted(lines.keys()):
                line = lines[y]
                self.outfp.write(";".join(line[x] for x in sorted(line.keys())))
                self.outfp.write("\n")

    # ... the following part of the code is a remix of the 
    # convert() function in the pdfminer/tools/pdf2text module
    rsrc = PDFResourceManager()
    outfp = StringIO()
    device = CsvConverter(rsrc, outfp, codec="utf-8", laparams=LAParams())
        # becuase my test documents are utf-8 (note: utf-8 is the default codec)

    doc = PDFDocument()
    fp = open(filename, 'rb')
    parser = PDFParser(fp)       
    parser.set_document(doc)     
    doc.set_parser(parser)       
    doc.initialize('')

    interpreter = PDFPageInterpreter(rsrc, device)

    for i, page in enumerate(doc.get_pages()):
        outfp.write("START PAGE %d\n" % i)
        if page is not None:
            interpreter.process_page(page)
        outfp.write("END PAGE %d\n" % i)

    device.close()
    fp.close()

    return outfp.getvalue()

回答 2

由于这些解决方案都不支持最新版本的PDFMiner,因此我编写了一个简单的解决方案,该解决方案将使用PDFMiner返回pdf文本。这将对那些遇到导入错误的人有用process_pdf

import sys
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.pdfpage import PDFPage
from pdfminer.converter import XMLConverter, HTMLConverter, TextConverter
from pdfminer.layout import LAParams
from cStringIO import StringIO

def pdfparser(data):

    fp = file(data, 'rb')
    rsrcmgr = PDFResourceManager()
    retstr = StringIO()
    codec = 'utf-8'
    laparams = LAParams()
    device = TextConverter(rsrcmgr, retstr, codec=codec, laparams=laparams)
    # Create a PDF interpreter object.
    interpreter = PDFPageInterpreter(rsrcmgr, device)
    # Process each page contained in the document.

    for page in PDFPage.get_pages(fp):
        interpreter.process_page(page)
        data =  retstr.getvalue()

    print data

if __name__ == '__main__':
    pdfparser(sys.argv[1])  

请参阅以下适用于Python 3的代码:

import sys
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.pdfpage import PDFPage
from pdfminer.converter import XMLConverter, HTMLConverter, TextConverter
from pdfminer.layout import LAParams
import io

def pdfparser(data):

    fp = open(data, 'rb')
    rsrcmgr = PDFResourceManager()
    retstr = io.StringIO()
    codec = 'utf-8'
    laparams = LAParams()
    device = TextConverter(rsrcmgr, retstr, codec=codec, laparams=laparams)
    # Create a PDF interpreter object.
    interpreter = PDFPageInterpreter(rsrcmgr, device)
    # Process each page contained in the document.

    for page in PDFPage.get_pages(fp):
        interpreter.process_page(page)
        data =  retstr.getvalue()

    print(data)

if __name__ == '__main__':
    pdfparser(sys.argv[1])  

Since none for these solutions support the latest version of PDFMiner I wrote a simple solution that will return text of a pdf using PDFMiner. This will work for those who are getting import errors with process_pdf

import sys
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.pdfpage import PDFPage
from pdfminer.converter import XMLConverter, HTMLConverter, TextConverter
from pdfminer.layout import LAParams
from cStringIO import StringIO

def pdfparser(data):

    fp = file(data, 'rb')
    rsrcmgr = PDFResourceManager()
    retstr = StringIO()
    codec = 'utf-8'
    laparams = LAParams()
    device = TextConverter(rsrcmgr, retstr, codec=codec, laparams=laparams)
    # Create a PDF interpreter object.
    interpreter = PDFPageInterpreter(rsrcmgr, device)
    # Process each page contained in the document.

    for page in PDFPage.get_pages(fp):
        interpreter.process_page(page)
        data =  retstr.getvalue()

    print data

if __name__ == '__main__':
    pdfparser(sys.argv[1])  

See below code that works for Python 3:

import sys
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.pdfpage import PDFPage
from pdfminer.converter import XMLConverter, HTMLConverter, TextConverter
from pdfminer.layout import LAParams
import io

def pdfparser(data):

    fp = open(data, 'rb')
    rsrcmgr = PDFResourceManager()
    retstr = io.StringIO()
    codec = 'utf-8'
    laparams = LAParams()
    device = TextConverter(rsrcmgr, retstr, codec=codec, laparams=laparams)
    # Create a PDF interpreter object.
    interpreter = PDFPageInterpreter(rsrcmgr, device)
    # Process each page contained in the document.

    for page in PDFPage.get_pages(fp):
        interpreter.process_page(page)
        data =  retstr.getvalue()

    print(data)

if __name__ == '__main__':
    pdfparser(sys.argv[1])  

回答 3

Pdftotext一个开放源代码程序(Xpdf的一部分),您可以从python调用它(不是您所要求的,但可能有用)。我使用它没有问题。我认为Google在Google桌面中使用它。

Pdftotext An open source program (part of Xpdf) which you could call from python (not what you asked for but might be useful). I’ve used it with no problems. I think google use it in google desktop.


回答 4

pyPDF可以正常工作(假设您使用的是格式正确的PDF)。如果您只需要文本(带空格),则可以执行以下操作:

import pyPdf
pdf = pyPdf.PdfFileReader(open(filename, "rb"))
for page in pdf.pages:
    print page.extractText()

您还可以轻松访问元数据,图像数据等。

extractText代码中的注释说明:

按照在内容流中提供的顺序找到所有文本绘制命令,然后提取文本。这对于某些PDF文件效果很好,但对其他PDF文件效果不佳,具体取决于所使用的生成器。将来会对此进行完善。不要依赖此功能产生的文本顺序,因为如果此功能变得更复杂,它将改变。

这是否是一个问题取决于您对文本的处理方式(例如,顺序无关紧要,就可以了,或者如果生成器按显示顺序将文本添加到流中,就可以了) 。我在日常使用中有pyPdf提取代码,没有任何问题。

pyPDF works fine (assuming that you’re working with well-formed PDFs). If all you want is the text (with spaces), you can just do:

import pyPdf
pdf = pyPdf.PdfFileReader(open(filename, "rb"))
for page in pdf.pages:
    print page.extractText()

You can also easily get access to the metadata, image data, and so forth.

A comment in the extractText code notes:

Locate all text drawing commands, in the order they are provided in the content stream, and extract the text. This works well for some PDF files, but poorly for others, depending on the generator used. This will be refined in the future. Do not rely on the order of text coming out of this function, as it will change if this function is made more sophisticated.

Whether or not this is a problem depends on what you’re doing with the text (e.g. if the order doesn’t matter, it’s fine, or if the generator adds text to the stream in the order it will be displayed, it’s fine). I have pyPdf extraction code in daily use, without any problems.


回答 5

您也可以很容易地将pdfminer用作库。您可以访问pdf的内容模型,并且可以创建自己的文本提取。我这样做是使用以下代码将pdf内容转换为以分号分隔的文本。

该函数仅根据TextItem内容对象的y和x坐标对它们进行排序,并输出与一条文本行具有相同y坐标的项目,并用’;’将同一行上的对象分隔开 字符。

使用这种方法,我能够从pdf提取文本,而其他工具无法提取适合进一步解析的内容。我尝试过的其他工具包括pdftotext,ps2ascii和在线工具pdftextonline.com。

pdfminer是pdf抓取的宝贵工具。


def pdf_to_csv(filename):
    from pdflib.page import TextItem, TextConverter
    from pdflib.pdfparser import PDFDocument, PDFParser
    from pdflib.pdfinterp import PDFResourceManager, PDFPageInterpreter

    class CsvConverter(TextConverter):
        def __init__(self, *args, **kwargs):
            TextConverter.__init__(self, *args, **kwargs)

        def end_page(self, i):
            from collections import defaultdict
            lines = defaultdict(lambda : {})
            for child in self.cur_item.objs:
                if isinstance(child, TextItem):
                    (_,_,x,y) = child.bbox
                    line = lines[int(-y)]
                    line[x] = child.text

            for y in sorted(lines.keys()):
                line = lines[y]
                self.outfp.write(";".join(line[x] for x in sorted(line.keys())))
                self.outfp.write("\n")

    # ... the following part of the code is a remix of the 
    # convert() function in the pdfminer/tools/pdf2text module
    rsrc = PDFResourceManager()
    outfp = StringIO()
    device = CsvConverter(rsrc, outfp, "ascii")

    doc = PDFDocument()
    fp = open(filename, 'rb')
    parser = PDFParser(doc, fp)
    doc.initialize('')

    interpreter = PDFPageInterpreter(rsrc, device)

    for i, page in enumerate(doc.get_pages()):
        outfp.write("START PAGE %d\n" % i)
        interpreter.process_page(page)
        outfp.write("END PAGE %d\n" % i)

    device.close()
    fp.close()

    return outfp.getvalue()

更新

上面的代码是针对API的旧版本编写的,请参见下面的评论。

You can also quite easily use pdfminer as a library. You have access to the pdf’s content model, and can create your own text extraction. I did this to convert pdf contents to semi-colon separated text, using the code below.

The function simply sorts the TextItem content objects according to their y and x coordinates, and outputs items with the same y coordinate as one text line, separating the objects on the same line with ‘;’ characters.

Using this approach, I was able to extract text from a pdf that no other tool was able to extract content suitable for further parsing from. Other tools I tried include pdftotext, ps2ascii and the online tool pdftextonline.com.

pdfminer is an invaluable tool for pdf-scraping.


def pdf_to_csv(filename):
    from pdflib.page import TextItem, TextConverter
    from pdflib.pdfparser import PDFDocument, PDFParser
    from pdflib.pdfinterp import PDFResourceManager, PDFPageInterpreter

    class CsvConverter(TextConverter):
        def __init__(self, *args, **kwargs):
            TextConverter.__init__(self, *args, **kwargs)

        def end_page(self, i):
            from collections import defaultdict
            lines = defaultdict(lambda : {})
            for child in self.cur_item.objs:
                if isinstance(child, TextItem):
                    (_,_,x,y) = child.bbox
                    line = lines[int(-y)]
                    line[x] = child.text

            for y in sorted(lines.keys()):
                line = lines[y]
                self.outfp.write(";".join(line[x] for x in sorted(line.keys())))
                self.outfp.write("\n")

    # ... the following part of the code is a remix of the 
    # convert() function in the pdfminer/tools/pdf2text module
    rsrc = PDFResourceManager()
    outfp = StringIO()
    device = CsvConverter(rsrc, outfp, "ascii")

    doc = PDFDocument()
    fp = open(filename, 'rb')
    parser = PDFParser(doc, fp)
    doc.initialize('')

    interpreter = PDFPageInterpreter(rsrc, device)

    for i, page in enumerate(doc.get_pages()):
        outfp.write("START PAGE %d\n" % i)
        interpreter.process_page(page)
        outfp.write("END PAGE %d\n" % i)

    device.close()
    fp.close()

    return outfp.getvalue()

UPDATE:

The code above is written against an old version of the API, see my comment below.


回答 6

slate 是一个使从库中使用PDFMiner变得非常简单的项目:

>>> with open('example.pdf') as f:
...    doc = slate.PDF(f)
...
>>> doc
[..., ..., ...]
>>> doc[1]
'Text from page 2...'   

slate is a project that makes it very simple to use PDFMiner from a library:

>>> with open('example.pdf') as f:
...    doc = slate.PDF(f)
...
>>> doc
[..., ..., ...]
>>> doc[1]
'Text from page 2...'   

回答 7

我需要在python模块中将特定的PDF转换为纯文本。在阅读了他们的pdf2txt.py工具后,我使用了PDFMiner 20110515,我编写了以下简单代码段:

from cStringIO import StringIO
from pdfminer.pdfinterp import PDFResourceManager, process_pdf
from pdfminer.converter import TextConverter
from pdfminer.layout import LAParams

def to_txt(pdf_path):
    input_ = file(pdf_path, 'rb')
    output = StringIO()

    manager = PDFResourceManager()
    converter = TextConverter(manager, output, laparams=LAParams())
    process_pdf(manager, converter, input_)

    return output.getvalue() 

I needed to convert a specific PDF to plain text within a python module. I used PDFMiner 20110515, after reading through their pdf2txt.py tool I wrote this simple snippet:

from cStringIO import StringIO
from pdfminer.pdfinterp import PDFResourceManager, process_pdf
from pdfminer.converter import TextConverter
from pdfminer.layout import LAParams

def to_txt(pdf_path):
    input_ = file(pdf_path, 'rb')
    output = StringIO()

    manager = PDFResourceManager()
    converter = TextConverter(manager, output, laparams=LAParams())
    process_pdf(manager, converter, input_)

    return output.getvalue() 

回答 8

重新利用pdfminer随附的pdf2txt.py代码;您可以使函数采用pdf路径;(可选)输出类型(txt | html | xml | tag),并选择类似命令行pdf2txt {‘-o’:’/path/to/outfile.txt’…}。默认情况下,您可以调用:

convert_pdf(path)

将创建一个文本文件,该文件在文件系统上为原始pdf的同级文件。

def convert_pdf(path, outtype='txt', opts={}):
    import sys
    from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter, process_pdf
    from pdfminer.converter import XMLConverter, HTMLConverter, TextConverter, TagExtractor
    from pdfminer.layout import LAParams
    from pdfminer.pdfparser import PDFDocument, PDFParser
    from pdfminer.pdfdevice import PDFDevice
    from pdfminer.cmapdb import CMapDB

    outfile = path[:-3] + outtype
    outdir = '/'.join(path.split('/')[:-1])

    debug = 0
    # input option
    password = ''
    pagenos = set()
    maxpages = 0
    # output option
    codec = 'utf-8'
    pageno = 1
    scale = 1
    showpageno = True
    laparams = LAParams()
    for (k, v) in opts:
        if k == '-d': debug += 1
        elif k == '-p': pagenos.update( int(x)-1 for x in v.split(',') )
        elif k == '-m': maxpages = int(v)
        elif k == '-P': password = v
        elif k == '-o': outfile = v
        elif k == '-n': laparams = None
        elif k == '-A': laparams.all_texts = True
        elif k == '-D': laparams.writing_mode = v
        elif k == '-M': laparams.char_margin = float(v)
        elif k == '-L': laparams.line_margin = float(v)
        elif k == '-W': laparams.word_margin = float(v)
        elif k == '-O': outdir = v
        elif k == '-t': outtype = v
        elif k == '-c': codec = v
        elif k == '-s': scale = float(v)
    #
    CMapDB.debug = debug
    PDFResourceManager.debug = debug
    PDFDocument.debug = debug
    PDFParser.debug = debug
    PDFPageInterpreter.debug = debug
    PDFDevice.debug = debug
    #
    rsrcmgr = PDFResourceManager()
    if not outtype:
        outtype = 'txt'
        if outfile:
            if outfile.endswith('.htm') or outfile.endswith('.html'):
                outtype = 'html'
            elif outfile.endswith('.xml'):
                outtype = 'xml'
            elif outfile.endswith('.tag'):
                outtype = 'tag'
    if outfile:
        outfp = file(outfile, 'w')
    else:
        outfp = sys.stdout
    if outtype == 'txt':
        device = TextConverter(rsrcmgr, outfp, codec=codec, laparams=laparams)
    elif outtype == 'xml':
        device = XMLConverter(rsrcmgr, outfp, codec=codec, laparams=laparams, outdir=outdir)
    elif outtype == 'html':
        device = HTMLConverter(rsrcmgr, outfp, codec=codec, scale=scale, laparams=laparams, outdir=outdir)
    elif outtype == 'tag':
        device = TagExtractor(rsrcmgr, outfp, codec=codec)
    else:
        return usage()

    fp = file(path, 'rb')
    process_pdf(rsrcmgr, device, fp, pagenos, maxpages=maxpages, password=password)
    fp.close()
    device.close()

    outfp.close()
    return

Repurposing the pdf2txt.py code that comes with pdfminer; you can make a function that will take a path to the pdf; optionally, an outtype (txt|html|xml|tag) and opts like the commandline pdf2txt {‘-o’: ‘/path/to/outfile.txt’ …}. By default, you can call:

convert_pdf(path)

A text file will be created, a sibling on the filesystem to the original pdf.

def convert_pdf(path, outtype='txt', opts={}):
    import sys
    from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter, process_pdf
    from pdfminer.converter import XMLConverter, HTMLConverter, TextConverter, TagExtractor
    from pdfminer.layout import LAParams
    from pdfminer.pdfparser import PDFDocument, PDFParser
    from pdfminer.pdfdevice import PDFDevice
    from pdfminer.cmapdb import CMapDB

    outfile = path[:-3] + outtype
    outdir = '/'.join(path.split('/')[:-1])

    debug = 0
    # input option
    password = ''
    pagenos = set()
    maxpages = 0
    # output option
    codec = 'utf-8'
    pageno = 1
    scale = 1
    showpageno = True
    laparams = LAParams()
    for (k, v) in opts:
        if k == '-d': debug += 1
        elif k == '-p': pagenos.update( int(x)-1 for x in v.split(',') )
        elif k == '-m': maxpages = int(v)
        elif k == '-P': password = v
        elif k == '-o': outfile = v
        elif k == '-n': laparams = None
        elif k == '-A': laparams.all_texts = True
        elif k == '-D': laparams.writing_mode = v
        elif k == '-M': laparams.char_margin = float(v)
        elif k == '-L': laparams.line_margin = float(v)
        elif k == '-W': laparams.word_margin = float(v)
        elif k == '-O': outdir = v
        elif k == '-t': outtype = v
        elif k == '-c': codec = v
        elif k == '-s': scale = float(v)
    #
    CMapDB.debug = debug
    PDFResourceManager.debug = debug
    PDFDocument.debug = debug
    PDFParser.debug = debug
    PDFPageInterpreter.debug = debug
    PDFDevice.debug = debug
    #
    rsrcmgr = PDFResourceManager()
    if not outtype:
        outtype = 'txt'
        if outfile:
            if outfile.endswith('.htm') or outfile.endswith('.html'):
                outtype = 'html'
            elif outfile.endswith('.xml'):
                outtype = 'xml'
            elif outfile.endswith('.tag'):
                outtype = 'tag'
    if outfile:
        outfp = file(outfile, 'w')
    else:
        outfp = sys.stdout
    if outtype == 'txt':
        device = TextConverter(rsrcmgr, outfp, codec=codec, laparams=laparams)
    elif outtype == 'xml':
        device = XMLConverter(rsrcmgr, outfp, codec=codec, laparams=laparams, outdir=outdir)
    elif outtype == 'html':
        device = HTMLConverter(rsrcmgr, outfp, codec=codec, scale=scale, laparams=laparams, outdir=outdir)
    elif outtype == 'tag':
        device = TagExtractor(rsrcmgr, outfp, codec=codec)
    else:
        return usage()

    fp = file(path, 'rb')
    process_pdf(rsrcmgr, device, fp, pagenos, maxpages=maxpages, password=password)
    fp.close()
    device.close()

    outfp.close()
    return

回答 9

PDFminer在我尝试使用的pdf文件的每一页上给了我一行[第7页,共1页…]。

到目前为止,我最好的答案是pdftoipe或基于Xpdf的c ++代码。

请参阅我的问题,了解pdftoipe的输出是什么样的。

PDFminer gave me perhaps one line [page 1 of 7…] on every page of a pdf file I tried with it.

The best answer I have so far is pdftoipe, or the c++ code it’s based on Xpdf.

see my question for what the output of pdftoipe looks like.


回答 10

另外,还有PDFTextStream,这是一个商业Java库,也可以从Python使用。

Additionally there is PDFTextStream which is a commercial Java library that can also be used from Python.


回答 11

我用过pdftohtml这个-xml参数,用读取结果subprocess.Popen(),它将为您提供pdf 中每个文本片段的x坐标,y坐标,宽度,高度和字体。我认为这也是“证据”可能使用的原因,因为出现了相同的错误消息。

如果您需要处理柱状数据,由于必须发明一种适合pdf文件的算法,它会变得稍微复杂一些。问题在于制作PDF文件的程序实际上不一定以任何逻辑格式对文本进行布局。您可以尝试使用简单的排序算法,该算法有时会起作用,但是可能很少出现“散乱”和“杂散”的情况,这些文字不会按照您认为的顺序排列。所以你必须要有创造力。

我花了大约5个小时才找到一份我正在研究的pdf文件。但现在效果很好。祝好运。

I have used pdftohtml with the -xml argument, read the result with subprocess.Popen(), that will give you x coord, y coord, width, height, and font, of every snippet of text in the pdf. I think this is what ‘evince’ probably uses too because the same error messages spew out.

If you need to process columnar data, it gets slightly more complicated as you have to invent an algorithm that suits your pdf file. The problem is that the programs that make PDF files don’t really necessarily lay out the text in any logical format. You can try simple sorting algorithms and it works sometimes, but there can be little ‘stragglers’ and ‘strays’, pieces of text that don’t get put in the order you thought they would. So you have to get creative.

It took me about 5 hours to figure out one for the pdf’s I was working on. But it works pretty good now. Good luck.


回答 12

今天找到了该解决方案。对我来说很棒。甚至将PDF页面呈现为PNG图像。 http://www.swftools.org/gfx_tutorial.html

Found that solution today. Works great for me. Even rendering PDF pages to PNG images. http://www.swftools.org/gfx_tutorial.html


在Python中更改字符串中的一个字符

问题:在Python中更改字符串中的一个字符

Python中替换字符串中字符的最简单方法是什么?

例如:

text = "abcdefg";
text[1] = "Z";
           ^

What is the easiest way in Python to replace a character in a string?

For example:

text = "abcdefg";
text[1] = "Z";
           ^

回答 0

不要修改字符串。

与他们一起工作;仅在需要时才将它们转换为字符串。

>>> s = list("Hello zorld")
>>> s
['H', 'e', 'l', 'l', 'o', ' ', 'z', 'o', 'r', 'l', 'd']
>>> s[6] = 'W'
>>> s
['H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd']
>>> "".join(s)
'Hello World'

Python字符串是不可变的(即无法修改)。有很多的原因。除非您别无选择,否则请使用列表,然后将它们变成字符串。

Don’t modify strings.

Work with them as lists; turn them into strings only when needed.

>>> s = list("Hello zorld")
>>> s
['H', 'e', 'l', 'l', 'o', ' ', 'z', 'o', 'r', 'l', 'd']
>>> s[6] = 'W'
>>> s
['H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd']
>>> "".join(s)
'Hello World'

Python strings are immutable (i.e. they can’t be modified). There are a lot of reasons for this. Use lists until you have no choice, only then turn them into strings.


回答 1

最快的方法?

有三种方法。对于速度寻求者,我建议使用“方法2”

方法1

由这个答案给出

text = 'abcdefg'
new = list(text)
new[6] = 'W'
''.join(new)

与“方法2”相比,这相当慢

timeit.timeit("text = 'abcdefg'; s = list(text); s[6] = 'W'; ''.join(s)", number=1000000)
1.0411581993103027

方法2(快速方法)

由这个答案给出

text = 'abcdefg'
text = text[:1] + 'Z' + text[2:]

哪个更快:

timeit.timeit("text = 'abcdefg'; text = text[:1] + 'Z' + text[2:]", number=1000000)
0.34651994705200195

方法3:

字节数组:

timeit.timeit("text = 'abcdefg'; s = bytearray(text); s[1] = 'Z'; str(s)", number=1000000)
1.0387420654296875

Fastest method?

There are three ways. For the speed seekers I recommend ‘Method 2’

Method 1

Given by this answer

text = 'abcdefg'
new = list(text)
new[6] = 'W'
''.join(new)

Which is pretty slow compared to ‘Method 2’

timeit.timeit("text = 'abcdefg'; s = list(text); s[6] = 'W'; ''.join(s)", number=1000000)
1.0411581993103027

Method 2 (FAST METHOD)

Given by this answer

text = 'abcdefg'
text = text[:1] + 'Z' + text[2:]

Which is much faster:

timeit.timeit("text = 'abcdefg'; text = text[:1] + 'Z' + text[2:]", number=1000000)
0.34651994705200195

Method 3:

Byte array:

timeit.timeit("text = 'abcdefg'; s = bytearray(text); s[1] = 'Z'; str(s)", number=1000000)
1.0387420654296875

回答 2

new = text[:1] + 'Z' + text[2:]
new = text[:1] + 'Z' + text[2:]

回答 3

Python字符串是不可变的,您可以通过复制来更改它们。
做您想做的最简单的方法可能是:

text = "Z" + text[1:]

text[1:]返回字符串text从位置1到结束,位置从0开始计数,从而“1”是第二个字符。

编辑:您可以对字符串的任何部分使用相同的字符串切片技术

text = text[:1] + "Z" + text[2:]

或者,如果该字母仅出现一次,则可以使用下面建议的搜索和替换技术

Python strings are immutable, you change them by making a copy.
The easiest way to do what you want is probably:

text = "Z" + text[1:]

The text[1:] returns the string in text from position 1 to the end, positions count from 0 so ‘1’ is the second character.

edit: You can use the same string slicing technique for any part of the string

text = text[:1] + "Z" + text[2:]

Or if the letter only appears once you can use the search and replace technique suggested below


回答 4

从python 2.6和python 3开始,您可以使用可变的字节数组(可以与字符串不同,可以逐个元素地更改):

s = "abcdefg"
b_s = bytearray(s)
b_s[1] = "Z"
s = str(b_s)
print s
aZcdefg

编辑:更改为s

edit2:正如两位炼金术士在评论中所述,此代码不适用于unicode。

Starting with python 2.6 and python 3 you can use bytearrays which are mutable (can be changed element-wise unlike strings):

s = "abcdefg"
b_s = bytearray(s)
b_s[1] = "Z"
s = str(b_s)
print s
aZcdefg

edit: Changed str to s

edit2: As Two-Bit Alchemist mentioned in the comments, this code does not work with unicode.


回答 5

就像其他人所说的那样,通常Python字符串应该是不可变的。

但是,如果您使用的是CPython,即python.org的实现,则可以使用ctypes修改内存中的字符串结构。

这是我使用该技术清除字符串的示例。

在python中将数据标记为敏感

为了完整起见,我提到了这一点,这应该是您的最后选择,因为它有点黑。

Like other people have said, generally Python strings are supposed to be immutable.

However, if you are using CPython, the implementation at python.org, it is possible to use ctypes to modify the string structure in memory.

Here is an example where I use the technique to clear a string.

Mark data as sensitive in python

I mention this for the sake of completeness, and this should be your last resort as it is hackish.


回答 6

此代码不是我的。我不记得我在哪里填写网站表格。有趣的是,您可以使用此字符用一个或多个字符替换一个或多个字符。尽管此回复很晚,但像我这样的新手(随时)可能会觉得有用。

更改文字功能。

mytext = 'Hello Zorld'
mytext = mytext.replace('Z', 'W')
print mytext,

This code is not mine. I couldn’t recall the site form where, I took it. Interestingly, you can use this to replace one character or more with one or more charectors. Though this reply is very late, novices like me (anytime) might find it useful.

Change Text function.

mytext = 'Hello Zorld'
mytext = mytext.replace('Z', 'W')
print mytext,

回答 7

实际上,使用字符串,您可以执行以下操作:

oldStr = 'Hello World!'    
newStr = ''

for i in oldStr:  
    if 'a' < i < 'z':    
        newStr += chr(ord(i)-32)     
    else:      
        newStr += i
print(newStr)

'HELLO WORLD!'

基本上,我是将“ +”字符串一起添加到新字符串中:)。

Actually, with strings, you can do something like this:

oldStr = 'Hello World!'    
newStr = ''

for i in oldStr:  
    if 'a' < i < 'z':    
        newStr += chr(ord(i)-32)     
    else:      
        newStr += i
print(newStr)

'HELLO WORLD!'

Basically, I’m “adding”+”strings” together into a new string :).


回答 8

如果您的世界是100%ascii/utf-8(很多用例都放在该框中):

b = bytearray(s, 'utf-8')
# process - e.g., lowercasing: 
#    b[0] = b[i+1] - 32
s = str(b, 'utf-8')

python 3.7.3

if your world is 100% ascii/utf-8(a lot of use cases fit in that box):

b = bytearray(s, 'utf-8')
# process - e.g., lowercasing: 
#    b[0] = b[i+1] - 32
s = str(b, 'utf-8')

python 3.7.3


回答 9

我想添加另一种更改字符串中字符的方式。

>>> text = '~~~~~~~~~~~'
>>> text = text[:1] + (text[1:].replace(text[0], '+', 1))
'~+~~~~~~~~~'

与将字符串转换为list并替换ith值然后再次加入相比,速度有多快?

清单方式

>>> timeit.timeit("text = '~~~~~~~~~~~'; s = list(text); s[1] = '+'; ''.join(s)", number=1000000)
0.8268570480013295

我的解决方案

>>> timeit.timeit("text = '~~~~~~~~~~~'; text=text[:1] + (text[1:].replace(text[0], '+', 1))", number=1000000)
0.588400217000526

I would like to add another way of changing a character in a string.

>>> text = '~~~~~~~~~~~'
>>> text = text[:1] + (text[1:].replace(text[0], '+', 1))
'~+~~~~~~~~~'

How faster it is when compared to turning the string into list and replacing the ith value then joining again?.

List approach

>>> timeit.timeit("text = '~~~~~~~~~~~'; s = list(text); s[1] = '+'; ''.join(s)", number=1000000)
0.8268570480013295

My solution

>>> timeit.timeit("text = '~~~~~~~~~~~'; text=text[:1] + (text[1:].replace(text[0], '+', 1))", number=1000000)
0.588400217000526

在Python中读取.mat文件

问题:在Python中读取.mat文件

是否可以在Python中读取二进制MATLAB .mat文件?

我已经看到SciPy声称支持读取.mat文件,但是我没有成功。我安装了SciPy 0.7.0版,但找不到该loadmat()方法。

Is it possible to read binary MATLAB .mat files in Python?

I’ve seen that SciPy has alleged support for reading .mat files, but I’m unsuccessful with it. I installed SciPy version 0.7.0, and I can’t find the loadmat() method.


回答 0

需要导入,import scipy.io

import scipy.io
mat = scipy.io.loadmat('file.mat')

An import is required, import scipy.io

import scipy.io
mat = scipy.io.loadmat('file.mat')

回答 1

无论是scipy.io.savemat,还是scipy.io.loadmat对于MATLAB阵列的7.3版本的工作。但是好消息是MATLAB版本7.3文件是hdf5数据集。因此,可以使用许多工具(包括NumPy)读取它们。

对于Python,您将需要h5py扩展,该扩展在系统上需要HDF5。

import numpy as np
import h5py
f = h5py.File('somefile.mat','r')
data = f.get('data/variable1')
data = np.array(data) # For converting to a NumPy array

Neither scipy.io.savemat, nor scipy.io.loadmat work for MATLAB arrays version 7.3. But the good part is that MATLAB version 7.3 files are hdf5 datasets. So they can be read using a number of tools, including NumPy.

For Python, you will need the h5py extension, which requires HDF5 on your system.

import numpy as np
import h5py
f = h5py.File('somefile.mat','r')
data = f.get('data/variable1')
data = np.array(data) # For converting to a NumPy array

回答 2

首先将.mat文件另存为:

save('test.mat', '-v7')

之后,在Python中,使用通常的loadmat函数:

import scipy.io as sio
test = sio.loadmat('test.mat')

First save the .mat file as:

save('test.mat', '-v7')

After that, in Python, use the usual loadmat function:

import scipy.io as sio
test = sio.loadmat('test.mat')

回答 3

有一个很好的软件包mat4py,可以很容易地使用安装

pip install mat4py

使用起来很简单(从网站上):

从MAT文件加载数据

该函数loadmat仅使用Python dictlist对象将MAT文件中存储的所有变量加载到简单的Python数据结构中。数字和单元格数组将转换为按行排序的嵌套列表。压缩数组以消除仅包含一个元素的数组。结果数据结构由与JSON格式兼容的简单类型组成。

示例:将MAT文件加载到Python数据结构中:

from mat4py import loadmat

data = loadmat('datafile.mat')

变量datadict带有MAT文件中包含的变量和值的a 。

将Python数据结构保存到MAT文件

可以使用函数将Python数据保存到MAT文件中savemat。数据已经以同样的方式为被结构化的loadmat,也就是说,它应该由简单数据类型,像dictliststrint,和float

示例:将Python数据结构保存到MAT文件中:

from mat4py import savemat

savemat('datafile.mat', data)

参数data应为dict带有变量的a。

There is a nice package called mat4py which can easily be installed using

pip install mat4py

It is straightforward to use (from the website):

Load data from a MAT-file

The function loadmat loads all variables stored in the MAT-file into a simple Python data structure, using only Python’s dict and list objects. Numeric and cell arrays are converted to row-ordered nested lists. Arrays are squeezed to eliminate arrays with only one element. The resulting data structure is composed of simple types that are compatible with the JSON format.

Example: Load a MAT-file into a Python data structure:

from mat4py import loadmat

data = loadmat('datafile.mat')

The variable data is a dict with the variables and values contained in the MAT-file.

Save a Python data structure to a MAT-file

Python data can be saved to a MAT-file, with the function savemat. Data has to be structured in the same way as for loadmat, i.e. it should be composed of simple data types, like dict, list, str, int, and float.

Example: Save a Python data structure to a MAT-file:

from mat4py import savemat

savemat('datafile.mat', data)

The parameter data shall be a dict with the variables.


回答 4

安装了MATLAB 2014b或更高版本后,可以使用适用于PythonMATLAB引擎

import matlab.engine
eng = matlab.engine.start_matlab()
content = eng.load("example.mat", nargout=1)

Having MATLAB 2014b or newer installed, the MATLAB engine for Python could be used:

import matlab.engine
eng = matlab.engine.start_matlab()
content = eng.load("example.mat", nargout=1)

回答 5

读取文件

import scipy.io
mat = scipy.io.loadmat(file_name)

检查MAT变量的类型

print(type(mat))
#OUTPUT - <class 'dict'>

字典中的MATLAB变量分配给这些变量对象

Reading the file

import scipy.io
mat = scipy.io.loadmat(file_name)

Inspecting the type of MAT variable

print(type(mat))
#OUTPUT - <class 'dict'>

The keys inside the dictionary are MATLAB variables, and the values are the objects assigned to those variables.


回答 6

MathWorks本身也提供用于PythonMATLAB引擎。如果您有MATLAB,则可能值得考虑(我自己还没有尝试过,但是它具有比仅读取MATLAB文件更多的功能)。但是,我不知道是否允许将其分发给其他用户(如果这些人拥有MATLAB,这可能不是问题。否则,也许NumPy是正确的选择?)。

另外,如果您想自己掌握所有基础知识,MathWorks将提供有关文件格式结构的详细文档(如果链接发生更改,请尝试使用google matfile_format.pdf或其标题MAT-FILE Format)。它并不像我个人想象的那样复杂,但是显然,这不是最简单的方法。它还取决于.mat您要支持-files的多少功能。

我编写了一个“小”(约700行)Python脚本,该脚本可以读取一些基本的.mat-files。我既不是Python专家,也不是初学者,我花了大约两天时间来编写它(使用上面链接的MathWorks文档)。我学到很多新东西,这很有趣(大部分时间)。当我在工作时编写Python脚本时,恐怕我无法发布它了……但是我可以在这里给出一些建议:

  • 首先阅读文档。
  • 使用十六进制编辑器(例如HxD)并查看.mat要解析的参考文件。
  • 尝试通过将字节保存到.txt文件并注释每行来弄清楚每个字节的含义。
  • 使用类来保存每个数据元素(如miCOMPRESSEDmiMATRIXmxDOUBLE,或miINT32
  • .mat-files’结构是最佳的用于保存在一个树形数据结构中的数据元素; 每个节点都有一个类和子节点

There is also the MATLAB Engine for Python by MathWorks itself. If you have MATLAB, this might be worth considering (I haven’t tried it myself but it has a lot more functionality than just reading MATLAB files). However, I don’t know if it is allowed to distribute it to other users (it is probably not a problem if those persons have MATLAB. Otherwise, maybe NumPy is the right way to go?).

Also, if you want to do all the basics yourself, MathWorks provides (if the link changes, try to google for matfile_format.pdf or its title MAT-FILE Format) a detailed documentation on the structure of the file format. It’s not as complicated as I personally thought, but obviously, this is not the easiest way to go. It also depends on how many features of the .mat-files you want to support.

I’ve written a “small” (about 700 lines) Python script which can read some basic .mat-files. I’m neither a Python expert nor a beginner and it took me about two days to write it (using the MathWorks documentation linked above). I’ve learned a lot of new stuff and it was quite fun (most of the time). As I’ve written the Python script at work, I’m afraid I cannot publish it… But I can give some advice here:

  • First read the documentation.
  • Use a hex editor (such as HxD) and look into a reference .mat-file you want to parse.
  • Try to figure out the meaning of each byte by saving the bytes to a .txt file and annotate each line.
  • Use classes to save each data element (such as miCOMPRESSED, miMATRIX, mxDOUBLE, or miINT32)
  • The .mat-files’ structure is optimal for saving the data elements in a tree data structure; each node has one class and subnodes

回答 7

from os.path import dirname, join as pjoin
import scipy.io as sio
data_dir = pjoin(dirname(sio.__file__), 'matlab', 'tests', 'data')
mat_fname = pjoin(data_dir, 'testdouble_7.4_GLNX86.mat')
mat_contents = sio.loadmat(mat_fname)

您可以使用上面的代码在Python中读取默认保存的.mat文件。

from os.path import dirname, join as pjoin
import scipy.io as sio
data_dir = pjoin(dirname(sio.__file__), 'matlab', 'tests', 'data')
mat_fname = pjoin(data_dir, 'testdouble_7.4_GLNX86.mat')
mat_contents = sio.loadmat(mat_fname)

You can use above code to read the default saved .mat file in Python.


如何检测圣诞树?[关闭]

问题:如何检测圣诞树?[关闭]

可以使用哪些图像处理技术来实现检测以下图像中显示的圣诞树的应用程序?

我正在寻找可以在所有这些图像上使用的解决方案。因此,需要训练haar级联分类器模板匹配的方法不是很有趣。

我正在寻找可以使用任何编程语言编写的东西,只要它仅使用开源技术即可。该解决方案必须使用此问题上共享的图像进行测试。有6个输入图像,答案应显示每个图像的处理结果。最后,对于每个输出图像,必须绘制红线以包围检测到的树。

您将如何以编程方式检测这些图像中的树木?

Which image processing techniques could be used to implement an application that detects the Christmas trees displayed in the following images?

I’m searching for solutions that are going to work on all these images. Therefore, approaches that require training haar cascade classifiers or template matching are not very interesting.

I’m looking for something that can be written in any programming language, as long as it uses only Open Source technologies. The solution must be tested with the images that are shared on this question. There are 6 input images and the answer should display the results of processing each of them. Finally, for each output image there must be red lines draw to surround the detected tree.

How would you go about programmatically detecting the trees in these images?


回答 0

我有一种我认为很有趣的方法,与其他方法有所不同。与其他方法相比,我的方法的主要区别在于如何执行图像分割步骤-我使用了来自python scikit-learn 的DBSCAN聚类算法;它经过优化,可找到可能不一定具有单个清晰质心的某种无定形形状。

在最高层,我的方法很简单,可以分解为大约3个步骤。首先,我应用一个阈值(或者实际上是两个单独且不同的阈值的逻辑“或”)。与其他许多答案一样,我假设圣诞树将是场景中较亮的对象之一,因此第一个阈值只是一个简单的单色亮度测试;0-255范围(黑色为0,白色为255)上的值大于220的所有像素将保存到二进制黑白图像。第二个阈值尝试寻找红光和黄光,这在六张图像的左上角和右下角的树木中尤为突出,并且在大多数照片中普遍使用的蓝绿色背景下表现出色。我将rgb图像转换为hsv空间,并要求色相在0.0-1.0范围内小于0.2(大致对应于黄色和绿色之间的边界)或大于0.95(对应于紫色与红色之间的边界),另外我还要求明亮,饱和的颜色:饱和度和值都必须高于0.7。这两个阈值过程的结果在逻辑上“或”在一起,黑白二进制图像的结果矩阵如下所示:

对HSV和单色亮度进行阈值设置后的圣诞树

您可以清楚地看到,每个图像都有一个大的像素簇,大致对应于每棵树的位置,加上一些图像还具有一些其他的小簇,它们对应于某些建筑物的窗户上的灯光,或者对应于背景场景在地平线上。下一步是使计算机识别这些是单独的群集,并使用群集成员ID号正确标记每个像素。

为此,我选择了DBSCAN。相对于其他集群算法,这里有一个很好的视觉比较,可以比较DBSCAN通常的行为。正如我之前说的,它非常适合非晶形形状。此处显示了DBSCAN的输出,其中每个集群以不同的颜色绘制:

DBSCAN集群输出

查看此结果时,需要注意一些事项。首先,DBSCAN要求用户设置一个“接近”参数以调节其行为,该参数有效地控制了一对点必须分开的程度,以便算法声明新的单独簇,而不是将测试点聚结到已经存在的集群。我将此值设置为每个图像对角线大小的0.04倍。由于图像的大小从大约VGA到大约HD 1080不等,因此这种比例相关的定义至关重要。

另一个值得注意的点是,在scikit-learn中实现的DBSCAN算法具有内存限制,对于此示例中的某些较大图像而言,这是相当大的挑战。因此,对于一些较大的图像,我实际上必须每个群集“抽取”(即,仅保留每个第3或第4像素并丢弃其他像素),以保持在此范围内。作为这种剔除处理的结果,在某些较大的图像上很难看到其余的单个稀疏像素。因此,仅出于显示目的,上述图像中的颜色编码像素已被有效地稍微“扩张”了一点,以使其更加突出。出于叙述目的,这纯粹是一种修饰操作;尽管在我的代码中有评论提到此膨胀,

识别并标记了聚类后,第三步也是最后一步很容易:我只是在每个图像中选取最大的聚类(在这种情况下,我选择根据成员像素的总数来衡量“大小”,尽管可以却很容易地使用某种类型的度量标准来衡量物理范围)并计算该集群的凸包。凸包然后成为树的边界。通过此方法计算的六个凸包在下面以红色显示:

带有计算边界的圣诞树

源代码是为Python 2.7.6编写的,它取决于numpyscipymatplotlibscikit-learn。我将其分为两部分。第一部分负责实际的图像处理:

from PIL import Image
import numpy as np
import scipy as sp
import matplotlib.colors as colors
from sklearn.cluster import DBSCAN
from math import ceil, sqrt

"""
Inputs:

    rgbimg:         [M,N,3] numpy array containing (uint, 0-255) color image

    hueleftthr:     Scalar constant to select maximum allowed hue in the
                    yellow-green region

    huerightthr:    Scalar constant to select minimum allowed hue in the
                    blue-purple region

    satthr:         Scalar constant to select minimum allowed saturation

    valthr:         Scalar constant to select minimum allowed value

    monothr:        Scalar constant to select minimum allowed monochrome
                    brightness

    maxpoints:      Scalar constant maximum number of pixels to forward to
                    the DBSCAN clustering algorithm

    proxthresh:     Proximity threshold to use for DBSCAN, as a fraction of
                    the diagonal size of the image

Outputs:

    borderseg:      [K,2,2] Nested list containing K pairs of x- and y- pixel
                    values for drawing the tree border

    X:              [P,2] List of pixels that passed the threshold step

    labels:         [Q,2] List of cluster labels for points in Xslice (see
                    below)

    Xslice:         [Q,2] Reduced list of pixels to be passed to DBSCAN

"""

def findtree(rgbimg, hueleftthr=0.2, huerightthr=0.95, satthr=0.7, 
             valthr=0.7, monothr=220, maxpoints=5000, proxthresh=0.04):

    # Convert rgb image to monochrome for
    gryimg = np.asarray(Image.fromarray(rgbimg).convert('L'))
    # Convert rgb image (uint, 0-255) to hsv (float, 0.0-1.0)
    hsvimg = colors.rgb_to_hsv(rgbimg.astype(float)/255)

    # Initialize binary thresholded image
    binimg = np.zeros((rgbimg.shape[0], rgbimg.shape[1]))
    # Find pixels with hue<0.2 or hue>0.95 (red or yellow) and saturation/value
    # both greater than 0.7 (saturated and bright)--tends to coincide with
    # ornamental lights on trees in some of the images
    boolidx = np.logical_and(
                np.logical_and(
                  np.logical_or((hsvimg[:,:,0] < hueleftthr),
                                (hsvimg[:,:,0] > huerightthr)),
                                (hsvimg[:,:,1] > satthr)),
                                (hsvimg[:,:,2] > valthr))
    # Find pixels that meet hsv criterion
    binimg[np.where(boolidx)] = 255
    # Add pixels that meet grayscale brightness criterion
    binimg[np.where(gryimg > monothr)] = 255

    # Prepare thresholded points for DBSCAN clustering algorithm
    X = np.transpose(np.where(binimg == 255))
    Xslice = X
    nsample = len(Xslice)
    if nsample > maxpoints:
        # Make sure number of points does not exceed DBSCAN maximum capacity
        Xslice = X[range(0,nsample,int(ceil(float(nsample)/maxpoints)))]

    # Translate DBSCAN proximity threshold to units of pixels and run DBSCAN
    pixproxthr = proxthresh * sqrt(binimg.shape[0]**2 + binimg.shape[1]**2)
    db = DBSCAN(eps=pixproxthr, min_samples=10).fit(Xslice)
    labels = db.labels_.astype(int)

    # Find the largest cluster (i.e., with most points) and obtain convex hull   
    unique_labels = set(labels)
    maxclustpt = 0
    for k in unique_labels:
        class_members = [index[0] for index in np.argwhere(labels == k)]
        if len(class_members) > maxclustpt:
            points = Xslice[class_members]
            hull = sp.spatial.ConvexHull(points)
            maxclustpt = len(class_members)
            borderseg = [[points[simplex,0], points[simplex,1]] for simplex
                          in hull.simplices]

    return borderseg, X, labels, Xslice

第二部分是用户级脚本,该脚本调用第一个文件并生成上面的所有图:

#!/usr/bin/env python

from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from findtree import findtree

# Image files to process
fname = ['nmzwj.png', 'aVZhC.png', '2K9EF.png',
         'YowlH.png', '2y4o5.png', 'FWhSP.png']

# Initialize figures
fgsz = (16,7)        
figthresh = plt.figure(figsize=fgsz, facecolor='w')
figclust  = plt.figure(figsize=fgsz, facecolor='w')
figcltwo  = plt.figure(figsize=fgsz, facecolor='w')
figborder = plt.figure(figsize=fgsz, facecolor='w')
figthresh.canvas.set_window_title('Thresholded HSV and Monochrome Brightness')
figclust.canvas.set_window_title('DBSCAN Clusters (Raw Pixel Output)')
figcltwo.canvas.set_window_title('DBSCAN Clusters (Slightly Dilated for Display)')
figborder.canvas.set_window_title('Trees with Borders')

for ii, name in zip(range(len(fname)), fname):
    # Open the file and convert to rgb image
    rgbimg = np.asarray(Image.open(name))

    # Get the tree borders as well as a bunch of other intermediate values
    # that will be used to illustrate how the algorithm works
    borderseg, X, labels, Xslice = findtree(rgbimg)

    # Display thresholded images
    axthresh = figthresh.add_subplot(2,3,ii+1)
    axthresh.set_xticks([])
    axthresh.set_yticks([])
    binimg = np.zeros((rgbimg.shape[0], rgbimg.shape[1]))
    for v, h in X:
        binimg[v,h] = 255
    axthresh.imshow(binimg, interpolation='nearest', cmap='Greys')

    # Display color-coded clusters
    axclust = figclust.add_subplot(2,3,ii+1) # Raw version
    axclust.set_xticks([])
    axclust.set_yticks([])
    axcltwo = figcltwo.add_subplot(2,3,ii+1) # Dilated slightly for display only
    axcltwo.set_xticks([])
    axcltwo.set_yticks([])
    axcltwo.imshow(binimg, interpolation='nearest', cmap='Greys')
    clustimg = np.ones(rgbimg.shape)    
    unique_labels = set(labels)
    # Generate a unique color for each cluster 
    plcol = cm.rainbow_r(np.linspace(0, 1, len(unique_labels)))
    for lbl, pix in zip(labels, Xslice):
        for col, unqlbl in zip(plcol, unique_labels):
            if lbl == unqlbl:
                # Cluster label of -1 indicates no cluster membership;
                # override default color with black
                if lbl == -1:
                    col = [0.0, 0.0, 0.0, 1.0]
                # Raw version
                for ij in range(3):
                    clustimg[pix[0],pix[1],ij] = col[ij]
                # Dilated just for display
                axcltwo.plot(pix[1], pix[0], 'o', markerfacecolor=col, 
                    markersize=1, markeredgecolor=col)
    axclust.imshow(clustimg)
    axcltwo.set_xlim(0, binimg.shape[1]-1)
    axcltwo.set_ylim(binimg.shape[0], -1)

    # Plot original images with read borders around the trees
    axborder = figborder.add_subplot(2,3,ii+1)
    axborder.set_axis_off()
    axborder.imshow(rgbimg, interpolation='nearest')
    for vseg, hseg in borderseg:
        axborder.plot(hseg, vseg, 'r-', lw=3)
    axborder.set_xlim(0, binimg.shape[1]-1)
    axborder.set_ylim(binimg.shape[0], -1)

plt.show()

I have an approach which I think is interesting and a bit different from the rest. The main difference in my approach, compared to some of the others, is in how the image segmentation step is performed–I used the DBSCAN clustering algorithm from Python’s scikit-learn; it’s optimized for finding somewhat amorphous shapes that may not necessarily have a single clear centroid.

At the top level, my approach is fairly simple and can be broken down into about 3 steps. First I apply a threshold (or actually, the logical “or” of two separate and distinct thresholds). As with many of the other answers, I assumed that the Christmas tree would be one of the brighter objects in the scene, so the first threshold is just a simple monochrome brightness test; any pixels with values above 220 on a 0-255 scale (where black is 0 and white is 255) are saved to a binary black-and-white image. The second threshold tries to look for red and yellow lights, which are particularly prominent in the trees in the upper left and lower right of the six images, and stand out well against the blue-green background which is prevalent in most of the photos. I convert the rgb image to hsv space, and require that the hue is either less than 0.2 on a 0.0-1.0 scale (corresponding roughly to the border between yellow and green) or greater than 0.95 (corresponding to the border between purple and red) and additionally I require bright, saturated colors: saturation and value must both be above 0.7. The results of the two threshold procedures are logically “or”-ed together, and the resulting matrix of black-and-white binary images is shown below:

Christmas trees, after thresholding on HSV as well as monochrome brightness

You can clearly see that each image has one large cluster of pixels roughly corresponding to the location of each tree, plus a few of the images also have some other small clusters corresponding either to lights in the windows of some of the buildings, or to a background scene on the horizon. The next step is to get the computer to recognize that these are separate clusters, and label each pixel correctly with a cluster membership ID number.

For this task I chose DBSCAN. There is a pretty good visual comparison of how DBSCAN typically behaves, relative to other clustering algorithms, available here. As I said earlier, it does well with amorphous shapes. The output of DBSCAN, with each cluster plotted in a different color, is shown here:

DBSCAN clustering output

There are a few things to be aware of when looking at this result. First is that DBSCAN requires the user to set a “proximity” parameter in order to regulate its behavior, which effectively controls how separated a pair of points must be in order for the algorithm to declare a new separate cluster rather than agglomerating a test point onto an already pre-existing cluster. I set this value to be 0.04 times the size along the diagonal of each image. Since the images vary in size from roughly VGA up to about HD 1080, this type of scale-relative definition is critical.

Another point worth noting is that the DBSCAN algorithm as it is implemented in scikit-learn has memory limits which are fairly challenging for some of the larger images in this sample. Therefore, for a few of the larger images, I actually had to “decimate” (i.e., retain only every 3rd or 4th pixel and drop the others) each cluster in order to stay within this limit. As a result of this culling process, the remaining individual sparse pixels are difficult to see on some of the larger images. Therefore, for display purposes only, the color-coded pixels in the above images have been effectively “dilated” just slightly so that they stand out better. It’s purely a cosmetic operation for the sake of the narrative; although there are comments mentioning this dilation in my code, rest assured that it has nothing to do with any calculations that actually matter.

Once the clusters are identified and labeled, the third and final step is easy: I simply take the largest cluster in each image (in this case, I chose to measure “size” in terms of the total number of member pixels, although one could have just as easily instead used some type of metric that gauges physical extent) and compute the convex hull for that cluster. The convex hull then becomes the tree border. The six convex hulls computed via this method are shown below in red:

Christmas trees with their calculated borders

The source code is written for Python 2.7.6 and it depends on numpy, scipy, matplotlib and scikit-learn. I’ve divided it into two parts. The first part is responsible for the actual image processing:

from PIL import Image
import numpy as np
import scipy as sp
import matplotlib.colors as colors
from sklearn.cluster import DBSCAN
from math import ceil, sqrt

"""
Inputs:

    rgbimg:         [M,N,3] numpy array containing (uint, 0-255) color image

    hueleftthr:     Scalar constant to select maximum allowed hue in the
                    yellow-green region

    huerightthr:    Scalar constant to select minimum allowed hue in the
                    blue-purple region

    satthr:         Scalar constant to select minimum allowed saturation

    valthr:         Scalar constant to select minimum allowed value

    monothr:        Scalar constant to select minimum allowed monochrome
                    brightness

    maxpoints:      Scalar constant maximum number of pixels to forward to
                    the DBSCAN clustering algorithm

    proxthresh:     Proximity threshold to use for DBSCAN, as a fraction of
                    the diagonal size of the image

Outputs:

    borderseg:      [K,2,2] Nested list containing K pairs of x- and y- pixel
                    values for drawing the tree border

    X:              [P,2] List of pixels that passed the threshold step

    labels:         [Q,2] List of cluster labels for points in Xslice (see
                    below)

    Xslice:         [Q,2] Reduced list of pixels to be passed to DBSCAN

"""

def findtree(rgbimg, hueleftthr=0.2, huerightthr=0.95, satthr=0.7, 
             valthr=0.7, monothr=220, maxpoints=5000, proxthresh=0.04):

    # Convert rgb image to monochrome for
    gryimg = np.asarray(Image.fromarray(rgbimg).convert('L'))
    # Convert rgb image (uint, 0-255) to hsv (float, 0.0-1.0)
    hsvimg = colors.rgb_to_hsv(rgbimg.astype(float)/255)

    # Initialize binary thresholded image
    binimg = np.zeros((rgbimg.shape[0], rgbimg.shape[1]))
    # Find pixels with hue<0.2 or hue>0.95 (red or yellow) and saturation/value
    # both greater than 0.7 (saturated and bright)--tends to coincide with
    # ornamental lights on trees in some of the images
    boolidx = np.logical_and(
                np.logical_and(
                  np.logical_or((hsvimg[:,:,0] < hueleftthr),
                                (hsvimg[:,:,0] > huerightthr)),
                                (hsvimg[:,:,1] > satthr)),
                                (hsvimg[:,:,2] > valthr))
    # Find pixels that meet hsv criterion
    binimg[np.where(boolidx)] = 255
    # Add pixels that meet grayscale brightness criterion
    binimg[np.where(gryimg > monothr)] = 255

    # Prepare thresholded points for DBSCAN clustering algorithm
    X = np.transpose(np.where(binimg == 255))
    Xslice = X
    nsample = len(Xslice)
    if nsample > maxpoints:
        # Make sure number of points does not exceed DBSCAN maximum capacity
        Xslice = X[range(0,nsample,int(ceil(float(nsample)/maxpoints)))]

    # Translate DBSCAN proximity threshold to units of pixels and run DBSCAN
    pixproxthr = proxthresh * sqrt(binimg.shape[0]**2 + binimg.shape[1]**2)
    db = DBSCAN(eps=pixproxthr, min_samples=10).fit(Xslice)
    labels = db.labels_.astype(int)

    # Find the largest cluster (i.e., with most points) and obtain convex hull   
    unique_labels = set(labels)
    maxclustpt = 0
    for k in unique_labels:
        class_members = [index[0] for index in np.argwhere(labels == k)]
        if len(class_members) > maxclustpt:
            points = Xslice[class_members]
            hull = sp.spatial.ConvexHull(points)
            maxclustpt = len(class_members)
            borderseg = [[points[simplex,0], points[simplex,1]] for simplex
                          in hull.simplices]

    return borderseg, X, labels, Xslice

and the second part is a user-level script which calls the first file and generates all of the plots above:

#!/usr/bin/env python

from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from findtree import findtree

# Image files to process
fname = ['nmzwj.png', 'aVZhC.png', '2K9EF.png',
         'YowlH.png', '2y4o5.png', 'FWhSP.png']

# Initialize figures
fgsz = (16,7)        
figthresh = plt.figure(figsize=fgsz, facecolor='w')
figclust  = plt.figure(figsize=fgsz, facecolor='w')
figcltwo  = plt.figure(figsize=fgsz, facecolor='w')
figborder = plt.figure(figsize=fgsz, facecolor='w')
figthresh.canvas.set_window_title('Thresholded HSV and Monochrome Brightness')
figclust.canvas.set_window_title('DBSCAN Clusters (Raw Pixel Output)')
figcltwo.canvas.set_window_title('DBSCAN Clusters (Slightly Dilated for Display)')
figborder.canvas.set_window_title('Trees with Borders')

for ii, name in zip(range(len(fname)), fname):
    # Open the file and convert to rgb image
    rgbimg = np.asarray(Image.open(name))

    # Get the tree borders as well as a bunch of other intermediate values
    # that will be used to illustrate how the algorithm works
    borderseg, X, labels, Xslice = findtree(rgbimg)

    # Display thresholded images
    axthresh = figthresh.add_subplot(2,3,ii+1)
    axthresh.set_xticks([])
    axthresh.set_yticks([])
    binimg = np.zeros((rgbimg.shape[0], rgbimg.shape[1]))
    for v, h in X:
        binimg[v,h] = 255
    axthresh.imshow(binimg, interpolation='nearest', cmap='Greys')

    # Display color-coded clusters
    axclust = figclust.add_subplot(2,3,ii+1) # Raw version
    axclust.set_xticks([])
    axclust.set_yticks([])
    axcltwo = figcltwo.add_subplot(2,3,ii+1) # Dilated slightly for display only
    axcltwo.set_xticks([])
    axcltwo.set_yticks([])
    axcltwo.imshow(binimg, interpolation='nearest', cmap='Greys')
    clustimg = np.ones(rgbimg.shape)    
    unique_labels = set(labels)
    # Generate a unique color for each cluster 
    plcol = cm.rainbow_r(np.linspace(0, 1, len(unique_labels)))
    for lbl, pix in zip(labels, Xslice):
        for col, unqlbl in zip(plcol, unique_labels):
            if lbl == unqlbl:
                # Cluster label of -1 indicates no cluster membership;
                # override default color with black
                if lbl == -1:
                    col = [0.0, 0.0, 0.0, 1.0]
                # Raw version
                for ij in range(3):
                    clustimg[pix[0],pix[1],ij] = col[ij]
                # Dilated just for display
                axcltwo.plot(pix[1], pix[0], 'o', markerfacecolor=col, 
                    markersize=1, markeredgecolor=col)
    axclust.imshow(clustimg)
    axcltwo.set_xlim(0, binimg.shape[1]-1)
    axcltwo.set_ylim(binimg.shape[0], -1)

    # Plot original images with read borders around the trees
    axborder = figborder.add_subplot(2,3,ii+1)
    axborder.set_axis_off()
    axborder.imshow(rgbimg, interpolation='nearest')
    for vseg, hseg in borderseg:
        axborder.plot(hseg, vseg, 'r-', lw=3)
    axborder.set_xlim(0, binimg.shape[1]-1)
    axborder.set_ylim(binimg.shape[0], -1)

plt.show()

回答 1

编辑注释:我编辑了这篇文章,以(i)根据要求单独处理每棵树图像,(ii)考虑对象的亮度和形状,以提高结果的质量。


下面介绍一种考虑物体亮度和形状的方法。换句话说,它寻找具有三角形形状且具有明显亮度的物体。它使用Marvin图像处理框架以Java实现。

第一步是颜色阈值。此处的目的是将分析重点放在亮度很高的物体上。

输出图像:

源代码:

public class ChristmasTree {

private MarvinImagePlugin fill = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.fill.boundaryFill");
private MarvinImagePlugin threshold = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.thresholding");
private MarvinImagePlugin invert = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.invert");
private MarvinImagePlugin dilation = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.morphological.dilation");

public ChristmasTree(){
    MarvinImage tree;

    // Iterate each image
    for(int i=1; i<=6; i++){
        tree = MarvinImageIO.loadImage("./res/trees/tree"+i+".png");

        // 1. Threshold
        threshold.setAttribute("threshold", 200);
        threshold.process(tree.clone(), tree);
    }
}
public static void main(String[] args) {
    new ChristmasTree();
}
}

在第二步中,将图像中最亮的点放大以形成形状。该过程的结果是具有明显亮度的物体的可能形状。应用洪水填充分割,可以检测到断开的形状。

输出图像:

源代码:

public class ChristmasTree {

private MarvinImagePlugin fill = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.fill.boundaryFill");
private MarvinImagePlugin threshold = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.thresholding");
private MarvinImagePlugin invert = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.invert");
private MarvinImagePlugin dilation = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.morphological.dilation");

public ChristmasTree(){
    MarvinImage tree;

    // Iterate each image
    for(int i=1; i<=6; i++){
        tree = MarvinImageIO.loadImage("./res/trees/tree"+i+".png");

        // 1. Threshold
        threshold.setAttribute("threshold", 200);
        threshold.process(tree.clone(), tree);

        // 2. Dilate
        invert.process(tree.clone(), tree);
        tree = MarvinColorModelConverter.rgbToBinary(tree, 127);
        MarvinImageIO.saveImage(tree, "./res/trees/new/tree_"+i+"threshold.png");
        dilation.setAttribute("matrix", MarvinMath.getTrueMatrix(50, 50));
        dilation.process(tree.clone(), tree);
        MarvinImageIO.saveImage(tree, "./res/trees/new/tree_"+1+"_dilation.png");
        tree = MarvinColorModelConverter.binaryToRgb(tree);

        // 3. Segment shapes
        MarvinImage trees2 = tree.clone();
        fill(tree, trees2);
        MarvinImageIO.saveImage(trees2, "./res/trees/new/tree_"+i+"_fill.png");
}

private void fill(MarvinImage imageIn, MarvinImage imageOut){
    boolean found;
    int color= 0xFFFF0000;

    while(true){
        found=false;

        Outerloop:
        for(int y=0; y<imageIn.getHeight(); y++){
            for(int x=0; x<imageIn.getWidth(); x++){
                if(imageOut.getIntComponent0(x, y) == 0){
                    fill.setAttribute("x", x);
                    fill.setAttribute("y", y);
                    fill.setAttribute("color", color);
                    fill.setAttribute("threshold", 120);
                    fill.process(imageIn, imageOut);
                    color = newColor(color);

                    found = true;
                    break Outerloop;
                }
            }
        }

        if(!found){
            break;
        }
    }

}

private int newColor(int color){
    int red = (color & 0x00FF0000) >> 16;
    int green = (color & 0x0000FF00) >> 8;
    int blue = (color & 0x000000FF);

    if(red <= green && red <= blue){
        red+=5;
    }
    else if(green <= red && green <= blue){
        green+=5;
    }
    else{
        blue+=5;
    }

    return 0xFF000000 + (red << 16) + (green << 8) + blue;
}

public static void main(String[] args) {
    new ChristmasTree();
}
}

如输出图像所示,检测到多种形状。在此问题中,图像中只有几个亮点。但是,实施此方法是为了处理更复杂的情况。

在下一步中,将分析每个形状。一种简单的算法可以检测形状类似于三角形的形状。该算法逐行分析对象形状。如果每个形状线的质心几乎相同(给定阈值),并且质量随着y的增加而增加,则对象具有三角形的形状。形状线的质量是该线中属于该形状的像素数。想象一下,您将对象水平切片并分析每个水平段。如果它们彼此居中,并且长度以线性模式从第一段到最后一段增加,则您可能有一个类似于三角形的对象。

源代码:

private int[] detectTrees(MarvinImage image){
    HashSet<Integer> analysed = new HashSet<Integer>();
    boolean found;
    while(true){
        found = false;
        for(int y=0; y<image.getHeight(); y++){
            for(int x=0; x<image.getWidth(); x++){
                int color = image.getIntColor(x, y);

                if(!analysed.contains(color)){
                    if(isTree(image, color)){
                        return getObjectRect(image, color);
                    }

                    analysed.add(color);
                    found=true;
                }
            }
        }

        if(!found){
            break;
        }
    }
    return null;
}

private boolean isTree(MarvinImage image, int color){

    int mass[][] = new int[image.getHeight()][2];
    int yStart=-1;
    int xStart=-1;
    for(int y=0; y<image.getHeight(); y++){
        int mc = 0;
        int xs=-1;
        int xe=-1;
        for(int x=0; x<image.getWidth(); x++){
            if(image.getIntColor(x, y) == color){
                mc++;

                if(yStart == -1){
                    yStart=y;
                    xStart=x;
                }

                if(xs == -1){
                    xs = x;
                }
                if(x > xe){
                    xe = x;
                }
            }
        }
        mass[y][0] = xs;
        mass[y][3] = xe;
        mass[y][4] = mc;    
    }

    int validLines=0;
    for(int y=0; y<image.getHeight(); y++){
        if
        ( 
            mass[y][5] > 0 &&
            Math.abs(((mass[y][0]+mass[y][6])/2)-xStart) <= 50 &&
            mass[y][7] >= (mass[yStart][8] + (y-yStart)*0.3) &&
            mass[y][9] <= (mass[yStart][10] + (y-yStart)*1.5)
        )
        {
            validLines++;
        }
    }

    if(validLines > 100){
        return true;
    }
    return false;
}

最后,如下图所示,原始图像中突出显示了每个形状类似于三角形且具有明显亮度的位置(在本例中为圣诞树)。

最终输出图像:

最终源代码:

public class ChristmasTree {

private MarvinImagePlugin fill = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.fill.boundaryFill");
private MarvinImagePlugin threshold = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.thresholding");
private MarvinImagePlugin invert = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.invert");
private MarvinImagePlugin dilation = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.morphological.dilation");

public ChristmasTree(){
    MarvinImage tree;

    // Iterate each image
    for(int i=1; i<=6; i++){
        tree = MarvinImageIO.loadImage("./res/trees/tree"+i+".png");

        // 1. Threshold
        threshold.setAttribute("threshold", 200);
        threshold.process(tree.clone(), tree);

        // 2. Dilate
        invert.process(tree.clone(), tree);
        tree = MarvinColorModelConverter.rgbToBinary(tree, 127);
        MarvinImageIO.saveImage(tree, "./res/trees/new/tree_"+i+"threshold.png");
        dilation.setAttribute("matrix", MarvinMath.getTrueMatrix(50, 50));
        dilation.process(tree.clone(), tree);
        MarvinImageIO.saveImage(tree, "./res/trees/new/tree_"+1+"_dilation.png");
        tree = MarvinColorModelConverter.binaryToRgb(tree);

        // 3. Segment shapes
        MarvinImage trees2 = tree.clone();
        fill(tree, trees2);
        MarvinImageIO.saveImage(trees2, "./res/trees/new/tree_"+i+"_fill.png");

        // 4. Detect tree-like shapes
        int[] rect = detectTrees(trees2);

        // 5. Draw the result
        MarvinImage original = MarvinImageIO.loadImage("./res/trees/tree"+i+".png");
        drawBoundary(trees2, original, rect);
        MarvinImageIO.saveImage(original, "./res/trees/new/tree_"+i+"_out_2.jpg");
    }
}

private void drawBoundary(MarvinImage shape, MarvinImage original, int[] rect){
    int yLines[] = new int[6];
    yLines[0] = rect[1];
    yLines[1] = rect[1]+(int)((rect[3]/5));
    yLines[2] = rect[1]+((rect[3]/5)*2);
    yLines[3] = rect[1]+((rect[3]/5)*3);
    yLines[4] = rect[1]+(int)((rect[3]/5)*4);
    yLines[5] = rect[1]+rect[3];

    List<Point> points = new ArrayList<Point>();
    for(int i=0; i<yLines.length; i++){
        boolean in=false;
        Point startPoint=null;
        Point endPoint=null;
        for(int x=rect[0]; x<rect[0]+rect[2]; x++){

            if(shape.getIntColor(x, yLines[i]) != 0xFFFFFFFF){
                if(!in){
                    if(startPoint == null){
                        startPoint = new Point(x, yLines[i]);
                    }
                }
                in = true;
            }
            else{
                if(in){
                    endPoint = new Point(x, yLines[i]);
                }
                in = false;
            }
        }

        if(endPoint == null){
            endPoint = new Point((rect[0]+rect[2])-1, yLines[i]);
        }

        points.add(startPoint);
        points.add(endPoint);
    }

    drawLine(points.get(0).x, points.get(0).y, points.get(1).x, points.get(1).y, 15, original);
    drawLine(points.get(1).x, points.get(1).y, points.get(3).x, points.get(3).y, 15, original);
    drawLine(points.get(3).x, points.get(3).y, points.get(5).x, points.get(5).y, 15, original);
    drawLine(points.get(5).x, points.get(5).y, points.get(7).x, points.get(7).y, 15, original);
    drawLine(points.get(7).x, points.get(7).y, points.get(9).x, points.get(9).y, 15, original);
    drawLine(points.get(9).x, points.get(9).y, points.get(11).x, points.get(11).y, 15, original);
    drawLine(points.get(11).x, points.get(11).y, points.get(10).x, points.get(10).y, 15, original);
    drawLine(points.get(10).x, points.get(10).y, points.get(8).x, points.get(8).y, 15, original);
    drawLine(points.get(8).x, points.get(8).y, points.get(6).x, points.get(6).y, 15, original);
    drawLine(points.get(6).x, points.get(6).y, points.get(4).x, points.get(4).y, 15, original);
    drawLine(points.get(4).x, points.get(4).y, points.get(2).x, points.get(2).y, 15, original);
    drawLine(points.get(2).x, points.get(2).y, points.get(0).x, points.get(0).y, 15, original);
}

private void drawLine(int x1, int y1, int x2, int y2, int length, MarvinImage image){
    int lx1, lx2, ly1, ly2;
    for(int i=0; i<length; i++){
        lx1 = (x1+i >= image.getWidth() ? (image.getWidth()-1)-i: x1);
        lx2 = (x2+i >= image.getWidth() ? (image.getWidth()-1)-i: x2);
        ly1 = (y1+i >= image.getHeight() ? (image.getHeight()-1)-i: y1);
        ly2 = (y2+i >= image.getHeight() ? (image.getHeight()-1)-i: y2);

        image.drawLine(lx1+i, ly1, lx2+i, ly2, Color.red);
        image.drawLine(lx1, ly1+i, lx2, ly2+i, Color.red);
    }
}

private void fillRect(MarvinImage image, int[] rect, int length){
    for(int i=0; i<length; i++){
        image.drawRect(rect[0]+i, rect[1]+i, rect[2]-(i*2), rect[3]-(i*2), Color.red);
    }
}

private void fill(MarvinImage imageIn, MarvinImage imageOut){
    boolean found;
    int color= 0xFFFF0000;

    while(true){
        found=false;

        Outerloop:
        for(int y=0; y<imageIn.getHeight(); y++){
            for(int x=0; x<imageIn.getWidth(); x++){
                if(imageOut.getIntComponent0(x, y) == 0){
                    fill.setAttribute("x", x);
                    fill.setAttribute("y", y);
                    fill.setAttribute("color", color);
                    fill.setAttribute("threshold", 120);
                    fill.process(imageIn, imageOut);
                    color = newColor(color);

                    found = true;
                    break Outerloop;
                }
            }
        }

        if(!found){
            break;
        }
    }

}

private int[] detectTrees(MarvinImage image){
    HashSet<Integer> analysed = new HashSet<Integer>();
    boolean found;
    while(true){
        found = false;
        for(int y=0; y<image.getHeight(); y++){
            for(int x=0; x<image.getWidth(); x++){
                int color = image.getIntColor(x, y);

                if(!analysed.contains(color)){
                    if(isTree(image, color)){
                        return getObjectRect(image, color);
                    }

                    analysed.add(color);
                    found=true;
                }
            }
        }

        if(!found){
            break;
        }
    }
    return null;
}

private boolean isTree(MarvinImage image, int color){

    int mass[][] = new int[image.getHeight()][11];
    int yStart=-1;
    int xStart=-1;
    for(int y=0; y<image.getHeight(); y++){
        int mc = 0;
        int xs=-1;
        int xe=-1;
        for(int x=0; x<image.getWidth(); x++){
            if(image.getIntColor(x, y) == color){
                mc++;

                if(yStart == -1){
                    yStart=y;
                    xStart=x;
                }

                if(xs == -1){
                    xs = x;
                }
                if(x > xe){
                    xe = x;
                }
            }
        }
        mass[y][0] = xs;
        mass[y][12] = xe;
        mass[y][13] = mc;   
    }

    int validLines=0;
    for(int y=0; y<image.getHeight(); y++){
        if
        ( 
            mass[y][14] > 0 &&
            Math.abs(((mass[y][0]+mass[y][15])/2)-xStart) <= 50 &&
            mass[y][16] >= (mass[yStart][17] + (y-yStart)*0.3) &&
            mass[y][18] <= (mass[yStart][19] + (y-yStart)*1.5)
        )
        {
            validLines++;
        }
    }

    if(validLines > 100){
        return true;
    }
    return false;
}

private int[] getObjectRect(MarvinImage image, int color){
    int x1=-1;
    int x2=-1;
    int y1=-1;
    int y2=-1;

    for(int y=0; y<image.getHeight(); y++){
        for(int x=0; x<image.getWidth(); x++){
            if(image.getIntColor(x, y) == color){

                if(x1 == -1 || x < x1){
                    x1 = x;
                }
                if(x2 == -1 || x > x2){
                    x2 = x;
                }
                if(y1 == -1 || y < y1){
                    y1 = y;
                }
                if(y2 == -1 || y > y2){
                    y2 = y;
                }
            }
        }
    }

    return new int[]{x1, y1, (x2-x1), (y2-y1)};
}

private int newColor(int color){
    int red = (color & 0x00FF0000) >> 16;
    int green = (color & 0x0000FF00) >> 8;
    int blue = (color & 0x000000FF);

    if(red <= green && red <= blue){
        red+=5;
    }
    else if(green <= red && green <= blue){
        green+=30;
    }
    else{
        blue+=30;
    }

    return 0xFF000000 + (red << 16) + (green << 8) + blue;
}

public static void main(String[] args) {
    new ChristmasTree();
}
}

这种方法的优点在于,由于它可以分析物体的形状,因此可能会与包含其他发光物体的图像一起使用。

圣诞节快乐!


编辑注2

讨论了此解决方案与其他解决方案的输出图像的相似性。实际上,它们非常相似。但是这种方法不仅可以分割对象。它还从某种意义上分析了对象的形状。它可以处理同一场景中的多个发光物体。实际上,圣诞树不必是最亮的圣诞树。我只是为了中止讨论而中止。样本中存在偏差,即仅寻找最亮的对象,便会找到树木。但是,我们真的要在这一点上停止讨论吗?在这一点上,计算机实际上能识别出类似于圣诞树的物体吗?让我们尝试缩小这一差距。

下面给出的结果只是为了阐明这一点:

输入图像

在此处输入图片说明

输出

在此处输入图片说明

EDIT NOTE: I edited this post to (i) process each tree image individually, as requested in the requirements, (ii) to consider both object brightness and shape in order to improve the quality of the result.


Below is presented an approach that takes in consideration the object brightness and shape. In other words, it seeks for objects with triangle-like shape and with significant brightness. It was implemented in Java, using Marvin image processing framework.

The first step is the color thresholding. The objective here is to focus the analysis on objects with significant brightness.

output images:

source code:

public class ChristmasTree {

private MarvinImagePlugin fill = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.fill.boundaryFill");
private MarvinImagePlugin threshold = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.thresholding");
private MarvinImagePlugin invert = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.invert");
private MarvinImagePlugin dilation = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.morphological.dilation");

public ChristmasTree(){
    MarvinImage tree;

    // Iterate each image
    for(int i=1; i<=6; i++){
        tree = MarvinImageIO.loadImage("./res/trees/tree"+i+".png");

        // 1. Threshold
        threshold.setAttribute("threshold", 200);
        threshold.process(tree.clone(), tree);
    }
}
public static void main(String[] args) {
    new ChristmasTree();
}
}

In the second step, the brightest points in the image are dilated in order to form shapes. The result of this process is the probable shape of the objects with significant brightness. Applying flood fill segmentation, disconnected shapes are detected.

output images:

source code:

public class ChristmasTree {

private MarvinImagePlugin fill = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.fill.boundaryFill");
private MarvinImagePlugin threshold = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.thresholding");
private MarvinImagePlugin invert = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.invert");
private MarvinImagePlugin dilation = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.morphological.dilation");

public ChristmasTree(){
    MarvinImage tree;

    // Iterate each image
    for(int i=1; i<=6; i++){
        tree = MarvinImageIO.loadImage("./res/trees/tree"+i+".png");

        // 1. Threshold
        threshold.setAttribute("threshold", 200);
        threshold.process(tree.clone(), tree);

        // 2. Dilate
        invert.process(tree.clone(), tree);
        tree = MarvinColorModelConverter.rgbToBinary(tree, 127);
        MarvinImageIO.saveImage(tree, "./res/trees/new/tree_"+i+"threshold.png");
        dilation.setAttribute("matrix", MarvinMath.getTrueMatrix(50, 50));
        dilation.process(tree.clone(), tree);
        MarvinImageIO.saveImage(tree, "./res/trees/new/tree_"+1+"_dilation.png");
        tree = MarvinColorModelConverter.binaryToRgb(tree);

        // 3. Segment shapes
        MarvinImage trees2 = tree.clone();
        fill(tree, trees2);
        MarvinImageIO.saveImage(trees2, "./res/trees/new/tree_"+i+"_fill.png");
}

private void fill(MarvinImage imageIn, MarvinImage imageOut){
    boolean found;
    int color= 0xFFFF0000;

    while(true){
        found=false;

        Outerloop:
        for(int y=0; y<imageIn.getHeight(); y++){
            for(int x=0; x<imageIn.getWidth(); x++){
                if(imageOut.getIntComponent0(x, y) == 0){
                    fill.setAttribute("x", x);
                    fill.setAttribute("y", y);
                    fill.setAttribute("color", color);
                    fill.setAttribute("threshold", 120);
                    fill.process(imageIn, imageOut);
                    color = newColor(color);

                    found = true;
                    break Outerloop;
                }
            }
        }

        if(!found){
            break;
        }
    }

}

private int newColor(int color){
    int red = (color & 0x00FF0000) >> 16;
    int green = (color & 0x0000FF00) >> 8;
    int blue = (color & 0x000000FF);

    if(red <= green && red <= blue){
        red+=5;
    }
    else if(green <= red && green <= blue){
        green+=5;
    }
    else{
        blue+=5;
    }

    return 0xFF000000 + (red << 16) + (green << 8) + blue;
}

public static void main(String[] args) {
    new ChristmasTree();
}
}

As shown in the output image, multiple shapes was detected. In this problem, there a just a few bright points in the images. However, this approach was implemented to deal with more complex scenarios.

In the next step each shape is analyzed. A simple algorithm detects shapes with a pattern similar to a triangle. The algorithm analyze the object shape line by line. If the center of the mass of each shape line is almost the same (given a threshold) and mass increase as y increase, the object has a triangle-like shape. The mass of the shape line is the number of pixels in that line that belongs to the shape. Imagine you slice the object horizontally and analyze each horizontal segment. If they are centralized to each other and the length increase from the first segment to last one in a linear pattern, you probably has an object that resembles a triangle.

source code:

private int[] detectTrees(MarvinImage image){
    HashSet<Integer> analysed = new HashSet<Integer>();
    boolean found;
    while(true){
        found = false;
        for(int y=0; y<image.getHeight(); y++){
            for(int x=0; x<image.getWidth(); x++){
                int color = image.getIntColor(x, y);

                if(!analysed.contains(color)){
                    if(isTree(image, color)){
                        return getObjectRect(image, color);
                    }

                    analysed.add(color);
                    found=true;
                }
            }
        }

        if(!found){
            break;
        }
    }
    return null;
}

private boolean isTree(MarvinImage image, int color){

    int mass[][] = new int[image.getHeight()][2];
    int yStart=-1;
    int xStart=-1;
    for(int y=0; y<image.getHeight(); y++){
        int mc = 0;
        int xs=-1;
        int xe=-1;
        for(int x=0; x<image.getWidth(); x++){
            if(image.getIntColor(x, y) == color){
                mc++;

                if(yStart == -1){
                    yStart=y;
                    xStart=x;
                }

                if(xs == -1){
                    xs = x;
                }
                if(x > xe){
                    xe = x;
                }
            }
        }
        mass[y][0] = xs;
        mass[y][3] = xe;
        mass[y][4] = mc;    
    }

    int validLines=0;
    for(int y=0; y<image.getHeight(); y++){
        if
        ( 
            mass[y][5] > 0 &&
            Math.abs(((mass[y][0]+mass[y][6])/2)-xStart) <= 50 &&
            mass[y][7] >= (mass[yStart][8] + (y-yStart)*0.3) &&
            mass[y][9] <= (mass[yStart][10] + (y-yStart)*1.5)
        )
        {
            validLines++;
        }
    }

    if(validLines > 100){
        return true;
    }
    return false;
}

Finally, the position of each shape similar to a triangle and with significant brightness, in this case a Christmas tree, is highlighted in the original image, as shown below.

final output images:

final source code:

public class ChristmasTree {

private MarvinImagePlugin fill = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.fill.boundaryFill");
private MarvinImagePlugin threshold = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.thresholding");
private MarvinImagePlugin invert = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.invert");
private MarvinImagePlugin dilation = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.morphological.dilation");

public ChristmasTree(){
    MarvinImage tree;

    // Iterate each image
    for(int i=1; i<=6; i++){
        tree = MarvinImageIO.loadImage("./res/trees/tree"+i+".png");

        // 1. Threshold
        threshold.setAttribute("threshold", 200);
        threshold.process(tree.clone(), tree);

        // 2. Dilate
        invert.process(tree.clone(), tree);
        tree = MarvinColorModelConverter.rgbToBinary(tree, 127);
        MarvinImageIO.saveImage(tree, "./res/trees/new/tree_"+i+"threshold.png");
        dilation.setAttribute("matrix", MarvinMath.getTrueMatrix(50, 50));
        dilation.process(tree.clone(), tree);
        MarvinImageIO.saveImage(tree, "./res/trees/new/tree_"+1+"_dilation.png");
        tree = MarvinColorModelConverter.binaryToRgb(tree);

        // 3. Segment shapes
        MarvinImage trees2 = tree.clone();
        fill(tree, trees2);
        MarvinImageIO.saveImage(trees2, "./res/trees/new/tree_"+i+"_fill.png");

        // 4. Detect tree-like shapes
        int[] rect = detectTrees(trees2);

        // 5. Draw the result
        MarvinImage original = MarvinImageIO.loadImage("./res/trees/tree"+i+".png");
        drawBoundary(trees2, original, rect);
        MarvinImageIO.saveImage(original, "./res/trees/new/tree_"+i+"_out_2.jpg");
    }
}

private void drawBoundary(MarvinImage shape, MarvinImage original, int[] rect){
    int yLines[] = new int[6];
    yLines[0] = rect[1];
    yLines[1] = rect[1]+(int)((rect[3]/5));
    yLines[2] = rect[1]+((rect[3]/5)*2);
    yLines[3] = rect[1]+((rect[3]/5)*3);
    yLines[4] = rect[1]+(int)((rect[3]/5)*4);
    yLines[5] = rect[1]+rect[3];

    List<Point> points = new ArrayList<Point>();
    for(int i=0; i<yLines.length; i++){
        boolean in=false;
        Point startPoint=null;
        Point endPoint=null;
        for(int x=rect[0]; x<rect[0]+rect[2]; x++){

            if(shape.getIntColor(x, yLines[i]) != 0xFFFFFFFF){
                if(!in){
                    if(startPoint == null){
                        startPoint = new Point(x, yLines[i]);
                    }
                }
                in = true;
            }
            else{
                if(in){
                    endPoint = new Point(x, yLines[i]);
                }
                in = false;
            }
        }

        if(endPoint == null){
            endPoint = new Point((rect[0]+rect[2])-1, yLines[i]);
        }

        points.add(startPoint);
        points.add(endPoint);
    }

    drawLine(points.get(0).x, points.get(0).y, points.get(1).x, points.get(1).y, 15, original);
    drawLine(points.get(1).x, points.get(1).y, points.get(3).x, points.get(3).y, 15, original);
    drawLine(points.get(3).x, points.get(3).y, points.get(5).x, points.get(5).y, 15, original);
    drawLine(points.get(5).x, points.get(5).y, points.get(7).x, points.get(7).y, 15, original);
    drawLine(points.get(7).x, points.get(7).y, points.get(9).x, points.get(9).y, 15, original);
    drawLine(points.get(9).x, points.get(9).y, points.get(11).x, points.get(11).y, 15, original);
    drawLine(points.get(11).x, points.get(11).y, points.get(10).x, points.get(10).y, 15, original);
    drawLine(points.get(10).x, points.get(10).y, points.get(8).x, points.get(8).y, 15, original);
    drawLine(points.get(8).x, points.get(8).y, points.get(6).x, points.get(6).y, 15, original);
    drawLine(points.get(6).x, points.get(6).y, points.get(4).x, points.get(4).y, 15, original);
    drawLine(points.get(4).x, points.get(4).y, points.get(2).x, points.get(2).y, 15, original);
    drawLine(points.get(2).x, points.get(2).y, points.get(0).x, points.get(0).y, 15, original);
}

private void drawLine(int x1, int y1, int x2, int y2, int length, MarvinImage image){
    int lx1, lx2, ly1, ly2;
    for(int i=0; i<length; i++){
        lx1 = (x1+i >= image.getWidth() ? (image.getWidth()-1)-i: x1);
        lx2 = (x2+i >= image.getWidth() ? (image.getWidth()-1)-i: x2);
        ly1 = (y1+i >= image.getHeight() ? (image.getHeight()-1)-i: y1);
        ly2 = (y2+i >= image.getHeight() ? (image.getHeight()-1)-i: y2);

        image.drawLine(lx1+i, ly1, lx2+i, ly2, Color.red);
        image.drawLine(lx1, ly1+i, lx2, ly2+i, Color.red);
    }
}

private void fillRect(MarvinImage image, int[] rect, int length){
    for(int i=0; i<length; i++){
        image.drawRect(rect[0]+i, rect[1]+i, rect[2]-(i*2), rect[3]-(i*2), Color.red);
    }
}

private void fill(MarvinImage imageIn, MarvinImage imageOut){
    boolean found;
    int color= 0xFFFF0000;

    while(true){
        found=false;

        Outerloop:
        for(int y=0; y<imageIn.getHeight(); y++){
            for(int x=0; x<imageIn.getWidth(); x++){
                if(imageOut.getIntComponent0(x, y) == 0){
                    fill.setAttribute("x", x);
                    fill.setAttribute("y", y);
                    fill.setAttribute("color", color);
                    fill.setAttribute("threshold", 120);
                    fill.process(imageIn, imageOut);
                    color = newColor(color);

                    found = true;
                    break Outerloop;
                }
            }
        }

        if(!found){
            break;
        }
    }

}

private int[] detectTrees(MarvinImage image){
    HashSet<Integer> analysed = new HashSet<Integer>();
    boolean found;
    while(true){
        found = false;
        for(int y=0; y<image.getHeight(); y++){
            for(int x=0; x<image.getWidth(); x++){
                int color = image.getIntColor(x, y);

                if(!analysed.contains(color)){
                    if(isTree(image, color)){
                        return getObjectRect(image, color);
                    }

                    analysed.add(color);
                    found=true;
                }
            }
        }

        if(!found){
            break;
        }
    }
    return null;
}

private boolean isTree(MarvinImage image, int color){

    int mass[][] = new int[image.getHeight()][11];
    int yStart=-1;
    int xStart=-1;
    for(int y=0; y<image.getHeight(); y++){
        int mc = 0;
        int xs=-1;
        int xe=-1;
        for(int x=0; x<image.getWidth(); x++){
            if(image.getIntColor(x, y) == color){
                mc++;

                if(yStart == -1){
                    yStart=y;
                    xStart=x;
                }

                if(xs == -1){
                    xs = x;
                }
                if(x > xe){
                    xe = x;
                }
            }
        }
        mass[y][0] = xs;
        mass[y][12] = xe;
        mass[y][13] = mc;   
    }

    int validLines=0;
    for(int y=0; y<image.getHeight(); y++){
        if
        ( 
            mass[y][14] > 0 &&
            Math.abs(((mass[y][0]+mass[y][15])/2)-xStart) <= 50 &&
            mass[y][16] >= (mass[yStart][17] + (y-yStart)*0.3) &&
            mass[y][18] <= (mass[yStart][19] + (y-yStart)*1.5)
        )
        {
            validLines++;
        }
    }

    if(validLines > 100){
        return true;
    }
    return false;
}

private int[] getObjectRect(MarvinImage image, int color){
    int x1=-1;
    int x2=-1;
    int y1=-1;
    int y2=-1;

    for(int y=0; y<image.getHeight(); y++){
        for(int x=0; x<image.getWidth(); x++){
            if(image.getIntColor(x, y) == color){

                if(x1 == -1 || x < x1){
                    x1 = x;
                }
                if(x2 == -1 || x > x2){
                    x2 = x;
                }
                if(y1 == -1 || y < y1){
                    y1 = y;
                }
                if(y2 == -1 || y > y2){
                    y2 = y;
                }
            }
        }
    }

    return new int[]{x1, y1, (x2-x1), (y2-y1)};
}

private int newColor(int color){
    int red = (color & 0x00FF0000) >> 16;
    int green = (color & 0x0000FF00) >> 8;
    int blue = (color & 0x000000FF);

    if(red <= green && red <= blue){
        red+=5;
    }
    else if(green <= red && green <= blue){
        green+=30;
    }
    else{
        blue+=30;
    }

    return 0xFF000000 + (red << 16) + (green << 8) + blue;
}

public static void main(String[] args) {
    new ChristmasTree();
}
}

The advantage of this approach is the fact it will probably work with images containing other luminous objects since it analyzes the object shape.

Merry Christmas!


EDIT NOTE 2

There is a discussion about the similarity of the output images of this solution and some other ones. In fact, they are very similar. But this approach does not just segment objects. It also analyzes the object shapes in some sense. It can handle multiple luminous objects in the same scene. In fact, the Christmas tree does not need to be the brightest one. I’m just abording it to enrich the discussion. There is a bias in the samples that just looking for the brightest object, you will find the trees. But, does we really want to stop the discussion at this point? At this point, how far the computer is really recognizing an object that resembles a Christmas tree? Let’s try to close this gap.

Below is presented a result just to elucidate this point:

input image

enter image description here

output

enter image description here


回答 2

这是我简单而又愚蠢的解决方案。它基于这样的假设,即树将是图片中最亮,最大的东西。

//g++ -Wall -pedantic -ansi -O2 -pipe -s -o christmas_tree christmas_tree.cpp `pkg-config --cflags --libs opencv`
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <iostream>

using namespace cv;
using namespace std;

int main(int argc,char *argv[])
{
    Mat original,tmp,tmp1;
    vector <vector<Point> > contours;
    Moments m;
    Rect boundrect;
    Point2f center;
    double radius, max_area=0,tmp_area=0;
    unsigned int j, k;
    int i;

    for(i = 1; i < argc; ++i)
    {
        original = imread(argv[i]);
        if(original.empty())
        {
            cerr << "Error"<<endl;
            return -1;
        }

        GaussianBlur(original, tmp, Size(3, 3), 0, 0, BORDER_DEFAULT);
        erode(tmp, tmp, Mat(), Point(-1, -1), 10);
        cvtColor(tmp, tmp, CV_BGR2HSV);
        inRange(tmp, Scalar(0, 0, 0), Scalar(180, 255, 200), tmp);

        dilate(original, tmp1, Mat(), Point(-1, -1), 15);
        cvtColor(tmp1, tmp1, CV_BGR2HLS);
        inRange(tmp1, Scalar(0, 185, 0), Scalar(180, 255, 255), tmp1);
        dilate(tmp1, tmp1, Mat(), Point(-1, -1), 10);

        bitwise_and(tmp, tmp1, tmp1);

        findContours(tmp1, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
        max_area = 0;
        j = 0;
        for(k = 0; k < contours.size(); k++)
        {
            tmp_area = contourArea(contours[k]);
            if(tmp_area > max_area)
            {
                max_area = tmp_area;
                j = k;
            }
        }
        tmp1 = Mat::zeros(original.size(),CV_8U);
        approxPolyDP(contours[j], contours[j], 30, true);
        drawContours(tmp1, contours, j, Scalar(255,255,255), CV_FILLED);

        m = moments(contours[j]);
        boundrect = boundingRect(contours[j]);
        center = Point2f(m.m10/m.m00, m.m01/m.m00);
        radius = (center.y - (boundrect.tl().y))/4.0*3.0;
        Rect heightrect(center.x-original.cols/5, boundrect.tl().y, original.cols/5*2, boundrect.size().height);

        tmp = Mat::zeros(original.size(), CV_8U);
        rectangle(tmp, heightrect, Scalar(255, 255, 255), -1);
        circle(tmp, center, radius, Scalar(255, 255, 255), -1);

        bitwise_and(tmp, tmp1, tmp1);

        findContours(tmp1, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
        max_area = 0;
        j = 0;
        for(k = 0; k < contours.size(); k++)
        {
            tmp_area = contourArea(contours[k]);
            if(tmp_area > max_area)
            {
                max_area = tmp_area;
                j = k;
            }
        }

        approxPolyDP(contours[j], contours[j], 30, true);
        convexHull(contours[j], contours[j]);

        drawContours(original, contours, j, Scalar(0, 0, 255), 3);

        namedWindow(argv[i], CV_WINDOW_NORMAL|CV_WINDOW_KEEPRATIO|CV_GUI_EXPANDED);
        imshow(argv[i], original);

        waitKey(0);
        destroyWindow(argv[i]);
    }

    return 0;
}

第一步是检测图片中最亮的像素,但是我们必须对树木本身和反射其光的雪进行区分。在这里,我们尝试排除对颜色代码应用非常简单的滤镜的雪:

GaussianBlur(original, tmp, Size(3, 3), 0, 0, BORDER_DEFAULT);
erode(tmp, tmp, Mat(), Point(-1, -1), 10);
cvtColor(tmp, tmp, CV_BGR2HSV);
inRange(tmp, Scalar(0, 0, 0), Scalar(180, 255, 200), tmp);

然后我们找到每个“明亮”像素:

dilate(original, tmp1, Mat(), Point(-1, -1), 15);
cvtColor(tmp1, tmp1, CV_BGR2HLS);
inRange(tmp1, Scalar(0, 185, 0), Scalar(180, 255, 255), tmp1);
dilate(tmp1, tmp1, Mat(), Point(-1, -1), 10);

最后,我们将两个结果结合起来:

bitwise_and(tmp, tmp1, tmp1);

现在我们寻找最大的明亮物体:

findContours(tmp1, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
max_area = 0;
j = 0;
for(k = 0; k < contours.size(); k++)
{
    tmp_area = contourArea(contours[k]);
    if(tmp_area > max_area)
    {
        max_area = tmp_area;
        j = k;
    }
}
tmp1 = Mat::zeros(original.size(),CV_8U);
approxPolyDP(contours[j], contours[j], 30, true);
drawContours(tmp1, contours, j, Scalar(255,255,255), CV_FILLED);

现在我们差不多完成了,但是由于下雪还有些不完善。为了将它们剪掉,我们将使用一个圆形和一个矩形构建一个遮罩,以近似于树的形状来删除不需要的片段:

m = moments(contours[j]);
boundrect = boundingRect(contours[j]);
center = Point2f(m.m10/m.m00, m.m01/m.m00);
radius = (center.y - (boundrect.tl().y))/4.0*3.0;
Rect heightrect(center.x-original.cols/5, boundrect.tl().y, original.cols/5*2, boundrect.size().height);

tmp = Mat::zeros(original.size(), CV_8U);
rectangle(tmp, heightrect, Scalar(255, 255, 255), -1);
circle(tmp, center, radius, Scalar(255, 255, 255), -1);

bitwise_and(tmp, tmp1, tmp1);

最后一步是找到我们树的轮廓并将其绘制在原始图片上。

findContours(tmp1, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
max_area = 0;
j = 0;
for(k = 0; k < contours.size(); k++)
{
    tmp_area = contourArea(contours[k]);
    if(tmp_area > max_area)
    {
        max_area = tmp_area;
        j = k;
    }
}

approxPolyDP(contours[j], contours[j], 30, true);
convexHull(contours[j], contours[j]);

drawContours(original, contours, j, Scalar(0, 0, 255), 3);

抱歉,目前连接不好,因此无法上传图片。稍后再尝试。

圣诞节快乐。

编辑:

这里是最终输出的一些图片:

Here is my simple and dumb solution. It is based upon the assumption that the tree will be the most bright and big thing in the picture.

//g++ -Wall -pedantic -ansi -O2 -pipe -s -o christmas_tree christmas_tree.cpp `pkg-config --cflags --libs opencv`
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <iostream>

using namespace cv;
using namespace std;

int main(int argc,char *argv[])
{
    Mat original,tmp,tmp1;
    vector <vector<Point> > contours;
    Moments m;
    Rect boundrect;
    Point2f center;
    double radius, max_area=0,tmp_area=0;
    unsigned int j, k;
    int i;

    for(i = 1; i < argc; ++i)
    {
        original = imread(argv[i]);
        if(original.empty())
        {
            cerr << "Error"<<endl;
            return -1;
        }

        GaussianBlur(original, tmp, Size(3, 3), 0, 0, BORDER_DEFAULT);
        erode(tmp, tmp, Mat(), Point(-1, -1), 10);
        cvtColor(tmp, tmp, CV_BGR2HSV);
        inRange(tmp, Scalar(0, 0, 0), Scalar(180, 255, 200), tmp);

        dilate(original, tmp1, Mat(), Point(-1, -1), 15);
        cvtColor(tmp1, tmp1, CV_BGR2HLS);
        inRange(tmp1, Scalar(0, 185, 0), Scalar(180, 255, 255), tmp1);
        dilate(tmp1, tmp1, Mat(), Point(-1, -1), 10);

        bitwise_and(tmp, tmp1, tmp1);

        findContours(tmp1, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
        max_area = 0;
        j = 0;
        for(k = 0; k < contours.size(); k++)
        {
            tmp_area = contourArea(contours[k]);
            if(tmp_area > max_area)
            {
                max_area = tmp_area;
                j = k;
            }
        }
        tmp1 = Mat::zeros(original.size(),CV_8U);
        approxPolyDP(contours[j], contours[j], 30, true);
        drawContours(tmp1, contours, j, Scalar(255,255,255), CV_FILLED);

        m = moments(contours[j]);
        boundrect = boundingRect(contours[j]);
        center = Point2f(m.m10/m.m00, m.m01/m.m00);
        radius = (center.y - (boundrect.tl().y))/4.0*3.0;
        Rect heightrect(center.x-original.cols/5, boundrect.tl().y, original.cols/5*2, boundrect.size().height);

        tmp = Mat::zeros(original.size(), CV_8U);
        rectangle(tmp, heightrect, Scalar(255, 255, 255), -1);
        circle(tmp, center, radius, Scalar(255, 255, 255), -1);

        bitwise_and(tmp, tmp1, tmp1);

        findContours(tmp1, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
        max_area = 0;
        j = 0;
        for(k = 0; k < contours.size(); k++)
        {
            tmp_area = contourArea(contours[k]);
            if(tmp_area > max_area)
            {
                max_area = tmp_area;
                j = k;
            }
        }

        approxPolyDP(contours[j], contours[j], 30, true);
        convexHull(contours[j], contours[j]);

        drawContours(original, contours, j, Scalar(0, 0, 255), 3);

        namedWindow(argv[i], CV_WINDOW_NORMAL|CV_WINDOW_KEEPRATIO|CV_GUI_EXPANDED);
        imshow(argv[i], original);

        waitKey(0);
        destroyWindow(argv[i]);
    }

    return 0;
}

The first step is to detect the most bright pixels in the picture, but we have to do a distinction between the tree itself and the snow which reflect its light. Here we try to exclude the snow appling a really simple filter on the color codes:

GaussianBlur(original, tmp, Size(3, 3), 0, 0, BORDER_DEFAULT);
erode(tmp, tmp, Mat(), Point(-1, -1), 10);
cvtColor(tmp, tmp, CV_BGR2HSV);
inRange(tmp, Scalar(0, 0, 0), Scalar(180, 255, 200), tmp);

Then we find every “bright” pixel:

dilate(original, tmp1, Mat(), Point(-1, -1), 15);
cvtColor(tmp1, tmp1, CV_BGR2HLS);
inRange(tmp1, Scalar(0, 185, 0), Scalar(180, 255, 255), tmp1);
dilate(tmp1, tmp1, Mat(), Point(-1, -1), 10);

Finally we join the two results:

bitwise_and(tmp, tmp1, tmp1);

Now we look for the biggest bright object:

findContours(tmp1, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
max_area = 0;
j = 0;
for(k = 0; k < contours.size(); k++)
{
    tmp_area = contourArea(contours[k]);
    if(tmp_area > max_area)
    {
        max_area = tmp_area;
        j = k;
    }
}
tmp1 = Mat::zeros(original.size(),CV_8U);
approxPolyDP(contours[j], contours[j], 30, true);
drawContours(tmp1, contours, j, Scalar(255,255,255), CV_FILLED);

Now we have almost done, but there are still some imperfection due to the snow. To cut them off we’ll build a mask using a circle and a rectangle to approximate the shape of a tree to delete unwanted pieces:

m = moments(contours[j]);
boundrect = boundingRect(contours[j]);
center = Point2f(m.m10/m.m00, m.m01/m.m00);
radius = (center.y - (boundrect.tl().y))/4.0*3.0;
Rect heightrect(center.x-original.cols/5, boundrect.tl().y, original.cols/5*2, boundrect.size().height);

tmp = Mat::zeros(original.size(), CV_8U);
rectangle(tmp, heightrect, Scalar(255, 255, 255), -1);
circle(tmp, center, radius, Scalar(255, 255, 255), -1);

bitwise_and(tmp, tmp1, tmp1);

The last step is to find the contour of our tree and draw it on the original picture.

findContours(tmp1, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
max_area = 0;
j = 0;
for(k = 0; k < contours.size(); k++)
{
    tmp_area = contourArea(contours[k]);
    if(tmp_area > max_area)
    {
        max_area = tmp_area;
        j = k;
    }
}

approxPolyDP(contours[j], contours[j], 30, true);
convexHull(contours[j], contours[j]);

drawContours(original, contours, j, Scalar(0, 0, 255), 3);

I’m sorry but at the moment I have a bad connection so it is not possible for me to upload pictures. I’ll try to do it later.

Merry Christmas.

EDIT:

Here some pictures of the final output:


回答 3

我在Matlab R2007a中编写了代码。我用k均值粗略提取了圣诞树。我将只用一张图像显示中间结果,而用全部六个图像显示最终结果。

首先,我将RGB空间映射到Lab空间,这可以增强b通道中红色的对比度:

colorTransform = makecform('srgb2lab');
I = applycform(I, colorTransform);
L = double(I(:,:,1));
a = double(I(:,:,2));
b = double(I(:,:,3));

在此处输入图片说明

除了色彩空间中的功能外,我还使用了与邻域相关的纹理功能,而不是与每个像素本身相关。在这里,我将三个原始通道(R,G,B)的强度线性组合。我采用这种格式的原因是,图片中的圣诞树上都有红色的灯光,有时还有绿色/有时是蓝色的照明。

R=double(Irgb(:,:,1));
G=double(Irgb(:,:,2));
B=double(Irgb(:,:,3));
I0 = (3*R + max(G,B)-min(G,B))/2;

在此处输入图片说明

我在其上应用了3X3局部二进制模式I0,将中心像素用作阈值,并通过计算阈值以上的平均像素强度值与阈值以下的平均值之间的差来获得对比度。

I0_copy = zeros(size(I0));
for i = 2 : size(I0,1) - 1
    for j = 2 : size(I0,2) - 1
        tmp = I0(i-1:i+1,j-1:j+1) >= I0(i,j);
        I0_copy(i,j) = mean(mean(tmp.*I0(i-1:i+1,j-1:j+1))) - ...
            mean(mean(~tmp.*I0(i-1:i+1,j-1:j+1))); % Contrast
    end
end

在此处输入图片说明

由于我总共有4个特征,因此我将在聚类方法中选择K = 5。k-means的代码如下所示(来自Andrew Ng博士的机器学习类。我之前参加过该类,我自己在程序设计中编写了代码)。

[centroids, idx] = runkMeans(X, initial_centroids, max_iters);
mask=reshape(idx,img_size(1),img_size(2));

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [centroids, idx] = runkMeans(X, initial_centroids, ...
                                  max_iters, plot_progress)
   [m n] = size(X);
   K = size(initial_centroids, 1);
   centroids = initial_centroids;
   previous_centroids = centroids;
   idx = zeros(m, 1);

   for i=1:max_iters    
      % For each example in X, assign it to the closest centroid
      idx = findClosestCentroids(X, centroids);

      % Given the memberships, compute new centroids
      centroids = computeCentroids(X, idx, K);

   end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function idx = findClosestCentroids(X, centroids)
   K = size(centroids, 1);
   idx = zeros(size(X,1), 1);
   for xi = 1:size(X,1)
      x = X(xi, :);
      % Find closest centroid for x.
      best = Inf;
      for mui = 1:K
        mu = centroids(mui, :);
        d = dot(x - mu, x - mu);
        if d < best
           best = d;
           idx(xi) = mui;
        end
      end
   end 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function centroids = computeCentroids(X, idx, K)
   [m n] = size(X);
   centroids = zeros(K, n);
   for mui = 1:K
      centroids(mui, :) = sum(X(idx == mui, :)) / sum(idx == mui);
   end

由于该程序在我的计算机上运行非常慢,因此我只运行了3次迭代。通常,停止条件是(i)迭代时间至少为10,或(ii)质心不再变化。以我的测试而言,增加迭代次数可能会更准确地区分背景(天空和树木,天空和建筑物等),但在圣诞树提取中并未显示出明显的变化。还要注意,k均值不能不受随机质心初始化的影响,因此建议多次运行该程序进行比较。

在k均值之后,I0选择具有最大强度的标记区域。并使用边界跟踪提取边界。对我来说,最后一棵圣诞树是最难提取的圣诞树,因为该图片中的对比度不如前五棵圣诞树高。我方法中的另一个问题是,我bwboundaries在Matlab中使用函数来跟踪边界,但是有时在第3、5、6个结果中也可以看到内部边界。圣诞树上的阴暗面不仅无法与发光面聚在一起,而且还导致了许多细微的内部边界追踪(imfill改善不多)。总之我的算法还有很大的改进空间。

一些出版物指出,均值平移可能比k均值更健壮,并且许多 基于图割的算法在复杂的边界分割上也很有竞争力。我自己编写了均值漂移算法,似乎可以在没有足够光线的情况下更好地提取区域。但是均值移动有点过分,需要一些合并策略。它在我的计算机上的运行速度甚至比k-means慢得多,恐怕我不得不放弃它。我热切期待看到其他人将通过上述现代算法在此处提交出色的结果。

但是我始终相信特征选择是图像分割中的关键组成部分。选择适当的特征以使对象和背景之间的余量最大化,许多分割算法肯定会起作用。不同的算法可能会将结果从1提高到10,但是功能选择可能会将结果从0提高到1。

圣诞节快乐 !

I wrote the code in Matlab R2007a. I used k-means to roughly extract the christmas tree. I will show my intermediate result only with one image, and final results with all the six.

First, I mapped the RGB space onto Lab space, which could enhance the contrast of red in its b channel:

colorTransform = makecform('srgb2lab');
I = applycform(I, colorTransform);
L = double(I(:,:,1));
a = double(I(:,:,2));
b = double(I(:,:,3));

enter image description here

Besides the feature in color space, I also used texture feature that is relevant with the neighborhood rather than each pixel itself. Here I linearly combined the intensity from the 3 original channels (R,G,B). The reason why I formatted this way is because the christmas trees in the picture all have red lights on them, and sometimes green/sometimes blue illumination as well.

R=double(Irgb(:,:,1));
G=double(Irgb(:,:,2));
B=double(Irgb(:,:,3));
I0 = (3*R + max(G,B)-min(G,B))/2;

enter image description here

I applied a 3X3 local binary pattern on I0, used the center pixel as the threshold, and obtained the contrast by calculating the difference between the mean pixel intensity value above the threshold and the mean value below it.

I0_copy = zeros(size(I0));
for i = 2 : size(I0,1) - 1
    for j = 2 : size(I0,2) - 1
        tmp = I0(i-1:i+1,j-1:j+1) >= I0(i,j);
        I0_copy(i,j) = mean(mean(tmp.*I0(i-1:i+1,j-1:j+1))) - ...
            mean(mean(~tmp.*I0(i-1:i+1,j-1:j+1))); % Contrast
    end
end

enter image description here

Since I have 4 features in total, I would choose K=5 in my clustering method. The code for k-means are shown below (it is from Dr. Andrew Ng’s machine learning course. I took the course before, and I wrote the code myself in his programming assignment).

[centroids, idx] = runkMeans(X, initial_centroids, max_iters);
mask=reshape(idx,img_size(1),img_size(2));

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [centroids, idx] = runkMeans(X, initial_centroids, ...
                                  max_iters, plot_progress)
   [m n] = size(X);
   K = size(initial_centroids, 1);
   centroids = initial_centroids;
   previous_centroids = centroids;
   idx = zeros(m, 1);

   for i=1:max_iters    
      % For each example in X, assign it to the closest centroid
      idx = findClosestCentroids(X, centroids);

      % Given the memberships, compute new centroids
      centroids = computeCentroids(X, idx, K);

   end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function idx = findClosestCentroids(X, centroids)
   K = size(centroids, 1);
   idx = zeros(size(X,1), 1);
   for xi = 1:size(X,1)
      x = X(xi, :);
      % Find closest centroid for x.
      best = Inf;
      for mui = 1:K
        mu = centroids(mui, :);
        d = dot(x - mu, x - mu);
        if d < best
           best = d;
           idx(xi) = mui;
        end
      end
   end 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function centroids = computeCentroids(X, idx, K)
   [m n] = size(X);
   centroids = zeros(K, n);
   for mui = 1:K
      centroids(mui, :) = sum(X(idx == mui, :)) / sum(idx == mui);
   end

Since the program runs very slow in my computer, I just ran 3 iterations. Normally the stop criteria is (i) iteration time at least 10, or (ii) no change on the centroids any more. To my test, increasing the iteration may differentiate the background (sky and tree, sky and building,…) more accurately, but did not show a drastic changes in christmas tree extraction. Also note k-means is not immune to the random centroid initialization, so running the program several times to make a comparison is recommended.

After the k-means, the labelled region with the maximum intensity of I0 was chosen. And boundary tracing was used to extracted the boundaries. To me, the last christmas tree is the most difficult one to extract since the contrast in that picture is not high enough as they are in the first five. Another issue in my method is that I used bwboundaries function in Matlab to trace the boundary, but sometimes the inner boundaries are also included as you can observe in 3rd, 5th, 6th results. The dark side within the christmas trees are not only failed to be clustered with the illuminated side, but they also lead to so many tiny inner boundaries tracing (imfill doesn’t improve very much). In all my algorithm still has a lot improvement space.

Some publications indicates that mean-shift may be more robust than k-means, and many graph-cut based algorithms are also very competitive on complicated boundaries segmentation. I wrote a mean-shift algorithm myself, it seems to better extract the regions without enough light. But mean-shift is a little bit over-segmented, and some strategy of merging is needed. It ran even much slower than k-means in my computer, I am afraid I have to give it up. I eagerly look forward to see others would submit excellent results here with those modern algorithms mentioned above.

Yet I always believe the feature selection is the key component in image segmentation. With a proper feature selection that can maximize the margin between object and background, many segmentation algorithms will definitely work. Different algorithms may improve the result from 1 to 10, but the feature selection may improve it from 0 to 1.

Merry Christmas !


回答 4

这是我使用传统图像处理方法的最后一篇文章。

在这里,我以某种方式结合了其他两个建议,甚至取得了更好的结果。事实上,我看不到这些结果如何更好(尤其是当您查看该方法生成的蒙版图像时)。

该方法的核心是三个关键假设的组合:

  1. 图像在树状区域中应该有很大的波动
  2. 图像在树状区域中应具有更高的强度
  3. 背景区域应具有较低的强度,并且大部分为蓝色

考虑到这些假设,该方法的工作方式如下:

  1. 将图像转换为HSV
  2. 用LoG滤波器过滤V通道
  3. 对LoG滤波图像应用硬阈值以获得“活动”蒙版A
  4. 对V通道应用硬阈值以获得强度遮罩B
  5. 应用H通道阈值以将低强度的蓝色区域捕获到背景遮罩C中
  6. 使用AND合并蒙版以获得最终蒙版
  7. 扩展遮罩以扩大区域并连接分散的像素
  8. 消除小区域并获得最终的蒙版,该蒙版最终仅代表树

这是MATLAB中的代码(同样,脚本将所有jpg图像加载到当前文件夹中,同样,这并不是一段经过优化的代码):

% clear everything
clear;
pack;
close all;
close all hidden;
drawnow;
clc;

% initialization
ims=dir('./*.jpg');
imgs={};
images={}; 
blur_images={}; 
log_image={}; 
dilated_image={};
int_image={};
back_image={};
bin_image={};
measurements={};
box={};
num=length(ims);
thres_div = 3;

for i=1:num, 
    % load original image
    imgs{end+1}=imread(ims(i).name);

    % convert to HSV colorspace
    images{end+1}=rgb2hsv(imgs{i});

    % apply laplacian filtering and heuristic hard thresholding
    val_thres = (max(max(images{i}(:,:,3)))/thres_div);
    log_image{end+1} = imfilter( images{i}(:,:,3),fspecial('log')) > val_thres;

    % get the most bright regions of the image
    int_thres = 0.26*max(max( images{i}(:,:,3)));
    int_image{end+1} = images{i}(:,:,3) > int_thres;

    % get the most probable background regions of the image
    back_image{end+1} = images{i}(:,:,1)>(150/360) & images{i}(:,:,1)<(320/360) & images{i}(:,:,3)<0.5;

    % compute the final binary image by combining 
    % high 'activity' with high intensity
    bin_image{end+1} = logical( log_image{i}) & logical( int_image{i}) & ~logical( back_image{i});

    % apply morphological dilation to connect distonnected components
    strel_size = round(0.01*max(size(imgs{i})));        % structuring element for morphological dilation
    dilated_image{end+1} = imdilate( bin_image{i}, strel('disk',strel_size));

    % do some measurements to eliminate small objects
    measurements{i} = regionprops( logical( dilated_image{i}),'Area','BoundingBox');

    % iterative enlargement of the structuring element for better connectivity
    while length(measurements{i})>14 && strel_size<(min(size(imgs{i}(:,:,1)))/2),
        strel_size = round( 1.5 * strel_size);
        dilated_image{i} = imdilate( bin_image{i}, strel('disk',strel_size));
        measurements{i} = regionprops( logical( dilated_image{i}),'Area','BoundingBox');
    end

    for m=1:length(measurements{i})
        if measurements{i}(m).Area < 0.05*numel( dilated_image{i})
            dilated_image{i}( round(measurements{i}(m).BoundingBox(2):measurements{i}(m).BoundingBox(4)+measurements{i}(m).BoundingBox(2)),...
                round(measurements{i}(m).BoundingBox(1):measurements{i}(m).BoundingBox(3)+measurements{i}(m).BoundingBox(1))) = 0;
        end
    end
    % make sure the dilated image is the same size with the original
    dilated_image{i} = dilated_image{i}(1:size(imgs{i},1),1:size(imgs{i},2));
    % compute the bounding box
    [y,x] = find( dilated_image{i});
    if isempty( y)
        box{end+1}=[];
    else
        box{end+1} = [ min(x) min(y) max(x)-min(x)+1 max(y)-min(y)+1];
    end
end 

%%% additional code to display things
for i=1:num,
    figure;
    subplot(121);
    colormap gray;
    imshow( imgs{i});
    if ~isempty(box{i})
        hold on;
        rr = rectangle( 'position', box{i});
        set( rr, 'EdgeColor', 'r');
        hold off;
    end
    subplot(122);
    imshow( imgs{i}.*uint8(repmat(dilated_image{i},[1 1 3])));
end

结果

结果

高分辨率结果仍可在这里!
在这里可以找到更多带有其他图像的实验。

This is my final post using the traditional image processing approaches…

Here I somehow combine my two other proposals, achieving even better results. As a matter of fact I cannot see how these results could be better (especially when you look at the masked images that the method produces).

At the heart of the approach is the combination of three key assumptions:

  1. Images should have high fluctuations in the tree regions
  2. Images should have higher intensity in the tree regions
  3. Background regions should have low intensity and be mostly blue-ish

With these assumptions in mind the method works as follows:

  1. Convert the images to HSV
  2. Filter the V channel with a LoG filter
  3. Apply hard thresholding on LoG filtered image to get ‘activity’ mask A
  4. Apply hard thresholding to V channel to get intensity mask B
  5. Apply H channel thresholding to capture low intensity blue-ish regions into background mask C
  6. Combine masks using AND to get the final mask
  7. Dilate the mask to enlarge regions and connect dispersed pixels
  8. Eliminate small regions and get the final mask which will eventually represent only the tree

Here is the code in MATLAB (again, the script loads all jpg images in the current folder and, again, this is far from being an optimized piece of code):

% clear everything
clear;
pack;
close all;
close all hidden;
drawnow;
clc;

% initialization
ims=dir('./*.jpg');
imgs={};
images={}; 
blur_images={}; 
log_image={}; 
dilated_image={};
int_image={};
back_image={};
bin_image={};
measurements={};
box={};
num=length(ims);
thres_div = 3;

for i=1:num, 
    % load original image
    imgs{end+1}=imread(ims(i).name);

    % convert to HSV colorspace
    images{end+1}=rgb2hsv(imgs{i});

    % apply laplacian filtering and heuristic hard thresholding
    val_thres = (max(max(images{i}(:,:,3)))/thres_div);
    log_image{end+1} = imfilter( images{i}(:,:,3),fspecial('log')) > val_thres;

    % get the most bright regions of the image
    int_thres = 0.26*max(max( images{i}(:,:,3)));
    int_image{end+1} = images{i}(:,:,3) > int_thres;

    % get the most probable background regions of the image
    back_image{end+1} = images{i}(:,:,1)>(150/360) & images{i}(:,:,1)<(320/360) & images{i}(:,:,3)<0.5;

    % compute the final binary image by combining 
    % high 'activity' with high intensity
    bin_image{end+1} = logical( log_image{i}) & logical( int_image{i}) & ~logical( back_image{i});

    % apply morphological dilation to connect distonnected components
    strel_size = round(0.01*max(size(imgs{i})));        % structuring element for morphological dilation
    dilated_image{end+1} = imdilate( bin_image{i}, strel('disk',strel_size));

    % do some measurements to eliminate small objects
    measurements{i} = regionprops( logical( dilated_image{i}),'Area','BoundingBox');

    % iterative enlargement of the structuring element for better connectivity
    while length(measurements{i})>14 && strel_size<(min(size(imgs{i}(:,:,1)))/2),
        strel_size = round( 1.5 * strel_size);
        dilated_image{i} = imdilate( bin_image{i}, strel('disk',strel_size));
        measurements{i} = regionprops( logical( dilated_image{i}),'Area','BoundingBox');
    end

    for m=1:length(measurements{i})
        if measurements{i}(m).Area < 0.05*numel( dilated_image{i})
            dilated_image{i}( round(measurements{i}(m).BoundingBox(2):measurements{i}(m).BoundingBox(4)+measurements{i}(m).BoundingBox(2)),...
                round(measurements{i}(m).BoundingBox(1):measurements{i}(m).BoundingBox(3)+measurements{i}(m).BoundingBox(1))) = 0;
        end
    end
    % make sure the dilated image is the same size with the original
    dilated_image{i} = dilated_image{i}(1:size(imgs{i},1),1:size(imgs{i},2));
    % compute the bounding box
    [y,x] = find( dilated_image{i});
    if isempty( y)
        box{end+1}=[];
    else
        box{end+1} = [ min(x) min(y) max(x)-min(x)+1 max(y)-min(y)+1];
    end
end 

%%% additional code to display things
for i=1:num,
    figure;
    subplot(121);
    colormap gray;
    imshow( imgs{i});
    if ~isempty(box{i})
        hold on;
        rr = rectangle( 'position', box{i});
        set( rr, 'EdgeColor', 'r');
        hold off;
    end
    subplot(122);
    imshow( imgs{i}.*uint8(repmat(dilated_image{i},[1 1 3])));
end

Results

results

High resolution results still available here!
Even more experiments with additional images can be found here.


回答 5

我的解决步骤:

  1. 获取R通道(从RGB)-我们在此通道上进行的所有操作:

  2. 创建兴趣区(ROI)

    • 最小值为149的阈值R通道(右上图)

    • 扩大结果区域(左中图)

  3. 在计算的投资回报率中检测矿石。树有很多边缘(右中图)

    • 膨胀结果

    • 半径较大的腐蚀(左下图)

  4. 选择最大的(按区域)对象-这是结果区域

  5. ConvexHull(树是凸多边形)(右下图)

  6. 边界框(右下图-grren框)

一步步: 在此处输入图片说明

第一个结果-最简单但不是开源软件-“ Adaptive Vision Studio + Adaptive Vision Library”:这不是开源的,但原型制作起来确实非常快:

完整的圣诞树检测算法(11个块): AVL解决方案

下一步。我们需要开源解决方案。将AVL滤镜更改为OpenCV滤镜:在这里,我进行了一些更改,例如,“边缘检测”使用cvCanny滤镜,以尊重roi,我确实将区域图像与边缘图像相乘,选择了我使用的最大元素findContours + outlineArea,但是想法是相同的。

https://www.youtube.com/watch?v=sfjB3MigLH0&index=1&list=UUpSRrkMHNHiLDXgylwhWNQQ

OpenCV解决方案

我现在无法显示具有中间步骤的图像,因为我只能放置2个链接。

好的,现在我们使用openSource过滤器,但它还不是全部开源。最后一步-移植到C ++代码。我在版本2.4.4中使用了OpenCV

最终的c ++代码的结果是: 在此处输入图片说明

C ++代码也很短:

#include "opencv2/highgui/highgui.hpp"
#include "opencv2/opencv.hpp"
#include <algorithm>
using namespace cv;

int main()
{

    string images[6] = {"..\\1.png","..\\2.png","..\\3.png","..\\4.png","..\\5.png","..\\6.png"};

    for(int i = 0; i < 6; ++i)
    {
        Mat img, thresholded, tdilated, tmp, tmp1;
        vector<Mat> channels(3);

        img = imread(images[i]);
        split(img, channels);
        threshold( channels[2], thresholded, 149, 255, THRESH_BINARY);                      //prepare ROI - threshold
        dilate( thresholded, tdilated,  getStructuringElement( MORPH_RECT, Size(22,22) ) ); //prepare ROI - dilate
        Canny( channels[2], tmp, 75, 125, 3, true );    //Canny edge detection
        multiply( tmp, tdilated, tmp1 );    // set ROI

        dilate( tmp1, tmp, getStructuringElement( MORPH_RECT, Size(20,16) ) ); // dilate
        erode( tmp, tmp1, getStructuringElement( MORPH_RECT, Size(36,36) ) ); // erode

        vector<vector<Point> > contours, contours1(1);
        vector<Point> convex;
        vector<Vec4i> hierarchy;
        findContours( tmp1, contours, hierarchy, CV_RETR_TREE, CV_CHAIN_APPROX_SIMPLE, Point(0, 0) );

        //get element of maximum area
        //int bestID = std::max_element( contours.begin(), contours.end(), 
        //  []( const vector<Point>& A, const vector<Point>& B ) { return contourArea(A) < contourArea(B); } ) - contours.begin();

            int bestID = 0;
        int bestArea = contourArea( contours[0] );
        for( int i = 1; i < contours.size(); ++i )
        {
            int area = contourArea( contours[i] );
            if( area > bestArea )
            {
                bestArea  = area;
                bestID = i;
            }
        }

        convexHull( contours[bestID], contours1[0] ); 
        drawContours( img, contours1, 0, Scalar( 100, 100, 255 ), img.rows / 100, 8, hierarchy, 0, Point() );

        imshow("image", img );
        waitKey(0);
    }


    return 0;
}

My solution steps:

  1. Get R channel (from RGB) – all operations we make on this channel:

  2. Create Region of Interest (ROI)

    • Threshold R channel with min value 149 (top right image)

    • Dilate result region (middle left image)

  3. Detect eges in computed roi. Tree has a lot of edges (middle right image)

    • Dilate result

    • Erode with bigger radius ( bottom left image)

  4. Select the biggest (by area) object – it’s the result region

  5. ConvexHull ( tree is convex polygon ) ( bottom right image )

  6. Bounding box (bottom right image – grren box )

Step by step: enter image description here

The first result – most simple but not in open source software – “Adaptive Vision Studio + Adaptive Vision Library”: This is not open source but really fast to prototype:

Whole algorithm to detect christmas tree (11 blocks): AVL solution

Next step. We want open source solution. Change AVL filters to OpenCV filters: Here I did little changes e.g. Edge Detection use cvCanny filter, to respect roi i did multiply region image with edges image, to select the biggest element i used findContours + contourArea but idea is the same.

https://www.youtube.com/watch?v=sfjB3MigLH0&index=1&list=UUpSRrkMHNHiLDXgylwhWNQQ

OpenCV solution

I can’t show images with intermediate steps now because I can put only 2 links.

Ok now we use openSource filters but it’s not still whole open source. Last step – port to c++ code. I used OpenCV in version 2.4.4

The result of final c++ code is: enter image description here

c++ code is also quite short:

#include "opencv2/highgui/highgui.hpp"
#include "opencv2/opencv.hpp"
#include <algorithm>
using namespace cv;

int main()
{

    string images[6] = {"..\\1.png","..\\2.png","..\\3.png","..\\4.png","..\\5.png","..\\6.png"};

    for(int i = 0; i < 6; ++i)
    {
        Mat img, thresholded, tdilated, tmp, tmp1;
        vector<Mat> channels(3);

        img = imread(images[i]);
        split(img, channels);
        threshold( channels[2], thresholded, 149, 255, THRESH_BINARY);                      //prepare ROI - threshold
        dilate( thresholded, tdilated,  getStructuringElement( MORPH_RECT, Size(22,22) ) ); //prepare ROI - dilate
        Canny( channels[2], tmp, 75, 125, 3, true );    //Canny edge detection
        multiply( tmp, tdilated, tmp1 );    // set ROI

        dilate( tmp1, tmp, getStructuringElement( MORPH_RECT, Size(20,16) ) ); // dilate
        erode( tmp, tmp1, getStructuringElement( MORPH_RECT, Size(36,36) ) ); // erode

        vector<vector<Point> > contours, contours1(1);
        vector<Point> convex;
        vector<Vec4i> hierarchy;
        findContours( tmp1, contours, hierarchy, CV_RETR_TREE, CV_CHAIN_APPROX_SIMPLE, Point(0, 0) );

        //get element of maximum area
        //int bestID = std::max_element( contours.begin(), contours.end(), 
        //  []( const vector<Point>& A, const vector<Point>& B ) { return contourArea(A) < contourArea(B); } ) - contours.begin();

            int bestID = 0;
        int bestArea = contourArea( contours[0] );
        for( int i = 1; i < contours.size(); ++i )
        {
            int area = contourArea( contours[i] );
            if( area > bestArea )
            {
                bestArea  = area;
                bestID = i;
            }
        }

        convexHull( contours[bestID], contours1[0] ); 
        drawContours( img, contours1, 0, Scalar( 100, 100, 255 ), img.rows / 100, 8, hierarchy, 0, Point() );

        imshow("image", img );
        waitKey(0);
    }


    return 0;
}

回答 6

…另一种老式解决方案-完全基于HSV处理

  1. 将图像转换为HSV色彩空间
  2. 根据HSV中的启发式方法创建蒙版(请参见下文)
  3. 对面罩进行形态学扩张以连接断开的区域
  4. 丢弃小区域和水平块(记住树是垂直块)
  5. 计算边界框

一个字的启发式在HSV处理:

  1. 色调(H)在210-320度之间的所有物体被丢弃为蓝紫色,应该是在背景或不相关的区域
  2. 一切与值(V)降低40%也被丢弃,因为太暗是相关

当然,可以尝试许多其他可能性来微调这种方法。

这是实现此技巧的MATLAB代码(警告:代码远未优化!!!我使用了不推荐用于MATLAB编程的技术,只是为了能够跟踪过程中的任何内容,因此可以对其进行极大地优化):

% clear everything
clear;
pack;
close all;
close all hidden;
drawnow;
clc;

% initialization
ims=dir('./*.jpg');
num=length(ims);

imgs={};
hsvs={}; 
masks={};
dilated_images={};
measurements={};
boxs={};

for i=1:num, 
    % load original image
    imgs{end+1} = imread(ims(i).name);
    flt_x_size = round(size(imgs{i},2)*0.005);
    flt_y_size = round(size(imgs{i},1)*0.005);
    flt = fspecial( 'average', max( flt_y_size, flt_x_size));
    imgs{i} = imfilter( imgs{i}, flt, 'same');
    % convert to HSV colorspace
    hsvs{end+1} = rgb2hsv(imgs{i});
    % apply a hard thresholding and binary operation to construct the mask
    masks{end+1} = medfilt2( ~(hsvs{i}(:,:,1)>(210/360) & hsvs{i}(:,:,1)<(320/360))&hsvs{i}(:,:,3)>0.4);
    % apply morphological dilation to connect distonnected components
    strel_size = round(0.03*max(size(imgs{i})));        % structuring element for morphological dilation
    dilated_images{end+1} = imdilate( masks{i}, strel('disk',strel_size));
    % do some measurements to eliminate small objects
    measurements{i} = regionprops( dilated_images{i},'Perimeter','Area','BoundingBox'); 
    for m=1:length(measurements{i})
        if (measurements{i}(m).Area < 0.02*numel( dilated_images{i})) || (measurements{i}(m).BoundingBox(3)>1.2*measurements{i}(m).BoundingBox(4))
            dilated_images{i}( round(measurements{i}(m).BoundingBox(2):measurements{i}(m).BoundingBox(4)+measurements{i}(m).BoundingBox(2)),...
                round(measurements{i}(m).BoundingBox(1):measurements{i}(m).BoundingBox(3)+measurements{i}(m).BoundingBox(1))) = 0;
        end
    end
    dilated_images{i} = dilated_images{i}(1:size(imgs{i},1),1:size(imgs{i},2));
    % compute the bounding box
    [y,x] = find( dilated_images{i});
    if isempty( y)
        boxs{end+1}=[];
    else
        boxs{end+1} = [ min(x) min(y) max(x)-min(x)+1 max(y)-min(y)+1];
    end

end 

%%% additional code to display things
for i=1:num,
    figure;
    subplot(121);
    colormap gray;
    imshow( imgs{i});
    if ~isempty(boxs{i})
        hold on;
        rr = rectangle( 'position', boxs{i});
        set( rr, 'EdgeColor', 'r');
        hold off;
    end
    subplot(122);
    imshow( imgs{i}.*uint8(repmat(dilated_images{i},[1 1 3])));
end

结果:

在结果中,我显示了蒙版的图像和边界框。 在此处输入图片说明

…another old fashioned solution – purely based on HSV processing:

  1. Convert images to the HSV colorspace
  2. Create masks according to heuristics in the HSV (see below)
  3. Apply morphological dilation to the mask to connect disconnected areas
  4. Discard small areas and horizontal blocks (remember trees are vertical blocks)
  5. Compute the bounding box

A word on the heuristics in the HSV processing:

  1. everything with Hues (H) between 210 – 320 degrees is discarded as blue-magenta that is supposed to be in the background or in non-relevant areas
  2. everything with Values (V) lower that 40% is also discarded as being too dark to be relevant

Of course one may experiment with numerous other possibilities to fine-tune this approach…

Here is the MATLAB code to do the trick (warning: the code is far from being optimized!!! I used techniques not recommended for MATLAB programming just to be able to track anything in the process-this can be greatly optimized):

% clear everything
clear;
pack;
close all;
close all hidden;
drawnow;
clc;

% initialization
ims=dir('./*.jpg');
num=length(ims);

imgs={};
hsvs={}; 
masks={};
dilated_images={};
measurements={};
boxs={};

for i=1:num, 
    % load original image
    imgs{end+1} = imread(ims(i).name);
    flt_x_size = round(size(imgs{i},2)*0.005);
    flt_y_size = round(size(imgs{i},1)*0.005);
    flt = fspecial( 'average', max( flt_y_size, flt_x_size));
    imgs{i} = imfilter( imgs{i}, flt, 'same');
    % convert to HSV colorspace
    hsvs{end+1} = rgb2hsv(imgs{i});
    % apply a hard thresholding and binary operation to construct the mask
    masks{end+1} = medfilt2( ~(hsvs{i}(:,:,1)>(210/360) & hsvs{i}(:,:,1)<(320/360))&hsvs{i}(:,:,3)>0.4);
    % apply morphological dilation to connect distonnected components
    strel_size = round(0.03*max(size(imgs{i})));        % structuring element for morphological dilation
    dilated_images{end+1} = imdilate( masks{i}, strel('disk',strel_size));
    % do some measurements to eliminate small objects
    measurements{i} = regionprops( dilated_images{i},'Perimeter','Area','BoundingBox'); 
    for m=1:length(measurements{i})
        if (measurements{i}(m).Area < 0.02*numel( dilated_images{i})) || (measurements{i}(m).BoundingBox(3)>1.2*measurements{i}(m).BoundingBox(4))
            dilated_images{i}( round(measurements{i}(m).BoundingBox(2):measurements{i}(m).BoundingBox(4)+measurements{i}(m).BoundingBox(2)),...
                round(measurements{i}(m).BoundingBox(1):measurements{i}(m).BoundingBox(3)+measurements{i}(m).BoundingBox(1))) = 0;
        end
    end
    dilated_images{i} = dilated_images{i}(1:size(imgs{i},1),1:size(imgs{i},2));
    % compute the bounding box
    [y,x] = find( dilated_images{i});
    if isempty( y)
        boxs{end+1}=[];
    else
        boxs{end+1} = [ min(x) min(y) max(x)-min(x)+1 max(y)-min(y)+1];
    end

end 

%%% additional code to display things
for i=1:num,
    figure;
    subplot(121);
    colormap gray;
    imshow( imgs{i});
    if ~isempty(boxs{i})
        hold on;
        rr = rectangle( 'position', boxs{i});
        set( rr, 'EdgeColor', 'r');
        hold off;
    end
    subplot(122);
    imshow( imgs{i}.*uint8(repmat(dilated_images{i},[1 1 3])));
end

Results:

In the results I show the masked image and the bounding box. enter image description here


回答 7

一些老式的图像处理方法…
这个想法是基于这样的假设,即图像在通常较暗和较平滑的背景(在某些情况下为前景)上描绘了发光的树。该点燃树面积更“有活力”,具有较高的强度
流程如下:

  1. 转换为灰度
  2. 应用LoG过滤以获取最“活跃”的区域
  3. 应用专心的阈值以获得最明亮的区域
  4. 结合之前的2个以获得初步的蒙版
  5. 应用形态学扩张来扩大区域并连接相邻的组件
  6. 根据面积缩小消除候选区域

您得到的是每个图像的二进制掩码和边界框。

这是使用这种幼稚技术的结果: 在此处输入图片说明

MATLAB上 的代码如下:该代码在包含JPG图像的文件夹上运行。加载所有图像并返回检测到的结果。

% clear everything
clear;
pack;
close all;
close all hidden;
drawnow;
clc;

% initialization
ims=dir('./*.jpg');
imgs={};
images={}; 
blur_images={}; 
log_image={}; 
dilated_image={};
int_image={};
bin_image={};
measurements={};
box={};
num=length(ims);
thres_div = 3;

for i=1:num, 
    % load original image
    imgs{end+1}=imread(ims(i).name);

    % convert to grayscale
    images{end+1}=rgb2gray(imgs{i});

    % apply laplacian filtering and heuristic hard thresholding
    val_thres = (max(max(images{i}))/thres_div);
    log_image{end+1} = imfilter( images{i},fspecial('log')) > val_thres;

    % get the most bright regions of the image
    int_thres = 0.26*max(max( images{i}));
    int_image{end+1} = images{i} > int_thres;

    % compute the final binary image by combining 
    % high 'activity' with high intensity
    bin_image{end+1} = log_image{i} .* int_image{i};

    % apply morphological dilation to connect distonnected components
    strel_size = round(0.01*max(size(imgs{i})));        % structuring element for morphological dilation
    dilated_image{end+1} = imdilate( bin_image{i}, strel('disk',strel_size));

    % do some measurements to eliminate small objects
    measurements{i} = regionprops( logical( dilated_image{i}),'Area','BoundingBox');
    for m=1:length(measurements{i})
        if measurements{i}(m).Area < 0.05*numel( dilated_image{i})
            dilated_image{i}( round(measurements{i}(m).BoundingBox(2):measurements{i}(m).BoundingBox(4)+measurements{i}(m).BoundingBox(2)),...
                round(measurements{i}(m).BoundingBox(1):measurements{i}(m).BoundingBox(3)+measurements{i}(m).BoundingBox(1))) = 0;
        end
    end
    % make sure the dilated image is the same size with the original
    dilated_image{i} = dilated_image{i}(1:size(imgs{i},1),1:size(imgs{i},2));
    % compute the bounding box
    [y,x] = find( dilated_image{i});
    if isempty( y)
        box{end+1}=[];
    else
        box{end+1} = [ min(x) min(y) max(x)-min(x)+1 max(y)-min(y)+1];
    end
end 

%%% additional code to display things
for i=1:num,
    figure;
    subplot(121);
    colormap gray;
    imshow( imgs{i});
    if ~isempty(box{i})
        hold on;
        rr = rectangle( 'position', box{i});
        set( rr, 'EdgeColor', 'r');
        hold off;
    end
    subplot(122);
    imshow( imgs{i}.*uint8(repmat(dilated_image{i},[1 1 3])));
end

Some old-fashioned image processing approach…
The idea is based on the assumption that images depict lighted trees on typically darker and smoother backgrounds (or foregrounds in some cases). The lighted tree area is more “energetic” and has higher intensity.
The process is as follows:

  1. Convert to graylevel
  2. Apply LoG filtering to get the most “active” areas
  3. Apply an intentisy thresholding to get the most bright areas
  4. Combine the previous 2 to get a preliminary mask
  5. Apply a morphological dilation to enlarge areas and connect neighboring components
  6. Eliminate small candidate areas according to their area size

What you get is a binary mask and a bounding box for each image.

Here are the results using this naive technique: enter image description here

Code on MATLAB follows: The code runs on a folder with JPG images. Loads all images and returns detected results.

% clear everything
clear;
pack;
close all;
close all hidden;
drawnow;
clc;

% initialization
ims=dir('./*.jpg');
imgs={};
images={}; 
blur_images={}; 
log_image={}; 
dilated_image={};
int_image={};
bin_image={};
measurements={};
box={};
num=length(ims);
thres_div = 3;

for i=1:num, 
    % load original image
    imgs{end+1}=imread(ims(i).name);

    % convert to grayscale
    images{end+1}=rgb2gray(imgs{i});

    % apply laplacian filtering and heuristic hard thresholding
    val_thres = (max(max(images{i}))/thres_div);
    log_image{end+1} = imfilter( images{i},fspecial('log')) > val_thres;

    % get the most bright regions of the image
    int_thres = 0.26*max(max( images{i}));
    int_image{end+1} = images{i} > int_thres;

    % compute the final binary image by combining 
    % high 'activity' with high intensity
    bin_image{end+1} = log_image{i} .* int_image{i};

    % apply morphological dilation to connect distonnected components
    strel_size = round(0.01*max(size(imgs{i})));        % structuring element for morphological dilation
    dilated_image{end+1} = imdilate( bin_image{i}, strel('disk',strel_size));

    % do some measurements to eliminate small objects
    measurements{i} = regionprops( logical( dilated_image{i}),'Area','BoundingBox');
    for m=1:length(measurements{i})
        if measurements{i}(m).Area < 0.05*numel( dilated_image{i})
            dilated_image{i}( round(measurements{i}(m).BoundingBox(2):measurements{i}(m).BoundingBox(4)+measurements{i}(m).BoundingBox(2)),...
                round(measurements{i}(m).BoundingBox(1):measurements{i}(m).BoundingBox(3)+measurements{i}(m).BoundingBox(1))) = 0;
        end
    end
    % make sure the dilated image is the same size with the original
    dilated_image{i} = dilated_image{i}(1:size(imgs{i},1),1:size(imgs{i},2));
    % compute the bounding box
    [y,x] = find( dilated_image{i});
    if isempty( y)
        box{end+1}=[];
    else
        box{end+1} = [ min(x) min(y) max(x)-min(x)+1 max(y)-min(y)+1];
    end
end 

%%% additional code to display things
for i=1:num,
    figure;
    subplot(121);
    colormap gray;
    imshow( imgs{i});
    if ~isempty(box{i})
        hold on;
        rr = rectangle( 'position', box{i});
        set( rr, 'EdgeColor', 'r');
        hold off;
    end
    subplot(122);
    imshow( imgs{i}.*uint8(repmat(dilated_image{i},[1 1 3])));
end

回答 8

使用与我所见不同的方法,我创建了一个 通过它们的灯光检测圣诞树的脚本。结果始终是对称的三角形,如有必要,还可以使用数字值,例如树的角度(“脂肪度”)。

显然,此算法的最大威胁是(大量)或树前(更大的问题,直到进一步优化)旁边的灯。编辑(添加):做不到的事情:找出是否有一棵圣诞树,在一幅图像中找到多棵圣诞树,正确检测拉斯维加斯中部的圣诞前夜树,检测严重弯曲的圣诞树,倒置或切碎…;)

不同的阶段是:

  • 计算每个像素的附加亮度(R + G + B)
  • 将每个像素上方所有8个相邻像素的值相加
  • 按此值对所有像素进行排名(最亮的优先)-我知道,不是很细微…
  • 从顶部开始选择N个,跳过距离太近的
  • 计算 这前N个(给我们大约树的中心)
  • 从中位数位置开始,在一个加宽的搜索光束中,从所选的最亮光源发出的最上面的光线(人们倾向于在最上方放置至少一个光线)
  • 从那里开始,想象线条向左和向右向下倾斜60度(圣诞节树不应该那么胖)
  • 降低60度,直到20%的最亮的光不在此三角形内
  • 在三角形的最底部找到光线,为您提供树的下部水平边框
  • 完成了

标记说明:

  • 树中心的大红十字:N个最亮的灯光的中位数
  • 从上方向上的虚线:“搜索光束”为树的顶部
  • 较小的红十字:树顶
  • 很小的红叉:所有N个最亮的灯
  • 红色三角形:D!

源代码:

<?php

ini_set('memory_limit', '1024M');

header("Content-type: image/png");

$chosenImage = 6;

switch($chosenImage){
    case 1:
        $inputImage     = imagecreatefromjpeg("nmzwj.jpg");
        break;
    case 2:
        $inputImage     = imagecreatefromjpeg("2y4o5.jpg");
        break;
    case 3:
        $inputImage     = imagecreatefromjpeg("YowlH.jpg");
        break;
    case 4:
        $inputImage     = imagecreatefromjpeg("2K9Ef.jpg");
        break;
    case 5:
        $inputImage     = imagecreatefromjpeg("aVZhC.jpg");
        break;
    case 6:
        $inputImage     = imagecreatefromjpeg("FWhSP.jpg");
        break;
    case 7:
        $inputImage     = imagecreatefromjpeg("roemerberg.jpg");
        break;
    default:
        exit();
}

// Process the loaded image

$topNspots = processImage($inputImage);

imagejpeg($inputImage);
imagedestroy($inputImage);

// Here be functions

function processImage($image) {
    $orange = imagecolorallocate($image, 220, 210, 60);
    $black = imagecolorallocate($image, 0, 0, 0);
    $red = imagecolorallocate($image, 255, 0, 0);

    $maxX = imagesx($image)-1;
    $maxY = imagesy($image)-1;

    // Parameters
    $spread = 1; // Number of pixels to each direction that will be added up
    $topPositions = 80; // Number of (brightest) lights taken into account
    $minLightDistance = round(min(array($maxX, $maxY)) / 30); // Minimum number of pixels between the brigtests lights
    $searchYperX = 5; // spread of the "search beam" from the median point to the top

    $renderStage = 3; // 1 to 3; exits the process early


    // STAGE 1
    // Calculate the brightness of each pixel (R+G+B)

    $maxBrightness = 0;
    $stage1array = array();

    for($row = 0; $row <= $maxY; $row++) {

        $stage1array[$row] = array();

        for($col = 0; $col <= $maxX; $col++) {

            $rgb = imagecolorat($image, $col, $row);
            $brightness = getBrightnessFromRgb($rgb);
            $stage1array[$row][$col] = $brightness;

            if($renderStage == 1){
                $brightnessToGrey = round($brightness / 765 * 256);
                $greyRgb = imagecolorallocate($image, $brightnessToGrey, $brightnessToGrey, $brightnessToGrey);
                imagesetpixel($image, $col, $row, $greyRgb);
            }

            if($brightness > $maxBrightness) {
                $maxBrightness = $brightness;
                if($renderStage == 1){
                    imagesetpixel($image, $col, $row, $red);
                }
            }
        }
    }
    if($renderStage == 1) {
        return;
    }


    // STAGE 2
    // Add up brightness of neighbouring pixels

    $stage2array = array();
    $maxStage2 = 0;

    for($row = 0; $row <= $maxY; $row++) {
        $stage2array[$row] = array();

        for($col = 0; $col <= $maxX; $col++) {
            if(!isset($stage2array[$row][$col])) $stage2array[$row][$col] = 0;

            // Look around the current pixel, add brightness
            for($y = $row-$spread; $y <= $row+$spread; $y++) {
                for($x = $col-$spread; $x <= $col+$spread; $x++) {

                    // Don't read values from outside the image
                    if($x >= 0 && $x <= $maxX && $y >= 0 && $y <= $maxY){
                        $stage2array[$row][$col] += $stage1array[$y][$x]+10;
                    }
                }
            }

            $stage2value = $stage2array[$row][$col];
            if($stage2value > $maxStage2) {
                $maxStage2 = $stage2value;
            }
        }
    }

    if($renderStage >= 2){
        // Paint the accumulated light, dimmed by the maximum value from stage 2
        for($row = 0; $row <= $maxY; $row++) {
            for($col = 0; $col <= $maxX; $col++) {
                $brightness = round($stage2array[$row][$col] / $maxStage2 * 255);
                $greyRgb = imagecolorallocate($image, $brightness, $brightness, $brightness);
                imagesetpixel($image, $col, $row, $greyRgb);
            }
        }
    }

    if($renderStage == 2) {
        return;
    }


    // STAGE 3

    // Create a ranking of bright spots (like "Top 20")
    $topN = array();

    for($row = 0; $row <= $maxY; $row++) {
        for($col = 0; $col <= $maxX; $col++) {

            $stage2Brightness = $stage2array[$row][$col];
            $topN[$col.":".$row] = $stage2Brightness;
        }
    }
    arsort($topN);

    $topNused = array();
    $topPositionCountdown = $topPositions;

    if($renderStage == 3){
        foreach ($topN as $key => $val) {
            if($topPositionCountdown <= 0){
                break;
            }

            $position = explode(":", $key);

            foreach($topNused as $usedPosition => $usedValue) {
                $usedPosition = explode(":", $usedPosition);
                $distance = abs($usedPosition[0] - $position[0]) + abs($usedPosition[1] - $position[1]);
                if($distance < $minLightDistance) {
                    continue 2;
                }
            }

            $topNused[$key] = $val;

            paintCrosshair($image, $position[0], $position[1], $red, 2);

            $topPositionCountdown--;

        }
    }


    // STAGE 4
    // Median of all Top N lights
    $topNxValues = array();
    $topNyValues = array();

    foreach ($topNused as $key => $val) {
        $position = explode(":", $key);
        array_push($topNxValues, $position[0]);
        array_push($topNyValues, $position[1]);
    }

    $medianXvalue = round(calculate_median($topNxValues));
    $medianYvalue = round(calculate_median($topNyValues));
    paintCrosshair($image, $medianXvalue, $medianYvalue, $red, 15);


    // STAGE 5
    // Find treetop

    $filename = 'debug.log';
    $handle = fopen($filename, "w");
    fwrite($handle, "\n\n STAGE 5");

    $treetopX = $medianXvalue;
    $treetopY = $medianYvalue;

    $searchXmin = $medianXvalue;
    $searchXmax = $medianXvalue;

    $width = 0;
    for($y = $medianYvalue; $y >= 0; $y--) {
        fwrite($handle, "\nAt y = ".$y);

        if(($y % $searchYperX) == 0) { // Modulo
            $width++;
            $searchXmin = $medianXvalue - $width;
            $searchXmax = $medianXvalue + $width;
            imagesetpixel($image, $searchXmin, $y, $red);
            imagesetpixel($image, $searchXmax, $y, $red);
        }

        foreach ($topNused as $key => $val) {
            $position = explode(":", $key); // "x:y"

            if($position[1] != $y){
                continue;
            }

            if($position[0] >= $searchXmin && $position[0] <= $searchXmax){
                $treetopX = $position[0];
                $treetopY = $y;
            }
        }

    }

    paintCrosshair($image, $treetopX, $treetopY, $red, 5);


    // STAGE 6
    // Find tree sides
    fwrite($handle, "\n\n STAGE 6");

    $treesideAngle = 60; // The extremely "fat" end of a christmas tree
    $treeBottomY = $treetopY;

    $topPositionsExcluded = 0;
    $xymultiplier = 0;
    while(($topPositionsExcluded < ($topPositions / 5)) && $treesideAngle >= 1){
        fwrite($handle, "\n\nWe're at angle ".$treesideAngle);
        $xymultiplier = sin(deg2rad($treesideAngle));
        fwrite($handle, "\nMultiplier: ".$xymultiplier);

        $topPositionsExcluded = 0;
        foreach ($topNused as $key => $val) {
            $position = explode(":", $key);
            fwrite($handle, "\nAt position ".$key);

            if($position[1] > $treeBottomY) {
                $treeBottomY = $position[1];
            }

            // Lights above the tree are outside of it, but don't matter
            if($position[1] < $treetopY){
                $topPositionsExcluded++;
                fwrite($handle, "\nTOO HIGH");
                continue;
            }

            // Top light will generate division by zero
            if($treetopY-$position[1] == 0) {
                fwrite($handle, "\nDIVISION BY ZERO");
                continue;
            }

            // Lights left end right of it are also not inside
            fwrite($handle, "\nLight position factor: ".(abs($treetopX-$position[0]) / abs($treetopY-$position[1])));
            if((abs($treetopX-$position[0]) / abs($treetopY-$position[1])) > $xymultiplier){
                $topPositionsExcluded++;
                fwrite($handle, "\n --- Outside tree ---");
            }
        }

        $treesideAngle--;
    }
    fclose($handle);

    // Paint tree's outline
    $treeHeight = abs($treetopY-$treeBottomY);
    $treeBottomLeft = 0;
    $treeBottomRight = 0;
    $previousState = false; // line has not started; assumes the tree does not "leave"^^

    for($x = 0; $x <= $maxX; $x++){
        if(abs($treetopX-$x) != 0 && abs($treetopX-$x) / $treeHeight > $xymultiplier){
            if($previousState == true){
                $treeBottomRight = $x;
                $previousState = false;
            }
            continue;
        }
        imagesetpixel($image, $x, $treeBottomY, $red);
        if($previousState == false){
            $treeBottomLeft = $x;
            $previousState = true;
        }
    }
    imageline($image, $treeBottomLeft, $treeBottomY, $treetopX, $treetopY, $red);
    imageline($image, $treeBottomRight, $treeBottomY, $treetopX, $treetopY, $red);


    // Print out some parameters

    $string = "Min dist: ".$minLightDistance." | Tree angle: ".$treesideAngle." deg | Tree bottom: ".$treeBottomY;

    $px     = (imagesx($image) - 6.5 * strlen($string)) / 2;
    imagestring($image, 2, $px, 5, $string, $orange);

    return $topN;
}

/**
 * Returns values from 0 to 765
 */
function getBrightnessFromRgb($rgb) {
    $r = ($rgb >> 16) & 0xFF;
    $g = ($rgb >> 8) & 0xFF;
    $b = $rgb & 0xFF;

    return $r+$r+$b;
}

function paintCrosshair($image, $posX, $posY, $color, $size=5) {
    for($x = $posX-$size; $x <= $posX+$size; $x++) {
        if($x>=0 && $x < imagesx($image)){
            imagesetpixel($image, $x, $posY, $color);
        }
    }
    for($y = $posY-$size; $y <= $posY+$size; $y++) {
        if($y>=0 && $y < imagesy($image)){
            imagesetpixel($image, $posX, $y, $color);
        }
    }
}

// From http://www.mdj.us/web-development/php-programming/calculating-the-median-average-values-of-an-array-with-php/
function calculate_median($arr) {
    sort($arr);
    $count = count($arr); //total numbers in array
    $middleval = floor(($count-1)/2); // find the middle value, or the lowest middle value
    if($count % 2) { // odd number, middle is the median
        $median = $arr[$middleval];
    } else { // even number, calculate avg of 2 medians
        $low = $arr[$middleval];
        $high = $arr[$middleval+1];
        $median = (($low+$high)/2);
    }
    return $median;
}


?>

图片: 左上 下中心 左下 右上方 上中 右下

奖励:来自维基百科的德国人Weihnachtsbaum,网址:http://commons.wikimedia.org/wiki/File:Weihnachtsbaum_R%C3%B6merberg.jpg 罗默贝格

Using a quite different approach from what I’ve seen, I created a script that detects christmas trees by their lights. The result ist always a symmetrical triangle, and if necessary numeric values like the angle (“fatness”) of the tree.

The biggest threat to this algorithm obviously are lights next to (in great numbers) or in front of the tree (the greater problem until further optimization). Edit (added): What it can’t do: Find out if there’s a christmas tree or not, find multiple christmas trees in one image, correctly detect a cristmas tree in the middle of Las Vegas, detect christmas trees that are heavily bent, upside-down or chopped down… ;)

The different stages are:

  • Calculate the added brightness (R+G+B) for each pixel
  • Add up this value of all 8 neighbouring pixels on top of each pixel
  • Rank all pixels by this value (brightest first) – I know, not really subtle…
  • Choose N of these, starting from the top, skipping ones that are too close
  • Calculate the of these top N (gives us the approximate center of the tree)
  • Start from the median position upwards in a widening search beam for the topmost light from the selected brightest ones (people tend to put at least one light at the very top)
  • From there, imagine lines going 60 degrees left and right downwards (christmas trees shouldn’t be that fat)
  • Decrease those 60 degrees until 20% of the brightest lights are outside this triangle
  • Find the light at the very bottom of the triangle, giving you the lower horizontal border of the tree
  • Done

Explanation of the markings:

  • Big red cross in the center of the tree: Median of the top N brightest lights
  • Dotted line from there upwards: “search beam” for the top of the tree
  • Smaller red cross: top of the tree
  • Really small red crosses: All of the top N brightest lights
  • Red triangle: D’uh!

Source code:

<?php

ini_set('memory_limit', '1024M');

header("Content-type: image/png");

$chosenImage = 6;

switch($chosenImage){
    case 1:
        $inputImage     = imagecreatefromjpeg("nmzwj.jpg");
        break;
    case 2:
        $inputImage     = imagecreatefromjpeg("2y4o5.jpg");
        break;
    case 3:
        $inputImage     = imagecreatefromjpeg("YowlH.jpg");
        break;
    case 4:
        $inputImage     = imagecreatefromjpeg("2K9Ef.jpg");
        break;
    case 5:
        $inputImage     = imagecreatefromjpeg("aVZhC.jpg");
        break;
    case 6:
        $inputImage     = imagecreatefromjpeg("FWhSP.jpg");
        break;
    case 7:
        $inputImage     = imagecreatefromjpeg("roemerberg.jpg");
        break;
    default:
        exit();
}

// Process the loaded image

$topNspots = processImage($inputImage);

imagejpeg($inputImage);
imagedestroy($inputImage);

// Here be functions

function processImage($image) {
    $orange = imagecolorallocate($image, 220, 210, 60);
    $black = imagecolorallocate($image, 0, 0, 0);
    $red = imagecolorallocate($image, 255, 0, 0);

    $maxX = imagesx($image)-1;
    $maxY = imagesy($image)-1;

    // Parameters
    $spread = 1; // Number of pixels to each direction that will be added up
    $topPositions = 80; // Number of (brightest) lights taken into account
    $minLightDistance = round(min(array($maxX, $maxY)) / 30); // Minimum number of pixels between the brigtests lights
    $searchYperX = 5; // spread of the "search beam" from the median point to the top

    $renderStage = 3; // 1 to 3; exits the process early


    // STAGE 1
    // Calculate the brightness of each pixel (R+G+B)

    $maxBrightness = 0;
    $stage1array = array();

    for($row = 0; $row <= $maxY; $row++) {

        $stage1array[$row] = array();

        for($col = 0; $col <= $maxX; $col++) {

            $rgb = imagecolorat($image, $col, $row);
            $brightness = getBrightnessFromRgb($rgb);
            $stage1array[$row][$col] = $brightness;

            if($renderStage == 1){
                $brightnessToGrey = round($brightness / 765 * 256);
                $greyRgb = imagecolorallocate($image, $brightnessToGrey, $brightnessToGrey, $brightnessToGrey);
                imagesetpixel($image, $col, $row, $greyRgb);
            }

            if($brightness > $maxBrightness) {
                $maxBrightness = $brightness;
                if($renderStage == 1){
                    imagesetpixel($image, $col, $row, $red);
                }
            }
        }
    }
    if($renderStage == 1) {
        return;
    }


    // STAGE 2
    // Add up brightness of neighbouring pixels

    $stage2array = array();
    $maxStage2 = 0;

    for($row = 0; $row <= $maxY; $row++) {
        $stage2array[$row] = array();

        for($col = 0; $col <= $maxX; $col++) {
            if(!isset($stage2array[$row][$col])) $stage2array[$row][$col] = 0;

            // Look around the current pixel, add brightness
            for($y = $row-$spread; $y <= $row+$spread; $y++) {
                for($x = $col-$spread; $x <= $col+$spread; $x++) {

                    // Don't read values from outside the image
                    if($x >= 0 && $x <= $maxX && $y >= 0 && $y <= $maxY){
                        $stage2array[$row][$col] += $stage1array[$y][$x]+10;
                    }
                }
            }

            $stage2value = $stage2array[$row][$col];
            if($stage2value > $maxStage2) {
                $maxStage2 = $stage2value;
            }
        }
    }

    if($renderStage >= 2){
        // Paint the accumulated light, dimmed by the maximum value from stage 2
        for($row = 0; $row <= $maxY; $row++) {
            for($col = 0; $col <= $maxX; $col++) {
                $brightness = round($stage2array[$row][$col] / $maxStage2 * 255);
                $greyRgb = imagecolorallocate($image, $brightness, $brightness, $brightness);
                imagesetpixel($image, $col, $row, $greyRgb);
            }
        }
    }

    if($renderStage == 2) {
        return;
    }


    // STAGE 3

    // Create a ranking of bright spots (like "Top 20")
    $topN = array();

    for($row = 0; $row <= $maxY; $row++) {
        for($col = 0; $col <= $maxX; $col++) {

            $stage2Brightness = $stage2array[$row][$col];
            $topN[$col.":".$row] = $stage2Brightness;
        }
    }
    arsort($topN);

    $topNused = array();
    $topPositionCountdown = $topPositions;

    if($renderStage == 3){
        foreach ($topN as $key => $val) {
            if($topPositionCountdown <= 0){
                break;
            }

            $position = explode(":", $key);

            foreach($topNused as $usedPosition => $usedValue) {
                $usedPosition = explode(":", $usedPosition);
                $distance = abs($usedPosition[0] - $position[0]) + abs($usedPosition[1] - $position[1]);
                if($distance < $minLightDistance) {
                    continue 2;
                }
            }

            $topNused[$key] = $val;

            paintCrosshair($image, $position[0], $position[1], $red, 2);

            $topPositionCountdown--;

        }
    }


    // STAGE 4
    // Median of all Top N lights
    $topNxValues = array();
    $topNyValues = array();

    foreach ($topNused as $key => $val) {
        $position = explode(":", $key);
        array_push($topNxValues, $position[0]);
        array_push($topNyValues, $position[1]);
    }

    $medianXvalue = round(calculate_median($topNxValues));
    $medianYvalue = round(calculate_median($topNyValues));
    paintCrosshair($image, $medianXvalue, $medianYvalue, $red, 15);


    // STAGE 5
    // Find treetop

    $filename = 'debug.log';
    $handle = fopen($filename, "w");
    fwrite($handle, "\n\n STAGE 5");

    $treetopX = $medianXvalue;
    $treetopY = $medianYvalue;

    $searchXmin = $medianXvalue;
    $searchXmax = $medianXvalue;

    $width = 0;
    for($y = $medianYvalue; $y >= 0; $y--) {
        fwrite($handle, "\nAt y = ".$y);

        if(($y % $searchYperX) == 0) { // Modulo
            $width++;
            $searchXmin = $medianXvalue - $width;
            $searchXmax = $medianXvalue + $width;
            imagesetpixel($image, $searchXmin, $y, $red);
            imagesetpixel($image, $searchXmax, $y, $red);
        }

        foreach ($topNused as $key => $val) {
            $position = explode(":", $key); // "x:y"

            if($position[1] != $y){
                continue;
            }

            if($position[0] >= $searchXmin && $position[0] <= $searchXmax){
                $treetopX = $position[0];
                $treetopY = $y;
            }
        }

    }

    paintCrosshair($image, $treetopX, $treetopY, $red, 5);


    // STAGE 6
    // Find tree sides
    fwrite($handle, "\n\n STAGE 6");

    $treesideAngle = 60; // The extremely "fat" end of a christmas tree
    $treeBottomY = $treetopY;

    $topPositionsExcluded = 0;
    $xymultiplier = 0;
    while(($topPositionsExcluded < ($topPositions / 5)) && $treesideAngle >= 1){
        fwrite($handle, "\n\nWe're at angle ".$treesideAngle);
        $xymultiplier = sin(deg2rad($treesideAngle));
        fwrite($handle, "\nMultiplier: ".$xymultiplier);

        $topPositionsExcluded = 0;
        foreach ($topNused as $key => $val) {
            $position = explode(":", $key);
            fwrite($handle, "\nAt position ".$key);

            if($position[1] > $treeBottomY) {
                $treeBottomY = $position[1];
            }

            // Lights above the tree are outside of it, but don't matter
            if($position[1] < $treetopY){
                $topPositionsExcluded++;
                fwrite($handle, "\nTOO HIGH");
                continue;
            }

            // Top light will generate division by zero
            if($treetopY-$position[1] == 0) {
                fwrite($handle, "\nDIVISION BY ZERO");
                continue;
            }

            // Lights left end right of it are also not inside
            fwrite($handle, "\nLight position factor: ".(abs($treetopX-$position[0]) / abs($treetopY-$position[1])));
            if((abs($treetopX-$position[0]) / abs($treetopY-$position[1])) > $xymultiplier){
                $topPositionsExcluded++;
                fwrite($handle, "\n --- Outside tree ---");
            }
        }

        $treesideAngle--;
    }
    fclose($handle);

    // Paint tree's outline
    $treeHeight = abs($treetopY-$treeBottomY);
    $treeBottomLeft = 0;
    $treeBottomRight = 0;
    $previousState = false; // line has not started; assumes the tree does not "leave"^^

    for($x = 0; $x <= $maxX; $x++){
        if(abs($treetopX-$x) != 0 && abs($treetopX-$x) / $treeHeight > $xymultiplier){
            if($previousState == true){
                $treeBottomRight = $x;
                $previousState = false;
            }
            continue;
        }
        imagesetpixel($image, $x, $treeBottomY, $red);
        if($previousState == false){
            $treeBottomLeft = $x;
            $previousState = true;
        }
    }
    imageline($image, $treeBottomLeft, $treeBottomY, $treetopX, $treetopY, $red);
    imageline($image, $treeBottomRight, $treeBottomY, $treetopX, $treetopY, $red);


    // Print out some parameters

    $string = "Min dist: ".$minLightDistance." | Tree angle: ".$treesideAngle." deg | Tree bottom: ".$treeBottomY;

    $px     = (imagesx($image) - 6.5 * strlen($string)) / 2;
    imagestring($image, 2, $px, 5, $string, $orange);

    return $topN;
}

/**
 * Returns values from 0 to 765
 */
function getBrightnessFromRgb($rgb) {
    $r = ($rgb >> 16) & 0xFF;
    $g = ($rgb >> 8) & 0xFF;
    $b = $rgb & 0xFF;

    return $r+$r+$b;
}

function paintCrosshair($image, $posX, $posY, $color, $size=5) {
    for($x = $posX-$size; $x <= $posX+$size; $x++) {
        if($x>=0 && $x < imagesx($image)){
            imagesetpixel($image, $x, $posY, $color);
        }
    }
    for($y = $posY-$size; $y <= $posY+$size; $y++) {
        if($y>=0 && $y < imagesy($image)){
            imagesetpixel($image, $posX, $y, $color);
        }
    }
}

// From http://www.mdj.us/web-development/php-programming/calculating-the-median-average-values-of-an-array-with-php/
function calculate_median($arr) {
    sort($arr);
    $count = count($arr); //total numbers in array
    $middleval = floor(($count-1)/2); // find the middle value, or the lowest middle value
    if($count % 2) { // odd number, middle is the median
        $median = $arr[$middleval];
    } else { // even number, calculate avg of 2 medians
        $low = $arr[$middleval];
        $high = $arr[$middleval+1];
        $median = (($low+$high)/2);
    }
    return $median;
}


?>

Images: Upper left Lower center Lower left Upper right Upper center Lower right

Bonus: A german Weihnachtsbaum, from Wikipedia Römerberg http://commons.wikimedia.org/wiki/File:Weihnachtsbaum_R%C3%B6merberg.jpg


回答 9

我将python与opencv一起使用。

我的算法是这样的:

  1. 首先,它从图像中获取红色通道
  2. 将阈值(最小值200)应用于红色通道
  3. 然后应用形态学梯度,然后执行“闭合”(先扩张,然后进行侵蚀)
  4. 然后,它在平面中找到轮廓,并选择最长的轮廓。

结果:

编码:

import numpy as np
import cv2
import copy


def findTree(image,num):
    im = cv2.imread(image)
    im = cv2.resize(im, (400,250))
    gray = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY)
    imf = copy.deepcopy(im)

    b,g,r = cv2.split(im)
    minR = 200
    _,thresh = cv2.threshold(r,minR,255,0)
    kernel = np.ones((25,5))
    dst = cv2.morphologyEx(thresh, cv2.MORPH_GRADIENT, kernel)
    dst = cv2.morphologyEx(dst, cv2.MORPH_CLOSE, kernel)

    contours = cv2.findContours(dst,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)[0]
    cv2.drawContours(im, contours,-1, (0,255,0), 1)

    maxI = 0
    for i in range(len(contours)):
        if len(contours[maxI]) < len(contours[i]):
            maxI = i

    img = copy.deepcopy(r)
    cv2.polylines(img,[contours[maxI]],True,(255,255,255),3)
    imf[:,:,2] = img

    cv2.imshow(str(num), imf)

def main():
    findTree('tree.jpg',1)
    findTree('tree2.jpg',2)
    findTree('tree3.jpg',3)
    findTree('tree4.jpg',4)
    findTree('tree5.jpg',5)
    findTree('tree6.jpg',6)

    cv2.waitKey(0)
    cv2.destroyAllWindows()

if __name__ == "__main__":
    main()

如果将内核从(25,5)更改为(10,5),则除左下角以外的所有树都可获得更好的结果, 在此处输入图片说明

我的算法假设这棵树上有灯,在左下方的树中,顶部的灯比其他树的灯少。

I used python with opencv.

My algorithm goes like this:

  1. First it takes the red channel from the image
  2. Apply a threshold (min value 200) to the Red channel
  3. Then apply Morphological Gradient and then do a ‘Closing’ (dilation followed by Erosion)
  4. Then it finds the contours in the plane and it picks the longest contour.

The outcome:

The code:

import numpy as np
import cv2
import copy


def findTree(image,num):
    im = cv2.imread(image)
    im = cv2.resize(im, (400,250))
    gray = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY)
    imf = copy.deepcopy(im)

    b,g,r = cv2.split(im)
    minR = 200
    _,thresh = cv2.threshold(r,minR,255,0)
    kernel = np.ones((25,5))
    dst = cv2.morphologyEx(thresh, cv2.MORPH_GRADIENT, kernel)
    dst = cv2.morphologyEx(dst, cv2.MORPH_CLOSE, kernel)

    contours = cv2.findContours(dst,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)[0]
    cv2.drawContours(im, contours,-1, (0,255,0), 1)

    maxI = 0
    for i in range(len(contours)):
        if len(contours[maxI]) < len(contours[i]):
            maxI = i

    img = copy.deepcopy(r)
    cv2.polylines(img,[contours[maxI]],True,(255,255,255),3)
    imf[:,:,2] = img

    cv2.imshow(str(num), imf)

def main():
    findTree('tree.jpg',1)
    findTree('tree2.jpg',2)
    findTree('tree3.jpg',3)
    findTree('tree4.jpg',4)
    findTree('tree5.jpg',5)
    findTree('tree6.jpg',6)

    cv2.waitKey(0)
    cv2.destroyAllWindows()

if __name__ == "__main__":
    main()

If I change the kernel from (25,5) to (10,5) I get nicer results on all trees but the bottom left, enter image description here

my algorithm assumes that the tree has lights on it, and in the bottom left tree, the top has less light then the others.


json和simplejson Python模块之间有什么区别?

问题:json和simplejson Python模块之间有什么区别?

我已经看到许多项目使用simplejson模块而不是json标准库中的模块。另外,有许多不同的simplejson模块。为什么要使用这些替代方法而不是标准库中的替代方法?

I have seen many projects using simplejson module instead of json module from the Standard Library. Also, there are many different simplejson modules. Why would use these alternatives, instead of the one in the Standard Library?


回答 0

json simplejson,已添加到stdlib中。但是自从json2.6中添加以来,它simplejson具有在更多Python版本(2.4+)上工作的优势。

simplejson的更新频率也比Python高,因此,如果您需要(或想要)最新版本simplejson,则尽可能使用它自己。

我认为,一种好的做法是将其中一个作为后备。

try:
    import simplejson as json
except ImportError:
    import json

json is simplejson, added to the stdlib. But since json was added in 2.6, simplejson has the advantage of working on more Python versions (2.4+).

simplejson is also updated more frequently than Python, so if you need (or want) the latest version, it’s best to use simplejson itself, if possible.

A good practice, in my opinion, is to use one or the other as a fallback.

try:
    import simplejson as json
except ImportError:
    import json

回答 1

我必须不同意其他答案:内置json库(在Python 2.7中)不一定比慢simplejson。它也没有这个烦人的unicode错误

这是一个简单的基准:

import json
import simplejson
from timeit import repeat

NUMBER = 100000
REPEAT = 10

def compare_json_and_simplejson(data):
    """Compare json and simplejson - dumps and loads"""
    compare_json_and_simplejson.data = data
    compare_json_and_simplejson.dump = json.dumps(data)
    assert json.dumps(data) == simplejson.dumps(data)
    result = min(repeat("json.dumps(compare_json_and_simplejson.data)", "from __main__ import json, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "      json dumps {} seconds".format(result)
    result = min(repeat("simplejson.dumps(compare_json_and_simplejson.data)", "from __main__ import simplejson, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "simplejson dumps {} seconds".format(result)
    assert json.loads(compare_json_and_simplejson.dump) == data
    result = min(repeat("json.loads(compare_json_and_simplejson.dump)", "from __main__ import json, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "      json loads {} seconds".format(result)
    result = min(repeat("simplejson.loads(compare_json_and_simplejson.dump)", "from __main__ import simplejson, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "simplejson loads {} seconds".format(result)


print "Complex real world data:" 
COMPLEX_DATA = {'status': 1, 'timestamp': 1362323499.23, 'site_code': 'testing123', 'remote_address': '212.179.220.18', 'input_text': u'ny monday for less than \u20aa123', 'locale_value': 'UK', 'eva_version': 'v1.0.3286', 'message': 'Successful Parse', 'muuid1': '11e2-8414-a5e9e0fd-95a6-12313913cc26', 'api_reply': {"api_reply": {"Money": {"Currency": "ILS", "Amount": "123", "Restriction": "Less"}, "ProcessedText": "ny monday for less than \\u20aa123", "Locations": [{"Index": 0, "Derived From": "Default", "Home": "Default", "Departure": {"Date": "2013-03-04"}, "Next": 10}, {"Arrival": {"Date": "2013-03-04", "Calculated": True}, "Index": 10, "All Airports Code": "NYC", "Airports": "EWR,JFK,LGA,PHL", "Name": "New York City, New York, United States (GID=5128581)", "Latitude": 40.71427, "Country": "US", "Type": "City", "Geoid": 5128581, "Longitude": -74.00597}]}}}
compare_json_and_simplejson(COMPLEX_DATA)
print "\nSimple data:"
SIMPLE_DATA = [1, 2, 3, "asasd", {'a':'b'}]
compare_json_and_simplejson(SIMPLE_DATA)

以及在我的系统(Python 2.7.4,Linux 64位)上的结果:

复杂的现实世界数据:
json转储1.56666707993秒
simplejson转储2.25638604164秒
json加载2.71256899834秒
simplejson加载1.29233884811秒

简单数据:
json转储0.370109081268秒
simplejson转储0.574181079865秒
json加载0.422876119614秒
simplejson加载0.270955085754秒

对于转储,json比快simplejson。对于加载,simplejson速度更快。

由于我当前正在构建Web服务,dumps()因此更为重要-并且始终首选使用标准库。

另外,cjson在过去的4年中没有更新,因此我不会去碰它。

I have to disagree with the other answers: the built in json library (in Python 2.7) is not necessarily slower than simplejson. It also doesn’t have this annoying unicode bug.

Here is a simple benchmark:

import json
import simplejson
from timeit import repeat

NUMBER = 100000
REPEAT = 10

def compare_json_and_simplejson(data):
    """Compare json and simplejson - dumps and loads"""
    compare_json_and_simplejson.data = data
    compare_json_and_simplejson.dump = json.dumps(data)
    assert json.dumps(data) == simplejson.dumps(data)
    result = min(repeat("json.dumps(compare_json_and_simplejson.data)", "from __main__ import json, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "      json dumps {} seconds".format(result)
    result = min(repeat("simplejson.dumps(compare_json_and_simplejson.data)", "from __main__ import simplejson, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "simplejson dumps {} seconds".format(result)
    assert json.loads(compare_json_and_simplejson.dump) == data
    result = min(repeat("json.loads(compare_json_and_simplejson.dump)", "from __main__ import json, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "      json loads {} seconds".format(result)
    result = min(repeat("simplejson.loads(compare_json_and_simplejson.dump)", "from __main__ import simplejson, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "simplejson loads {} seconds".format(result)


print "Complex real world data:" 
COMPLEX_DATA = {'status': 1, 'timestamp': 1362323499.23, 'site_code': 'testing123', 'remote_address': '212.179.220.18', 'input_text': u'ny monday for less than \u20aa123', 'locale_value': 'UK', 'eva_version': 'v1.0.3286', 'message': 'Successful Parse', 'muuid1': '11e2-8414-a5e9e0fd-95a6-12313913cc26', 'api_reply': {"api_reply": {"Money": {"Currency": "ILS", "Amount": "123", "Restriction": "Less"}, "ProcessedText": "ny monday for less than \\u20aa123", "Locations": [{"Index": 0, "Derived From": "Default", "Home": "Default", "Departure": {"Date": "2013-03-04"}, "Next": 10}, {"Arrival": {"Date": "2013-03-04", "Calculated": True}, "Index": 10, "All Airports Code": "NYC", "Airports": "EWR,JFK,LGA,PHL", "Name": "New York City, New York, United States (GID=5128581)", "Latitude": 40.71427, "Country": "US", "Type": "City", "Geoid": 5128581, "Longitude": -74.00597}]}}}
compare_json_and_simplejson(COMPLEX_DATA)
print "\nSimple data:"
SIMPLE_DATA = [1, 2, 3, "asasd", {'a':'b'}]
compare_json_and_simplejson(SIMPLE_DATA)

And the results on my system (Python 2.7.4, Linux 64-bit):

Complex real world data:
json dumps 1.56666707993 seconds
simplejson dumps 2.25638604164 seconds
json loads 2.71256899834 seconds
simplejson loads 1.29233884811 seconds

Simple data:
json dumps 0.370109081268 seconds
simplejson dumps 0.574181079865 seconds
json loads 0.422876119614 seconds
simplejson loads 0.270955085754 seconds

For dumping, json is faster than simplejson. For loading, simplejson is faster.

Since I am currently building a web service, dumps() is more important—and using a standard library is always preferred.

Also, cjson was not updated in the past 4 years, so I wouldn’t touch it.


回答 2

所有这些答案都不太有用,因为它们对时间敏感

对我自己进行一些研究后,我发现simplejson如果将其更新为最新版本,确实比内置速度更快。

pip/easy_install我想在ubuntu 12.04上安装2.3.2,但是在发现最新simplejson版本实际上是3.3.0之后,我更新了它并重新进行了时间测试。

  • simplejsonjson负载下的内置速度快3倍
  • simplejsonjson转储时的内置速度快30%

免责声明:

上面的语句在python-2.7.3和simplejson 3.3.0中(使用c speedups),并且要确保我的回答也不是时间敏感的,您应该运行自己的测试以进行检查,因为版本之间的差异很大;没有时间敏感的简单答案。

如何判断是否在simplejson中启用了C加速:

import simplejson
# If this is True, then c speedups are enabled.
print bool(getattr(simplejson, '_speedups', False))

更新:我最近遇到了一个名为ujson的库,它比simplejson某些基本测试的执行速度快约3倍。

All of these answers aren’t very helpful because they are time sensitive.

After doing some research of my own I found that simplejson is indeed faster than the builtin, if you keep it updated to the latest version.

pip/easy_install wanted to install 2.3.2 on ubuntu 12.04, but after finding out the latest simplejson version is actually 3.3.0, so I updated it and reran the time tests.

  • simplejson is about 3x faster than the builtin json at loads
  • simplejson is about 30% faster than the builtin json at dumps

Disclaimer:

The above statements are in python-2.7.3 and simplejson 3.3.0 (with c speedups) And to make sure my answer also isn’t time sensitive, you should run your own tests to check since it varies so much between versions; there’s no easy answer that isn’t time sensitive.

How to tell if C speedups are enabled in simplejson:

import simplejson
# If this is True, then c speedups are enabled.
print bool(getattr(simplejson, '_speedups', False))

UPDATE: I recently came across a library called ujson that is performing ~3x faster than simplejson with some basic tests.


回答 3

我一直在对json,simplejson和cjson进行基准测试。

  • cjson最快
  • simplejson与cjson差不多
  • json比simplejson慢10倍

http://pastie.org/1507411

$ python test_serialization_speed.py 
--------------------
   Encoding Tests
--------------------
Encoding: 100000 x {'m': 'asdsasdqwqw', 't': 3}
[      json] 1.12385 seconds for 100000 runs. avg: 0.011239ms
[simplejson] 0.44356 seconds for 100000 runs. avg: 0.004436ms
[     cjson] 0.09593 seconds for 100000 runs. avg: 0.000959ms

Encoding: 10000 x {'m': [['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19]], 't': 3}
[      json] 7.76628 seconds for 10000 runs. avg: 0.776628ms
[simplejson] 0.51179 seconds for 10000 runs. avg: 0.051179ms
[     cjson] 0.44362 seconds for 10000 runs. avg: 0.044362ms

--------------------
   Decoding Tests
--------------------
Decoding: 100000 x {"m": "asdsasdqwqw", "t": 3}
[      json] 3.32861 seconds for 100000 runs. avg: 0.033286ms
[simplejson] 0.37164 seconds for 100000 runs. avg: 0.003716ms
[     cjson] 0.03893 seconds for 100000 runs. avg: 0.000389ms

Decoding: 10000 x {"m": [["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19]], "t": 3}
[      json] 37.26270 seconds for 10000 runs. avg: 3.726270ms
[simplejson] 0.56643 seconds for 10000 runs. avg: 0.056643ms
[     cjson] 0.33007 seconds for 10000 runs. avg: 0.033007ms

I’ve been benchmarking json, simplejson and cjson.

  • cjson is fastest
  • simplejson is almost on par with cjson
  • json is about 10x slower than simplejson

http://pastie.org/1507411:

$ python test_serialization_speed.py 
--------------------
   Encoding Tests
--------------------
Encoding: 100000 x {'m': 'asdsasdqwqw', 't': 3}
[      json] 1.12385 seconds for 100000 runs. avg: 0.011239ms
[simplejson] 0.44356 seconds for 100000 runs. avg: 0.004436ms
[     cjson] 0.09593 seconds for 100000 runs. avg: 0.000959ms

Encoding: 10000 x {'m': [['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19], ['0', 1, '2', 3, '4', 5, '6', 7, '8', 9, '10', 11, '12', 13, '14', 15, '16', 17, '18', 19]], 't': 3}
[      json] 7.76628 seconds for 10000 runs. avg: 0.776628ms
[simplejson] 0.51179 seconds for 10000 runs. avg: 0.051179ms
[     cjson] 0.44362 seconds for 10000 runs. avg: 0.044362ms

--------------------
   Decoding Tests
--------------------
Decoding: 100000 x {"m": "asdsasdqwqw", "t": 3}
[      json] 3.32861 seconds for 100000 runs. avg: 0.033286ms
[simplejson] 0.37164 seconds for 100000 runs. avg: 0.003716ms
[     cjson] 0.03893 seconds for 100000 runs. avg: 0.000389ms

Decoding: 10000 x {"m": [["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19], ["0", 1, "2", 3, "4", 5, "6", 7, "8", 9, "10", 11, "12", 13, "14", 15, "16", 17, "18", 19]], "t": 3}
[      json] 37.26270 seconds for 10000 runs. avg: 3.726270ms
[simplejson] 0.56643 seconds for 10000 runs. avg: 0.056643ms
[     cjson] 0.33007 seconds for 10000 runs. avg: 0.033007ms

回答 4

一些值在simplejson和json之间序列化的方式有所不同。

值得注意的是,的实例由collections.namedtuple序列化为数组,json但由序列化为对象simplejson。您可以通过传递namedtuple_as_object=False给来覆盖此行为simplejson.dump,但是默认情况下行为不匹配。

>>> import collections, simplejson, json
>>> TupleClass = collections.namedtuple("TupleClass", ("a", "b"))
>>> value = TupleClass(1, 2)
>>> json.dumps(value)
'[1, 2]'
>>> simplejson.dumps(value)
'{"a": 1, "b": 2}'
>>> simplejson.dumps(value, namedtuple_as_object=False)
'[1, 2]'

Some values are serialized differently between simplejson and json.

Notably, instances of collections.namedtuple are serialized as arrays by json but as objects by simplejson. You can override this behaviour by passing namedtuple_as_object=False to simplejson.dump, but by default the behaviours do not match.

>>> import collections, simplejson, json
>>> TupleClass = collections.namedtuple("TupleClass", ("a", "b"))
>>> value = TupleClass(1, 2)
>>> json.dumps(value)
'[1, 2]'
>>> simplejson.dumps(value)
'{"a": 1, "b": 2}'
>>> simplejson.dumps(value, namedtuple_as_object=False)
'[1, 2]'

回答 5

我发现与Python 2.7和simplejson 3.3.1的API不兼容之处在于输出是生成str对象还是unicode对象。例如

>>> from json import JSONDecoder
>>> jd = JSONDecoder()
>>> jd.decode("""{ "a":"b" }""")
{u'a': u'b'}

>>> from simplejson import JSONDecoder
>>> jd = JSONDecoder()
>>> jd.decode("""{ "a":"b" }""")
{'a': 'b'}

如果首选项是使用simplejson,则可以通过将参数字符串强制为unicode来解决此问题,如下所示:

>>> from simplejson import JSONDecoder
>>> jd = JSONDecoder()
>>> jd.decode(unicode("""{ "a":"b" }""", "utf-8"))
{u'a': u'b'}

强制确实需要知道原始字符集,例如:

>>> jd.decode(unicode("""{ "a": "ξηθννββωφρες" }"""))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 8: ordinal not in range(128)

这是无法解决的问题40

An API incompatibility I found, with Python 2.7 vs simplejson 3.3.1 is in whether output produces str or unicode objects. e.g.

>>> from json import JSONDecoder
>>> jd = JSONDecoder()
>>> jd.decode("""{ "a":"b" }""")
{u'a': u'b'}

vs

>>> from simplejson import JSONDecoder
>>> jd = JSONDecoder()
>>> jd.decode("""{ "a":"b" }""")
{'a': 'b'}

If the preference is to use simplejson, then this can be addressed by coercing the argument string to unicode, as in:

>>> from simplejson import JSONDecoder
>>> jd = JSONDecoder()
>>> jd.decode(unicode("""{ "a":"b" }""", "utf-8"))
{u'a': u'b'}

The coercion does require knowing the original charset, for example:

>>> jd.decode(unicode("""{ "a": "ξηθννββωφρες" }"""))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 8: ordinal not in range(128)

This is the won’t fix issue 40


回答 6

项目使用simplejson的另一个原因是内置的json最初不包括其C加速,因此性能差异非常明显。

Another reason projects use simplejson is that the builtin json did not originally include its C speedups, so the performance difference was noticeable.


回答 7

内置json模块已包含在Python 2.6中。任何支持Python <2.6以下版本的项目都必须具有后备功能。在许多情况下,这种后备状态为simplejson

The builtin json module got included in Python 2.6. Any projects that support versions of Python < 2.6 need to have a fallback. In many cases, that fallback is simplejson.


回答 8

这是Python json库的(现在已经过时)比较:

比较Python的JSON模块存档链接

无论比较结果如何,如果您使用的是Python 2.6,都应使用标准库json。并且..不然也可以使用simplejson。

Here’s (a now outdated) comparison of Python json libraries:

Comparing JSON modules for Python (archive link)

Regardless of the results in this comparison you should use the standard library json if you are on Python 2.6. And.. might as well just use simplejson otherwise.


回答 9

simplejson模块仅比json快1.5倍(在我的计算机上,使用simplejson 2.1.1和Python 2.7 x86)。

如果需要,可以尝试进行基准测试:http : //abral.altervista.org/jsonpickle-bench.zip 在我的PC上,simplejson比cPickle更快。我也想知道您的基准!

就像Coady所说的,simplejson和json之间的区别可能是simplejson包含_speedups.c。那么,为什么python开发人员不使用simplejson?

simplejson module is simply 1,5 times faster than json (On my computer, with simplejson 2.1.1 and Python 2.7 x86).

If you want, you can try the benchmark: http://abral.altervista.org/jsonpickle-bench.zip On my PC simplejson is faster than cPickle. I would like to know also your benchmarks!

Probably, as said Coady, the difference between simplejson and json is that simplejson includes _speedups.c. So, why don’t python developers use simplejson?


回答 10

在python3,如果你的字符串b'bytes',与json你有.decode()内容,然后才能加载它。 simplejson照顾好这个,所以你可以做simplejson.loads(byte_string)

In python3, if you a string of b'bytes', with json you have to .decode() the content before you can load it. simplejson takes care of this so you can just do simplejson.loads(byte_string).


回答 11

json 似乎比 simplejson最新版本的加载和转储

测试版本:

  • 的Python:3.6.8
  • json:2.0.9
  • simplejson:3.16.0

结果:

>>> def test(obj, call, data, times):
...   s = datetime.now()
...   print("calling: ", call, " in ", obj, " ", times, " times") 
...   for _ in range(times):
...     r = getattr(obj, call)(data)
...   e = datetime.now()
...   print("total time: ", str(e-s))
...   return r

>>> test(json, "dumps", data, 10000)
calling:  dumps  in  <module 'json' from 'C:\\Users\\jophine.antony\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\json\\__init__.py'>   10000  times
total time:  0:00:00.054857

>>> test(simplejson, "dumps", data, 10000)
calling:  dumps  in  <module 'simplejson' from 'C:\\Users\\jophine.antony\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\site-packages\\simplejson\\__init__.py'>   10000  times
total time:  0:00:00.419895
'{"1": 100, "2": "acs", "3.5": 3.5567, "d": [1, "23"], "e": {"a": "A"}}'

>>> test(json, "loads", strdata, 1000)
calling:  loads  in  <module 'json' from 'C:\\Users\\jophine.antony\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\json\\__init__.py'>   1000  times
total time:  0:00:00.004985
{'1': 100, '2': 'acs', '3.5': 3.5567, 'd': [1, '23'], 'e': {'a': 'A'}}

>>> test(simplejson, "loads", strdata, 1000)
calling:  loads  in  <module 'simplejson' from 'C:\\Users\\jophine.antony\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\site-packages\\simplejson\\__init__.py'>   1000  times
total time:  0:00:00.040890
{'1': 100, '2': 'acs', '3.5': 3.5567, 'd': [1, '23'], 'e': {'a': 'A'}}

对于版本:

  • 的Python:3.7.4
  • json:2.0.9
  • simplejson:3.17.0

json在转储操作期间比simplejson快,但在加载操作期间两者保持相同的速度

json seems faster than simplejson in both cases of loads and dumps in latest version

Tested versions:

  • python: 3.6.8
  • json: 2.0.9
  • simplejson: 3.16.0

Results:

>>> def test(obj, call, data, times):
...   s = datetime.now()
...   print("calling: ", call, " in ", obj, " ", times, " times") 
...   for _ in range(times):
...     r = getattr(obj, call)(data)
...   e = datetime.now()
...   print("total time: ", str(e-s))
...   return r

>>> test(json, "dumps", data, 10000)
calling:  dumps  in  <module 'json' from 'C:\\Users\\jophine.antony\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\json\\__init__.py'>   10000  times
total time:  0:00:00.054857

>>> test(simplejson, "dumps", data, 10000)
calling:  dumps  in  <module 'simplejson' from 'C:\\Users\\jophine.antony\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\site-packages\\simplejson\\__init__.py'>   10000  times
total time:  0:00:00.419895
'{"1": 100, "2": "acs", "3.5": 3.5567, "d": [1, "23"], "e": {"a": "A"}}'

>>> test(json, "loads", strdata, 1000)
calling:  loads  in  <module 'json' from 'C:\\Users\\jophine.antony\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\json\\__init__.py'>   1000  times
total time:  0:00:00.004985
{'1': 100, '2': 'acs', '3.5': 3.5567, 'd': [1, '23'], 'e': {'a': 'A'}}

>>> test(simplejson, "loads", strdata, 1000)
calling:  loads  in  <module 'simplejson' from 'C:\\Users\\jophine.antony\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\site-packages\\simplejson\\__init__.py'>   1000  times
total time:  0:00:00.040890
{'1': 100, '2': 'acs', '3.5': 3.5567, 'd': [1, '23'], 'e': {'a': 'A'}}

For versions:

  • python: 3.7.4
  • json: 2.0.9
  • simplejson: 3.17.0

json was faster than simplejson during dumps operation but both maintained the same speed during loads operations


回答 12

我在寻找要为Python 2.6安装simplejson时遇到了这个问题。我需要使用json.load()的’object_pairs_hook’来将json文件加载为OrderedDict。熟悉最新版本的Python时,我没有意识到Python 2.6的json模块不包含’object_pairs_hook’,因此我为此目的必须安装simplejson。从个人经验来看,这就是为什么我使用simplejson而不是标准json模块的原因。

I came across this question as I was looking to install simplejson for Python 2.6. I needed to use the ‘object_pairs_hook’ of json.load() in order to load a json file as an OrderedDict. Being familiar with more recent versions of Python I didn’t realize that the json module for Python 2.6 doesn’t include the ‘object_pairs_hook’ so I had to install simplejson for this purpose. From personal experience this is why i use simplejson as opposed to the standard json module.


交互时在python中重新导入模块

问题:交互时在python中重新导入模块

我知道可以做到,但是我不记得怎么做。

如何在python中重新导入模块?场景如下:我以交互方式导入模块并对其进行修改,但随后遇到错误。我修复了.py文件中的错误,然后想重新导入固定模块而不退出python。我该怎么做 ?

I know it can be done, but I never remember how.

How can you reimport a module in python? The scenario is as follows: I import a module interactively and tinker with it, but then I face an error. I fix the error in the .py file and then I want to reimport the fixed module without quitting python. How can I do it ?


回答 0

这应该工作:

reload(my.module)

来自Python文档

重新加载先前导入的模块。参数必须是模块对象,因此它必须已经成功导入。如果您已使用外部编辑器编辑了模块源文件,并且想在不离开Python解释器的情况下尝试新版本,则这将非常有用。

如果运行Python 3.4及更高版本,请执行import importlib,然后执行importlib.reload(nameOfModule)

不要忘记使用此方法的注意事项:

  • 重新加载模块时,将保留其字典(包含模块的全局变量)。名称的重新定义将覆盖旧的定义,因此通常这不是问题,但是如果模块的新版本未定义旧版本定义的名称,则不会删除旧定义。

  • 如果一个模块使用导入了另一个模块的对象from ... import ...,则调用reload()另一个模块的操作并未重新定义从该模块导入的对象-解决该问题的一种方法是重新执行该from语句,另一种方法是使用import和限定名称(module.*name*)。

  • 如果模块实例化一个类的实例,则重新加载定义该类的模块不会影响实例的方法定义-它们将继续使用旧的类定义。派生类也是如此。

This should work:

reload(my.module)

From the Python docs

Reload a previously imported module. The argument must be a module object, so it must have been successfully imported before. This is useful if you have edited the module source file using an external editor and want to try out the new version without leaving the Python interpreter.

If running Python 3.4 and up, do import importlib, then do importlib.reload(nameOfModule).

Don’t forget the caveats of using this method:

  • When a module is reloaded, its dictionary (containing the module’s global variables) is retained. Redefinitions of names will override the old definitions, so this is generally not a problem, but if the new version of a module does not define a name that was defined by the old version, the old definition is not removed.

  • If a module imports objects from another module using from ... import ..., calling reload() for the other module does not redefine the objects imported from it — one way around this is to re-execute the from statement, another is to use import and qualified names (module.*name*) instead.

  • If a module instantiates instances of a class, reloading the module that defines the class does not affect the method definitions of the instances — they continue to use the old class definition. The same is true for derived classes.


回答 1

在python 3中,reload不再是内置函数。

如果您使用的是python 3.4+,则应改reloadimportlib库中使用:

import importlib
importlib.reload(some_module)

如果您使用的是Python 3.2或3.3,则应:

import imp  
imp.reload(module)  

代替。参见http://docs.python.org/3.0/library/imp.html#imp.reload

如果您使用ipython,请绝对考虑使用autoreload扩展名:

%load_ext autoreload
%autoreload 2

In python 3, reload is no longer a built in function.

If you are using python 3.4+ you should use reload from the importlib library instead:

import importlib
importlib.reload(some_module)

If you are using python 3.2 or 3.3 you should:

import imp  
imp.reload(module)  

instead. See http://docs.python.org/3.0/library/imp.html#imp.reload

If you are using ipython, definitely consider using the autoreload extension:

%load_ext autoreload
%autoreload 2

回答 2

实际上,在Python 3中,该模块imp被标记为DEPRECATED。好吧,至少对于3.4是正确的。

而是应使用模块中的reload功能importlib

https://docs.python.org/3/library/importlib.html#importlib.reload

但是请注意,该库在最后两个次要版本中进行了一些API更改。

Actually, in Python 3 the module imp is marked as DEPRECATED. Well, at least that’s true for 3.4.

Instead the reload function from the importlib module should be used:

https://docs.python.org/3/library/importlib.html#importlib.reload

But be aware that this library had some API-changes with the last two minor versions.


回答 3

如果要从模块导入特定的函数或类,可以执行以下操作:

import importlib
import sys
importlib.reload(sys.modules['my_module'])
from my_module import my_function

If you want to import a specific function or class from a module, you can do this:

import importlib
import sys
importlib.reload(sys.modules['my_module'])
from my_module import my_function

回答 4

尽管提供的答案确实适用于特定模块,但它们不会重新加载子模块,如本答案所述

如果一个模块使用导入了另一个模块的对象from ... import ...,则调用reload()另一个模块的操作并未重新定义从该模块导入的对象-解决该问题的一种方法是重新执行from语句,另一种方法是使用import和限定名称(module.*name*)。

但是,如果使用__all__变量定义公共API,则可以自动重新加载所有公共可用模块:

# Python >= 3.5
import importlib
import types


def walk_reload(module: types.ModuleType) -> None:
    if hasattr(module, "__all__"):
        for submodule_name in module.__all__:
            walk_reload(getattr(module, submodule_name))
    importlib.reload(module)


walk_reload(my_module)

先前答案中指出的警告仍然有效。值得注意的是,如__all__变量所描述的那样,修改不属于公共API的子模块将不会受到使用此函数进行的重新加载的影响。同样,删除子模块的元素也不会反映在重新加载中。

Although the provided answers do work for a specific module, they won’t reload submodules, as noted in This answer:

If a module imports objects from another module using from ... import ..., calling reload() for the other module does not redefine the objects imported from it — one way around this is to re-execute the from statement, another is to use import and qualified names (module.*name*) instead.

However, if using the __all__ variable to define the public API, it is possible to automatically reload all publicly available modules:

# Python >= 3.5
import importlib
import types


def walk_reload(module: types.ModuleType) -> None:
    if hasattr(module, "__all__"):
        for submodule_name in module.__all__:
            walk_reload(getattr(module, submodule_name))
    importlib.reload(module)


walk_reload(my_module)

The caveats noted in the previous answer are still valid though. Notably, modifying a submodule that is not part of the public API as described by the __all__ variable won’t be affected by a reload using this function. Similarly, removing an element of a submodule won’t be reflected by a reload.


如何在Python中从字符串末尾删除子字符串?

问题:如何在Python中从字符串末尾删除子字符串?

我有以下代码:

url = 'abcdc.com'
print(url.strip('.com'))

我期望: abcdc

我有: abcd

现在我做

url.rsplit('.com', 1)

有没有更好的办法?

I have the following code:

url = 'abcdc.com'
print(url.strip('.com'))

I expected: abcdc

I got: abcd

Now I do

url.rsplit('.com', 1)

Is there a better way?


回答 0

strip并不意味着“删除此子字符串”。x.strip(y)视为y一组字符,并从的末尾剥离该组中的所有字符x

相反,您可以使用endswith和切片:

url = 'abcdc.com'
if url.endswith('.com'):
    url = url[:-4]

或使用正则表达式

import re
url = 'abcdc.com'
url = re.sub('\.com$', '', url)

strip doesn’t mean “remove this substring”. x.strip(y) treats y as a set of characters and strips any characters in that set from the ends of x.

Instead, you could use endswith and slicing:

url = 'abcdc.com'
if url.endswith('.com'):
    url = url[:-4]

Or using regular expressions:

import re
url = 'abcdc.com'
url = re.sub('\.com$', '', url)

回答 1

如果您确定字符串仅出现在末尾,则最简单的方法是使用“替换”:

url = 'abcdc.com'
print(url.replace('.com',''))

If you are sure that the string only appears at the end, then the simplest way would be to use ‘replace’:

url = 'abcdc.com'
print(url.replace('.com',''))

回答 2

def strip_end(text, suffix):
    if not text.endswith(suffix):
        return text
    return text[:len(text)-len(suffix)]
def strip_end(text, suffix):
    if not text.endswith(suffix):
        return text
    return text[:len(text)-len(suffix)]

回答 3

由于似乎没有人指出这一点:

url = "www.example.com"
new_url = url[:url.rfind(".")]

这应该比split()不使用任何新列表对象的方法更有效,并且此解决方案适用于带有多个点的字符串。

Since it seems like nobody has pointed this on out yet:

url = "www.example.com"
new_url = url[:url.rfind(".")]

This should be more efficient than the methods using split() as no new list object is created, and this solution works for strings with several dots.


回答 4

取决于您对网址的了解以及您要尝试的内容。如果您知道它将始终以“ .com”(或“ .net”或“ .org”)结尾,则

 url=url[:-4]

是最快的解决方案。如果它是更通用的URL,那么最好研究一下python随附的urlparse库。

另一方面,如果您只是想删除最后一个“。”之后的所有内容。然后是一个字符串

url.rsplit('.',1)[0]

将工作。或者,如果您只想让所有内容都达到第一个“。”。然后尝试

url.split('.',1)[0]

Depends on what you know about your url and exactly what you’re tryinh to do. If you know that it will always end in ‘.com’ (or ‘.net’ or ‘.org’) then

 url=url[:-4]

is the quickest solution. If it’s a more general URLs then you’re probably better of looking into the urlparse library that comes with python.

If you on the other hand you simply want to remove everything after the final ‘.’ in a string then

url.rsplit('.',1)[0]

will work. Or if you want just want everything up to the first ‘.’ then try

url.split('.',1)[0]

回答 5

如果您知道这是一个扩展,那么

url = 'abcdc.com'
...
url.rsplit('.', 1)[0]  # split at '.', starting from the right, maximum 1 split

这与abcdc.comor www.abcdc.com或or 同样有效,abcdc.[anything]并且可扩展性更高。

If you know it’s an extension, then

url = 'abcdc.com'
...
url.rsplit('.', 1)[0]  # split at '.', starting from the right, maximum 1 split

This works equally well with abcdc.com or www.abcdc.com or abcdc.[anything] and is more extensible.


回答 6

一行:

text if not text.endswith(suffix) or len(suffix) == 0 else text[:-len(suffix)]

In one line:

text if not text.endswith(suffix) or len(suffix) == 0 else text[:-len(suffix)]

回答 7

怎么url[:-4]

How about url[:-4]?


回答 8

对于url(在给定的示例中,它似乎是主题的一部分),可以执行以下操作:

import os
url = 'http://www.stackoverflow.com'
name,ext = os.path.splitext(url)
print (name, ext)

#Or:
ext = '.'+url.split('.')[-1]
name = url[:-len(ext)]
print (name, ext)

两者都将输出: ('http://www.stackoverflow', '.com')

str.endswith(suffix)如果您只需要分割“ .com”或其他特定内容,也可以将其结合使用。

For urls (as it seems to be a part of the topic by the given example), one can do something like this:

import os
url = 'http://www.stackoverflow.com'
name,ext = os.path.splitext(url)
print (name, ext)

#Or:
ext = '.'+url.split('.')[-1]
name = url[:-len(ext)]
print (name, ext)

Both will output: ('http://www.stackoverflow', '.com')

This can also be combined with str.endswith(suffix) if you need to just split “.com”, or anything specific.


回答 9

url.rsplit(’。com’,1)

不太正确。

您实际需要写的是

url.rsplit('.com', 1)[0]

,而且恕我直言。

但是,我个人偏爱此选项,因为它仅使用一个参数:

url.rpartition('.com')[0]

url.rsplit(‘.com’, 1)

is not quite right.

What you actually would need to write is

url.rsplit('.com', 1)[0]

, and it looks pretty succinct IMHO.

However, my personal preference is this option because it uses only one parameter:

url.rpartition('.com')[0]

回答 10

从开始Python 3.9,您可以removesuffix改用:

'abcdc.com'.removesuffix('.com')
# 'abcdc'

Starting in Python 3.9, you can use removesuffix instead:

'abcdc.com'.removesuffix('.com')
# 'abcdc'

回答 11

如果需要剥离某个字符串的某个末端(如果存在),否则什么也不做。我最好的解决方案。您可能会想使用前两个实现之一,但是为了完整起见,我包括了第三个实现。

对于恒定的后缀:

def remove_suffix(v, s):
    return v[:-len(s) if v.endswith(s) else v
remove_suffix("abc.com", ".com") == 'abc'
remove_suffix("abc", ".com") == 'abc'

对于正则表达式:

def remove_suffix_compile(suffix_pattern):
    r = re.compile(f"(.*?)({suffix_pattern})?$")
    return lambda v: r.match(v)[1]
remove_domain = remove_suffix_compile(r"\.[a-zA-Z0-9]{3,}")
remove_domain("abc.com") == "abc"
remove_domain("sub.abc.net") == "sub.abc"
remove_domain("abc.") == "abc."
remove_domain("abc") == "abc"

对于常量后缀的集合,用于大量调用的渐近最快方法:

def remove_suffix_preprocess(*suffixes):
    suffixes = set(suffixes)
    try:
        suffixes.remove('')
    except KeyError:
        pass

    def helper(suffixes, pos):
        if len(suffixes) == 1:
            suf = suffixes[0]
            l = -len(suf)
            ls = slice(0, l)
            return lambda v: v[ls] if v.endswith(suf) else v
        si = iter(suffixes)
        ml = len(next(si))
        exact = False
        for suf in si:
            l = len(suf)
            if -l == pos:
                exact = True
            else:
                ml = min(len(suf), ml)
        ml = -ml
        suffix_dict = {}
        for suf in suffixes:
            sub = suf[ml:pos]
            if sub in suffix_dict:
                suffix_dict[sub].append(suf)
            else:
                suffix_dict[sub] = [suf]
        if exact:
            del suffix_dict['']
            for key in suffix_dict:
                suffix_dict[key] = helper([s[:pos] for s in suffix_dict[key]], None)
            return lambda v: suffix_dict.get(v[ml:pos], lambda v: v)(v[:pos])
        else:
            for key in suffix_dict:
                suffix_dict[key] = helper(suffix_dict[key], ml)
            return lambda v: suffix_dict.get(v[ml:pos], lambda v: v)(v)
    return helper(tuple(suffixes), None)
domain_remove = remove_suffix_preprocess(".com", ".net", ".edu", ".uk", '.tv', '.co.uk', '.org.uk')

最后一个在pypy中可能要比cpython快得多。对于几乎所有不涉及潜在后缀的巨大词典的情况,regex变体可能比此方法更快,至少在cPython中这些潜在后缀无法轻易地表示为regex。

在PyPy中,即使re模​​块使用DFA编译正则表达式引擎,对于大量调用或长字符串来说,正则表达式变体几乎肯定会变慢,因为JIT会优化lambda的大部分开销。

但是,在cPython中,您几乎可以肯定地比较了正在运行的regex的c代码这一事实,这几乎可以证明后缀集合版本在算法上的优势。

If you need to strip some end of a string if it exists otherwise do nothing. My best solutions. You probably will want to use one of first 2 implementations however I have included the 3rd for completeness.

For a constant suffix:

def remove_suffix(v, s):
    return v[:-len(s) if v.endswith(s) else v
remove_suffix("abc.com", ".com") == 'abc'
remove_suffix("abc", ".com") == 'abc'

For a regex:

def remove_suffix_compile(suffix_pattern):
    r = re.compile(f"(.*?)({suffix_pattern})?$")
    return lambda v: r.match(v)[1]
remove_domain = remove_suffix_compile(r"\.[a-zA-Z0-9]{3,}")
remove_domain("abc.com") == "abc"
remove_domain("sub.abc.net") == "sub.abc"
remove_domain("abc.") == "abc."
remove_domain("abc") == "abc"

For a collection of constant suffixes the asymptotically fastest way for a large number of calls:

def remove_suffix_preprocess(*suffixes):
    suffixes = set(suffixes)
    try:
        suffixes.remove('')
    except KeyError:
        pass

    def helper(suffixes, pos):
        if len(suffixes) == 1:
            suf = suffixes[0]
            l = -len(suf)
            ls = slice(0, l)
            return lambda v: v[ls] if v.endswith(suf) else v
        si = iter(suffixes)
        ml = len(next(si))
        exact = False
        for suf in si:
            l = len(suf)
            if -l == pos:
                exact = True
            else:
                ml = min(len(suf), ml)
        ml = -ml
        suffix_dict = {}
        for suf in suffixes:
            sub = suf[ml:pos]
            if sub in suffix_dict:
                suffix_dict[sub].append(suf)
            else:
                suffix_dict[sub] = [suf]
        if exact:
            del suffix_dict['']
            for key in suffix_dict:
                suffix_dict[key] = helper([s[:pos] for s in suffix_dict[key]], None)
            return lambda v: suffix_dict.get(v[ml:pos], lambda v: v)(v[:pos])
        else:
            for key in suffix_dict:
                suffix_dict[key] = helper(suffix_dict[key], ml)
            return lambda v: suffix_dict.get(v[ml:pos], lambda v: v)(v)
    return helper(tuple(suffixes), None)
domain_remove = remove_suffix_preprocess(".com", ".net", ".edu", ".uk", '.tv', '.co.uk', '.org.uk')

the final one is probably significantly faster in pypy then cpython. The regex variant is likely faster than this for virtually all cases that to not involve huge dictionaries of potential suffixes that cannot be easily represented as a regex at least in cPython.

In PyPy the regex variant is almost certainly slower for large number of calls or long strings even if the re module uses a DFA compiling regex engine as the vast majority of the overhead of the lambda’s will be optimized out by the JIT.

In cPython however the fact that your running c code for the regex compare almost certainly out ways the algorithmic advantages of the suffix collection version in almost all cases.


回答 12

如果您只打算去除扩展名:

'.'.join('abcdc.com'.split('.')[:-1])
# 'abcdc'

它适用于任何扩展名,文件名中也可能存在其他点。它只是将字符串拆分为点列表,并在没有最后一个元素的情况下将其加入。

If you mean to only strip the extension:

'.'.join('abcdc.com'.split('.')[:-1])
# 'abcdc'

It works with any extension, with potential other dots existing in filename as well. It simply splits the string as a list on dots and joins it without the last element.


回答 13

import re

def rm_suffix(url = 'abcdc.com', suffix='\.com'):
    return(re.sub(suffix+'$', '', url))

我想重复这个答案,以此作为最有表现力的方式。当然,以下操作会减少CPU时间:

def rm_dotcom(url = 'abcdc.com'):
    return(url[:-4] if url.endswith('.com') else url)

但是,如果CPU是瓶颈,为什么要用Python编写?

无论如何,CPU何时会成为瓶颈?在司机中,也许。

使用正则表达式的优点是代码可重用性。如果下一个要删除只有三个字符的’.me’怎么办?

相同的代码可以解决问题:

>>> rm_sub('abcdc.me','.me')
'abcdc'
import re

def rm_suffix(url = 'abcdc.com', suffix='\.com'):
    return(re.sub(suffix+'$', '', url))

I want to repeat this answer as the most expressive way to do it. Of course, the following would take less CPU time:

def rm_dotcom(url = 'abcdc.com'):
    return(url[:-4] if url.endswith('.com') else url)

However, if CPU is the bottle neck why write in Python?

When is CPU a bottle neck anyway? In drivers, maybe.

The advantages of using regular expression is code reusability. What if you next want to remove ‘.me’, which only has three characters?

Same code would do the trick:

>>> rm_sub('abcdc.me','.me')
'abcdc'

回答 14

就我而言,我需要提出一个exceptions,所以我做到了:

class UnableToStripEnd(Exception):
    """A Exception type to indicate that the suffix cannot be removed from the text."""

    @staticmethod
    def get_exception(text, suffix):
        return UnableToStripEnd("Could not find suffix ({0}) on text: {1}."
                                .format(suffix, text))


def strip_end(text, suffix):
    """Removes the end of a string. Otherwise fails."""
    if not text.endswith(suffix):
        raise UnableToStripEnd.get_exception(text, suffix)
    return text[:len(text)-len(suffix)]

In my case I needed to raise an exception so I did:

class UnableToStripEnd(Exception):
    """A Exception type to indicate that the suffix cannot be removed from the text."""

    @staticmethod
    def get_exception(text, suffix):
        return UnableToStripEnd("Could not find suffix ({0}) on text: {1}."
                                .format(suffix, text))


def strip_end(text, suffix):
    """Removes the end of a string. Otherwise fails."""
    if not text.endswith(suffix):
        raise UnableToStripEnd.get_exception(text, suffix)
    return text[:len(text)-len(suffix)]

回答 15

在这里,我有一个最简单的代码。

url=url.split(".")[0]

Here,i have a simplest code.

url=url.split(".")[0]

回答 16

假定您要删除域,无论它是什么(.com,.net等)。我建议找到,.然后从此删除所有内容。

url = 'abcdc.com'
dot_index = url.rfind('.')
url = url[:dot_index]

在这里,我rfind用来解决url之类的问题abcdc.com.net,应该将其简化为name abcdc.com

如果您还担心www.s,则应明确检查它们:

if url.startswith("www."):
   url = url.replace("www.","", 1)

替换中的1用于奇怪的边缘情况,例如 www.net.www.com

如果您的网址比该网址更野,请查看人们响应的正则表达式答案。

Assuming you want to remove the domain, no matter what it is (.com, .net, etc). I recommend finding the . and removing everything from that point on.

url = 'abcdc.com'
dot_index = url.rfind('.')
url = url[:dot_index]

Here I’m using rfind to solve the problem of urls like abcdc.com.net which should be reduced to the name abcdc.com.

If you’re also concerned about www.s, you should explicitly check for them:

if url.startswith("www."):
   url = url.replace("www.","", 1)

The 1 in replace is for strange edgecases like www.net.www.com

If your url gets any wilder than that look at the regex answers people have responded with.


回答 17

我使用内置的rstrip函数来执行此操作,如下所示:

string = "test.com"
suffix = ".com"
newstring = string.rstrip(suffix)
print(newstring)
test

I used the built-in rstrip function to do it like follow:

string = "test.com"
suffix = ".com"
newstring = string.rstrip(suffix)
print(newstring)
test

回答 18

您可以使用split:

'abccomputer.com'.split('.com',1)[0]
# 'abccomputer'

You can use split:

'abccomputer.com'.split('.com',1)[0]
# 'abccomputer'

回答 19

这是正则表达式的完美用法:

>>> import re
>>> re.match(r"(.*)\.com", "hello.com").group(1)
'hello'

This is a perfect use for regular expressions:

>>> import re
>>> re.match(r"(.*)\.com", "hello.com").group(1)
'hello'

回答 20

Python> = 3.9:

'abcdc.com'.removesuffix('.com')

Python <3.9:

def remove_suffix(text, suffix):
    if text.endswith(suffix):
        text = text[:-len(suffix)]
    return text

remove_suffix('abcdc.com', '.com')

Python >= 3.9:

'abcdc.com'.removesuffix('.com')

Python < 3.9:

def remove_suffix(text, suffix):
    if text.endswith(suffix):
        text = text[:-len(suffix)]
    return text

remove_suffix('abcdc.com', '.com')

每n个字符分割一个字符串?

问题:每n个字符分割一个字符串?

是否可以每n个字符分割一个字符串?

例如,假设我有一个包含以下内容的字符串:

'1234567890'

我怎样才能使它看起来像这样:

['12','34','56','78','90']

Is it possible to split a string every nth character?

For example, suppose I have a string containing the following:

'1234567890'

How can I get it to look like this:

['12','34','56','78','90']

回答 0

>>> line = '1234567890'
>>> n = 2
>>> [line[i:i+n] for i in range(0, len(line), n)]
['12', '34', '56', '78', '90']
>>> line = '1234567890'
>>> n = 2
>>> [line[i:i+n] for i in range(0, len(line), n)]
['12', '34', '56', '78', '90']

回答 1

为了完整起见,您可以使用正则表达式执行此操作:

>>> import re
>>> re.findall('..','1234567890')
['12', '34', '56', '78', '90']

对于字符的奇数,您可以执行以下操作:

>>> import re
>>> re.findall('..?', '123456789')
['12', '34', '56', '78', '9']

您还可以执行以下操作,以简化较长块的正则表达式:

>>> import re
>>> re.findall('.{1,2}', '123456789')
['12', '34', '56', '78', '9']

re.finditer如果字符串很长,则可以使用它逐块生成。

Just to be complete, you can do this with a regex:

>>> import re
>>> re.findall('..','1234567890')
['12', '34', '56', '78', '90']

For odd number of chars you can do this:

>>> import re
>>> re.findall('..?', '123456789')
['12', '34', '56', '78', '9']

You can also do the following, to simplify the regex for longer chunks:

>>> import re
>>> re.findall('.{1,2}', '123456789')
['12', '34', '56', '78', '9']

And you can use re.finditer if the string is long to generate chunk by chunk.


回答 2

在python中已经有一个内置函数。

>>> from textwrap import wrap
>>> s = '1234567890'
>>> wrap(s, 2)
['12', '34', '56', '78', '90']

这是wrap的文档字符串所说的:

>>> help(wrap)
'''
Help on function wrap in module textwrap:

wrap(text, width=70, **kwargs)
    Wrap a single paragraph of text, returning a list of wrapped lines.

    Reformat the single paragraph in 'text' so it fits in lines of no
    more than 'width' columns, and return a list of wrapped lines.  By
    default, tabs in 'text' are expanded with string.expandtabs(), and
    all other whitespace characters (including newline) are converted to
    space.  See TextWrapper class for available keyword args to customize
    wrapping behaviour.
'''

There is already an inbuilt function in python for this.

>>> from textwrap import wrap
>>> s = '1234567890'
>>> wrap(s, 2)
['12', '34', '56', '78', '90']

This is what the docstring for wrap says:

>>> help(wrap)
'''
Help on function wrap in module textwrap:

wrap(text, width=70, **kwargs)
    Wrap a single paragraph of text, returning a list of wrapped lines.

    Reformat the single paragraph in 'text' so it fits in lines of no
    more than 'width' columns, and return a list of wrapped lines.  By
    default, tabs in 'text' are expanded with string.expandtabs(), and
    all other whitespace characters (including newline) are converted to
    space.  See TextWrapper class for available keyword args to customize
    wrapping behaviour.
'''

回答 3

将元素分组为n个长度的组的另一种常见方式:

>>> s = '1234567890'
>>> map(''.join, zip(*[iter(s)]*2))
['12', '34', '56', '78', '90']

此方法直接来自的文档zip()

Another common way of grouping elements into n-length groups:

>>> s = '1234567890'
>>> map(''.join, zip(*[iter(s)]*2))
['12', '34', '56', '78', '90']

This method comes straight from the docs for zip().


回答 4

我认为这比itertools版本更短,更易读:

def split_by_n(seq, n):
    '''A generator to divide a sequence into chunks of n units.'''
    while seq:
        yield seq[:n]
        seq = seq[n:]

print(list(split_by_n('1234567890', 2)))

I think this is shorter and more readable than the itertools version:

def split_by_n(seq, n):
    '''A generator to divide a sequence into chunks of n units.'''
    while seq:
        yield seq[:n]
        seq = seq[n:]

print(list(split_by_n('1234567890', 2)))

回答 5

我喜欢这个解决方案:

s = '1234567890'
o = []
while s:
    o.append(s[:2])
    s = s[2:]

I like this solution:

s = '1234567890'
o = []
while s:
    o.append(s[:2])
    s = s[2:]

回答 6

使用PyPI的more-itertools

>>> from more_itertools import sliced
>>> list(sliced('1234567890', 2))
['12', '34', '56', '78', '90']

Using more-itertools from PyPI:

>>> from more_itertools import sliced
>>> list(sliced('1234567890', 2))
['12', '34', '56', '78', '90']

回答 7

您可以使用以下grouper()配方itertools

Python 2.x:

from itertools import izip_longest    

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

Python 3.x:

from itertools import zip_longest

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)

这些函数可节省内存,并且可与任何可迭代对象一起使用。

You could use the grouper() recipe from itertools:

Python 2.x:

from itertools import izip_longest    

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

Python 3.x:

from itertools import zip_longest

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)

These functions are memory-efficient and work with any iterables.


回答 8

尝试以下代码:

from itertools import islice

def split_every(n, iterable):
    i = iter(iterable)
    piece = list(islice(i, n))
    while piece:
        yield piece
        piece = list(islice(i, n))

s = '1234567890'
print list(split_every(2, list(s)))

Try the following code:

from itertools import islice

def split_every(n, iterable):
    i = iter(iterable)
    piece = list(islice(i, n))
    while piece:
        yield piece
        piece = list(islice(i, n))

s = '1234567890'
print list(split_every(2, list(s)))

回答 9

>>> from functools import reduce
>>> from operator import add
>>> from itertools import izip
>>> x = iter('1234567890')
>>> [reduce(add, tup) for tup in izip(x, x)]
['12', '34', '56', '78', '90']
>>> x = iter('1234567890')
>>> [reduce(add, tup) for tup in izip(x, x, x)]
['123', '456', '789']
>>> from functools import reduce
>>> from operator import add
>>> from itertools import izip
>>> x = iter('1234567890')
>>> [reduce(add, tup) for tup in izip(x, x)]
['12', '34', '56', '78', '90']
>>> x = iter('1234567890')
>>> [reduce(add, tup) for tup in izip(x, x, x)]
['123', '456', '789']

回答 10

尝试这个:

s='1234567890'
print([s[idx:idx+2] for idx,val in enumerate(s) if idx%2 == 0])

输出:

['12', '34', '56', '78', '90']

Try this:

s='1234567890'
print([s[idx:idx+2] for idx,val in enumerate(s) if idx%2 == 0])

Output:

['12', '34', '56', '78', '90']

回答 11

一如既往,对于那些喜欢一只班轮的人

n = 2  
line = "this is a line split into n characters"  
line = [line[i * n:i * n+n] for i,blah in enumerate(line[::n])]

As always, for those who love one liners

n = 2  
line = "this is a line split into n characters"  
line = [line[i * n:i * n+n] for i,blah in enumerate(line[::n])]

回答 12

一个短字符串的简单递归解决方案:

def split(s, n):
    if len(s) < n:
        return []
    else:
        return [s[:n]] + split(s[n:], n)

print(split('1234567890', 2))

或以这种形式:

def split(s, n):
    if len(s) < n:
        return []
    elif len(s) == n:
        return [s]
    else:
        return split(s[:n], n) + split(s[n:], n)

,它更明确地说明了递归方法中的典型分而治之模式(尽管实际上没有必要这样做)

A simple recursive solution for short string:

def split(s, n):
    if len(s) < n:
        return []
    else:
        return [s[:n]] + split(s[n:], n)

print(split('1234567890', 2))

Or in such a form:

def split(s, n):
    if len(s) < n:
        return []
    elif len(s) == n:
        return [s]
    else:
        return split(s[:n], n) + split(s[n:], n)

, which illustrates the typical divide and conquer pattern in recursive approach more explicitly (though practically it is not necessary to do it this way)


回答 13

我陷入了同一个场景。

这对我有用

x="1234567890"
n=2
list=[]
for i in range(0,len(x),n):
    list.append(x[i:i+n])
print(list)

输出量

['12', '34', '56', '78', '90']

I was stucked in the same scenrio.

This worked for me

x="1234567890"
n=2
list=[]
for i in range(0,len(x),n):
    list.append(x[i:i+n])
print(list)

Output

['12', '34', '56', '78', '90']

回答 14

more_itertools.sliced之前已经提到过。这是more_itertools库中的另外四个选项:

s = "1234567890"

["".join(c) for c in mit.grouper(2, s)]

["".join(c) for c in mit.chunked(s, 2)]

["".join(c) for c in mit.windowed(s, 2, step=2)]

["".join(c) for c in  mit.split_after(s, lambda x: int(x) % 2 == 0)]

后面的每个选项均产生以下输出:

['12', '34', '56', '78', '90']

所讨论的选项的说明文档:grouperchunkedwindowedsplit_after

more_itertools.sliced has been mentioned before. Here are four more options from the more_itertools library:

s = "1234567890"

["".join(c) for c in mit.grouper(2, s)]

["".join(c) for c in mit.chunked(s, 2)]

["".join(c) for c in mit.windowed(s, 2, step=2)]

["".join(c) for c in  mit.split_after(s, lambda x: int(x) % 2 == 0)]

Each of the latter options produce the following output:

['12', '34', '56', '78', '90']

Documentation for discussed options: grouper, chunked, windowed, split_after


回答 15

这可以通过简单的for循环来实现。

a = '1234567890a'
result = []

for i in range(0, len(a), 2):
    result.append(a[i : i + 2])
print(result)

输出看起来像[’12’,’34’,’56’,’78’,’90’,’a’]

This can be achieved by a simple for loop.

a = '1234567890a'
result = []

for i in range(0, len(a), 2):
    result.append(a[i : i + 2])
print(result)

The output looks like [’12’, ’34’, ’56’, ’78’, ’90’, ‘a’]