问题:我想exceptions处理“列表索引超出范围”。

我正在使用BeautifulSoup并解析一些HTML。

我从每个HTML (使用for循环)中获取特定数据,并将该数据添加到特定列表中。

问题是,某些HTML具有不同的格式(它们中没有我想要的数据)

因此,我尝试使用异常处理并将值添加null到列表中(我应该这样做,因为数据顺序很重要。)

例如,我有一个类似的代码:

soup = BeautifulSoup(links)
dlist = soup.findAll('dd', 'title')
# I'm trying to find content between <dd class='title'> and </dd>
gotdata = dlist[1]
# and what i want is the 2nd content of those
newlist.append(gotdata)
# and I add that to a newlist

并且某些链接没有任何链接<dd class='title'>,所以我想要做的是将字符串添加null到列表中。

错误出现:

list index out of range.

我尝试做的是添加一些像这样的行:

if not dlist[1]:  
   newlist.append('null')
   continue

但这行不通。它仍然显示错误:

list index out of range.

我该怎么办?我应该使用异常处理吗?还是有更简单的方法?

有什么建议?任何帮助都将非常棒!

I am using BeautifulSoup and parsing some HTMLs.

I’m getting a certain data from each HTML (using for loop) and adding that data to a certain list.

The problem is, some of the HTMLs have different format (and they don’t have the data that I want in them).

So, I was trying to use exception handling and add value null to the list (I should do this since the sequence of data is important.)

For instance, I have a code like:

soup = BeautifulSoup(links)
dlist = soup.findAll('dd', 'title')
# I'm trying to find content between <dd class='title'> and </dd>
gotdata = dlist[1]
# and what i want is the 2nd content of those
newlist.append(gotdata)
# and I add that to a newlist

and some of the links don’t have any <dd class='title'>, so what I want to do is add string null to the list instead.

The error appears:

list index out of range.

What I have done tried is to add some lines like this:

if not dlist[1]:  
   newlist.append('null')
   continue

But it doesn’t work out. It still shows error:

list index out of range.

What should I do about this? Should I use exception handling? or is there any easier way?

Any suggestions? Any help would be really great!


回答 0

处理异常的方法是:

try:
    gotdata = dlist[1]
except IndexError:
    gotdata = 'null'

当然,你也可以检查len()dlist; 但是处理异常更为直观。

Handling the exception is the way to go:

try:
    gotdata = dlist[1]
except IndexError:
    gotdata = 'null'

Of course you could also check the len() of dlist; but handling the exception is more intuitive.


回答 1

您有两个选择;处理异常或测试长度:

if len(dlist) > 1:
    newlist.append(dlist[1])
    continue

要么

try:
    newlist.append(dlist[1])
except IndexError:
    pass
continue

如果经常没有第二项,则使用第一项;如果有时没有第二项,则使用第二项。

You have two options; either handle the exception or test the length:

if len(dlist) > 1:
    newlist.append(dlist[1])
    continue

or

try:
    newlist.append(dlist[1])
except IndexError:
    pass
continue

Use the first if there often is no second item, the second if there sometimes is no second item.


回答 2

三元就足够了。更改:

gotdata = dlist[1]

gotdata = dlist[1] if len(dlist) > 1 else 'null'

这是一种较短的表达方式

if len(dlist) > 1:
    gotdata = dlist[1]
else: 
    gotdata = 'null'

A ternary will suffice. change:

gotdata = dlist[1]

to

gotdata = dlist[1] if len(dlist) > 1 else 'null'

this is a shorter way of expressing

if len(dlist) > 1:
    gotdata = dlist[1]
else: 
    gotdata = 'null'

回答 3

引用ThiefMaster♦有时,我们会得到一个错误,其值指定为’\ n’或null并执行处理ValueError所需的错误:

处理异常是解决之道

try:
    gotdata = dlist[1]
except (IndexError, ValueError):
    gotdata = 'null'

Taking reference of ThiefMaster♦ sometimes we get an error with value given as ‘\n’ or null and perform for that required to handle ValueError:

Handling the exception is the way to go

try:
    gotdata = dlist[1]
except (IndexError, ValueError):
    gotdata = 'null'

回答 4

for i in range (1, len(list))
    try:
        print (list[i])

    except ValueError:
        print("Error Value.")
    except indexError:
        print("Erorr index")
    except :
        print('error ')
for i in range (1, len(list))
    try:
        print (list[i])

    except ValueError:
        print("Error Value.")
    except indexError:
        print("Erorr index")
    except :
        print('error ')

回答 5

对于任何对较短方式感兴趣的人:

gotdata = len(dlist)>1 and dlist[1] or 'null'

但是为了获得最佳性能,我建议使用False而不是’null’,那么单行测试就足够了:

gotdata = len(dlist)>1 and dlist[1]

For anyone interested in a shorter way:

gotdata = len(dlist)>1 and dlist[1] or 'null'

But for best performance, I suggest using False instead of 'null', then a one line test will suffice:

gotdata = len(dlist)>1 and dlist[1]

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。