Python 实用宝典

Question 1

我是用python开发东西的C编码器。我知道如何在C语言中执行以下操作（以及因此在应用于Python的类似C的逻辑中），但是我想知道这样做的“ Python”方式是什么。

我有一个字典d，我想对项的子集进行操作，只有那些键（字符串）的项包含特定的子字符串。

即C逻辑将是：

for key in d:
    if filter_string in key:
        # do something
    else
        # do nothing, continue

我在想python版本会像

filtered_dict = crazy_python_syntax(d, substring)
for key,value in filtered_dict.iteritems():
    # do something

我在这里找到了很多有关过滤字典的文章，但是找不到与之相关的文章。

我的字典未嵌套，我正在使用python 2.7

Question 2

I’m a C coder developing something in python. I know how to do the following in C (and hence in C-like logic applied to python), but I’m wondering what the ‘Python’ way of doing it is.

I have a dictionary d, and I’d like to operate on a subset of the items, only those who’s key (string) contains a specific substring.

i.e. the C logic would be:

for key in d:
    if filter_string in key:
        # do something
    else
        # do nothing, continue

I’m imagining the python version would be something like

filtered_dict = crazy_python_syntax(d, substring)
for key,value in filtered_dict.iteritems():
    # do something

I’ve found a lot of posts on here regarding filtering dictionaries, but couldn’t find one which involved exactly this.

My dictionary is not nested and i’m using python 2.7

Question 3

字典理解如何：

filtered_dict = {k:v for k,v in d.iteritems() if filter_string in k}

您所看到的它应该是不言自明的，因为它的英语读起来很好。

此语法要求Python 2.7或更高版本。

在Python 3中，只有dict.items()，iteritems()所以您可以使用：

filtered_dict = {k:v for (k,v) in d.items() if filter_string in k}

Question 4

How about a dict comprehension:

filtered_dict = {k:v for k,v in d.iteritems() if filter_string in k}

One you see it, it should be self-explanatory, as it reads like English pretty well.

This syntax requires Python 2.7 or greater.

In Python 3, there is only dict.items(), not iteritems() so you would use:

filtered_dict = {k:v for (k,v) in d.items() if filter_string in k}

Question 5

选择最易读和易于维护的内容。仅仅因为您可以将其写成一行并不意味着您应该这样做。您现有的解决方案与我将要使用的迭代器跳过用户查找值的方法很接近，并且我讨厌如果不能避免，则使用嵌套的ifs：

for key, val in d.iteritems():
    if filter_string not in key:
        continue
    # do something

但是，如果您确实想要让您迭代筛选的dict的东西，那么我将不会执行构建筛选的dict然后对其进行迭代的两步过程，而是使用生成器，因为比pythonic（和超赞的）要好得多生成器？

首先，我们创建我们的生成器，并且良好的设计要求我们使它足够抽象以便可重用：

# The implementation of my generator may look vaguely familiar, no?
def filter_dict(d, filter_string):
    for key, val in d.iteritems():
        if filter_string not in key:
            continue
        yield key, val

然后，我们可以使用生成器通过简单易懂的代码很好地，干净地解决您的问题：

for key, val in filter_dict(d, some_string):
    # do something

简而言之：生成器很棒。

Question 6

Go for whatever is most readable and easily maintainable. Just because you can write it out in a single line doesn’t mean that you should. Your existing solution is close to what I would use other than I would user iteritems to skip the value lookup, and I hate nested ifs if I can avoid them:

for key, val in d.iteritems():
    if filter_string not in key:
        continue
    # do something

However if you realllly want something to let you iterate through a filtered dict then I would not do the two step process of building the filtered dict and then iterating through it, but instead use a generator, because what is more pythonic (and awesome) than a generator?

First we create our generator, and good design dictates that we make it abstract enough to be reusable:

# The implementation of my generator may look vaguely familiar, no?
def filter_dict(d, filter_string):
    for key, val in d.iteritems():
        if filter_string not in key:
            continue
        yield key, val

And then we can use the generator to solve your problem nice and cleanly with simple, understandable code:

for key, val in filter_dict(d, some_string):
    # do something

In short: generators are awesome.

Question 7

您可以使用内置的过滤器功能根据特定条件过滤字典，列表等。

filtered_dict = dict(filter(lambda item: filter_str in item[0], d.items()))

优点是您可以将其用于不同的数据结构。

Question 8

You can use the built-in filter function to filter dictionaries, lists, etc. based on specific conditions.

filtered_dict = dict(filter(lambda item: filter_str in item[0], d.items()))

The advantage is that you can use it for different data structures.

Question 9

input = {"A":"a", "B":"b", "C":"c"}
output = {k:v for (k,v) in input.items() if key_satifies_condition(k)}

Question 10

input = {"A":"a", "B":"b", "C":"c"}
output = {k:v for (k,v) in input.items() if key_satifies_condition(k)}

Question 11

乔纳森（Jonathon）在他的回答中给了你运用字典理解的方法。这是处理您要做的事情的一种方法。

如果您想对字典的值做一些事情，则根本不需要字典理解：

我正在使用iteritems(），因为您用标记了您的问题python-2.7

results = map(some_function, [(k,v) for k,v in a_dict.iteritems() if 'foo' in k])

现在，结果将出现在列表中，该列表some_function应用于已包含foo在其键中的字典的每个键/值对。

如果只想处理值并忽略键，则只需更改列表理解即可：

results = map(some_function, [v for k,v in a_dict.iteritems() if 'foo' in k])

some_function 可以是任何可调用的，因此lambda也可以工作：

results = map(lambda x: x*2, [v for k,v in a_dict.iteritems() if 'foo' in k])

内部列表实际上不是必需的，因为您还可以传递生成器表达式来映射：

>>> map(lambda a: a[0]*a[1], ((k,v) for k,v in {2:2, 3:2}.iteritems() if k == 2))
[4]

Question 12

Jonathon gave you an approach using dict comprehensions in his answer. Here is an approach that deals with your do something part.

If you want to do something with the values of the dictionary, you don’t need a dictionary comprehension at all:

I’m using iteritems() since you tagged your question with python-2.7

results = map(some_function, [(k,v) for k,v in a_dict.iteritems() if 'foo' in k])

Now the result will be in a list with some_function applied to each key/value pair of the dictionary, that has foo in its key.

If you just want to deal with the values and ignore the keys, just change the list comprehension:

results = map(some_function, [v for k,v in a_dict.iteritems() if 'foo' in k])

some_function can be any callable, so a lambda would work as well:

results = map(lambda x: x*2, [v for k,v in a_dict.iteritems() if 'foo' in k])

The inner list is actually not required, as you can pass a generator expression to map as well:

>>> map(lambda a: a[0]*a[1], ((k,v) for k,v in {2:2, 3:2}.iteritems() if k == 2))
[4]

Question 13

I’m trying to take a file that looks like this:

AAA x 111
AAB x 111
AAA x 112
AAC x 123
...

And use a dictionary to so that the output looks like this

{AAA: ['111', '112'], AAB: ['111'], AAC: [123], ...}

This is what I’ve tried

file = open("filename.txt", "r") 
readline = file.readline().rstrip()
while readline!= "":
    list = []
    list = readline.split(" ")
    j = list.index("x")
    k = list[0:j]
    v = list[j + 1:]
    d = {}
    if k not in d == False:
        d[k] = []
    d[k].append(v)
    readline = file.readline().rstrip()

I keep getting a TypeError: unhashable type: 'list'. I know that keys in a dictionary can’t be lists but I’m trying to make my value into a list not the key. I’m wondering if I made a mistake somewhere.

Question 14

As indicated by the other answers, the error is to due to k = list[0:j], where your key is converted to a list. One thing you could try is reworking your code to take advantage of the split function:

# Using with ensures that the file is properly closed when you're done
with open('filename.txt', 'rb') as f:
  d = {}
  # Here we use readlines() to split the file into a list where each element is a line
  for line in f.readlines():
    # Now we split the file on `x`, since the part before the x will be
    # the key and the part after the value
    line = line.split('x')
    # Take the line parts and strip out the spaces, assigning them to the variables
    # Once you get a bit more comfortable, this works as well:
    # key, value = [x.strip() for x in line] 
    key = line[0].strip()
    value = line[1].strip()
    # Now we check if the dictionary contains the key; if so, append the new value,
    # and if not, make a new list that contains the current value
    # (For future reference, this is a great place for a defaultdict :)
    if key in d:
      d[key].append(value)
    else:
      d[key] = [value]

print d
# {'AAA': ['111', '112'], 'AAC': ['123'], 'AAB': ['111']}

Note that if you are using Python 3.x, you’ll have to make a minor adjustment to get it work properly. If you open the file with rb, you’ll need to use line = line.split(b'x') (which makes sure you are splitting the byte with the proper type of string). You can also open the file using with open('filename.txt', 'rU') as f: (or even with open('filename.txt', 'r') as f:) and it should work fine.

Question 15

Note: This answer does not explicitly answer the asked question. the other answers do it. Since the question is specific to a scenario and the raised exception is general, This answer points to the general case.

Hash values are just integers which are used to compare dictionary keys during a dictionary lookup quickly.

Internally, hash() method calls __hash__() method of an object which are set by default for any object.

Converting a nested list to a set

>>> a = [1,2,3,4,[5,6,7],8,9]
>>> set(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

This happens because of the list inside a list which is a list which cannot be hashed. Which can be solved by converting the internal nested lists to a tuple,

>>> set([1, 2, 3, 4, (5, 6, 7), 8, 9])
set([1, 2, 3, 4, 8, 9, (5, 6, 7)])

Explicitly hashing a nested list

>>> hash([1, 2, 3, [4, 5,], 6, 7])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'


>>> hash(tuple([1, 2, 3, [4, 5,], 6, 7]))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

>>> hash(tuple([1, 2, 3, tuple([4, 5,]), 6, 7]))
-7943504827826258506

The solution to avoid this error is to restructure the list to have nested tuples instead of lists.

Question 16

You’re trying to use k (which is a list) as a key for d. Lists are mutable and can’t be used as dict keys.

Also, you’re never initializing the lists in the dictionary, because of this line:

if k not in d == False:

Which should be:

if k not in d == True:

Which should actually be:

if k not in d:

Question 17

The reason you’re getting the unhashable type: 'list' exception is because k = list[0:j] sets k to be a “slice” of the list, which is logically another, often shorter, list. What you need is to get just the first item in list, written like so k = list[0]. The same for v = list[j + 1:] which should just be v = list[2] for the third element of the list returned from the call to readline.split(" ").

I noticed several other likely problems with the code, of which I’ll mention a few. A big one is you don’t want to (re)initialize d with d = {} for each line read in the loop. Another is it’s generally not a good idea to name variables the same as any of the built-ins types because it’ll prevent you from being able to access one of them if you need it — and it’s confusing to others who are used to the names designating one of these standard items. For that reason, you ought to rename your variable list variable something different to avoid issues like that.

Here’s a working version of your with these changes in it, I also replaced the if statement expression you used to check to see if the key was already in the dictionary and now make use of a dictionary’s setdefault() method to accomplish the same thing a little more succinctly.

d = {}
with open("nameerror.txt", "r") as file:
    line = file.readline().rstrip()
    while line:
        lst = line.split() # Split into sequence like ['AAA', 'x', '111'].
        k, _, v = lst[:3]  # Get first and third items.
        d.setdefault(k, []).append(v)
        line = file.readline().rstrip()

print('d: {}'.format(d))

Output:

d: {'AAA': ['111', '112'], 'AAC': ['123'], 'AAB': ['111']}

Question 18

The TypeError is happening because k is a list, since it is created using a slice from another list with the line k = list[0:j]. This should probably be something like k = ' '.join(list[0:j]), so you have a string instead.

In addition to this, your if statement is incorrect as noted by Jesse’s answer, which should read if k not in d or if not k in d (I prefer the latter).

You are also clearing your dictionary on each iteration since you have d = {} inside of your for loop.

Note that you should also not be using list or file as variable names, since you will be masking builtins.

Here is how I would rewrite your code:

d = {}
with open("filename.txt", "r") as input_file:
    for line in input_file:
        fields = line.split()
        j = fields.index("x")
        k = " ".join(fields[:j])
        d.setdefault(k, []).append(" ".join(fields[j+1:]))

The dict.setdefault() method above replaces the if k not in d logic from your code.

Question 19

    python 3.2

    with open("d://test.txt") as f:
              k=(((i.split("\n"))[0].rstrip()).split() for i in f.readlines())
              d={}
              for i,_,v in k:
                      d.setdefault(i,[]).append(v)

Question 20

I’ve a two columns dataframe, and intend to convert it to python dictionary – the first column will be the key and the second will be the value. Thank you in advance.

Dataframe:

    id    value
0    0     10.2
1    1      5.7
2    2      7.4

Question 21

See the docs for to_dict. You can use it like this:

df.set_index('id').to_dict()

And if you have only one column, to avoid the column name is also a level in the dict (actually, in this case you use the Series.to_dict()):

df.set_index('id')['value'].to_dict()

Question 22

mydict = dict(zip(df.id, df.value))

Question 23

If you want a simple way to preserve duplicates, you could use groupby:

>>> ptest = pd.DataFrame([['a',1],['a',2],['b',3]], columns=['id', 'value']) 
>>> ptest
  id  value
0  a      1
1  a      2
2  b      3
>>> {k: g["value"].tolist() for k,g in ptest.groupby("id")}
{'a': [1, 2], 'b': [3]}

Question 24

The answers by joris in this thread and by punchagan in the duplicated thread are very elegant, however they will not give correct results if the column used for the keys contains any duplicated value.

For example:

>>> ptest = p.DataFrame([['a',1],['a',2],['b',3]], columns=['id', 'value']) 
>>> ptest
  id  value
0  a      1
1  a      2
2  b      3

# note that in both cases the association a->1 is lost:
>>> ptest.set_index('id')['value'].to_dict()
{'a': 2, 'b': 3}
>>> dict(zip(ptest.id, ptest.value))
{'a': 2, 'b': 3}

If you have duplicated entries and do not want to lose them, you can use this ugly but working code:

>>> mydict = {}
>>> for x in range(len(ptest)):
...     currentid = ptest.iloc[x,0]
...     currentvalue = ptest.iloc[x,1]
...     mydict.setdefault(currentid, [])
...     mydict[currentid].append(currentvalue)
>>> mydict
{'a': [1, 2], 'b': [3]}

Question 25

Simplest solution:

df.set_index('id').T.to_dict('records')

Example:

df= pd.DataFrame([['a',1],['a',2],['b',3]], columns=['id','value'])
df.set_index('id').T.to_dict('records')

If you have multiple values, like val1, val2, val3,etc and u want them as lists, then use the below code:

df.set_index('id').T.to_dict('list')

Question 26

in some versions the code below might not work

mydict = dict(zip(df.id, df.value))

so make it explicit

id_=df.id.values
value=df.value.values
mydict=dict(zip(id_,value))

Note i used id_ because the word id is reserved word

Question 27

You can use ‘dict comprehension’

my_dict = {row[0]: row[1] for row in df.values}

Question 28

Another (slightly shorter) solution for not losing duplicate entries:

>>> ptest = pd.DataFrame([['a',1],['a',2],['b',3]], columns=['id','value'])
>>> ptest
  id  value
0  a      1
1  a      2
2  b      3

>>> pdict = dict()
>>> for i in ptest['id'].unique().tolist():
...     ptest_slice = ptest[ptest['id'] == i]
...     pdict[i] = ptest_slice['value'].tolist()
...

>>> pdict
{'b': [3], 'a': [1, 2]}

Question 29

You need a list as a dictionary value. This code will do the trick.

from collections import defaultdict
mydict = defaultdict(list)
for k, v in zip(df.id.values,df.value.values):
    mydict[k].append(v)

Question 30

I found this question while trying to make a dictionary out of three columns of a pandas dataframe. In my case the dataframe has columns A, B and C (let’s say A and B are the geographical coordinates of longitude and latitude and C the country region/state/etc, which is more or less the case).

I wanted a dictionary with each pair of A,B values (dictionary key) matching the value of C (dictionary value) in the corresponding row (each pair of A,B values is guaranteed to be unique due to previous filtering, but it is possible to have the same value of C for different pairs of A,B values in this context), so I did:

mydict = dict(zip(zip(df['A'],df['B']), df['C']))

Using pandas to_dict() also works:

mydict = df.set_index(['A','B']).to_dict(orient='dict')['C']

(none of the columns A or B were used as index before executing the line creating the dictionary)

Both approaches are fast (less than one second on a dataframe with 85k rows, 5-year-old fast dual-core laptop).

The reasons I’m posting this:

for those who need this kind of solution
if someone knows a faster executing solution (e.g., for millions of rows), I’d appreciate a reply.

Question 31

def get_dict_from_pd(df, key_col, row_col):
    result = dict()
    for i in set(df[key_col].values):
        is_i = df[key_col] == i
        result[i] = list(df[is_i][row_col].values)
    return result

this is my sloution, a basic loop

Question 32

This is my solution:

import pandas as pd
df = pd.read_excel('dic.xlsx')
df_T = df.set_index('id').T
dic = df_T.to_dict('records')
print(dic)

Question 33

Python 3.2.3. There were some ideas listed here, which work on regular var’s, but it seems **kwargs play by different rules… so why doesn’t this work and how can I check to see if a key in **kwargs exists?

if kwargs['errormessage']:
    print("It exists")

I also think this should work, but it doesn’t —

if errormessage in kwargs:
    print("yeah it's here")

I’m guessing because kwargs is iterable? Do I have to iterate through it just to check if a particular key is there?

Question 34

You want

if 'errormessage' in kwargs:
    print("found it")

To get the value of errormessage

if 'errormessage' in kwargs:
    print("errormessage equals " + kwargs.get("errormessage"))

In this way, kwargs is just another dict. Your first example, if kwargs['errormessage'], means “get the value associated with the key “errormessage” in kwargs, and then check its bool value”. So if there’s no such key, you’ll get a KeyError.

Your second example, if errormessage in kwargs:, means “if kwargs contains the element named by “errormessage“, and unless “errormessage” is the name of a variable, you’ll get a NameError.

I should mention that dictionaries also have a method .get() which accepts a default parameter (itself defaulting to None), so that kwargs.get("errormessage") returns the value if that key exists and None otherwise (similarly kwargs.get("errormessage", 17) does what you might think it does). When you don’t care about the difference between the key existing and having None as a value or the key not existing, this can be handy.

Question 35

DSM’s and Tadeck’s answers answer your question directly.

In my scripts I often use the convenient dict.pop() to deal with optional, and additional arguments. Here’s an example of a simple print() wrapper:

def my_print(*args, **kwargs):
    prefix = kwargs.pop('prefix', '')
    print(prefix, *args, **kwargs)

Then:

>>> my_print('eggs')
 eggs
>>> my_print('eggs', prefix='spam')
spam eggs

As you can see, if prefix is not contained in kwargs, then the default '' (empty string) is being stored in the local prefix variable. If it is given, then its value is being used.

This is generally a compact and readable recipe for writing wrappers for any kind of function: Always just pass-through arguments you don’t understand, and don’t even know if they exist. If you always pass through *args and **kwargs you make your code slower, and requires a bit more typing, but if interfaces of the called function (in this case print) changes, you don’t need to change your code. This approach reduces development time while supporting all interface changes.

Question 36

It is just this:

if 'errormessage' in kwargs:
    print("yeah it's here")

You need to check, if the key is in the dictionary. The syntax for that is some_key in some_dict (where some_key is something hashable, not necessarily a string).

The ideas you have linked (these ideas) contained examples for checking if specific key existed in dictionaries returned by locals() and globals(). Your example is similar, because you are checking existence of specific key in kwargs dictionary (the dictionary containing keyword arguments).

Question 37

One way is to add it by yourself! How? By merging kwargs with a bunch of defaults. This won’t be appropriate on all occasions, for example, if the keys are not known to you in advance. However, if they are, here is a simple example:

import sys

def myfunc(**kwargs):
    args = {'country':'England','town':'London',
            'currency':'Pound', 'language':'English'}

    diff = set(kwargs.keys()) - set(args.keys())
    if diff:
        print("Invalid args:",tuple(diff),file=sys.stderr)
        return

    args.update(kwargs)            
    print(args)

The defaults are set in the dictionary args, which includes all the keys we are expecting. We first check to see if there are any unexpected keys in kwargs. Then we update args with kwargs which will overwrite any new values that the user has set. We don’t need to test if a key exists, we now use args as our argument dictionary and have no further need of kwargs.

Question 38

You can discover those things easily by yourself:

def hello(*args, **kwargs):
    print kwargs
    print type(kwargs)
    print dir(kwargs)

hello(what="world")

Question 39

if kwarg.__len__() != 0:
    print(kwarg)

Question 40

I am trying to iterate through a JSON object to import data, i.e. title and link. I can’t seem to get to the content that is past the :.

JSON:

[
    {
        "title": "Baby (Feat. Ludacris) - Justin Bieber",
        "description": "Baby (Feat. Ludacris) by Justin Bieber on Grooveshark",
        "link": "http://listen.grooveshark.com/s/Baby+Feat+Ludacris+/2Bqvdq",
        "pubDate": "Wed, 28 Apr 2010 02:37:53 -0400",
        "pubTime": 1272436673,
        "TinyLink": "http://tinysong.com/d3wI",
        "SongID": "24447862",
        "SongName": "Baby (Feat. Ludacris)",
        "ArtistID": "1118876",
        "ArtistName": "Justin Bieber",
        "AlbumID": "4104002",
        "AlbumName": "My World (Part II);\nhttp://tinysong.com/gQsw",
        "LongLink": "11578982",
        "GroovesharkLink": "11578982",
        "Link": "http://tinysong.com/d3wI"
    },
    {
        "title": "Feel Good Inc - Gorillaz",
        "description": "Feel Good Inc by Gorillaz on Grooveshark",
        "link": "http://listen.grooveshark.com/s/Feel+Good+Inc/1UksmI",
        "pubDate": "Wed, 28 Apr 2010 02:25:30 -0400",
        "pubTime": 1272435930
    }
]

I tried using a dictionary:

def getLastSong(user,limit):
    base_url = 'http://gsuser.com/lastSong/'
    user_url = base_url + str(user) + '/' + str(limit) + "/"
    raw = urllib.urlopen(user_url)
    json_raw= raw.readlines()
    json_object = json.loads(json_raw[0])

    #filtering and making it look good.
    gsongs = []
    print json_object
    for song in json_object[0]:   
        print song

This code only prints the information before :. (ignore the Justin Bieber track :))

Question 41

Your loading of the JSON data is a little fragile. Instead of:

json_raw= raw.readlines()
json_object = json.loads(json_raw[0])

you should really just do:

json_object = json.load(raw)

You shouldn’t think of what you get as a “JSON object”. What you have is a list. The list contains two dicts. The dicts contain various key/value pairs, all strings. When you do json_object[0], you’re asking for the first dict in the list. When you iterate over that, with for song in json_object[0]:, you iterate over the keys of the dict. Because that’s what you get when you iterate over the dict. If you want to access the value associated with the key in that dict, you would use, for example, json_object[0][song].

None of this is specific to JSON. It’s just basic Python types, with their basic operations as covered in any tutorial.

Question 42

I believe you probably meant:

from __future__ import print_function

for song in json_object:
    # now song is a dictionary
    for attribute, value in song.items():
        print(attribute, value) # example usage

NB: You could use song.iteritems instead of song.items if in Python 2.

Question 43

This question has been out here a long time, but I wanted to contribute how I usually iterate through a JSON object. In the example below, I’ve shown a hard-coded string that contains the JSON, but the JSON string could just as easily have come from a web service or a file.

import json

def main():

    # create a simple JSON array
    jsonString = '{"key1":"value1","key2":"value2","key3":"value3"}'

    # change the JSON string into a JSON object
    jsonObject = json.loads(jsonString)

    # print the keys and values
    for key in jsonObject:
        value = jsonObject[key]
        print("The key and value are ({}) = ({})".format(key, value))

    pass

if __name__ == '__main__':
    main()

Question 44

After deserializing the JSON, you have a python object. Use the regular object methods.

In this case you have a list made of dictionaries:

json_object[0].items()

json_object[0]["title"]

etc.

Question 45

I would solve this problem more like this

import json
import urllib2

def last_song(user, limit):
    # Assembling strings with "foo" + str(bar) + "baz" + ... generally isn't 
    # as nice as using real string formatting. It can seem simpler at first, 
    # but leaves you less happy in the long run.
    url = 'http://gsuser.com/lastSong/%s/%d/' % (user, limit)

    # urllib.urlopen is deprecated in favour of urllib2.urlopen
    site = urllib2.urlopen(url)

    # The json module has a function load for loading from file-like objects, 
    # like the one you get from `urllib2.urlopen`. You don't need to turn 
    # your data into a string and use loads and you definitely don't need to 
    # use readlines or readline (there is seldom if ever reason to use a 
    # file-like object's readline(s) methods.)
    songs = json.load(site)

    # I don't know why "lastSong" stuff returns something like this, but 
    # your json thing was a JSON array of two JSON objects. This will 
    # deserialise as a list of two dicts, with each item representing 
    # each of those two songs.
    #
    # Since each of the songs is represented by a dict, it will iterate 
    # over its keys (like any other Python dict). 
    baby, feel_good = songs

    # Rather than printing in a function, it's usually better to 
    # return the string then let the caller do whatever with it. 
    # You said you wanted to make the output pretty but you didn't 
    # mention *how*, so here's an example of a prettyish representation
    # from the song information given.
    return "%(SongName)s by %(ArtistName)s - listen at %(link)s" % baby

Question 46

for iterating through JSON you can use this:

json_object = json.loads(json_file)
for element in json_object: 
    for value in json_object['Name_OF_YOUR_KEY/ELEMENT']:
        print(json_object['Name_OF_YOUR_KEY/ELEMENT']['INDEX_OF_VALUE']['VALUE'])

Question 47

For Python 3, you have to decode the data you get back from the web server. For instance I decode the data as utf8 then deal with it:

 # example of json data object group with two values of key id
jsonstufftest = '{'group':{'id':'2','id':'3'}}
 # always set your headers
headers = {'User-Agent': 'Moz & Woz'}
 # the url you are trying to load and get json from
url = 'http://www.cooljson.com/cooljson.json'
 # in python 3 you can build the request using request.Request
req = urllib.request.Request(url,None,headers)
 # try to connect or fail gracefully
try:
    response = urllib.request.urlopen(req) # new python 3 code -jc
except:
    exit('could not load page, check connection')
 # read the response and DECODE
html=response.read().decode('utf8') # new python3 code
 # now convert the decoded string into real JSON
loadedjson = json.loads(html)
 # print to make sure it worked
print (loadedjson) # works like a charm
 # iterate through each key value
for testdata in loadedjson['group']:
    print (accesscount['id']) # should print 2 then 3 if using test json

If you don’t decode you will get bytes vs string errors in Python 3.

Question 48

Consider the following dictionary, d:

d = {'a': 3, 'b': 2, 'c': 3, 'd': 4, 'e': 5}

I want to return the first N key:value pairs from d (N <= 4 in this case). What is the most efficient method of doing this?

Question 49

There’s no such thing a the “first n” keys because a dict doesn’t remember which keys were inserted first.

You can get any n key-value pairs though:

n_items = take(n, d.iteritems())

This uses the implementation of take from the itertools recipes:

from itertools import islice

def take(n, iterable):
    "Return first n items of the iterable as a list"
    return list(islice(iterable, n))

See it working online: ideone

Update for Python 3.6

n_items = take(n, d.items())

Question 50

A very efficient way to retrieve anything is to combine list or dictionary comprehensions with slicing. If you don’t need to order the items (you just want n random pairs), you can use a dictionary comprehension like this:

# Python 2
first2pairs = {k: mydict[k] for k in mydict.keys()[:2]}
# Python 3
first2pairs = {k: mydict[k] for k in list(mydict)[:2]}

Generally a comprehension like this is always faster to run than the equivalent “for x in y” loop. Also, by using .keys() to make a list of the dictionary keys and slicing that list you avoid ‘touching’ any unnecessary keys when you build the new dictionary.

If you don’t need the keys (only the values) you can use a list comprehension:

first2vals = [v for v in mydict.values()[:2]]

If you need the values sorted based on their keys, it’s not much more trouble:

first2vals = [mydict[k] for k in sorted(mydict.keys())[:2]]

or if you need the keys as well:

first2pairs = {k: mydict[k] for k in sorted(mydict.keys())[:2]}

Question 51

Python’s dicts are not ordered, so it’s meaningless to ask for the “first N” keys.

The collections.OrderedDict class is available if that’s what you need. You could efficiently get its first four elements as

import itertools
import collections

d = collections.OrderedDict((('foo', 'bar'), (1, 'a'), (2, 'b'), (3, 'c'), (4, 'd')))
x = itertools.islice(d.items(), 0, 4)

for key, value in x:
    print key, value

itertools.islice allows you to lazily take a slice of elements from any iterator. If you want the result to be reusable you’d need to convert it to a list or something, like so:

x = list(itertools.islice(d.items(), 0, 4))

Question 52

foo = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6}
iterator = iter(foo.items())
for i in range(3):
    print(next(iterator))

Basically, turn the view (dict_items) into an iterator, and then iterate it with next().

Question 53

Did not see it on here. Will not be ordered but the simplest syntactically if you need to just take some elements from a dictionary.

n = 2
{key:value for key,value in d.items()[0:n]}

Question 54

To get the top N elements from your python dictionary one can use the following line of code:

list(dictionaryName.items())[:N]

In your case you can change it to:

list(d.items())[:4]

Question 55

See PEP 0265 on sorting dictionaries. Then use the aforementioned iterable code.

If you need more efficiency in the sorted key-value pairs. Use a different data structure. That is, one that maintains sorted order and the key-value associations.

E.g.

import bisect

kvlist = [('a', 1), ('b', 2), ('c', 3), ('e', 5)]
bisect.insort_left(kvlist, ('d', 4))

print kvlist # [('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', 5)]

Question 56

in py3, this will do the trick

{A:N for (A,N) in [x for x in d.items()][:4]}

{‘a’: 3, ‘b’: 2, ‘c’: 3, ‘d’: 4}

Question 57

just add an answer using zip,

{k: d[k] for k, _ in zip(d, range(n))}

Question 58

This depends on what is ‘most efficient’ in your case.

If you just want a semi-random sample of a huge dictionary foo, use foo.iteritems() and take as many values from it as you need, it’s a lazy operation that avoids creation of an explicit list of keys or items.

If you need to sort keys first, there’s no way around using something like keys = foo.keys(); keys.sort() or sorted(foo.iterkeys()), you’ll have to build an explicit list of keys. Then slice or iterate through first N keys.

BTW why do you care about the ‘efficient’ way? Did you profile your program? If you did not, use the obvious and easy to understand way first. Chances are it will do pretty well without becoming a bottleneck.

Question 59

You can approach this a number of ways. If order is important you can do this:

for key in sorted(d.keys()):
  item = d.pop(key)

If order isn’t a concern you can do this:

for i in range(4):
  item = d.popitem()

Question 60

Dictionary maintains no order , so before picking top N key value pairs lets make it sorted.

import operator
d = {'a': 3, 'b': 2, 'c': 3, 'd': 4}
d=dict(sorted(d.items(),key=operator.itemgetter(1),reverse=True))
#itemgetter(0)=sort by keys, itemgetter(1)=sort by values

Now we can do the retrieval of top ‘N’ elements:, using the method structure like this:

def return_top(elements,dictionary_element):
    '''Takes the dictionary and the 'N' elements needed in return
    '''
    topers={}
    for h,i in enumerate(dictionary_element):
        if h<elements:
            topers.update({i:dictionary_element[i]})
    return topers

to get the top 2 elements then simply use this structure:

d = {'a': 3, 'b': 2, 'c': 3, 'd': 4}
d=dict(sorted(d.items(),key=operator.itemgetter(1),reverse=True))
d=return_top(2,d)
print(d)

Question 61

For Python 3 and above,To select first n Pairs

n=4
firstNpairs = {k: Diction[k] for k in list(Diction.keys())[:n]}

Question 62

consider a dict

d = {'a': 3, 'b': 2, 'c': 3, 'd': 4, 'e': 5}

from itertools import islice
n = 3
list(islice(d.items(),n))

islice will do the trick :) hope it helps !

Question 63

This might not be very elegant, but works for me:

d = {'a': 3, 'b': 2, 'c': 3, 'd': 4, 'e': 5}

x= 0
for key, val in d.items():
    if x == 2:
        break
    else:
        x += 1
        # Do something with the first two key-value pairs

Question 64

I have tried a few of the answers above and note that some of them are version dependent and do not work in version 3.7.

I also note that since 3.6 all dictionaries are ordered by the sequence in which items are inserted.

Despite dictionaries being ordered since 3.6 some of the statements you expect to work with ordered structures don’t seem to work.

The answer to the OP question that worked best for me.

itr = iter(dic.items())
lst = [next(itr) for i in range(3)]

问题：过滤python词典中的项，其中键包含特定的字符串

回答 0

回答 1

回答 2

回答 3

回答 4

问题：Python：TypeError：无法散列的类型：“列表”

回答 0

回答 1

将嵌套列表转换为集合

显式哈希嵌套列表

Converting a nested list to a set

Explicitly hashing a nested list

回答 2

回答 3

回答 4

回答 5

问题：python pandas dataframe到字典

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

问题：如何检查** kwargs键是否存在？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

问题：遍历JSON对象

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

问题：Python-从字典返回前N个key：value对

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

回答 13

回答 14

回答 15

问题：如果不存在，Python将更新dict中的键

回答 0

回答 1

回答 2

回答 3

回答 4

问题：定义type.Dict和dict之间的区别？

回答 0

回答 1

问题：Python：将元组/字典作为键，进行选择，排序

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7