Python 实用宝典

Question 1

In Python I’m getting an error:

Exception:  (<type 'exceptions.AttributeError'>,
AttributeError("'str' object has no attribute 'read'",), <traceback object at 0x1543ab8>)

Given python code:

def getEntries (self, sub):
    url = 'http://www.reddit.com/'
    if (sub != ''):
        url += 'r/' + sub

    request = urllib2.Request (url + 
        '.json', None, {'User-Agent' : 'Reddit desktop client by /user/RobinJ1995/'})
    response = urllib2.urlopen (request)
    jsonofabitch = response.read ()

    return json.load (jsonofabitch)['data']['children']

What does this error mean and what did I do to cause it?

Question 2

The problem is that for json.load you should pass a file like object with a read function defined. So either you use json.load(response) or json.loads(response.read()).

Question 3

AttributeError("'str' object has no attribute 'read'",)

This means exactly what it says: something tried to find a .read attribute on the object that you gave it, and you gave it an object of type str (i.e., you gave it a string).

The error occurred here:

json.load (jsonofabitch)['data']['children']

Well, you aren’t looking for read anywhere, so it must happen in the json.load function that you called (as indicated by the full traceback). That is because json.load is trying to .read the thing that you gave it, but you gave it jsonofabitch, which currently names a string (which you created by calling .read on the response).

Solution: don’t call .read yourself; the function will do this, and is expecting you to give it the response directly so that it can do so.

You could also have figured this out by reading the built-in Python documentation for the function (try help(json.load), or for the entire module (try help(json)), or by checking the documentation for those functions on http://docs.python.org .

Question 4

If you get a python error like this:

AttributeError: 'str' object has no attribute 'some_method'

You probably poisoned your object accidentally by overwriting your object with a string.

How to reproduce this error in python with a few lines of code:

#!/usr/bin/env python
import json
def foobar(json):
    msg = json.loads(json)

foobar('{"batman": "yes"}')

Run it, which prints:

AttributeError: 'str' object has no attribute 'loads'

But change the name of the variablename, and it works fine:

#!/usr/bin/env python
import json
def foobar(jsonstring):
    msg = json.loads(jsonstring)

foobar('{"batman": "yes"}')

This error is caused when you tried to run a method within a string. String has a few methods, but not the one you are invoking. So stop trying to invoke a method which String does not define and start looking for where you poisoned your object.

Question 5

Ok, this is an old thread but. I had a same issue, my problem was I used json.load instead of json.loads

This way, json has no problem with loading any kind of dictionary.

Official documentation

json.load – Deserialize fp (a .read()-supporting text file or binary file containing a JSON document) to a Python object using this conversion table.

json.loads – Deserialize s (a str, bytes or bytearray instance containing a JSON document) to a Python object using this conversion table.

Question 6

You need to open the file first. This doesn’t work:

json_file = json.load('test.json')

But this works:

f = open('test.json')
json_file = json.load(f)

Question 7

So I’ve followed this tutorial but it doesn’t seem to do anything. Simply nothing. It waits a few seconds and closes the program. What is wrong with this code?

import cv2
vidcap = cv2.VideoCapture('Compton.mp4')
success,image = vidcap.read()
count = 0
success = True
while success:
  success,image = vidcap.read()
  cv2.imwrite("frame%d.jpg" % count, image)     # save frame as JPEG file
  if cv2.waitKey(10) == 27:                     # exit if Escape is hit
      break
  count += 1

Also, in the comments it says that this limits the frames to 1000? Why?

EDIT: I tried doing success = True first but that didn’t help. It only created one image that was 0 bytes.

Question 8

From here download this video so we have the same video file for the test. Make sure to have that mp4 file in the same directory of your python code. Then also make sure to run the python interpreter from the same directory.

Then modify the code, ditch waitKey that’s wasting time also without a window it cannot capture the keyboard events. Also we print the success value to make sure it’s reading the frames successfully.

import cv2
vidcap = cv2.VideoCapture('big_buck_bunny_720p_5mb.mp4')
success,image = vidcap.read()
count = 0
while success:
  cv2.imwrite("frame%d.jpg" % count, image)     # save frame as JPEG file      
  success,image = vidcap.read()
  print('Read a new frame: ', success)
  count += 1

How does that go?

Question 9

To extend on this question (& answer by @user2700065) for a slightly different cases, if anyone does not want to extract every frame but wants to extract frame every one second. So a 1-minute video will give 60 frames(images).

import sys
import argparse

import cv2
print(cv2.__version__)

def extractImages(pathIn, pathOut):
    count = 0
    vidcap = cv2.VideoCapture(pathIn)
    success,image = vidcap.read()
    success = True
    while success:
        vidcap.set(cv2.CAP_PROP_POS_MSEC,(count*1000))    # added this line 
        success,image = vidcap.read()
        print ('Read a new frame: ', success)
        cv2.imwrite( pathOut + "\\frame%d.jpg" % count, image)     # save frame as JPEG file
        count = count + 1

if __name__=="__main__":
    a = argparse.ArgumentParser()
    a.add_argument("--pathIn", help="path to video")
    a.add_argument("--pathOut", help="path to images")
    args = a.parse_args()
    print(args)
    extractImages(args.pathIn, args.pathOut)

Question 10

This is a tweak from previous answer for python 3.x from @GShocked, I would post it to the comment, but dont have enough reputation

import sys
import argparse

import cv2
print(cv2.__version__)

def extractImages(pathIn, pathOut):
    vidcap = cv2.VideoCapture(pathIn)
    success,image = vidcap.read()
    count = 0
    success = True
    while success:
      success,image = vidcap.read()
      print ('Read a new frame: ', success)
      cv2.imwrite( pathOut + "\\frame%d.jpg" % count, image)     # save frame as JPEG file
      count += 1

if __name__=="__main__":
    print("aba")
    a = argparse.ArgumentParser()
    a.add_argument("--pathIn", help="path to video")
    a.add_argument("--pathOut", help="path to images")
    args = a.parse_args()
    print(args)
    extractImages(args.pathIn, args.pathOut)

Question 11

This is Function which will convert most of the video formats to number of frames there are in the video. It works on Python3 with OpenCV 3+

import cv2
import time
import os

def video_to_frames(input_loc, output_loc):
    """Function to extract frames from input video file
    and save them as separate frames in an output directory.
    Args:
        input_loc: Input video file.
        output_loc: Output directory to save the frames.
    Returns:
        None
    """
    try:
        os.mkdir(output_loc)
    except OSError:
        pass
    # Log the time
    time_start = time.time()
    # Start capturing the feed
    cap = cv2.VideoCapture(input_loc)
    # Find the number of frames
    video_length = int(cap.get(cv2.CAP_PROP_FRAME_COUNT)) - 1
    print ("Number of frames: ", video_length)
    count = 0
    print ("Converting video..\n")
    # Start converting the video
    while cap.isOpened():
        # Extract the frame
        ret, frame = cap.read()
        # Write the results back to output location.
        cv2.imwrite(output_loc + "/%#05d.jpg" % (count+1), frame)
        count = count + 1
        # If there are no more frames left
        if (count > (video_length-1)):
            # Log the time again
            time_end = time.time()
            # Release the feed
            cap.release()
            # Print stats
            print ("Done extracting frames.\n%d frames extracted" % count)
            print ("It took %d seconds forconversion." % (time_end-time_start))
            break

if __name__=="__main__":

    input_loc = '/path/to/video/00009.MTS'
    output_loc = '/path/to/output/frames/'
    video_to_frames(input_loc, output_loc)

It supports .mts and normal files like .mp4 and .avi. Tried and Tested on .mts files. Works like a Charm.

Question 12

After a lot of research on how to convert frames to video I have created this function hope this helps. We require opencv for this:

import cv2
import numpy as np
import os

def frames_to_video(inputpath,outputpath,fps):
   image_array = []
   files = [f for f in os.listdir(inputpath) if isfile(join(inputpath, f))]
   files.sort(key = lambda x: int(x[5:-4]))
   for i in range(len(files)):
       img = cv2.imread(inputpath + files[i])
       size =  (img.shape[1],img.shape[0])
       img = cv2.resize(img,size)
       image_array.append(img)
   fourcc = cv2.VideoWriter_fourcc('D', 'I', 'V', 'X')
   out = cv2.VideoWriter(outputpath,fourcc, fps, size)
   for i in range(len(image_array)):
       out.write(image_array[i])
   out.release()


inputpath = 'folder path'
outpath =  'video file path/video.mp4'
fps = 29
frames_to_video(inputpath,outpath,fps)

change the value of fps(frames per second),input folder path and output folder path according to your own local locations

Question 13

The previous answers have lost the first frame. And it will be nice to store the images in a folder.

# create a folder to store extracted images
import os
folder = 'test'  
os.mkdir(folder)
# use opencv to do the job
import cv2
print(cv2.__version__)  # my version is 3.1.0
vidcap = cv2.VideoCapture('test_video.mp4')
count = 0
while True:
    success,image = vidcap.read()
    if not success:
        break
    cv2.imwrite(os.path.join(folder,"frame{:d}.jpg".format(count)), image)     # save frame as JPEG file
    count += 1
print("{} images are extacted in {}.".format(count,folder))

By the way, you can check the frame rate by VLC. Go to windows -> media information -> codec details

Question 14

This code extract frames from the video and save the frames in .jpg formate

import cv2
import numpy as np
import os

# set video file path of input video with name and extension
vid = cv2.VideoCapture('VideoPath')


if not os.path.exists('images'):
    os.makedirs('images')

#for frame identity
index = 0
while(True):
    # Extract images
    ret, frame = vid.read()
    # end of frames
    if not ret: 
        break
    # Saves images
    name = './images/frame' + str(index) + '.jpg'
    print ('Creating...' + name)
    cv2.imwrite(name, frame)

    # next frame
    index += 1

Question 15

I am using Python via Anaconda’s Spyder software. Using the original code listed in the question of this thread by @Gshocked, the code does not work (the python won’t read the mp4 file). So I downloaded OpenCV 3.2 and copied “opencv_ffmpeg320.dll” and “opencv_ffmpeg320_64.dll” from the “bin” folder. I pasted both of these dll files to Anaconda’s “Dlls” folder.

Anaconda also has a “pckgs” folder…I copied and pasted the entire “OpenCV 3.2” folder that I downloaded to the Anaconda “pckgs” folder.

Finally, Anaconda has a “Library” folder which has a “bin” subfolder. I pasted the “opencv_ffmpeg320.dll” and “opencv_ffmpeg320_64.dll” files to that folder.

After closing and restarting Spyder, the code worked. I’m not sure which of the three methods worked, and I’m too lazy to go back and figure it out. But it works so, cheers!

Question 16

This function extracts images from video with 1 fps, IN ADDITION it identifies the last frame and stops reading also:

import cv2
import numpy as np

def extract_image_one_fps(video_source_path):

    vidcap = cv2.VideoCapture(video_source_path)
    count = 0
    success = True
    while success:
      vidcap.set(cv2.CAP_PROP_POS_MSEC,(count*1000))      
      success,image = vidcap.read()

      ## Stop when last frame is identified
      image_last = cv2.imread("frame{}.png".format(count-1))
      if np.array_equal(image,image_last):
          break

      cv2.imwrite("frame%d.png" % count, image)     # save frame as PNG file
      print '{}.sec reading a new frame: {} '.format(count,success)
      count += 1

Question 17

Following script will extract frames every half a second of all videos in folder. (Works on python 3.7)

import cv2
import os
listing = os.listdir(r'D:/Images/AllVideos')
count=1
for vid in listing:
    vid = r"D:/Images/AllVideos/"+vid
    vidcap = cv2.VideoCapture(vid)
    def getFrame(sec):
        vidcap.set(cv2.CAP_PROP_POS_MSEC,sec*1000)
        hasFrames,image = vidcap.read()
        if hasFrames:
            cv2.imwrite("D:/Images/Frames/image"+str(count)+".jpg", image) # Save frame as JPG file
        return hasFrames
    sec = 0
    frameRate = 0.5 # Change this number to 1 for each 1 second
    
    success = getFrame(sec)
    while success:
        count = count + 1
        sec = sec + frameRate
        sec = round(sec, 2)
        success = getFrame(sec)

Question 18

I searched in this official document to find difference between the json.dump() and json.dumps() in python. It is clear that they are related with file write option.
But what is the detailed difference between them and in what situations one has more advantage than other?

Question 19

There isn’t much else to add other than what the docs say. If you want to dump the JSON into a file/socket or whatever, then you should go with dump(). If you only need it as a string (for printing, parsing or whatever) then use dumps() (dump string)

As mentioned by Antti Haapala in this answer, there are some minor differences on the ensure_ascii behaviour. This is mostly due to how the underlying write() function works, being that it operates on chunks rather than the whole string. Check his answer for more details on that.

json.dump()

Serialize obj as a JSON formatted stream to fp (a .write()-supporting file-like object

If ensure_ascii is False, some chunks written to fp may be unicode instances

json.dumps()

Serialize obj to a JSON formatted str

If ensure_ascii is False, the result may contain non-ASCII characters and the return value may be a unicode instance

Question 20

The functions with an s take string parameters. The others take file streams.

Question 21

In memory usage and speed.

When you call jsonstr = json.dumps(mydata) it first creates a full copy of your data in memory and only then you file.write(jsonstr) it to disk. So this is a faster method but can be a problem if you have a big piece of data to save.

When you call json.dump(mydata, file) — without ‘s’, new memory is not used, as the data is dumped by chunks. But the whole process is about 2 times slower.

Source: I checked the source code of json.dump() and json.dumps() and also tested both the variants measuring the time with time.time() and watching the memory usage in htop.

Question 22

One notable difference in Python 2 is that if you’re using ensure_ascii=False, dump will properly write UTF-8 encoded data into the file (unless you used 8-bit strings with extended characters that are not UTF-8):

dumps on the other hand, with ensure_ascii=False can produce a str or unicode just depending on what types you used for strings:

Serialize obj to a JSON formatted str using this conversion table. If ensure_ascii is False, the result may contain non-ASCII characters and the return value may be a unicode instance.

(emphasis mine). Note that it may still be a str instance as well.

Thus you cannot use its return value to save the structure into file without checking which format was returned and possibly playing with unicode.encode.

This of course is not valid concern in Python 3 any more, since there is no more this 8-bit/Unicode confusion.

As for load vs loads, load considers the whole file to be one JSON document, so you cannot use it to read multiple newline limited JSON documents from a single file.

Question 23

I want to check my environment for the existence of a variable, say "FOO", in Python. For this purpose, I am using the os standard library. After reading the library’s documentation, I have figured out 2 ways to achieve my goal:

Method 1:

if "FOO" in os.environ:
    pass

Method 2:

if os.getenv("FOO") is not None:
    pass

I would like to know which method, if either, is a good/preferred conditional and why.

Question 24

Use the first; it directly tries to check if something is defined in environ. Though the second form works equally well, it’s lacking semantically since you get a value back if it exists and only use it for a comparison.

You’re trying to see if something is present in environ, why would you get just to compare it and then toss it away?

That’s exactly what getenv does:

Get an environment variable, return None if it doesn’t exist. The optional second argument can specify an alternate default.

(this also means your check could just be if getenv("FOO"))

you don’t want to get it, you want to check for it’s existence.

Either way, getenv is just a wrapper around environ.get but you don’t see people checking for membership in mappings with:

from os import environ
if environ.get('Foo') is not None:

To summarize, use:

if "FOO" in os.environ:
    pass

if you just want to check for existence, while, use getenv("FOO") if you actually want to do something with the value you might get.

Question 25

There is a case for either solution, depending on what you want to do conditional on the existence of the environment variable.

Case 1

When you want to take different actions purely based on the existence of the environment variable, without caring for its value, the first solution is the best practice. It succinctly describes what you test for: is ‘FOO’ in the list of environment variables.

if 'KITTEN_ALLERGY' in os.environ:
    buy_puppy()
else:
    buy_kitten()

Case 2

When you want to set a default value if the value is not defined in the environment variables the second solution is actually useful, though not in the form you wrote it:

server = os.getenv('MY_CAT_STREAMS', 'youtube.com')

or perhaps

server = os.environ.get('MY_CAT_STREAMS', 'youtube.com')

Note that if you have several options for your application you might want to look into ChainMap, which allows to merge multiple dicts based on keys. There is an example of this in the ChainMap documentation:

[...]
combined = ChainMap(command_line_args, os.environ, defaults)

Question 26

To be on the safe side use

os.getenv('FOO') or 'bar'

A corner case with the above answers is when the environment variable is set but is empty

For this special case you get

print(os.getenv('FOO', 'bar'))
# prints new line - though you expected `bar`

or

if "FOO" in os.environ:
    print("FOO is here")
# prints FOO is here - however its not

To avoid this just use or

os.getenv('FOO') or 'bar'

Then you get

print(os.getenv('FOO') or 'bar')
# bar

When do we have empty environment variables?

You forgot to set the value in the .env file

# .env
FOO=

or exported as

$ export FOO=

or forgot to set it in settings.py

# settings.py
os.environ['FOO'] = ''

Update: if in doubt, check out these one-liners

>>> import os; os.environ['FOO'] = ''; print(os.getenv('FOO', 'bar'))

$ FOO= python -c "import os; print(os.getenv('FOO', 'bar'))"

Question 27

In case you want to check if multiple env variables are not set, you can do the following:

import os

MANDATORY_ENV_VARS = ["FOO", "BAR"]

for var in MANDATORY_ENV_VARS:
    if var not in os.environ:
        raise EnvironmentError("Failed because {} is not set.".format(var))

Question 28

My comment might not be relevant to the tags given. However, I was lead to this page from my search. I was looking for similar check in R and I came up the following with the help of @hugovdbeg post. I hope it would be helpful for someone who is looking for similar solution in R

'USERNAME' %in% names(Sys.getenv())

Question 29

If I define a class method with a keyword argument thus:

class foo(object):
  def foodo(thing=None, thong='not underwear'):
    print thing if thing else "nothing" 
    print 'a thong is',thong

calling the method generates a TypeError:

myfoo = foo()
myfoo.foodo(thing="something")

...
TypeError: foodo() got multiple values for keyword argument 'thing'

What’s going on?

Question 30

The problem is that the first argument passed to class methods in python is always a copy of the class instance on which the method is called, typically labelled self. If the class is declared thus:

class foo(object):
  def foodo(self, thing=None, thong='not underwear'):
    print thing if thing else "nothing" 
    print 'a thong is',thong

it behaves as expected.

Explanation:

Without self as the first parameter, when myfoo.foodo(thing="something") is executed, the foodo method is called with arguments (myfoo, thing="something"). The instance myfoo is then assigned to thing (since thing is the first declared parameter), but python also attempts to assign "something" to thing, hence the Exception.

To demonstrate, try running this with the original code:

myfoo.foodo("something")
print
print myfoo

You’ll output like:

<__main__.foo object at 0x321c290>
a thong is something

<__main__.foo object at 0x321c290>

You can see that ‘thing’ has been assigned a reference to the instance ‘myfoo’ of the class ‘foo’. This section of the docs explains how function arguments work a bit more.

Question 31

Thanks for the instructive posts. I’d just like to keep a note that if you’re getting “TypeError: foodo() got multiple values for keyword argument ‘thing'”, it may also be that you’re mistakenly passing the ‘self’ as a parameter when calling the function (probably because you copied the line from the class declaration – it’s a common error when one’s in a hurry).

Question 32

This might be obvious, but it might help someone who has never seen it before. This also happens for regular functions if you mistakenly assign a parameter by position and explicitly by name.

>>> def foodo(thing=None, thong='not underwear'):
...     print thing if thing else "nothing"
...     print 'a thong is',thong
...
>>> foodo('something', thing='everything')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: foodo() got multiple values for keyword argument 'thing'

Question 33

just add ‘staticmethod’ decorator to function and problem is fixed

class foo(object):
    @staticmethod
    def foodo(thing=None, thong='not underwear'):
        print thing if thing else "nothing" 
        print 'a thong is',thong

Question 34

I want to add one more answer :

It happens when you try to pass positional parameter with wrong position order along with keyword argument in calling function.

there is difference between parameter and argument you can read in detail about here Arguments and Parameter in python

def hello(a,b=1, *args):
   print(a, b, *args)


hello(1, 2, 3, 4,a=12)

since we have three parameters :

a is positional parameter

b=1 is keyword and default parameter

*args is variable length parameter

so we first assign a as positional parameter , means we have to provide value to positional argument in its position order, here order matter. but we are passing argument 1 at the place of a in calling function and then we are also providing value to a , treating as keyword argument. now a have two values :

one is positional value: a=1

second is keyworded value which is a=12

Solution

We have to change hello(1, 2, 3, 4,a=12) to hello(1, 2, 3, 4,12) so now a will get only one positional value which is 1 and b will get value 2 and rest of values will get *args (variable length parameter)

additional information

if we want that *args should get 2,3,4 and a should get 1 and b should get 12

then we can do like this
def hello(a,*args,b=1): pass hello(1, 2, 3, 4,b=12)

Something more :

def hello(a,*c,b=1,**kwargs):
    print(b)
    print(c)
    print(a)
    print(kwargs)

hello(1,2,1,2,8,9,c=12)

output :

1

(2, 1, 2, 8, 9)

1

{'c': 12}

Question 35

This error can also happen if you pass a key word argument for which one of the keys is similar (has same string name) to a positional argument.

>>> class Foo():
...     def bar(self, bar, **kwargs):
...             print(bar)
... 
>>> kwgs = {"bar":"Barred", "jokes":"Another key word argument"}
>>> myfoo = Foo()
>>> myfoo.bar("fire", **kwgs)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: bar() got multiple values for argument 'bar'
>>>

“fire” has been accepted into the ‘bar’ argument. And yet there is another ‘bar’ argument present in kwargs.

You would have to remove the keyword argument from the kwargs before passing it to the method.

Question 36

Also this can happen in Django if you are using jquery ajax to url that reverses to a function that doesn’t contain ‘request’ parameter

$.ajax({
  url: '{{ url_to_myfunc }}',
});


def myfunc(foo, bar):
    ...

Question 37

I’m using NLTK to perform kmeans clustering on my text file in which each line is considered as a document. So for example, my text file is something like this:

belong finger death punch <br>
hasty <br>
mike hasty walls jericho <br>
jägermeister rules <br>
rules bands follow performing jägermeister stage <br>
approach

Now the demo code I’m trying to run is this:

import sys

import numpy
from nltk.cluster import KMeansClusterer, GAAClusterer, euclidean_distance
import nltk.corpus
from nltk import decorators
import nltk.stem

stemmer_func = nltk.stem.EnglishStemmer().stem
stopwords = set(nltk.corpus.stopwords.words('english'))

@decorators.memoize
def normalize_word(word):
    return stemmer_func(word.lower())

def get_words(titles):
    words = set()
    for title in job_titles:
        for word in title.split():
            words.add(normalize_word(word))
    return list(words)

@decorators.memoize
def vectorspaced(title):
    title_components = [normalize_word(word) for word in title.split()]
    return numpy.array([
        word in title_components and not word in stopwords
        for word in words], numpy.short)

if __name__ == '__main__':

    filename = 'example.txt'
    if len(sys.argv) == 2:
        filename = sys.argv[1]

    with open(filename) as title_file:

        job_titles = [line.strip() for line in title_file.readlines()]

        words = get_words(job_titles)

        # cluster = KMeansClusterer(5, euclidean_distance)
        cluster = GAAClusterer(5)
        cluster.cluster([vectorspaced(title) for title in job_titles if title])

        # NOTE: This is inefficient, cluster.classify should really just be
        # called when you are classifying previously unseen examples!
        classified_examples = [
                cluster.classify(vectorspaced(title)) for title in job_titles
            ]

        for cluster_id, title in sorted(zip(classified_examples, job_titles)):
            print cluster_id, title

(which can also be found here)

The error I receive is this:

Traceback (most recent call last):
File "cluster_example.py", line 40, in
words = get_words(job_titles)
File "cluster_example.py", line 20, in get_words
words.add(normalize_word(word))
File "", line 1, in
File "/usr/local/lib/python2.7/dist-packages/nltk/decorators.py", line 183, in memoize
result = func(*args)
File "cluster_example.py", line 14, in normalize_word
return stemmer_func(word.lower())
File "/usr/local/lib/python2.7/dist-packages/nltk/stem/snowball.py", line 694, in stem
word = (word.replace(u"\u2019", u"\x27")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 13: ordinal not in range(128)

What is happening here?

Question 38

The file is being read as a bunch of strs, but it should be unicodes. Python tries to implicitly convert, but fails. Change:

job_titles = [line.strip() for line in title_file.readlines()]

to explicitly decode the strs to unicode (here assuming UTF-8):

job_titles = [line.decode('utf-8').strip() for line in title_file.readlines()]

It could also be solved by importing the codecs module and using codecs.open rather than the built-in open.

Question 39

This works fine for me.

f = open(file_path, 'r+', encoding="utf-8")

You can add a third parameter encoding to ensure the encoding type is ‘utf-8’

Note: this method works fine in Python3, I did not try it in Python2.7.

Question 40

For me there was a problem with the terminal encoding. Adding UTF-8 to .bashrc solved the problem:

export LC_CTYPE=en_US.UTF-8

Don’t forget to reload .bashrc afterwards:

source ~/.bashrc

Question 41

You can try this also:

import sys
reload(sys)
sys.setdefaultencoding('utf8')

Question 42

When on Ubuntu 18.04 using Python3.6 I have solved the problem doing both:

with open(filename, encoding="utf-8") as lines:

and if you are running the tool as command line:

export LC_ALL=C.UTF-8

Note that if you are in Python2.7 you have do to handle this differently. First you have to set the default encoding:

import sys
reload(sys)
sys.setdefaultencoding('utf-8')

and then to load the file you must use io.open to set the encoding:

import io
with io.open(filename, 'r', encoding='utf-8') as lines:

You still need to export the env

export LC_ALL=C.UTF-8

Question 43

I got this error when trying to install a python package in a Docker container. For me, the issue was that the docker image did not have a locale configured. Adding the following code to the Dockerfile solved the problem for me.

# Avoid ascii errors when reading files in Python
RUN apt-get install -y locales && locale-gen en_US.UTF-8
ENV LANG='en_US.UTF-8' LANGUAGE='en_US:en' LC_ALL='en_US.UTF-8'

Question 44

To find ANY and ALL unicode error related… Using the following command:

grep -r -P '[^\x00-\x7f]' /etc/apache2 /etc/letsencrypt /etc/nginx

Found mine in

/etc/letsencrypt/options-ssl-nginx.conf:        # The following CSP directives don't use default-src as

Using shed, I found the offending sequence. It turned out to be an editor mistake.

00008099:     C2  194 302 11000010
00008100:     A0  160 240 10100000
00008101:  d  64  100 144 01100100
00008102:  e  65  101 145 01100101
00008103:  f  66  102 146 01100110
00008104:  a  61  097 141 01100001
00008105:  u  75  117 165 01110101
00008106:  l  6C  108 154 01101100
00008107:  t  74  116 164 01110100
00008108:  -  2D  045 055 00101101
00008109:  s  73  115 163 01110011
00008110:  r  72  114 162 01110010
00008111:  c  63  099 143 01100011
00008112:     C2  194 302 11000010
00008113:     A0  160 240 10100000

Question 45

You can try this before using job_titles string:

source = unicode(job_titles, 'utf-8')

Question 46

For python 3, the default encoding would be “utf-8”. Following steps are suggested in the base documentation:https://docs.python.org/2/library/csv.html#csv-examples in case of any problem

Create a function

def utf_8_encoder(unicode_csv_data):
    for line in unicode_csv_data:
        yield line.encode('utf-8')

Then use the function inside the reader, for e.g.

csv_reader = csv.reader(utf_8_encoder(unicode_csv_data))

Question 47

python3x or higher

load file in byte stream:

body = ” for lines in open(‘website/index.html’,’rb’): decodedLine = lines.decode(‘utf-8’) body = body+decodedLine.strip() return body
use global setting:

import io import sys sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding=’utf-8′)

Question 48

Use open(fn, 'rb').read().decode('utf-8') instead of just open(fn).read()

Question 49

I have a large project consisting of sufficiently large number of modules, each printing something to the standard output. Now as the project has grown in size, there are large no. of print statements printing a lot on the std out which has made the program considerably slower.

So, I now want to decide at runtime whether or not to print anything to the stdout. I cannot make changes in the modules as there are plenty of them. (I know I can redirect the stdout to a file but even this is considerably slow.)

So my question is how do I redirect the stdout to nothing ie how do I make the print statement do nothing?

# I want to do something like this.
sys.stdout = None         # this obviously will give an error as Nonetype object does not have any write method.

Currently the only idea I have is to make a class which has a write method (which does nothing) and redirect the stdout to an instance of this class.

class DontPrint(object):
    def write(*args): pass

dp = DontPrint()
sys.stdout = dp

Is there an inbuilt mechanism in python for this? Or is there something better than this?

Question 50

Cross-platform:

import os
import sys
f = open(os.devnull, 'w')
sys.stdout = f

On Windows:

f = open('nul', 'w')
sys.stdout = f

On Linux:

f = open('/dev/null', 'w')
sys.stdout = f

Question 51

A nice way to do this is to create a small context processor that you wrap your prints in. You then just use is in a with-statement to silence all output.

Python 2:

import os
import sys
from contextlib import contextmanager

@contextmanager
def silence_stdout():
    old_target = sys.stdout
    try:
        with open(os.devnull, "w") as new_target:
            sys.stdout = new_target
            yield new_target
    finally:
        sys.stdout = old_target

with silence_stdout():
    print("will not print")

print("this will print")

Python 3.4+:

Python 3.4 has a context processor like this built-in, so you can simply use contextlib like this:

import contextlib

with contextlib.redirect_stdout(None):
    print("will not print")

print("this will print")

Running this code only prints the second line of output, not the first:

$ python test.py
this will print

This works cross-platform (Windows + Linux + Mac OSX), and is cleaner than the ones other answers imho.

Question 52

If you’re in python 3.4 or higher, there’s a simple and safe solution using the standard library:

import contextlib

with contextlib.redirect_stdout(None):
  print("This won't print!")

Question 53

(at least on my system) it appears that writing to os.devnull is about 5x faster than writing to a DontPrint class, i.e.

#!/usr/bin/python
import os
import sys
import datetime

ITER = 10000000
def printlots(out, it, st="abcdefghijklmnopqrstuvwxyz1234567890"):
   temp = sys.stdout
   sys.stdout = out
   i = 0
   start_t = datetime.datetime.now()
   while i < it:
      print st
      i = i+1
   end_t = datetime.datetime.now()
   sys.stdout = temp
   print out, "\n   took", end_t - start_t, "for", it, "iterations"

class devnull():
   def write(*args):
      pass


printlots(open(os.devnull, 'wb'), ITER)
printlots(devnull(), ITER)

gave the following output:

<open file '/dev/null', mode 'wb' at 0x7f2b747044b0> 
   took 0:00:02.074853 for 10000000 iterations
<__main__.devnull instance at 0x7f2b746bae18> 
   took 0:00:09.933056 for 10000000 iterations

Question 54

If you’re in a Unix environment (Linux included), you can redirect output to /dev/null:

python myprogram.py > /dev/null

And for Windows:

python myprogram.py > nul

Question 55

How about this:

from contextlib import ExitStack, redirect_stdout
import os

with ExitStack() as stack:
    if should_hide_output():
        null_stream = open(os.devnull, "w")
        stack.enter_context(null_stream)
        stack.enter_context(redirect_stdout(null_stream))
    noisy_function()

This uses the features in the contextlib module to hide the output of whatever command you are trying to run, depending on the result of should_hide_output(), and then restores the output behavior after that function is done running.

If you want to hide standard error output, then import redirect_stderr from contextlib and add a line saying stack.enter_context(redirect_stderr(null_stream)).

The main downside it that this only works in Python 3.4 and later versions.

Question 56

Your class will work just fine (with the exception of the write() method name — it needs to be called write(), lowercase). Just make sure you save a copy of sys.stdout in another variable.

If you’re on a *NIX, you can do sys.stdout = open('/dev/null'), but this is less portable than rolling your own class.

Question 57

You can just mock it.

import mock

sys.stdout = mock.MagicMock()

Question 58

Why don’t you try this?

sys.stdout.close()
sys.stderr.close()

Question 59

sys.stdout = None

It is OK for print() case. But it can cause an error if you call any method of sys.stdout, e.g. sys.stdout.write().

There is a note in docs:

Under some conditions stdin, stdout and stderr as well as the original values stdin, stdout and stderr can be None. It is usually the case for Windows GUI apps that aren’t connected to a console and Python apps started with pythonw.

Question 60

Supplement to iFreilicht’s answer – it works for both python 2 & 3.

import sys

class NonWritable:
    def write(self, *args, **kwargs):
        pass

class StdoutIgnore:
    def __enter__(self):
        self.stdout_saved = sys.stdout
        sys.stdout = NonWritable()
        return self

    def __exit__(self, *args):
        sys.stdout = self.stdout_saved

with StdoutIgnore():
    print("This won't print!")

Question 61

In my program, user inputs number n, and then inputs n number of strings, which get stored in a list.

I need to code such that if a certain list index exists, then run a function.

This is made more complicated by the fact that I have nested if statements about len(my_list).

Here’s a simplified version of what I have now, which isn’t working:

n = input ("Define number of actors: ")

count = 0

nams = []

while count < n:
    count = count + 1
    print "Define name for actor ", count, ":"
    name = raw_input ()
    nams.append(name)

if nams[2]: #I am trying to say 'if nams[2] exists, do something depending on len(nams)
    if len(nams) > 3:
        do_something
    if len(nams) > 4
        do_something_else

if nams[3]: #etc.

Question 62

Could it be more useful for you to use the length of the list len(n) to inform your decision rather than checking n[i] for each possible length?

Question 63

I need to code such that if a certain list index exists, then run a function.

This is the perfect use for a try block:

ar=[1,2,3]

try:
    t=ar[5]
except IndexError:
    print('sorry, no 5')   

# Note: this only is a valid test in this context 
# with absolute (ie, positive) index
# a relative index is only showing you that a value can be returned
# from that relative index from the end of the list...

However, by definition, all items in a Python list between 0 and len(the_list)-1 exist (i.e., there is no need for a try, except if you know 0 <= index < len(the_list)).

You can use enumerate if you want the indexes between 0 and the last element:

names=['barney','fred','dino']

for i, name in enumerate(names):
    print(i + ' ' + name)
    if i in (3,4):
        # do your thing with the index 'i' or value 'name' for each item...

If you are looking for some defined ‘index’ thought, I think you are asking the wrong question. Perhaps you should consider using a mapping container (such as a dict) versus a sequence container (such as a list). You could rewrite your code like this:

def do_something(name):
    print('some thing 1 done with ' + name)

def do_something_else(name):
    print('something 2 done with ' + name)        

def default(name):
    print('nothing done with ' + name)     

something_to_do={  
    3: do_something,        
    4: do_something_else
    }        

n = input ("Define number of actors: ")
count = 0
names = []

for count in range(n):
    print("Define name for actor {}:".format(count+1))
    name = raw_input ()
    names.append(name)

for name in names:
    try:
        something_to_do[len(name)](name)
    except KeyError:
        default(name)

Runs like this:

Define number of actors: 3
Define name for actor 1: bob
Define name for actor 2: tony
Define name for actor 3: alice
some thing 1 done with bob
something 2 done with tony
nothing done with alice

You can also use .get method rather than try/except for a shorter version:

>>> something_to_do.get(3, default)('bob')
some thing 1 done with bob
>>> something_to_do.get(22, default)('alice')
nothing done with alice

Question 64

len(nams) should be equal to n in your code. All indexes 0 <= i < n “exist”.

问题：AttributeError（“’str’对象没有属性’read’”）

回答 0

回答 1

回答 2

回答 3

回答 4

问题：Python-提取和保存视频帧

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

问题：python中的json.dump（）和json.dumps（）有什么区别？

回答 0

回答 1

回答 2

回答 3

问题：检查环境变量是否存在的良好实践是什么？

回答 0

回答 1

情况1

情况二

Case 1

Case 2

回答 2

回答 3

回答 4

问题：类方法生成“ TypeError：…为关键字参数获得了多个值……”

回答 0

回答 1

回答 2

回答 3

回答 4

解

附加信息

Solution

additional information

回答 5

回答 6

问题：UnicodeDecodeError：’ascii’编解码器无法解码位置13的字节0xe2：序数不在范围内（128）

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

问题：在python中将stdout重定向为“ nothing”

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

问题：如果存在列表索引，请执行X

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10