Python 实用宝典

Question 1

What is the difference between:

some_list1 = []
some_list1.append("something")

and

some_list2 = []
some_list2 += ["something"]

Question 2

For your case the only difference is performance: append is twice as fast.

Python 3.0 (r30:67507, Dec  3 2008, 20:14:27) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit
>>> timeit.Timer('s.append("something")', 's = []').timeit()
0.20177424499999999
>>> timeit.Timer('s += ["something"]', 's = []').timeit()
0.41192320500000079

Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit
>>> timeit.Timer('s.append("something")', 's = []').timeit()
0.23079359499999999
>>> timeit.Timer('s += ["something"]', 's = []').timeit()
0.44208112500000141

In general case append will add one item to the list, while += will copy all elements of right-hand-side list into the left-hand-side list.

Update: perf analysis

Comparing bytecodes we can assume that append version wastes cycles in LOAD_ATTR + CALL_FUNCTION, and += version — in BUILD_LIST. Apparently BUILD_LIST outweighs LOAD_ATTR + CALL_FUNCTION.

>>> import dis
>>> dis.dis(compile("s = []; s.append('spam')", '', 'exec'))
  1           0 BUILD_LIST               0
              3 STORE_NAME               0 (s)
              6 LOAD_NAME                0 (s)
              9 LOAD_ATTR                1 (append)
             12 LOAD_CONST               0 ('spam')
             15 CALL_FUNCTION            1
             18 POP_TOP
             19 LOAD_CONST               1 (None)
             22 RETURN_VALUE
>>> dis.dis(compile("s = []; s += ['spam']", '', 'exec'))
  1           0 BUILD_LIST               0
              3 STORE_NAME               0 (s)
              6 LOAD_NAME                0 (s)
              9 LOAD_CONST               0 ('spam')
             12 BUILD_LIST               1
             15 INPLACE_ADD
             16 STORE_NAME               0 (s)
             19 LOAD_CONST               1 (None)
             22 RETURN_VALUE

We can improve performance even more by removing LOAD_ATTR overhead:

>>> timeit.Timer('a("something")', 's = []; a = s.append').timeit()
0.15924410999923566

Question 3

In the example you gave, there is no difference, in terms of output, between append and +=. But there is a difference between append and + (which the question originally asked about).

>>> a = []
>>> id(a)
11814312
>>> a.append("hello")
>>> id(a)
11814312

>>> b = []
>>> id(b)
11828720
>>> c = b + ["hello"]
>>> id(c)
11833752
>>> b += ["hello"]
>>> id(b)
11828720

As you can see, append and += have the same result; they add the item to the list, without producing a new list. Using + adds the two lists and produces a new list.

Question 4

>>> a=[]
>>> a.append([1,2])
>>> a
[[1, 2]]
>>> a=[]
>>> a+=[1,2]
>>> a
[1, 2]

See that append adds a single element to the list, which may be anything. +=[] joins the lists.

Question 5

+= is an assignment. When you use it you’re really saying ‘some_list2= some_list2+[‘something’]’. Assignments involve rebinding, so:

l= []

def a1(x):
    l.append(x) # works

def a2(x):
    l= l+[x] # assign to l, makes l local
             # so attempt to read l for addition gives UnboundLocalError

def a3(x):
    l+= [x]  # fails for the same reason

The += operator should also normally create a new list object like list+list normally does:

>>> l1= []
>>> l2= l1

>>> l1.append('x')
>>> l1 is l2
True

>>> l1= l1+['x']
>>> l1 is l2
False

However in reality:

>>> l2= l1
>>> l1+= ['x']
>>> l1 is l2
True

This is because Python lists implement __iadd__() to make a += augmented assignment short-circuit and call list.extend() instead. (It’s a bit of a strange wart this: it usually does what you meant, but for confusing reasons.)

In general, if you’re appending/extended an existing list, and you want to keep the reference to the same list (instead of making a new one), it’s best to be explicit and stick with the append()/extend() methods.

Question 6

 some_list2 += ["something"]

is actually

 some_list2.extend(["something"])

for one value, there is no difference. Documentation states, that:

s.append(x) same as s[len(s):len(s)] = [x]
s.extend(x) same as s[len(s):len(s)] = x

Thus obviously s.append(x) is same as s.extend([x])

Question 7

The difference is that concatenate will flatten the resulting list, whereas append will keep the levels intact:

So for example with:

myList = [ ]
listA = [1,2,3]
listB = ["a","b","c"]

Using append, you end up with a list of lists:

>> myList.append(listA)
>> myList.append(listB)
>> myList
[[1,2,3],['a',b','c']]

Using concatenate instead, you end up with a flat list:

>> myList += listA + listB
>> myList
[1,2,3,"a","b","c"]

Question 8

The performance tests here are not correct:

You shouldn’t run the profile only once.
If comparing append vs. += [] number of times you should declare append as a local function.
time results are different on different python versions: 64 and 32 bit

e.g.

timeit.Timer(‘for i in xrange(100): app(i)’, ‘s = [] ; app = s.append’).timeit()

good tests can be found here: http://markandclick.com/1/post/2012/01/python-list-append-vs.html

Question 9

In addition to the aspects described in the other answers, append and +[] have very different behaviors when you’re trying to build a list of lists.

>>> list1=[[1,2],[3,4]]
>>> list2=[5,6]
>>> list3=list1+list2
>>> list3
[[1, 2], [3, 4], 5, 6]
>>> list1.append(list2)
>>> list1
[[1, 2], [3, 4], [5, 6]]

list1+[‘5′,’6’] adds ‘5’ and ‘6’ to the list1 as individual elements. list1.append([‘5′,’6’]) adds the list [‘5′,’6’] to the list1 as a single element.

Question 10

The rebinding behaviour mentioned in other answers does matter in certain circumstances:

>>> a = ([],[])
>>> a[0].append(1)
>>> a
([1], [])
>>> a[1] += [1]
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment

That’s because augmented assignment always rebinds, even if the object was mutated in-place. The rebinding here happens to be a[1] = *mutated list*, which doesn’t work for tuples.

Question 11

let’s take an example first

list1=[1,2,3,4]
list2=list1     (that means they points to same object)

if we do 
list1=list1+[5]    it will create a new object of list
print(list1)       output [1,2,3,4,5] 
print(list2)       output [1,2,3,4]

but if we append  then 
list1.append(5)     no new object of list created
print(list1)       output [1,2,3,4,5] 
print(list2)       output [1,2,3,4,5]

extend(list) also do the same work as append it just append a list instead of a 
single variable

Question 12

The append() method adds a single item to the existing list

some_list1 = []
some_list1.append("something")

So here the some_list1 will get modified.

Updated:

Whereas using + to combine the elements of lists (more than one element) in the existing list similar to the extend (as corrected by Flux).

some_list2 = []
some_list2 += ["something"]

So here the some_list2 and [“something”] are the two lists that are combined.

Question 13

“+” does not mutate the list

.append() mutates the old list

Question 14

I can’t seem to get the nose testing framework to recognize modules beneath my test script in the file structure. I’ve set up the simplest example that demonstrates the problem. I’ll explain it below.

Here’s the the package file structure:

./__init__.py
./foo.py
./tests
   ./__init__.py
   ./test_foo.py

foo.py contains:

def dumb_true():
    return True

tests/test_foo.py contains:

import foo

def test_foo():
    assert foo.dumb_true()

Both init.py files are empty

If I run nosetests -vv in the main directory (where foo.py is), I get:

Failure: ImportError (No module named foo) ... ERROR

======================================================================
ERROR: Failure: ImportError (No module named foo)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python/site-packages/nose-0.11.1-py2.6.egg/nose/loader.py", line 379, in loadTestsFromName
    addr.filename, addr.module)
  File "/usr/lib/python/site-packages/nose-0.11.1-py2.6.egg/nose/importer.py", line 39, in importFromPath
    return self.importFromDir(dir_path, fqname)
  File "/usr/lib/python/site-packages/nose-0.11.1-py2.6.egg/nose/importer.py", line 86, in importFromDir
    mod = load_module(part_fqname, fh, filename, desc)
  File "/home/user/nose_testing/tests/test_foo.py", line 1, in <module>
    import foo
ImportError: No module named foo

----------------------------------------------------------------------
Ran 1 test in 0.002s

FAILED (errors=1)

I get the same error when I run from inside the tests/ directory. According to the documentation and an example I found, nose is supposed to add all parent packages to the path as well as the directory from which it is called, but this doesn’t seem to be happening in my case.

I’m running Ubuntu 8.04 with Python 2.6.2. I’ve built and installed nose manually (not with setup_tools) if that matters.

Question 15

You’ve got an __init__.py in your top level directory. That makes it a package. If you remove it, your nosetests should work.

If you don’t remove it, you’ll have to change your import to import dir.foo, where dir is the name of your directory.

Question 16

Are you in a virtualenv? In my case, nosetests was the one in /usr/bin/nosetests, which was using /usr/bin/python. The packages in the virtualenv definitely won’t be in the system path. The following fixed this:

source myvirtualenv/activate
pip install nose
which nosetests
/home/me/myvirtualenv/bin/nosetests

Question 17

To those of you finding this question later on: I get the import error if I don’t have an __init__.py file in my tests directory.

My directory structure was like this:

./tests/
  ./test_some_random_stuff.py

If I ran nosetests:

nosetests -w tests

It would give the ImportError that everyone else is seeing. If I add a blank __init__.py file it works just fine:

./tests/
  ./__init__.py
  ./test_some_random_stuff.py

Question 18

Another potential problem appears to be hyphens/dashes in the directory tree. I recently fixed a nose ImportError issue by renaming a directory from sub-dir to sub_dir.

Question 19

Of course if you have a syntax error in the module being imported that will cause this. For me the problem reared its head when I had a backup of a tests file with a path like module/tests.bak.py in the same directory as tests.py. Also, to deal with the init package/module problem in a Django app, you can run the following (in a bash/OSX shell) to make sure you don’t have any init.pyc files lying around:

find . -name '*.pyc' -delete

Question 20

I got this error message because I run the nosetests command from the wrong directory.

Silly, but happens.

Question 21

I just ran into one more thing that might cause this issue: naming of tests in the form testname.test.py. That extra . confounds nose and leads to it importing things it should not. I suppose it may be obvious that using unconventional test naming conventions will break things, but I thought it might be worth noting.

Question 22

For example, with the following directory structure, if you want to run nosetests in m1, m2 or m3 to test some functions in n.py, you should use from m2.m3 import n in test.py.

m1
└── m2
    ├── __init__.py
    └── m3
        ├── __init__.py
        ├── n.py
        └── test
            └── test.py

Question 23

Just to complete the question: If you’re struggling with structure like this:

project
├── m1
├    ├── __init__.py
├    ├── foo1.py
├    └──m2
├       ├── __init__.py
├       └── foo2.py
├
└── test
     ├── __init__.py
     └── test.py

And maybe you want to run test from a path outside the project, include your project path inside your PYTHONPATH.

export PYTHONPATH=$PYTHONPATH:$HOME/path/to/project

paste it inside your .profile. If you’re under a virtual environment, paste it inside the activate in your venv root

Question 24

I can’t get my head around Python’s logging module. My needs are very simple: I just want to log everything to syslog. After reading documentation I came up with this simple test script:

import logging
import logging.handlers

my_logger = logging.getLogger('MyLogger')
my_logger.setLevel(logging.DEBUG)

handler = logging.handlers.SysLogHandler()

my_logger.addHandler(handler)

my_logger.debug('this is debug')
my_logger.critical('this is critical')

But this script does not produce any log records in syslog. What’s wrong?

Question 25

Change the line to this:

handler = SysLogHandler(address='/dev/log')

This works for me

import logging
import logging.handlers

my_logger = logging.getLogger('MyLogger')
my_logger.setLevel(logging.DEBUG)

handler = logging.handlers.SysLogHandler(address = '/dev/log')

my_logger.addHandler(handler)

my_logger.debug('this is debug')
my_logger.critical('this is critical')

Question 26

You should always use the local host for logging, whether to /dev/log or localhost through the TCP stack. This allows the fully RFC compliant and featureful system logging daemon to handle syslog. This eliminates the need for the remote daemon to be functional and provides the enhanced capabilities of syslog daemon’s such as rsyslog and syslog-ng for instance. The same philosophy goes for SMTP. Just hand it to the local SMTP software. In this case use ‘program mode’ not the daemon, but it’s the same idea. Let the more capable software handle it. Retrying, queuing, local spooling, using TCP instead of UDP for syslog and so forth become possible. You can also [re-]configure those daemons separately from your code as it should be.

Save your coding for your application, let other software do it’s job in concert.

Question 27

I found the syslog module to make it quite easy to get the basic logging behavior you describe:

import syslog
syslog.syslog("This is a test message")
syslog.syslog(syslog.LOG_INFO, "Test message at INFO priority")

There are other things you could do, too, but even just the first two lines of that will get you what you’ve asked for as I understand it.

Question 28

Piecing things together from here and other places, this is what I came up with that works on unbuntu 12.04 and centOS6

Create an file in /etc/rsyslog.d/ that ends in .conf and add the following text

local6.*        /var/log/my-logfile

Restart rsyslog, reloading did NOT seem to work for the new log files. Maybe it only reloads existing conf files?

sudo restart rsyslog

Then you can use this test program to make sure it actually works.

import logging, sys
from logging import config

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'formatters': {
        'verbose': {
            'format': '%(levelname)s %(module)s P%(process)d T%(thread)d %(message)s'
            },
        },
    'handlers': {
        'stdout': {
            'class': 'logging.StreamHandler',
            'stream': sys.stdout,
            'formatter': 'verbose',
            },
        'sys-logger6': {
            'class': 'logging.handlers.SysLogHandler',
            'address': '/dev/log',
            'facility': "local6",
            'formatter': 'verbose',
            },
        },
    'loggers': {
        'my-logger': {
            'handlers': ['sys-logger6','stdout'],
            'level': logging.DEBUG,
            'propagate': True,
            },
        }
    }

config.dictConfig(LOGGING)


logger = logging.getLogger("my-logger")

logger.debug("Debug")
logger.info("Info")
logger.warn("Warn")
logger.error("Error")
logger.critical("Critical")

Question 29

I add a little extra comment just in case it helps anyone because I found this exchange useful but needed this little extra bit of info to get it all working.

To log to a specific facility using SysLogHandler you need to specify the facility value. Say for example that you have defined:

local3.* /var/log/mylog

in syslog, then you’ll want to use:

handler = logging.handlers.SysLogHandler(address = ('localhost',514), facility=19)

and you also need to have syslog listening on UDP to use localhost instead of /dev/log.

Question 30

Is your syslog.conf set up to handle facility=user?

You can set the facility used by the python logger with the facility argument, something like this:

handler = logging.handlers.SysLogHandler(facility=SysLogHandler.LOG_DAEMON)

Question 31

import syslog
syslog.openlog(ident="LOG_IDENTIFIER",logoption=syslog.LOG_PID, facility=syslog.LOG_LOCAL0)
syslog.syslog('Log processing initiated...')

the above script will log to LOCAL0 facility with our custom “LOG_IDENTIFIER”… you can use LOCAL[0-7] for local purpose.

Question 32

From https://github.com/luismartingil/per.scripts/tree/master/python_syslog

#!/usr/bin/python
# -*- coding: utf-8 -*-

'''
Implements a new handler for the logging module which uses the pure syslog python module.

@author:  Luis Martin Gil
@year: 2013
'''
import logging
import syslog

class SysLogLibHandler(logging.Handler):
    """A logging handler that emits messages to syslog.syslog."""
    FACILITY = [syslog.LOG_LOCAL0,
                syslog.LOG_LOCAL1,
                syslog.LOG_LOCAL2,
                syslog.LOG_LOCAL3,
                syslog.LOG_LOCAL4,
                syslog.LOG_LOCAL5,
                syslog.LOG_LOCAL6,
                syslog.LOG_LOCAL7]
    def __init__(self, n):
        """ Pre. (0 <= n <= 7) """
        try:
            syslog.openlog(logoption=syslog.LOG_PID, facility=self.FACILITY[n])
        except Exception , err:
            try:
                syslog.openlog(syslog.LOG_PID, self.FACILITY[n])
            except Exception, err:
                try:
                    syslog.openlog('my_ident', syslog.LOG_PID, self.FACILITY[n])
                except:
                    raise
        # We got it
        logging.Handler.__init__(self)

    def emit(self, record):
        syslog.syslog(self.format(record))

if __name__ == '__main__':
    """ Lets play with the log class. """
    # Some variables we need
    _id = 'myproj_v2.0'
    logStr = 'debug'
    logFacilityLocalN = 1

    # Defines a logging level and logging format based on a given string key.
    LOG_ATTR = {'debug': (logging.DEBUG,
                          _id + ' %(levelname)-9s %(name)-15s %(threadName)-14s +%(lineno)-4d %(message)s'),
                'info': (logging.INFO,
                         _id + ' %(levelname)-9s %(message)s'),
                'warning': (logging.WARNING,
                            _id + ' %(levelname)-9s %(message)s'),
                'error': (logging.ERROR,
                          _id + ' %(levelname)-9s %(message)s'),
                'critical': (logging.CRITICAL,
                             _id + ' %(levelname)-9s %(message)s')}
    loglevel, logformat = LOG_ATTR[logStr]

    # Configuring the logger
    logger = logging.getLogger()
    logger.setLevel(loglevel)

    # Clearing previous logs
    logger.handlers = []

    # Setting formaters and adding handlers.
    formatter = logging.Formatter(logformat)
    handlers = []
    handlers.append(SysLogLibHandler(logFacilityLocalN))
    for h in handlers:
        h.setFormatter(formatter)
        logger.addHandler(h)

    # Yep!
    logging.debug('test debug')
    logging.info('test info')
    logging.warning('test warning')
    logging.error('test error')
    logging.critical('test critical')

Question 33

Here’s the yaml dictConfig way recommended for 3.2 & later.

In log cfg.yml:

version: 1
disable_existing_loggers: true

formatters:
    default:
        format: "[%(process)d] %(name)s(%(funcName)s:%(lineno)s) - %(levelname)s: %(message)s"

handlers:
    syslog:
        class: logging.handlers.SysLogHandler
        level: DEBUG
        formatter: default
        address: /dev/log
        facility: local0

    rotating_file:
        class: logging.handlers.RotatingFileHandler
        level: DEBUG
        formatter: default
        filename: rotating.log
        maxBytes: 10485760 # 10MB
        backupCount: 20
        encoding: utf8

root:
    level: DEBUG
    handlers: [syslog, rotating_file]
    propogate: yes

loggers:
    main:
        level: DEBUG
        handlers: [syslog, rotating_file]
        propogate: yes

Load the config using:

log_config = yaml.safe_load(open('cfg.yml'))
logging.config.dictConfig(log_config)

Configured both syslog & a direct file. Note that the /dev/log is OS specific.

Question 34

I fix it on my notebook. The rsyslog service did not listen on socket service.

I config this line bellow in /etc/rsyslog.conf file and solved the problem:

$SystemLogSocketName /dev/log

Question 35

You can also add a file handler or rotating file handler to send your logs to a local file: http://docs.python.org/2/library/logging.handlers.html

Question 36

Situation: – There is a module in my project_folder called calendar – I would like to use the built-in Calendar class from the Python libraries – When I use from calendar import Calendar it complains because it’s trying to load from my module.

I’ve done a few searches and I can’t seem to find a solution to my problem.

Any ideas without having to rename my module?

Question 37

The accepted solution contains a now-deprecated approach.

The importlib documentation here gives a good example of the more appropriate way to load a module directly from a file path for python >= 3.5:

import importlib.util
import sys

# For illustrative purposes.
import tokenize
file_path = tokenize.__file__  # returns "/path/to/tokenize.py"
module_name = tokenize.__name__  # returns "tokenize"

spec = importlib.util.spec_from_file_location(module_name, file_path)
module = importlib.util.module_from_spec(spec)
sys.modules[module_name] = module
spec.loader.exec_module(module)

So, you can load any .py file from a path and set the module name to be whatever you want. So just adjust the module_name to be whatever custom name you’d like the module to have upon importing.

To load a package instead of a single file, file_path should be the path to the package’s root __init__.py

Question 38

Changing the name of your module is not necessary. Rather, you can use absolute_import to change the importing behavior. For example with stem/socket.py I import the socket module as follows:

from __future__ import absolute_import
import socket

This only works with Python 2.5 and above; it’s enabling behavior that is the default in Python 3.0 and higher. Pylint will complain about the code but it’s perfectly valid.

Question 39

Actually, solving this is rather easy, but the implementation will always be a bit fragile, because it depends python import mechanism’s internals and they are subject to change in future versions.

(the following code shows how to load both local and non-local modules and how they may coexist)

def import_non_local(name, custom_name=None):
    import imp, sys

    custom_name = custom_name or name

    f, pathname, desc = imp.find_module(name, sys.path[1:])
    module = imp.load_module(custom_name, f, pathname, desc)
    f.close()

    return module

# Import non-local module, use a custom name to differentiate it from local
# This name is only used internally for identifying the module. We decide
# the name in the local scope by assigning it to the variable calendar.
calendar = import_non_local('calendar','std_calendar')

# import local module normally, as calendar_local
import calendar as calendar_local

print calendar.Calendar
print calendar_local

The best solution, if possible, is to avoid naming your modules with the same name as standard-library or built-in module names.

Question 40

The only way to solve this problem is to hijack the internal import machinery yourself. This is not easy, and fraught with peril. You should avoid the grail shaped beacon at all costs because the peril is too perilous.

Rename your module instead.

If you want to learn how to hijack the internal import machinery, here is where you would go about finding out how to do this:

There are sometimes good reasons to get into this peril. The reason you give is not among them. Rename your module.

If you take the perilous path, one problem you will encounter is that when you load a module it ends up with an ‘official name’ so that Python can avoid ever having to parse the contents of that module ever again. A mapping of the ‘official name’ of a module to the module object itself can be found in sys.modules.

This means that if you import calendar in one place, whatever module is imported will be thought of as the module with the official name calendar and all other attempts to import calendar anywhere else, including in other code that’s part of the main Python library, will get that calendar.

It might be possible to design a customer importer using the imputil module in Python 2.x that caused modules loaded from certain paths to look up the modules they were importing in something other than sys.modules first or something like that. But that’s an extremely hairy thing to be doing, and it won’t work in Python 3.x anyway.

There is an extremely ugly and horrible thing you can do that does not involve hooking the import mechanism. This is something you should probably not do, but it will likely work. It turns your calendar module into a hybrid of the system calendar module and your calendar module. Thanks to Boaz Yaniv for the skeleton of the function I use. Put this at the beginning of your calendar.py file:

import sys

def copy_in_standard_module_symbols(name, local_module):
    import imp

    for i in range(0, 100):
        random_name = 'random_name_%d' % (i,)
        if random_name not in sys.modules:
            break
        else:
            random_name = None
    if random_name is None:
        raise RuntimeError("Couldn't manufacture an unused module name.")
    f, pathname, desc = imp.find_module(name, sys.path[1:])
    module = imp.load_module(random_name, f, pathname, desc)
    f.close()
    del sys.modules[random_name]
    for key in module.__dict__:
        if not hasattr(local_module, key):
            setattr(local_module, key, getattr(module, key))

copy_in_standard_module_symbols('calendar', sys.modules[copy_in_standard_module_symbols.__module__])

Question 41

I’d like to offer my version, which is a combination of Boaz Yaniv’s and Omnifarious’s solution. It will import the system version of a module, with two main differences from the previous answers:

Supports the ‘dot’ notation, eg. package.module
Is a drop-in replacement for the import statement on system modules, meaning you just have to replace that one line and if there are already calls being made to the module they will work as-is

Put this somewhere accessible so you can call it (I have mine in my __init__.py file):

class SysModule(object):
    pass

def import_non_local(name, local_module=None, path=None, full_name=None, accessor=SysModule()):
    import imp, sys, os

    path = path or sys.path[1:]
    if isinstance(path, basestring):
        path = [path]

    if '.' in name:
        package_name = name.split('.')[0]
        f, pathname, desc = imp.find_module(package_name, path)
        if pathname not in __path__:
            __path__.insert(0, pathname)
        imp.load_module(package_name, f, pathname, desc)
        v = import_non_local('.'.join(name.split('.')[1:]), None, pathname, name, SysModule())
        setattr(accessor, package_name, v)
        if local_module:
            for key in accessor.__dict__.keys():
                setattr(local_module, key, getattr(accessor, key))
        return accessor
    try:
        f, pathname, desc = imp.find_module(name, path)
        if pathname not in __path__:
            __path__.insert(0, pathname)
        module = imp.load_module(name, f, pathname, desc)
        setattr(accessor, name, module)
        if local_module:
            for key in accessor.__dict__.keys():
                setattr(local_module, key, getattr(accessor, key))
            return module
        return accessor
    finally:
        try:
            if f:
                f.close()
        except:
            pass

Example

I wanted to import mysql.connection, but I had a local package already called mysql (the official mysql utilities). So to get the connector from the system mysql package, I replaced this:

import mysql.connector

With this:

import sys
from mysql.utilities import import_non_local         # where I put the above function (mysql/utilities/__init__.py)
import_non_local('mysql.connector', sys.modules[__name__])

Result

# This unmodified line further down in the file now works just fine because mysql.connector has actually become part of the namespace
self.db_conn = mysql.connector.connect(**parameters)

Question 42

Change the import path:

import sys
save_path = sys.path[:]
sys.path.remove('')
import calendar
sys.path = save_path

Question 43

the following code worked until today when I imported from a Windows machine and got this error:

new-line character seen in unquoted field – do you need to open the file in universal-newline mode?

import csv

class CSV:


    def __init__(self, file=None):
        self.file = file

    def read_file(self):
        data = []
        file_read = csv.reader(self.file)
        for row in file_read:
            data.append(row)
        return data

    def get_row_count(self):
        return len(self.read_file())

    def get_column_count(self):
        new_data = self.read_file()
        return len(new_data[0])

    def get_data(self, rows=1):
        data = self.read_file()

        return data[:rows]

How can I fix this issue?

def upload_configurator(request, id=None):
    """
    A view that allows the user to configurator the uploaded CSV.
    """
    upload = Upload.objects.get(id=id)
    csvobject = CSV(upload.filepath)

    upload.num_records = csvobject.get_row_count()
    upload.num_columns = csvobject.get_column_count()
    upload.save()

    form = ConfiguratorForm()

    row_count = csvobject.get_row_count()
    colum_count = csvobject.get_column_count()
    first_row = csvobject.get_data(rows=1)
    first_two_rows = csvobject.get_data(rows=5)

Question 44

It’ll be good to see the csv file itself, but this might work for you, give it a try, replace:

file_read = csv.reader(self.file)

with:

file_read = csv.reader(self.file, dialect=csv.excel_tab)

Or, open a file with universal newline mode and pass it to csv.reader, like:

reader = csv.reader(open(self.file, 'rU'), dialect=csv.excel_tab)

Or, use splitlines(), like this:

def read_file(self):
    with open(self.file, 'r') as f:
        data = [row for row in csv.reader(f.read().splitlines())]
    return data

Question 45

I realize this is an old post, but I ran into the same problem and don’t see the correct answer so I will give it a try

Python Error:

_csv.Error: new-line character seen in unquoted field

Caused by trying to read Macintosh (pre OS X formatted) CSV files. These are text files that use CR for end of line. If using MS Office make sure you select either plain CSV format or CSV (MS-DOS). Do not use CSV (Macintosh) as save-as type.

My preferred EOL version would be LF (Unix/Linux/Apple), but I don’t think MS Office provides the option to save in this format.

Question 46

For Mac OS X, save your CSV file in “Windows Comma Separated (.csv)” format.

Question 47

If this happens to you on mac (as it did to me):

Save the file as CSV (MS-DOS Comma-Separated)

Run the following script

with open(csv_filename, 'rU') as csvfile:
    csvreader = csv.reader(csvfile)
    for row in csvreader:
        print ', '.join(row)

Question 48

Try to run dos2unix on your windows imported files first

Question 49

This is an error that I faced. I had saved .csv file in MAC OSX.

While saving, save it as “Windows Comma Separated Values (.csv)” which resolved the issue.

Question 50

This worked for me on OSX.

# allow variable to opened as files
from io import StringIO

# library to map other strange (accented) characters back into UTF-8
from unidecode import unidecode

# cleanse input file with Windows formating to plain UTF-8 string
with open(filename, 'rb') as fID:
    uncleansedBytes = fID.read()
    # decode the file using the correct encoding scheme
    # (probably this old windows one) 
    uncleansedText = uncleansedBytes.decode('Windows-1252')

    # replace carriage-returns with new-lines
    cleansedText = uncleansedText.replace('\r', '\n')

    # map any other non UTF-8 characters into UTF-8
    asciiText = unidecode(cleansedText)

# read each line of the csv file and store as an array of dicts, 
# use first line as field names for each dict. 
reader = csv.DictReader(StringIO(cleansedText))
for line_entry in reader:
    # do something with your read data

Question 51

I know this has been answered for quite some time but not solve my problem. I am using DictReader and StringIO for my csv reading due to some other complications. I was able to solve problem more simply by replacing delimiters explicitly:

with urllib.request.urlopen(q) as response:
    raw_data = response.read()
    encoding = response.info().get_content_charset('utf8') 
    data = raw_data.decode(encoding)
    if '\r\n' not in data:
        # proably a windows delimited thing...try to update it
        data = data.replace('\r', '\r\n')

Might not be reasonable for enormous CSV files, but worked well for my use case.

Question 52

Alternative and fast solution : I faced the same error. I reopened the “wierd” csv file in GNUMERIC on my lubuntu machine and exported the file as csv file. This corrected the issue.

Question 53

Can anyone amend namedtuple or provide an alternative class so that it works for mutable objects?

Primarily for readability, I would like something similar to namedtuple that does this:

from Camelot import namedgroup

Point = namedgroup('Point', ['x', 'y'])
p = Point(0, 0)
p.x = 10

>>> p
Point(x=10, y=0)

>>> p.x *= 10
Point(x=100, y=0)

It must be possible to pickle the resulting object. And per the characteristics of named tuple, the ordering of the output when represented must match the order of the parameter list when constructing the object.

Question 54

There is a mutable alternative to collections.namedtuple – recordclass.

It has the same API and memory footprint as namedtuple and it supports assignments (It should be faster as well). For example:

from recordclass import recordclass

Point = recordclass('Point', 'x y')

>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> print(p.x, p.y)
1 2
>>> p.x += 2; p.y += 3; print(p)
Point(x=3, y=5)

For python 3.6 and higher recordclass (since 0.5) support typehints:

from recordclass import recordclass, RecordClass

class Point(RecordClass):
   x: int
   y: int

>>> Point.__annotations__
{'x':int, 'y':int}
>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> print(p.x, p.y)
1 2
>>> p.x += 2; p.y += 3; print(p)
Point(x=3, y=5)

There is a more complete example (it also includes performance comparisons).

Since 0.9 recordclass library provides another variant — recordclass.structclass factory function. It can produce classes, whose instances occupy less memory than __slots__-based instances. This is can be important for the instances with attribute values, which has not intended to have reference cycles. It may help reduce memory usage if you need to create millions of instances. Here is an illustrative example.

Question 55

types.SimpleNamespace was introduced in Python 3.3 and supports the requested requirements.

from types import SimpleNamespace
t = SimpleNamespace(foo='bar')
t.ham = 'spam'
print(t)
namespace(foo='bar', ham='spam')
print(t.foo)
'bar'
import pickle
with open('/tmp/pickle', 'wb') as f:
    pickle.dump(t, f)

Question 56

As a very Pythonic alternative for this task, since Python-3.7, you can use dataclasses module that not only behaves like a mutable NamedTuple because they use normal class definitions they also support other classes features.

From PEP-0557:

Although they use a very different mechanism, Data Classes can be thought of as “mutable namedtuples with defaults”. Because Data Classes use normal class definition syntax, you are free to use inheritance, metaclasses, docstrings, user-defined methods, class factories, and other Python class features.

A class decorator is provided which inspects a class definition for variables with type annotations as defined in PEP 526, “Syntax for Variable Annotations”. In this document, such variables are called fields. Using these fields, the decorator adds generated method definitions to the class to support instance initialization, a repr, comparison methods, and optionally other methods as described in the Specification section. Such a class is called a Data Class, but there’s really nothing special about the class: the decorator adds generated methods to the class and returns the same class it was given.

This feature is introduced in PEP-0557 that you can read about it in more details on provided documentation link.

Example:

In [20]: from dataclasses import dataclass

In [21]: @dataclass
    ...: class InventoryItem:
    ...:     '''Class for keeping track of an item in inventory.'''
    ...:     name: str
    ...:     unit_price: float
    ...:     quantity_on_hand: int = 0
    ...: 
    ...:     def total_cost(self) -> float:
    ...:         return self.unit_price * self.quantity_on_hand
    ...:

Demo:

In [23]: II = InventoryItem('bisc', 2000)

In [24]: II
Out[24]: InventoryItem(name='bisc', unit_price=2000, quantity_on_hand=0)

In [25]: II.name = 'choco'

In [26]: II.name
Out[26]: 'choco'

In [27]: 

In [27]: II.unit_price *= 3

In [28]: II.unit_price
Out[28]: 6000

In [29]: II
Out[29]: InventoryItem(name='choco', unit_price=6000, quantity_on_hand=0)

Question 57

The latest namedlist 1.7 passes all of your tests with both Python 2.7 and Python 3.5 as of Jan 11, 2016. It is a pure python implementation whereas the recordclass is a C extension. Of course, it depends on your requirements whether a C extension is preferred or not.

Your tests (but also see the note below):

from __future__ import print_function
import pickle
import sys
from namedlist import namedlist

Point = namedlist('Point', 'x y')
p = Point(x=1, y=2)

print('1. Mutation of field values')
p.x *= 10
p.y += 10
print('p: {}, {}\n'.format(p.x, p.y))

print('2. String')
print('p: {}\n'.format(p))

print('3. Representation')
print(repr(p), '\n')

print('4. Sizeof')
print('size of p:', sys.getsizeof(p), '\n')

print('5. Access by name of field')
print('p: {}, {}\n'.format(p.x, p.y))

print('6. Access by index')
print('p: {}, {}\n'.format(p[0], p[1]))

print('7. Iterative unpacking')
x, y = p
print('p: {}, {}\n'.format(x, y))

print('8. Iteration')
print('p: {}\n'.format([v for v in p]))

print('9. Ordered Dict')
print('p: {}\n'.format(p._asdict()))

print('10. Inplace replacement (update?)')
p._update(x=100, y=200)
print('p: {}\n'.format(p))

print('11. Pickle and Unpickle')
pickled = pickle.dumps(p)
unpickled = pickle.loads(pickled)
assert p == unpickled
print('Pickled successfully\n')

print('12. Fields\n')
print('p: {}\n'.format(p._fields))

print('13. Slots')
print('p: {}\n'.format(p.__slots__))

Output on Python 2.7

1. Mutation of field values  
p: 10, 12

2. String  
p: Point(x=10, y=12)

3. Representation  
Point(x=10, y=12) 

4. Sizeof  
size of p: 64 

5. Access by name of field  
p: 10, 12

6. Access by index  
p: 10, 12

7. Iterative unpacking  
p: 10, 12

8. Iteration  
p: [10, 12]

9. Ordered Dict  
p: OrderedDict([('x', 10), ('y', 12)])

10. Inplace replacement (update?)  
p: Point(x=100, y=200)

11. Pickle and Unpickle  
Pickled successfully

12. Fields  
p: ('x', 'y')

13. Slots  
p: ('x', 'y')

The only difference with Python 3.5 is that the namedlist has become smaller, the size is 56 (Python 2.7 reports 64).

Note that I have changed your test 10 for in-place replacement. The namedlist has a _replace() method which does a shallow copy, and that makes perfect sense to me because the namedtuple in the standard library behaves the same way. Changing the semantics of the _replace() method would be confusing. In my opinion the _update() method should be used for in-place updates. Or maybe I failed to understand the intent of your test 10?

Question 58

It seems like the answer to this question is no.

Below is pretty close, but it’s not technically mutable. This is creating a new namedtuple() instance with an updated x value:

Point = namedtuple('Point', ['x', 'y'])
p = Point(0, 0)
p = p._replace(x=10)

On the other hand, you can create a simple class using __slots__ that should work well for frequently updating class instance attributes:

class Point:
    __slots__ = ['x', 'y']
    def __init__(self, x, y):
        self.x = x
        self.y = y

To add to this answer, I think __slots__ is good use here because it’s memory efficient when you create lots of class instances. The only downside is that you can’t create new class attributes.

Here’s one relevant thread that illustrates the memory efficiency – Dictionary vs Object – which is more efficient and why?

The quoted content in the answer of this thread is a very succinct explanation why __slots__ is more memory efficient – Python slots

Question 59

The following is a good solution for Python 3: A minimal class using __slots__ and Sequence abstract base class; does not do fancy error detection or such, but it works, and behaves mostly like a mutable tuple (except for typecheck).

from collections import Sequence

class NamedMutableSequence(Sequence):
    __slots__ = ()

    def __init__(self, *a, **kw):
        slots = self.__slots__
        for k in slots:
            setattr(self, k, kw.get(k))

        if a:
            for k, v in zip(slots, a):
                setattr(self, k, v)

    def __str__(self):
        clsname = self.__class__.__name__
        values = ', '.join('%s=%r' % (k, getattr(self, k))
                           for k in self.__slots__)
        return '%s(%s)' % (clsname, values)

    __repr__ = __str__

    def __getitem__(self, item):
        return getattr(self, self.__slots__[item])

    def __setitem__(self, item, value):
        return setattr(self, self.__slots__[item], value)

    def __len__(self):
        return len(self.__slots__)

class Point(NamedMutableSequence):
    __slots__ = ('x', 'y')

Example:

>>> p = Point(0, 0)
>>> p.x = 10
>>> p
Point(x=10, y=0)
>>> p.x *= 10
>>> p
Point(x=100, y=0)

If you want, you can have a method to create the class too (though using an explicit class is more transparent):

def namedgroup(name, members):
    if isinstance(members, str):
        members = members.split()
    members = tuple(members)
    return type(name, (NamedMutableSequence,), {'__slots__': members})

Example:

>>> Point = namedgroup('Point', ['x', 'y'])
>>> Point(6, 42)
Point(x=6, y=42)

In Python 2 you need to adjust it slightly – if you inherit from Sequence, the class will have a __dict__ and the __slots__ will stop from working.

The solution in Python 2 is to not inherit from Sequence, but object. If isinstance(Point, Sequence) == True is desired, you need to register the NamedMutableSequence as a base class to Sequence:

Sequence.register(NamedMutableSequence)

Question 60

Let’s implement this with dynamic type creation:

import copy
def namedgroup(typename, fieldnames):

    def init(self, **kwargs): 
        attrs = {k: None for k in self._attrs_}
        for k in kwargs:
            if k in self._attrs_:
                attrs[k] = kwargs[k]
            else:
                raise AttributeError('Invalid Field')
        self.__dict__.update(attrs)

    def getattribute(self, attr):
        if attr.startswith("_") or attr in self._attrs_:
            return object.__getattribute__(self, attr)
        else:
            raise AttributeError('Invalid Field')

    def setattr(self, attr, value):
        if attr in self._attrs_:
            object.__setattr__(self, attr, value)
        else:
            raise AttributeError('Invalid Field')

    def rep(self):
         d = ["{}={}".format(v,self.__dict__[v]) for v in self._attrs_]
         return self._typename_ + '(' + ', '.join(d) + ')'

    def iterate(self):
        for x in self._attrs_:
            yield self.__dict__[x]
        raise StopIteration()

    def setitem(self, *args, **kwargs):
        return self.__dict__.__setitem__(*args, **kwargs)

    def getitem(self, *args, **kwargs):
        return self.__dict__.__getitem__(*args, **kwargs)

    attrs = {"__init__": init,
                "__setattr__": setattr,
                "__getattribute__": getattribute,
                "_attrs_": copy.deepcopy(fieldnames),
                "_typename_": str(typename),
                "__str__": rep,
                "__repr__": rep,
                "__len__": lambda self: len(fieldnames),
                "__iter__": iterate,
                "__setitem__": setitem,
                "__getitem__": getitem,
                }

    return type(typename, (object,), attrs)

This checks the attributes to see if they are valid before allowing the operation to continue.

So is this pickleable? Yes if (and only if) you do the following:

>>> import pickle
>>> Point = namedgroup("Point", ["x", "y"])
>>> p = Point(x=100, y=200)
>>> p2 = pickle.loads(pickle.dumps(p))
>>> p2.x
100
>>> p2.y
200
>>> id(p) != id(p2)
True

The definition has to be in your namespace, and must exist long enough for pickle to find it. So if you define this to be in your package, it should work.

Point = namedgroup("Point", ["x", "y"])

Pickle will fail if you do the following, or make the definition temporary (goes out of scope when the function ends, say):

some_point = namedgroup("Point", ["x", "y"])

And yes, it does preserve the order of the fields listed in the type creation.

Question 61

Tuples are by definition immutable.

You can however make a dictionary subclass where you can access the attributes with dot-notation;

In [1]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:class AttrDict(dict):
:
:    def __getattr__(self, name):
:        return self[name]
:
:    def __setattr__(self, name, value):
:        self[name] = value
:--

In [2]: test = AttrDict()

In [3]: test.a = 1

In [4]: test.b = True

In [5]: test
Out[5]: {'a': 1, 'b': True}

Question 62

If you want similar behavior as namedtuples but mutable try namedlist

Note that in order to be mutable it cannot be a tuple.

Question 63

Provided performance is of little importance, one could use a silly hack like:

from collection import namedtuple

Point = namedtuple('Point', 'x y z')
mutable_z = Point(1,2,[3])

Question 64

I want to remove null=True from a TextField:

-    footer=models.TextField(null=True, blank=True)
+    footer=models.TextField(blank=True, default='')

I created a schema migration:

manage.py schemamigration fooapp --auto

Since some footer columns contain NULL I get this error if I run the migration:

django.db.utils.IntegrityError: column “footer” contains null values

I added this to the schema migration:

    for sender in orm['fooapp.EmailSender'].objects.filter(footer=None):
        sender.footer=''
        sender.save()

Now I get:

django.db.utils.DatabaseError: cannot ALTER TABLE "fooapp_emailsender" because it has pending trigger events

What is wrong?

问题：在Python中，“。append（）”和“ + = []”之间有什么区别？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

问题：Python鼻子导入错误

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

问题：如何在Python中将日志记录配置为syslog？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

问题：存在相同名称的模块时从内置库导入

回答 0

回答 1

回答 2

回答 3

回答 4

例

结果

Example

Result

回答 5

问题：CSV新行字符出现在未引用字段错误

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

问题：Python中存在可变的命名元组吗？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

问题：Django-DB-Migrations：无法更改表，因为它具有未决的触发事件

回答 0

回答 1

回答 2

回答 3

回答 4

问题：像C＃中的StringBuilder这样的Python字符串类？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

1.与`row.name`内线`apply(..., axis=1)`通话：

2.与`iterrows()`（较慢）

1. with `row.name` inside the `apply(..., axis=1)` call:

2. with `iterrows()` (slower)