Python 实用宝典

Question 1

I want to use the method of “findall” to locate some elements of the source xml file in the ElementTree module.

However, the source xml file (test.xml) has namespace. I truncate part of xml file as sample:

<?xml version="1.0" encoding="iso-8859-1"?>
<XML_HEADER xmlns="http://www.test.com">
    <TYPE>Updates</TYPE>
    <DATE>9/26/2012 10:30:34 AM</DATE>
    <COPYRIGHT_NOTICE>All Rights Reserved.</COPYRIGHT_NOTICE>
    <LICENSE>newlicense.htm</LICENSE>
    <DEAL_LEVEL>
        <PAID_OFF>N</PAID_OFF>
        </DEAL_LEVEL>
</XML_HEADER>

The sample python code is below:

from xml.etree import ElementTree as ET
tree = ET.parse(r"test.xml")
el1 = tree.findall("DEAL_LEVEL/PAID_OFF") # Return None
el2 = tree.findall("{http://www.test.com}DEAL_LEVEL/{http://www.test.com}PAID_OFF") # Return <Element '{http://www.test.com}DEAL_LEVEL/PAID_OFF' at 0xb78b90>

Although it can works, because there is a namespace “{http://www.test.com}”, it’s very inconvenient to add a namespace in front of each tag.

How can I ignore the namespace when using the method of “find”, “findall” and so on?

Question 2

Instead of modifying the XML document itself, it’s best to parse it and then modify the tags in the result. This way you can handle multiple namespaces and namespace aliases:

from io import StringIO  # for Python 2 import from StringIO instead
import xml.etree.ElementTree as ET

# instead of ET.fromstring(xml)
it = ET.iterparse(StringIO(xml))
for _, el in it:
    prefix, has_namespace, postfix = el.tag.partition('}')
    if has_namespace:
        el.tag = postfix  # strip all namespaces
root = it.root

This is based on the discussion here: http://bugs.python.org/issue18304

Update: rpartition instead of partition makes sure you get the tag name in postfix even if there is no namespace. Thus you could condense it:

for _, el in it:
    _, _, el.tag = el.tag.rpartition('}') # strip ns

Question 3

If you remove the xmlns attribute from the xml before parsing it then there won’t be a namespace prepended to each tag in the tree.

import re

xmlstring = re.sub(' xmlns="[^"]+"', '', xmlstring, count=1)

Question 4

The answers so far explicitely put the namespace value in the script. For a more generic solution, I would rather extract the namespace from the xml:

import re
def get_namespace(element):
  m = re.match('\{.*\}', element.tag)
  return m.group(0) if m else ''

And use it in find method:

namespace = get_namespace(tree.getroot())
print tree.find('./{0}parent/{0}version'.format(namespace)).text

Question 5

Here’s an extension to nonagon’s answer, which also strips namespaces off attributes:

from StringIO import StringIO
import xml.etree.ElementTree as ET

# instead of ET.fromstring(xml)
it = ET.iterparse(StringIO(xml))
for _, el in it:
    if '}' in el.tag:
        el.tag = el.tag.split('}', 1)[1]  # strip all namespaces
    for at in list(el.attrib.keys()): # strip namespaces of attributes too
        if '}' in at:
            newat = at.split('}', 1)[1]
            el.attrib[newat] = el.attrib[at]
            del el.attrib[at]
root = it.root

UPDATE: added list() so the iterator works (needed for Python 3)

Question 6

Improving on the answer by ericspod:

Instead of changing the parse mode globally we can wrap this in an object supporting the with construct.

from xml.parsers import expat

class DisableXmlNamespaces:
    def __enter__(self):
            self.oldcreate = expat.ParserCreate
            expat.ParserCreate = lambda encoding, sep: self.oldcreate(encoding, None)
    def __exit__(self, type, value, traceback):
            expat.ParserCreate = self.oldcreate

This can then be used as follows

import xml.etree.ElementTree as ET
with DisableXmlNamespaces():
     tree = ET.parse("test.xml")

The beauty of this way is that it does not change any behaviour for unrelated code outside the with block. I ended up creating this after getting errors in unrelated libraries after using the version by ericspod which also happened to use expat.

Question 7

You can use the elegant string formatting construct as well:

ns='http://www.test.com'
el2 = tree.findall("{%s}DEAL_LEVEL/{%s}PAID_OFF" %(ns,ns))

or, if you’re sure that PAID_OFF only appears in one level in tree:

el2 = tree.findall(".//{%s}PAID_OFF" % ns)

Question 8

If you’re using ElementTree and not cElementTree you can force Expat to ignore namespace processing by replacing ParserCreate():

from xml.parsers import expat
oldcreate = expat.ParserCreate
expat.ParserCreate = lambda encoding, sep: oldcreate(encoding, None)

ElementTree tries to use Expat by calling ParserCreate() but provides no option to not provide a namespace separator string, the above code will cause it to be ignore but be warned this could break other things.

Question 9

I might be late for this but I dont think re.sub is a good solution.

However the rewrite xml.parsers.expat does not work for Python 3.x versions,

The main culprit is the xml/etree/ElementTree.py see bottom of the source code

# Import the C accelerators
try:
    # Element is going to be shadowed by the C implementation. We need to keep
    # the Python version of it accessible for some "creative" by external code
    # (see tests)
    _Element_Py = Element

    # Element, SubElement, ParseError, TreeBuilder, XMLParser
    from _elementtree import *
except ImportError:
    pass

Which is kinda sad.

The solution is to get rid of it first.

import _elementtree
try:
    del _elementtree.XMLParser
except AttributeError:
    # in case deleted twice
    pass
else:
    from xml.parsers import expat  # NOQA: F811
    oldcreate = expat.ParserCreate
    expat.ParserCreate = lambda encoding, sep: oldcreate(encoding, None)

Tested on Python 3.6.

Try try statement is useful in case somewhere in your code you reload or import a module twice you get some strange errors like

maximum recursion depth exceeded
AttributeError: XMLParser

btw damn the etree source code looks really messy.

Question 10

Let’s combine nonagon’s answer with mzjn’s answer to a related question:

def parse_xml(xml_path: Path) -> Tuple[ET.Element, Dict[str, str]]:
    xml_iter = ET.iterparse(xml_path, events=["start-ns"])
    xml_namespaces = dict(prefix_namespace_pair for _, prefix_namespace_pair in xml_iter)
    return xml_iter.root, xml_namespaces

Using this function we:

Create an iterator to get both namespaces and a parsed tree object.
Iterate over the created iterator to get the namespaces dict that we can later pass in each find() or findall() call as sugested by iMom0.
Return the parsed tree’s root element object and namespaces.

I think this is the best approach all around as there’s no manipulation either of a source XML or resulting parsed xml.etree.ElementTree output whatsoever involved.

I’d like also to credit barny’s answer with providing an essential piece of this puzzle (that you can get the parsed root from the iterator). Until that I actually traversed XML tree twice in my application (once to get namespaces, second for a root).

Question 11

What is the proper way to compare 2 times in Python in order to speed test a section of code? I tried reading the API docs. I’m not sure I understand the timedelta thing.

So far I have this code:

from datetime import datetime

tstart = datetime.now()
print t1

# code to speed test

tend = datetime.now()
print t2
# what am I missing?
# I'd like to print the time diff here

Question 12

datetime.timedelta is just the difference between two datetimes … so it’s like a period of time, in days / seconds / microseconds

>>> import datetime
>>> a = datetime.datetime.now()
>>> b = datetime.datetime.now()
>>> c = b - a

>>> c
datetime.timedelta(0, 4, 316543)
>>> c.days
0
>>> c.seconds
4
>>> c.microseconds
316543

Be aware that c.microseconds only returns the microseconds portion of the timedelta! For timing purposes always use c.total_seconds().

You can do all sorts of maths with datetime.timedelta, eg:

>>> c / 10
datetime.timedelta(0, 0, 431654)

It might be more useful to look at CPU time instead of wallclock time though … that’s operating system dependant though … under Unix-like systems, check out the ‘time’ command.

Question 13

Since Python 2.7 there’s the timedelta.total_seconds() method. So, to get the elapsed milliseconds:

>>> import datetime
>>> a = datetime.datetime.now()
>>> b = datetime.datetime.now()
>>> delta = b - a
>>> print delta
0:00:05.077263
>>> int(delta.total_seconds() * 1000) # milliseconds
5077

Question 14

You might want to use the timeit module instead.

Question 15

You could also use:

import time

start = time.clock()
do_something()
end = time.clock()
print "%.2gs" % (end-start)

Or you could use the python profilers.

Question 16

I know this is late, but I actually really like using:

import time
start = time.time()

##### your timed code here ... #####

print "Process time: " + (time.time() - start)

time.time() gives you seconds since the epoch. Because this is a standardized time in seconds, you can simply subtract the start time from the end time to get the process time (in seconds). time.clock() is good for benchmarking, but I have found it kind of useless if you want to know how long your process took. For example, it’s much more intuitive to say “my process takes 10 seconds” than it is to say “my process takes 10 processor clock units”

>>> start = time.time(); sum([each**8.3 for each in range(1,100000)]) ; print (time.time() - start)
3.4001404476250935e+45
0.0637760162354
>>> start = time.clock(); sum([each**8.3 for each in range(1,100000)]) ; print (time.clock() - start)
3.4001404476250935e+45
0.05

In the first example above, you are shown a time of 0.05 for time.clock() vs 0.06377 for time.time()

>>> start = time.clock(); time.sleep(1) ; print "process time: " + (time.clock() - start)
process time: 0.0
>>> start = time.time(); time.sleep(1) ; print "process time: " + (time.time() - start)
process time: 1.00111794472

In the second example, somehow the processor time shows “0” even though the process slept for a second. time.time() correctly shows a little more than 1 second.

Question 17

The following code should display the time detla…

from datetime import datetime

tstart = datetime.now()

# code to speed test

tend = datetime.now()
print tend - tstart

Question 18

You could simply print the difference:

print tend - tstart

Question 19

I am not a Python programmer, but I do know how to use Google and here’s what I found: you use the “-” operator. To complete your code:

from datetime import datetime

tstart = datetime.now()

# code to speed test

tend = datetime.now()
print tend - tstart

Additionally, it looks like you can use the strftime() function to format the timespan calculation in order to render the time however makes you happy.

Question 20

time.time() / datetime is good for quick use, but is not always 100% precise. For that reason, I like to use one of the std lib profilers (especially hotshot) to find out what’s what.

Question 21

You may want to look into the profile modules. You’ll get a better read out of where your slowdowns are, and much of your work will be full-on automated.

Question 22

Here is a custom function that mimic’s Matlab’s/Octave’s tic toc functions.

Example of use:

time_var = time_me(); # get a variable with the current timestamp

... run operation ...

time_me(time_var); # print the time difference (e.g. '5 seconds 821.12314 ms')

Function :

def time_me(*arg):
    if len(arg) != 0: 
        elapsedTime = time.time() - arg[0];
        #print(elapsedTime);
        hours = math.floor(elapsedTime / (60*60))
        elapsedTime = elapsedTime - hours * (60*60);
        minutes = math.floor(elapsedTime / 60)
        elapsedTime = elapsedTime - minutes * (60);
        seconds = math.floor(elapsedTime);
        elapsedTime = elapsedTime - seconds;
        ms = elapsedTime * 1000;
        if(hours != 0):
            print ("%d hours %d minutes %d seconds" % (hours, minutes, seconds)) 
        elif(minutes != 0):
            print ("%d minutes %d seconds" % (minutes, seconds))
        else :
            print ("%d seconds %f ms" % (seconds, ms))
    else:
        #print ('does not exist. here you go.');
        return time.time()

Question 23

You could use timeit like this to test a script named module.py

$ python -mtimeit -s 'import module'

Question 24

Arrow: Better dates & times for Python

import arrow
start_time = arrow.utcnow()
end_time = arrow.utcnow()
(end_time - start_time).total_seconds()  # senconds
(end_time - start_time).total_seconds() * 1000  # milliseconds

Question 25

After doing some processing on an audio or image array, it needs to be normalized within a range before it can be written back to a file. This can be done like so:

# Normalize audio channels to between -1.0 and +1.0
audio[:,0] = audio[:,0]/abs(audio[:,0]).max()
audio[:,1] = audio[:,1]/abs(audio[:,1]).max()

# Normalize image to between 0 and 255
image = image/(image.max()/255.0)

Is there a less verbose, convenience function way to do this? matplotlib.colors.Normalize() doesn’t seem to be related.

Question 26

audio /= np.max(np.abs(audio),axis=0)
image *= (255.0/image.max())

Using /= and *= allows you to eliminate an intermediate temporary array, thus saving some memory. Multiplication is less expensive than division, so

image *= 255.0/image.max()    # Uses 1 division and image.size multiplications

is marginally faster than

image /= image.max()/255.0    # Uses 1+image.size divisions

Since we are using basic numpy methods here, I think this is about as efficient a solution in numpy as can be.

In-place operations do not change the dtype of the container array. Since the desired normalized values are floats, the audio and image arrays need to have floating-point point dtype before the in-place operations are performed. If they are not already of floating-point dtype, you’ll need to convert them using astype. For example,

image = image.astype('float64')

Question 27

If the array contains both positive and negative data, I’d go with:

import numpy as np

a = np.random.rand(3,2)

# Normalised [0,1]
b = (a - np.min(a))/np.ptp(a)

# Normalised [0,255] as integer: don't forget the parenthesis before astype(int)
c = (255*(a - np.min(a))/np.ptp(a)).astype(int)        

# Normalised [-1,1]
d = 2.*(a - np.min(a))/np.ptp(a)-1

If the array contains nan, one solution could be to just remove them as:

def nan_ptp(a):
    return np.ptp(a[np.isfinite(a)])

b = (a - np.nanmin(a))/nan_ptp(a)

However, depending on the context you might want to treat nan differently. E.g. interpolate the value, replacing in with e.g. 0, or raise an error.

Finally, worth mentioning even if it’s not OP’s question, standardization:

e = (a - np.mean(a)) / np.std(a)

Question 28

You can also rescale using sklearn. The advantages are that you can adjust normalize the standard deviation, in addition to mean-centering the data, and that you can do this on either axis, by features, or by records.

from sklearn.preprocessing import scale
X = scale( X, axis=0, with_mean=True, with_std=True, copy=True )

The keyword arguments axis, with_mean, with_std are self explanatory, and are shown in their default state. The argument copy performs the operation in-place if it is set to False. Documentation here.

Question 29

You can use the “i” (as in idiv, imul..) version, and it doesn’t look half bad:

image /= (image.max()/255.0)

For the other case you can write a function to normalize an n-dimensional array by colums:

def normalize_columns(arr):
    rows, cols = arr.shape
    for col in xrange(cols):
        arr[:,col] /= abs(arr[:,col]).max()

Question 30

You are trying to min-max scale the values of audio between -1 and +1 and image between 0 and 255.

Using sklearn.preprocessing.minmax_scale, should easily solve your problem.

e.g.:

audio_scaled = minmax_scale(audio, feature_range=(-1,1))

and

shape = image.shape
image_scaled = minmax_scale(image.ravel(), feature_range=(0,255)).reshape(shape)

note: Not to be confused with the operation that scales the norm (length) of a vector to a certain value (usually 1), which is also commonly referred to as normalization.

Question 31

A simple solution is using the scalers offered by the sklearn.preprocessing library.

scaler = sk.MinMaxScaler(feature_range=(0, 250))
scaler = scaler.fit(X)
X_scaled = scaler.transform(X)
# Checking reconstruction
X_rec = scaler.inverse_transform(X_scaled)

The error X_rec-X will be zero. You can adjust the feature_range for your needs, or even use a standart scaler sk.StandardScaler()

Question 32

I tried following this, and got the error

TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind''

The numpy array I was trying to normalize was an integer array. It seems they deprecated type casting in versions > 1.10, and you have to use numpy.true_divide() to resolve that.

arr = np.array(img)
arr = np.true_divide(arr,[255.0],out=None)

img was an PIL.Image object.

Question 33

I have a function taking float arguments (generally integers or decimals with one significant digit), and I need to output the values in a string with two decimal places (5 -> 5.00, 5.5 -> 5.50, etc). How can I do this in Python?

Question 34

You could use the string formatting operator for that:

>>> '%.2f' % 1.234
'1.23'
>>> '%.2f' % 5.0
'5.00'

The result of the operator is a string, so you can store it in a variable, print etc.

Question 35

Since this post might be here for a while, lets also point out python 3 syntax:

"{:.2f}".format(5)

Question 36

f-string formatting:

This was new in Python 3.6 – the string is placed in quotation marks as usual, prepended with f'... in the same way you would r'... for a raw string. Then you place whatever you want to put within your string, variables, numbers, inside braces f'some string text with a {variable} or {number} within that text' – and Python evaluates as with previous string formatting methods, except that this method is much more readable.

>>> foobar = 3.141592
>>> print(f'My number is {foobar:.2f} - look at the nice rounding!')

My number is 3.14 - look at the nice rounding!

You can see in this example we format with decimal places in similar fashion to previous string formatting methods.

NB foobar can be an number, variable, or even an expression eg f'{3*my_func(3.14):02f}'.

Going forward, with new code I prefer f-strings over common %s or str.format() methods as f-strings can be far more readable, and are often much faster.

Question 37

String formatting:

print "%.2f" % 5

Question 38

Using python string formatting.

>>> "%0.2f" % 3
'3.00'

Question 39

String Formatting:

a = 6.789809823
print('%.2f' %a)

OR

print ("{0:.2f}".format(a))

Round Function can be used:

print(round(a, 2))

Good thing about round() is that, we can store this result to another variable, and then use it for other purposes.

b = round(a, 2)
print(b)

Question 40

Shortest Python 3 syntax:

n = 5
print(f'{n:.2f}')

Question 41

If you actually want to change the number itself instead of only displaying it differently use format()

Format it to 2 decimal places:

format(value, '.2f')

example:

>>> format(5.00000, '.2f')
'5.00'

Question 42

I know it is an old question, but I was struggling finding the answer myself. Here is what I have come up with:

Python 3:

>>> num_dict = {'num': 0.123, 'num2': 0.127}
>>> "{0[num]:.2f}_{0[num2]:.2f}".format(num_dict) 
0.12_0.13

Question 43

Using Python 3 syntax:

print('%.2f' % number)

Question 44

If you want to get a floating point value with two decimal places limited at the time of calling input,

Check this out ~

a = eval(format(float(input()), '.2f'))   # if u feed 3.1415 for 'a'.
print(a)                                  # output 3.14 will be printed.

Question 45

I am trying to figure out the differences between the datetime and time modules, and what each should be used for.

I know that datetime provides both dates and time. What is the use of the time module?

Examples would be appreciated and differences concerning timezones would especially be of interest.

Question 46

the time module is principally for working with unix time stamps; expressed as a floating point number taken to be seconds since the unix epoch. the datetime module can support many of the same operations, but provides a more object oriented set of types, and also has some limited support for time zones.

Question 47

Stick to `time` to prevent DST ambiguity.

Use exclusively the system time module instead of the datetime module to prevent ambiguity issues with daylight savings time (DST).

Conversion to any time format, including local time, is pretty easy:

import time
t = time.time()

time.strftime('%Y-%m-%d %H:%M %Z', time.localtime(t))
'2019-05-27 12:03 CEST'

time.strftime('%Y-%m-%d %H:%M %Z', time.gmtime(t))
'2019-05-27 10:03 GMT'

time.time() is a floating point number representing the time in seconds since the system epoch. time.time() is ideal for unambiguous time stamping.

If the system additionally runs the network time protocol (NTP) dæmon, one ends up with a pretty solid time base.

Here is the documentation of the time module.

Question 48

The time module can be used when you just need the time of a particular record – like lets say you have a seperate table/file for the transactions for each day, then you would just need the time. However the time datatype is usually used to store the time difference between 2 points of time.

This can also be done using datetime, but if we are only dealing with time for a particular day, then time module can be used.

Datetime is used to store a particular data and time for a record. Like in a rental agency. The due date would be a datetime datatype.

Question 49

If you are interested in timezones, you should consider the use of pytz.

Question 50

I am trying to use a Python package called bidi. In a module in this package (algorithm.py) there are some lines that give me error, although it is part of the package.

Here are the lines:

# utf-8 ? we need unicode
if isinstance(unicode_or_str, unicode):
    text = unicode_or_str
    decoded = False
else:
    text = unicode_or_str.decode(encoding)
    decoded = True

and here is the error message:

Traceback (most recent call last):
  File "<pyshell#25>", line 1, in <module>
    bidi_text = get_display(reshaped_text)
  File "C:\Python33\lib\site-packages\python_bidi-0.3.4-py3.3.egg\bidi\algorithm.py",   line 602, in get_display
    if isinstance(unicode_or_str, unicode):
NameError: global name 'unicode' is not defined

How should I re-write this part of the code so it works in Python3? Also if anyone have used bidi package with Python 3 please let me know if they have found similar problems or not. I appreciate your help.

Question 51

Python 3 renamed the unicode type to str, the old str type has been replaced by bytes.

if isinstance(unicode_or_str, str):
    text = unicode_or_str
    decoded = False
else:
    text = unicode_or_str.decode(encoding)
    decoded = True

You may want to read the Python 3 porting HOWTO for more such details. There is also Lennart Regebro’s Porting to Python 3: An in-depth guide, free online.

Last but not least, you could just try to use the 2to3 tool to see how that translates the code for you.

Question 52

If you need to have the script keep working on python2 and 3 as I did, this might help someone

import sys
if sys.version_info[0] >= 3:
    unicode = str

and can then just do for example

foo = unicode.lower(foo)

Question 53

You can use the six library to support both Python 2 and 3:

import six
if isinstance(value, six.string_types):
    handle_string(value)

Question 54

Hope you are using Python 3 , Str are unicode by default, so please Replace Unicode function with String Str function.

if isinstance(unicode_or_str, str):    ##Replaces with str
    text = unicode_or_str
    decoded = False

Question 55

Assume table has three columns: username, password and no_of_logins.

When user tries to login, it’s checked for an entry with a query like

user = User.query.filter_by(username=form.username.data).first()

If password matches, he proceeds further. What I would like to do is count how many times the user logged in. Thus whenever he successfully logs in, I would like to increment the no_of_logins field and store it back to the user table. I’m not sure how to run update query with SqlAlchemy.

Question 56

user.no_of_logins += 1
session.commit()

Question 57

There are several ways to UPDATE using sqlalchemy

1) user.no_of_logins += 1
   session.commit()

2) session.query().\
       filter(User.username == form.username.data).\
       update({"no_of_logins": (User.no_of_logins +1)})
   session.commit()

3) conn = engine.connect()
   stmt = User.update().\
       values(no_of_logins=(User.no_of_logins + 1)).\
       where(User.username == form.username.data)
   conn.execute(stmt)

4) setattr(user, 'no_of_logins', user.no_of_logins+1)
   session.commit()

Question 58

Examples to clarify the important issue in accepted answer’s comments

I didn’t understand it until I played around with it myself, so I figured there would be others who were confused as well. Say you are working on the user whose id == 6 and whose no_of_logins == 30 when you start.

# 1 (bad)
user.no_of_logins += 1
# result: UPDATE user SET no_of_logins = 31 WHERE user.id = 6

# 2 (bad)
user.no_of_logins = user.no_of_logins + 1
# result: UPDATE user SET no_of_logins = 31 WHERE user.id = 6

# 3 (bad)
setattr(user, 'no_of_logins', user.no_of_logins + 1)
# result: UPDATE user SET no_of_logins = 31 WHERE user.id = 6

# 4 (ok)
user.no_of_logins = User.no_of_logins + 1
# result: UPDATE user SET no_of_logins = no_of_logins + 1 WHERE user.id = 6

# 5 (ok)
setattr(user, 'no_of_logins', User.no_of_logins + 1)
# result: UPDATE user SET no_of_logins = no_of_logins + 1 WHERE user.id = 6

The point

By referencing the class instead of the instance, you can get SQLAlchemy to be smarter about incrementing, getting it to happen on the database side instead of the Python side. Doing it within the database is better since it’s less vulnerable to data corruption (e.g. two clients attempt to increment at the same time with a net result of only one increment instead of two). I assume it’s possible to do the incrementing in Python if you set locks or bump up the isolation level, but why bother if you don’t have to?

A caveat

If you are going to increment twice via code that produces SQL like SET no_of_logins = no_of_logins + 1, then you will need to commit or at least flush in between increments, or else you will only get one increment in total:

# 6 (bad)
user.no_of_logins = User.no_of_logins + 1
user.no_of_logins = User.no_of_logins + 1
session.commit()
# result: UPDATE user SET no_of_logins = no_of_logins + 1 WHERE user.id = 6

# 7 (ok)
user.no_of_logins = User.no_of_logins + 1
session.flush()
# result: UPDATE user SET no_of_logins = no_of_logins + 1 WHERE user.id = 6
user.no_of_logins = User.no_of_logins + 1
session.commit()
# result: UPDATE user SET no_of_logins = no_of_logins + 1 WHERE user.id = 6

Question 59

With the help of user=User.query.filter_by(username=form.username.data).first() statement you will get the specified user in user variable.

Now you can change the value of the new object variable like user.no_of_logins += 1 and save the changes with the session‘s commit method.

Question 60

I wrote telegram bot, and have some problem with update rows. Use this example, if you have Model

def update_state(chat_id, state):
    try:
        value = Users.query.filter(Users.chat_id == str(chat_id)).first()
        value.state = str(state)
        db.session.flush()
        db.session.commit()
        #db.session.close()
    except:
        print('Error in def update_state')

Why use db.session.flush()? That’s why >>> SQLAlchemy: What’s the difference between flush() and commit()?

Question 61

What is the most efficient way to organise the following pandas Dataframe:

data =

Position    Letter
1           a
2           b
3           c
4           d
5           e

into a dictionary like alphabet[1 : 'a', 2 : 'b', 3 : 'c', 4 : 'd', 5 : 'e']?

Question 62

In [9]: pd.Series(df.Letter.values,index=df.Position).to_dict()
Out[9]: {1: 'a', 2: 'b', 3: 'c', 4: 'd', 5: 'e'}

Speed comparion (using Wouter’s method)

In [6]: df = pd.DataFrame(randint(0,10,10000).reshape(5000,2),columns=list('AB'))

In [7]: %timeit dict(zip(df.A,df.B))
1000 loops, best of 3: 1.27 ms per loop

In [8]: %timeit pd.Series(df.A.values,index=df.B).to_dict()
1000 loops, best of 3: 987 us per loop

Question 63

I found a faster way to solve the problem, at least on realistically large datasets using: df.set_index(KEY).to_dict()[VALUE]

Proof on 50,000 rows:

df = pd.DataFrame(np.random.randint(32, 120, 100000).reshape(50000,2),columns=list('AB'))
df['A'] = df['A'].apply(chr)

%timeit dict(zip(df.A,df.B))
%timeit pd.Series(df.A.values,index=df.B).to_dict()
%timeit df.set_index('A').to_dict()['B']

Output:

100 loops, best of 3: 7.04 ms per loop  # WouterOvermeire
100 loops, best of 3: 9.83 ms per loop  # Jeff
100 loops, best of 3: 4.28 ms per loop  # Kikohs (me)

Question 64

In Python 3.6 the fastest way is still the WouterOvermeire one. Kikohs’ proposal is slower than the other two options.

import timeit

setup = '''
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(32, 120, 100000).reshape(50000,2),columns=list('AB'))
df['A'] = df['A'].apply(chr)
'''

timeit.Timer('dict(zip(df.A,df.B))', setup=setup).repeat(7,500)
timeit.Timer('pd.Series(df.A.values,index=df.B).to_dict()', setup=setup).repeat(7,500)
timeit.Timer('df.set_index("A").to_dict()["B"]', setup=setup).repeat(7,500)

Results:

1.1214002349999777 s  # WouterOvermeire
1.1922008498571748 s  # Jeff
1.7034366211428602 s  # Kikohs

问题：Python ElementTree模块：使用方法“ find”，“ findall”时，如何忽略XML文件的命名空间以找到匹配的元素

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

问题：Python速度测试-时差-毫秒

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

问题：如何将NumPy数组标准化到一定范围内？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

问题：在Python中显示带有两位小数的浮点数

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

问题：Python日期时间与时间模块之间的差异

回答 0

回答 1

坚决time防止DST歧义。

Stick to time to prevent DST ambiguity.

回答 2

回答 3

问题：NameError：全局名称“ unicode”未定义-在Python 3中

回答 0

回答 1

回答 2

回答 3

问题：如何更新SQLAlchemy行条目？

回答 0

回答 1

回答 2

举例说明澄清已接受答案的重要问题

重点

注意事项

Examples to clarify the important issue in accepted answer’s comments

The point

A caveat

回答 3

回答 4

问题：创建两个熊猫数据框列的字典的最有效方法是什么？

回答 0

回答 1

回答 2

回答 3

TL; DR

在长

有关

TL;DR

In Long

Related

问题：熊猫加入问题：列重叠但未指定后缀

坚决`time`防止DST歧义。

Stick to `time` to prevent DST ambiguity.