Python 实用宝典

Question 1

I’ve got:

words = ['hello', 'world', 'you', 'look', 'nice']

I want to have:

'"hello", "world", "you", "look", "nice"'

What’s the easiest way to do this with Python?

Question 2

>>> words = ['hello', 'world', 'you', 'look', 'nice']
>>> ', '.join('"{0}"'.format(w) for w in words)
'"hello", "world", "you", "look", "nice"'

Question 3

you may also perform a single format call

>>> words = ['hello', 'world', 'you', 'look', 'nice']
>>> '"{0}"'.format('", "'.join(words))
'"hello", "world", "you", "look", "nice"'

Update: Some benchmarking (performed on a 2009 mbp):

>>> timeit.Timer("""words = ['hello', 'world', 'you', 'look', 'nice'] * 100; ', '.join('"{0}"'.format(w) for w in words)""").timeit(1000)
0.32559704780578613

>>> timeit.Timer("""words = ['hello', 'world', 'you', 'look', 'nice'] * 100; '"{}"'.format('", "'.join(words))""").timeit(1000)
0.018904924392700195

So it seems that format is actually quite expensive

Update 2: following @JCode’s comment, adding a map to ensure that join will work, Python 2.7.12

>>> timeit.Timer("""words = ['hello', 'world', 'you', 'look', 'nice'] * 100; ', '.join('"{0}"'.format(w) for w in words)""").timeit(1000)
0.08646488189697266

>>> timeit.Timer("""words = ['hello', 'world', 'you', 'look', 'nice'] * 100; '"{}"'.format('", "'.join(map(str, words)))""").timeit(1000)
0.04855608940124512

>>> timeit.Timer("""words = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] * 100; ', '.join('"{0}"'.format(w) for w in words)""").timeit(1000)
0.17348504066467285

>>> timeit.Timer("""words = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] * 100; '"{}"'.format('", "'.join(map(str, words)))""").timeit(1000)
0.06372308731079102

Question 4

You can try this :

str(words)[1:-1]

Question 5

>>> ', '.join(['"%s"' % w for w in words])

Question 6

An updated version of @jamylak answer with F Strings (for python 3.6+), I’ve used backticks for a string used for a SQL script.

keys = ['foo', 'bar' , 'omg']
', '.join(f'`{k}`' for k in keys)
# result: '`foo`, `bar`, `omg`'

Question 7

I would like to know if there is any built in function in python to break the string in to 2 parts, based on the last occurrence of a separator.

for eg: consider the string “a b c,d,e,f” , after the split over separator “,”, i want the output as

“a b c,d,e” and “f”.

I know how to manipulate the string to get the desired output, but i want to know if there is any in built function in python.

Question 8

Use rpartition(s). It does exactly that.

You can also use rsplit(s, 1).

Question 9

>>> "a b c,d,e,f".rsplit(',',1)
['a b c,d,e', 'f']

Question 10

You can split a string by the last occurrence of a separator with rsplit:

Returns a list of the words in the string, separated by the delimiter string (starting from right).

To split by the last comma:

>>> "a b c,d,e,f".rsplit(',', 1)
['a b c,d,e', 'f']

Question 11

Suppose I have a string which is a backslash-escaped version of another string. Is there an easy way, in Python, to unescape the string? I could, for example, do:

>>> escaped_str = '"Hello,\\nworld!"'
>>> raw_str = eval(escaped_str)
>>> print raw_str
Hello,
world!
>>>

However that involves passing a (possibly untrusted) string to eval() which is a security risk. Is there a function in the standard lib which takes a string and produces a string with no security implications?

Question 12

>>> print '"Hello,\\nworld!"'.decode('string_escape')
"Hello,
world!"

Question 13

You can use ast.literal_eval which is safe:

Safely evaluate an expression node or a string containing a Python expression. The string or node provided may only consist of the following Python literal structures: strings, numbers, tuples, lists, dicts, booleans, and None. (END)

Like this:

>>> import ast
>>> escaped_str = '"Hello,\\nworld!"'
>>> print ast.literal_eval(escaped_str)
Hello,
world!

Question 14

All given answers will break on general Unicode strings. The following works for Python3 in all cases, as far as I can tell:

from codecs import encode, decode
sample = u'mon€y\\nröcks'
result = decode(encode(sample, 'latin-1', 'backslashreplace'), 'unicode-escape')
print(result)

As outlined in the comments, you can also use the literal_eval method from the ast module like so:

import ast
sample = u'mon€y\\nröcks'
print(ast.literal_eval(F'"{sample}"'))

Or like this when your string really contains a string literal (including the quotes):

import ast
sample = u'"mon€y\\nröcks"'
print(ast.literal_eval(sample))

However, if you are uncertain whether the input string uses double or single quotes as delimiters, or when you cannot assume it to be properly escaped at all, then literal_eval may raise a SyntaxError while the encode/decode method will still work.

Question 15

In python 3, str objects don’t have a decode method and you have to use a bytes object. ChristopheD’s answer covers python 2.

# create a `bytes` object from a `str`
my_str = "Hello,\\nworld"
# (pick an encoding suitable for your str, e.g. 'latin1')
my_bytes = my_str.encode("utf-8")

# or directly
my_bytes = b"Hello,\\nworld"

print(my_bytes.decode("unicode_escape"))
# "Hello,
# world"

Question 16

I have data which is being accessed via http request and is sent back by the server in a comma separated format, I have the following code :

site= 'www.example.com'
hdr = {'User-Agent': 'Mozilla/5.0'}
req = urllib2.Request(site,headers=hdr)
page = urllib2.urlopen(req)
soup = BeautifulSoup(page)
soup = soup.get_text()
text=str(soup)

The content of text is as follows:

april,2,5,7
may,3,5,8
june,4,7,3
july,5,6,9

How can I save this data into a CSV file. I know I can do something along the lines of the following to iterate line by line:

import StringIO
s = StringIO.StringIO(text)
for line in s:

But i’m unsure how to now properly write each line to CSV

EDIT—> Thanks for the feedback as suggested the solution was rather simple and can be seen below.

Solution:

import StringIO
s = StringIO.StringIO(text)
with open('fileName.csv', 'w') as f:
    for line in s:
        f.write(line)

Question 17

General way:

##text=List of strings to be written to file
with open('csvfile.csv','wb') as file:
    for line in text:
        file.write(line)
        file.write('\n')

OR

Using CSV writer :

import csv
with open(<path to output_csv>, "wb") as csv_file:
        writer = csv.writer(csv_file, delimiter=',')
        for line in data:
            writer.writerow(line)

OR

Simplest way:

f = open('csvfile.csv','w')
f.write('hi there\n') #Give your csv text here.
## Python will convert \n to os.linesep
f.close()

Question 18

You could just write to the file as you would write any normal file.

with open('csvfile.csv','wb') as file:
    for l in text:
        file.write(l)
        file.write('\n')

If just in case, it is a list of lists, you could directly use built-in csv module

import csv

with open("csvfile.csv", "wb") as file:
    writer = csv.writer(file)
    writer.writerows(text)

Question 19

I would simply write each line to a file, since it’s already in a CSV format:

write_file = "output.csv"
with open(write_file, "w") as output:
    for line in text:
        output.write(line + '\n')

I can’t recall how to write lines with line-breaks at the moment, though :p

Also, you might like to take a look at this answer about write(), writelines(), and '\n'.

Question 20

To complement the previous answers, I whipped up a quick class to write to CSV files. It makes it easier to manage and close open files and achieve consistency and cleaner code if you have to deal with multiple files.

class CSVWriter():

    filename = None
    fp = None
    writer = None

    def __init__(self, filename):
        self.filename = filename
        self.fp = open(self.filename, 'w', encoding='utf8')
        self.writer = csv.writer(self.fp, delimiter=';', quotechar='"', quoting=csv.QUOTE_ALL, lineterminator='\n')

    def close(self):
        self.fp.close()

    def write(self, elems):
        self.writer.writerow(elems)

    def size(self):
        return os.path.getsize(self.filename)

    def fname(self):
        return self.filename

Example usage:

mycsv = CSVWriter('/tmp/test.csv')
mycsv.write((12,'green','apples'))
mycsv.write((7,'yellow','bananas'))
mycsv.close()
print("Written %d bytes to %s" % (mycsv.size(), mycsv.fname()))

Have fun

Question 21

What about this:

with open("your_csv_file.csv", "w") as f:
    f.write("\n".join(text))

str.join() Return a string which is the concatenation of the strings in iterable. The separator between elements is the string providing this method.

Question 22

In Python, the where and when of using string concatenation versus string substitution eludes me. As the string concatenation has seen large boosts in performance, is this (becoming more) a stylistic decision rather than a practical one?

For a concrete example, how should one handle construction of flexible URIs:

DOMAIN = 'http://stackoverflow.com'
QUESTIONS = '/questions'

def so_question_uri_sub(q_num):
    return "%s%s/%d" % (DOMAIN, QUESTIONS, q_num)

def so_question_uri_cat(q_num):
    return DOMAIN + QUESTIONS + '/' + str(q_num)

Edit: There have also been suggestions about joining a list of strings and for using named substitution. These are variants on the central theme, which is, which way is the Right Way to do it at which time? Thanks for the responses!

Question 23

Concatenation is (significantly) faster according to my machine. But stylistically, I’m willing to pay the price of substitution if performance is not critical. Well, and if I need formatting, there’s no need to even ask the question… there’s no option but to use interpolation/templating.

>>> import timeit
>>> def so_q_sub(n):
...  return "%s%s/%d" % (DOMAIN, QUESTIONS, n)
...
>>> so_q_sub(1000)
'http://stackoverflow.com/questions/1000'
>>> def so_q_cat(n):
...  return DOMAIN + QUESTIONS + '/' + str(n)
...
>>> so_q_cat(1000)
'http://stackoverflow.com/questions/1000'
>>> t1 = timeit.Timer('so_q_sub(1000)','from __main__ import so_q_sub')
>>> t2 = timeit.Timer('so_q_cat(1000)','from __main__ import so_q_cat')
>>> t1.timeit(number=10000000)
12.166618871951641
>>> t2.timeit(number=10000000)
5.7813972166853773
>>> t1.timeit(number=1)
1.103492206766532e-05
>>> t2.timeit(number=1)
8.5206360154188587e-06

>>> def so_q_tmp(n):
...  return "{d}{q}/{n}".format(d=DOMAIN,q=QUESTIONS,n=n)
...
>>> so_q_tmp(1000)
'http://stackoverflow.com/questions/1000'
>>> t3= timeit.Timer('so_q_tmp(1000)','from __main__ import so_q_tmp')
>>> t3.timeit(number=10000000)
14.564135316080637

>>> def so_q_join(n):
...  return ''.join([DOMAIN,QUESTIONS,'/',str(n)])
...
>>> so_q_join(1000)
'http://stackoverflow.com/questions/1000'
>>> t4= timeit.Timer('so_q_join(1000)','from __main__ import so_q_join')
>>> t4.timeit(number=10000000)
9.4431309007150048

Question 24

Don’t forget about named substitution:

def so_question_uri_namedsub(q_num):
    return "%(domain)s%(questions)s/%(q_num)d" % locals()

Question 25

Be wary of concatenating strings in a loop! The cost of string concatenation is proportional to the length of the result. Looping leads you straight to the land of N-squared. Some languages will optimize concatenation to the most recently allocated string, but it’s risky to count on the compiler to optimize your quadratic algorithm down to linear. Best to use the primitive (join?) that takes an entire list of strings, does a single allocation, and concatenates them all in one go.

Question 26

“As the string concatenation has seen large boosts in performance…”

If performance matters, this is good to know.

However, performance problems I’ve seen have never come down to string operations. I’ve generally gotten in trouble with I/O, sorting and O(n²) operations being the bottlenecks.

Until string operations are the performance limiters, I’ll stick with things that are obvious. Mostly, that’s substitution when it’s one line or less, concatenation when it makes sense, and a template tool (like Mako) when it’s large.

Question 27

What you want to concatenate/interpolate and how you want to format the result should drive your decision.

String interpolation allows you to easily add formatting. In fact, your string interpolation version doesn’t do the same thing as your concatenation version; it actually adds an extra forward slash before the q_num parameter. To do the same thing, you would have to write return DOMAIN + QUESTIONS + "/" + str(q_num) in that example.
Interpolation makes it easier to format numerics; "%d of %d (%2.2f%%)" % (current, total, total/current) would be much less readable in concatenation form.
Concatenation is useful when you don’t have a fixed number of items to string-ize.

Also, know that Python 2.6 introduces a new version of string interpolation, called string templating:

def so_question_uri_template(q_num):
    return "{domain}/{questions}/{num}".format(domain=DOMAIN,
                                               questions=QUESTIONS,
                                               num=q_num)

String templating is slated to eventually replace %-interpolation, but that won’t happen for quite a while, I think.

Question 28

I was just testing the speed of different string concatenation/substitution methods out of curiosity. A google search on the subject brought me here. I thought I would post my test results in the hope that it might help someone decide.

    import timeit
    def percent_():
            return "test %s, with number %s" % (1,2)

    def format_():
            return "test {}, with number {}".format(1,2)

    def format2_():
            return "test {1}, with number {0}".format(2,1)

    def concat_():
            return "test " + str(1) + ", with number " + str(2)

    def dotimers(func_list):
            # runs a single test for all functions in the list
            for func in func_list:
                    tmr = timeit.Timer(func)
                    res = tmr.timeit()
                    print "test " + func.func_name + ": " + str(res)

    def runtests(func_list, runs=5):
            # runs multiple tests for all functions in the list
            for i in range(runs):
                    print "----------- TEST #" + str(i + 1)
                    dotimers(func_list)

…After running runtests((percent_, format_, format2_, concat_), runs=5), I found that the % method was about twice as fast as the others on these small strings. The concat method was always the slowest (barely). There were very tiny differences when switching the positions in the format() method, but switching positions was always at least .01 slower than the regular format method.

Sample of test results:

    test concat_()  : 0.62  (0.61 to 0.63)
    test format_()  : 0.56  (consistently 0.56)
    test format2_() : 0.58  (0.57 to 0.59)
    test percent_() : 0.34  (0.33 to 0.35)

I ran these because I do use string concatenation in my scripts, and I was wondering what the cost was. I ran them in different orders to make sure nothing was interfering, or getting better performance being first or last. On a side note, I threw in some longer string generators into those functions like "%s" + ("a" * 1024) and regular concat was almost 3 times as fast (1.1 vs 2.8) as using the format and % methods. I guess it depends on the strings, and what you are trying to achieve. If performance really matters, it might be better to try different things and test them. I tend to choose readability over speed, unless speed becomes a problem, but thats just me. SO didn’t like my copy/paste, i had to put 8 spaces on everything to make it look right. I usually use 4.

Question 29

Remember, stylistic decisions are practical decisions, if you ever plan on maintaining or debugging your code :-) There’s a famous quote from Knuth (possibly quoting Hoare?): “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.”

As long as you’re careful not to (say) turn a O(n) task into an O(n²) task, I would go with whichever you find easiest to understand..

Question 30

I use substitution wherever I can. I only use concatenation if I’m building a string up in say a for-loop.

Question 31

Actually the correct thing to do, in this case (building paths) is to use os.path.join. Not string concatenation or interpolation

Question 32

I would like to remove the first character of a string.

For example, my string starts with a : and I want to remove that only. There are several occurrences of : in the string that shouldn’t be removed.

I am writing my code in Python.

Question 33

python 2.x

s = ":dfa:sif:e"
print s[1:]

python 3.x

s = ":dfa:sif:e"
print(s[1:])

both prints

dfa:sif:e

Question 34

Your problem seems unclear. You say you want to remove “a character from a certain position” then go on to say you want to remove a particular character.

If you only need to remove the first character you would do:

s = ":dfa:sif:e"
fixed = s[1:]

If you want to remove a character at a particular position, you would do:

s = ":dfa:sif:e"
fixed = s[0:pos]+s[pos+1:]

If you need to remove a particular character, say ‘:’, the first time it is encountered in a string then you would do:

s = ":dfa:sif:e"
fixed = ''.join(s.split(':', 1))

Question 35

Depending on the structure of the string, you can use lstrip:

str = str.lstrip(':')

But this would remove all colons at the beginning, i.e. if you have ::foo, the result would be foo. But this function is helpful if you also have strings that do not start with a colon and you don’t want to remove the first character then.

Question 36

deleting a char:

def del_char(string, indexes):

    'deletes all the indexes from the string and returns the new one'

    return ''.join((char for idx, char in enumerate(string) if idx not in indexes))

it deletes all the chars that are in indexes; you can use it in your case with del_char(your_string, [0])

Question 37

I have the above-mentioned error in s1="some very long string............"

Does anyone know what I am doing wrong?

Question 38

You are not putting a " before the end of the line.

Use """ if you want to do this:

""" a very long string ...... 
....that can span multiple lines
"""

Question 39

I had this problem – I eventually worked out that the reason was that I’d included \ characters in the string. If you have any of these, “escape” them with \\ and it should work fine.

Question 40

(Assuming you don’t have/want line breaks in your string…)

How long is this string really?

I suspect there is a limit to how long a line read from a file or from the commandline can be, and because the end of the line gets choped off the parser sees something like s1="some very long string.......... (without an ending ") and thus throws a parsing error?

You can split long lines up in multiple lines by escaping linebreaks in your source like this:

s1="some very long string.....\
...\
...."

Question 41

In my situation, I had \r\n in my single-quoted dictionary strings. I replaced all instances of \r with \\r and \n with \\n and it fixed my issue, properly returning escaped line breaks in the eval’ed dict.

ast.literal_eval(my_str.replace('\r','\\r').replace('\n','\\n'))
  .....

Question 42

I faced a similar problem. I had a string which contained path to a folder in Windows e.g. C:\Users\ The problem is that \ is an escape character and so in order to use it in strings you need to add one more \.

Incorrect: C:\Users\

Correct: C:\\\Users\\\

Question 43

I too had this problem, though there were answers here I want to an important point to this after / there should not be empty spaces.Be Aware of it

Question 44

I also had this exact error message, for me the problem was fixed by adding an ” \”

It turns out that my long string, broken into about eight lines with ” \” at the very end, was missing a ” \” on one line.

Python IDLE didn’t specify a line number that this error was on, but it red-highlighted a totally correct variable assignment statement, throwing me off. The actual misshapen string statement (multiple lines long with ” \”) was adjacent to the statement being highlighted. Maybe this will help someone else.

Question 45

In my case, I use Windows so I have to use double quotes instead of single.

C:\Users\Dr. Printer>python -mtimeit -s"a = 0"
100000000 loops, best of 3: 0.011 usec per loop

Question 46

I was getting this error in postgresql function. I had a long SQL which I broke into multiple lines with \ for better readability. However, that was the problem. I removed all and made them in one line to fix the issue. I was using pgadmin III.

Question 47

In my case with Mac OS X, I had the following statement:

model.export_srcpkg(platform, toolchain, 'mymodel_pkg.zip', 'mymodel.dylib’)

I was getting the error:

  File "<stdin>", line 1
model.export_srcpkg(platform, toolchain, 'mymodel_pkg.zip', 'mymodel.dylib’)
                                                                             ^
SyntaxError: EOL while scanning string literal

After I change to:

model.export_srcpkg(platform, toolchain, "mymodel_pkg.zip", "mymodel.dylib")

It worked…

David

Question 48

You can try this:

s = r'long\annoying\path'

Question 49

Your variable(s1) spans multiple lines. In order to do this (i.e you want your string to span multiple lines), you have to use triple quotes(“””).

s1="""some very long 
string............"""

Question 50

In this case, three single quotations or three double quotations both will work! For example:

    """Parameters:
    ...Type something.....
    .....finishing statement"""

OR

    '''Parameters:
    ...Type something.....
    .....finishing statement'''

Question 51

Most previous answers are correct and my answer is very similar to aaronasterling, you could also do 3 single quotations s1=”’some very long string…………”’

Question 52

I had faced the same problem while accessing any hard drive directory. Then I solved it in this way.

 import os
 os.startfile("D:\folder_name\file_name") #running shortcut
 os.startfile("F:") #accessing directory

The picture above shows an error and resolved output.

Question 53

How can I convert a string of bytes into an int in python?

Say like this: 'y\xcc\xa6\xbb'

I came up with a clever/stupid way of doing it:

sum(ord(c) << (i * 8) for i, c in enumerate('y\xcc\xa6\xbb'[::-1]))

I know there has to be something builtin or in the standard library that does this more simply…

This is different from converting a string of hex digits for which you can use int(xxx, 16), but instead I want to convert a string of actual byte values.

UPDATE:

I kind of like James’ answer a little better because it doesn’t require importing another module, but Greg’s method is faster:

>>> from timeit import Timer
>>> Timer('struct.unpack("<L", "y\xcc\xa6\xbb")[0]', 'import struct').timeit()
0.36242198944091797
>>> Timer("int('y\xcc\xa6\xbb'.encode('hex'), 16)").timeit()
1.1432669162750244

My hacky method:

>>> Timer("sum(ord(c) << (i * 8) for i, c in enumerate('y\xcc\xa6\xbb'[::-1]))").timeit()
2.8819329738616943

FURTHER UPDATE:

Someone asked in comments what’s the problem with importing another module. Well, importing a module isn’t necessarily cheap, take a look:

>>> Timer("""import struct\nstruct.unpack(">L", "y\xcc\xa6\xbb")[0]""").timeit()
0.98822188377380371

Including the cost of importing the module negates almost all of the advantage that this method has. I believe that this will only include the expense of importing it once for the entire benchmark run; look what happens when I force it to reload every time:

>>> Timer("""reload(struct)\nstruct.unpack(">L", "y\xcc\xa6\xbb")[0]""", 'import struct').timeit()
68.474128007888794

Needless to say, if you’re doing a lot of executions of this method per one import than this becomes proportionally less of an issue. It’s also probably i/o cost rather than cpu so it may depend on the capacity and load characteristics of the particular machine.

Question 54

You can also use the struct module to do this:

>>> struct.unpack("<L", "y\xcc\xa6\xbb")[0]
3148270713L

Question 55

In Python 3.2 and later, use

>>> int.from_bytes(b'y\xcc\xa6\xbb', byteorder='big')
2043455163

or

>>> int.from_bytes(b'y\xcc\xa6\xbb', byteorder='little')
3148270713

according to the endianness of your byte-string.

This also works for bytestring-integers of arbitrary length, and for two’s-complement signed integers by specifying signed=True. See the docs for from_bytes.

Question 56

As Greg said, you can use struct if you are dealing with binary values, but if you just have a “hex number” but in byte format you might want to just convert it like:

s = 'y\xcc\xa6\xbb'
num = int(s.encode('hex'), 16)

…this is the same as:

num = struct.unpack(">L", s)[0]

…except it’ll work for any number of bytes.

Question 57

I use the following function to convert data between int, hex and bytes.

def bytes2int(str):
 return int(str.encode('hex'), 16)

def bytes2hex(str):
 return '0x'+str.encode('hex')

def int2bytes(i):
 h = int2hex(i)
 return hex2bytes(h)

def int2hex(i):
 return hex(i)

def hex2int(h):
 if len(h) > 1 and h[0:2] == '0x':
  h = h[2:]

 if len(h) % 2:
  h = "0" + h

 return int(h, 16)

def hex2bytes(h):
 if len(h) > 1 and h[0:2] == '0x':
  h = h[2:]

 if len(h) % 2:
  h = "0" + h

 return h.decode('hex')

Source: http://opentechnotes.blogspot.com.au/2014/04/convert-values-to-from-integer-hex.html

Question 58

import array
integerValue = array.array("I", 'y\xcc\xa6\xbb')[0]

Warning: the above is strongly platform-specific. Both the “I” specifier and the endianness of the string->int conversion are dependent on your particular Python implementation. But if you want to convert many integers/strings at once, then the array module does it quickly.

Question 59

In Python 2.x, you could use the format specifiers <B for unsigned bytes, and <b for signed bytes with struct.unpack/struct.pack.

E.g:

Let x = '\xff\x10\x11'

data_ints = struct.unpack('<' + 'B'*len(x), x) # [255, 16, 17]

And:

data_bytes = struct.pack('<' + 'B'*len(data_ints), *data_ints) # '\xff\x10\x11'

That * is required!

See https://docs.python.org/2/library/struct.html#format-characters for a list of the format specifiers.

Question 60

>>> reduce(lambda s, x: s*256 + x, bytearray("y\xcc\xa6\xbb"))
2043455163

Test 1: inverse:

>>> hex(2043455163)
'0x79cca6bb'

Test 2: Number of bytes > 8:

>>> reduce(lambda s, x: s*256 + x, bytearray("AAAAAAAAAAAAAAA"))
338822822454978555838225329091068225L

Test 3: Increment by one:

>>> reduce(lambda s, x: s*256 + x, bytearray("AAAAAAAAAAAAAAB"))
338822822454978555838225329091068226L

Test 4: Append one byte, say ‘A’:

>>> reduce(lambda s, x: s*256 + x, bytearray("AAAAAAAAAAAAAABA"))
86738642548474510294585684247313465921L

Test 5: Divide by 256:

>>> reduce(lambda s, x: s*256 + x, bytearray("AAAAAAAAAAAAAABA"))/256
338822822454978555838225329091068226L

Result equals the result of Test 4, as expected.

Question 61

I was struggling to find a solution for arbitrary length byte sequences that would work under Python 2.x. Finally I wrote this one, it’s a bit hacky because it performs a string conversion, but it works.

Function for Python 2.x, arbitrary length

def signedbytes(data):
    """Convert a bytearray into an integer, considering the first bit as
    sign. The data must be big-endian."""
    negative = data[0] & 0x80 > 0

    if negative:
        inverted = bytearray(~d % 256 for d in data)
        return -signedbytes(inverted) - 1

    encoded = str(data).encode('hex')
    return int(encoded, 16)

This function has two requirements:

The input data needs to be a bytearray. You may call the function like this:
```
s = 'y\xcc\xa6\xbb'
n = signedbytes(s)
```
The data needs to be big-endian. In case you have a little-endian value, you should reverse it first:
```
n = signedbytes(s[::-1])
```

Of course, this should be used only if arbitrary length is needed. Otherwise, stick with more standard ways (e.g. struct).

Question 62

int.from_bytes is the best solution if you are at version >=3.2. The “struct.unpack” solution requires a string so it will not apply to arrays of bytes. Here is another solution:

def bytes2int( tb, order='big'):
    if order == 'big': seq=[0,1,2,3]
    elif order == 'little': seq=[3,2,1,0]
    i = 0
    for j in seq: i = (i<<8)+tb[j]
    return i

hex( bytes2int( [0x87, 0x65, 0x43, 0x21])) returns ‘0x87654321’.

It handles big and little endianness and is easily modifiable for 8 bytes

Question 63

As mentioned above using unpack function of struct is a good way. If you want to implement your own function there is an another solution:

def bytes_to_int(bytes):
    result = 0
    for b in bytes:
        result = result * 256 + int(b)
return result

Question 64

In python 3 you can easily convert a byte string into a list of integers (0..255) by

>>> list(b'y\xcc\xa6\xbb')
[121, 204, 166, 187]

问题：在python中加入字符串列表，并将每个字符串都用引号引起来

回答 0

回答 1

回答 2

回答 3

回答 4

问题：根据最后出现的分隔符将字符串分成2个

回答 0

回答 1

回答 2

问题：如何取消转义的反斜杠字符串？

回答 0

回答 1

回答 2

回答 3

问题：Python逐行写入CSV

回答 0

回答 1

回答 2

回答 3

回答 4

问题：Python中的字符串串联与字符串替换

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

问题：删除字符串的第一个字符

回答 0

回答 1

回答 2

回答 3

问题：python：SyntaxError：扫描字符串文字时停产

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

回答 13

回答 14

问题：如何将字节字符串转换为int？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

Python 2.x的函数，任意长度

Function for Python 2.x, arbitrary length

回答 8

回答 9

回答 10

回答 11

问题：如何在Python中获取字符串的大小？

回答 0

回答 1

Python 3：

Python 2：

Python 3:

Python 2:

回答 2

回答 3

回答 4

问题：在Python中的特定位置添加字符串

回答 0

回答 1