Python 实用宝典

Question 1

How might one remove the first x characters from a string? For example, if one had a string lipsum, how would they remove the first 3 characters and get a result of sum?

Question 2

>>> text = 'lipsum'
>>> text[3:]
'sum'

See the official documentation on strings for more information and this SO answer for a concise summary of the notation.

Question 3

Another way (depending on your actual needs): If you want to pop the first n characters and save both the popped characters and the modified string:

s = 'lipsum'
n = 3
a, s = s[:n], s[n:]
print(a)
# lip
print(s)
# sum

Question 4

>>> x = 'lipsum'
>>> x.replace(x[:3], '')
'sum'

Question 5

Use del.

Example:

>>> text = 'lipsum'
>>> l = list(text)
>>> del l[3:]
>>> ''.join(l)
'sum'

Question 6

Example to show last 3 digits of account number.

x = '1234567890'   
x.replace(x[:7], '')

o/p: '890'

Question 7

I’m working on a project in Django and I’ve just started trying to extend the User model in order to make user profiles.

Unfortunately, I’ve run into a problem: Every time I try to get the user’s profile inside of a template (user.get_template.lastIP, for example), I get the following error:

Environment:

Request Method: GET
Request URL: http://localhost:8000/
Django Version: 1.1
Python Version: 2.6.1

Template error:
In template /path/to/base.tpl, error at line 19
   Caught an exception while rendering: too many values to unpack

19 :                Hello, {{user.username}} ({{ user.get_profile.rep}}). How's it goin? Logout


Exception Type: TemplateSyntaxError at /
Exception Value: Caught an exception while rendering: too many values to unpack

Any ideas as to what’s going on or what I’m doing wrong?

Question 8

That exception means that you are trying to unpack a tuple, but the tuple has too many values with respect to the number of target variables. For example: this work, and prints 1, then 2, then 3

def returnATupleWithThreeValues():
    return (1,2,3)
a,b,c = returnATupleWithThreeValues()
print a
print b
print c

But this raises your error

def returnATupleWithThreeValues():
    return (1,2,3)
a,b = returnATupleWithThreeValues()
print a
print b

raises

Traceback (most recent call last):
  File "c.py", line 3, in ?
    a,b = returnATupleWithThreeValues()
ValueError: too many values to unpack

Now, the reason why this happens in your case, I don’t know, but maybe this answer will point you in the right direction.

Question 9

try unpacking in one variable,

python will handle it as a list,

then unpack from the list

def returnATupleWithThreeValues():
    return (1,2,3)
a = returnATupleWithThreeValues() # a is a list (1,2,3)
print a[0] # list[0] = 1
print a[1] # list[1] = 2
print a[2] # list[2] = 3

Question 10

This problem looked familiar so I thought I’d see if I could replicate from the limited amount of information.

A quick search turned up an entry in James Bennett’s blog here which mentions that when working with the UserProfile to extend the User model a common mistake in settings.py can cause Django to throw this error.

To quote the blog entry:

The value of the setting is not “appname.models.modelname”, it’s just “appname.modelname”. The reason is that Django is not using this to do a direct import; instead, it’s using an internal model-loading function which only wants the name of the app and the name of the model. Trying to do things like “appname.models.modelname” or “projectname.appname.models.modelname” in the AUTH_PROFILE_MODULE setting will cause Django to blow up with the dreaded “too many values to unpack” error, so make sure you’ve put “appname.modelname”, and nothing else, in the value of AUTH_PROFILE_MODULE.

If the OP had copied more of the traceback I would expect to see something like the one below which I was able to duplicate by adding “models” to my AUTH_PROFILE_MODULE setting.

TemplateSyntaxError at /

Caught an exception while rendering: too many values to unpack

Original Traceback (most recent call last):
  File "/home/brandon/Development/DJANGO_VERSIONS/Django-1.0/django/template/debug.py", line 71, in render_node
    result = node.render(context)
  File "/home/brandon/Development/DJANGO_VERSIONS/Django-1.0/django/template/debug.py", line 87, in render
    output = force_unicode(self.filter_expression.resolve(context))
  File "/home/brandon/Development/DJANGO_VERSIONS/Django-1.0/django/template/__init__.py", line 535, in resolve
    obj = self.var.resolve(context)
  File "/home/brandon/Development/DJANGO_VERSIONS/Django-1.0/django/template/__init__.py", line 676, in resolve
    value = self._resolve_lookup(context)
  File "/home/brandon/Development/DJANGO_VERSIONS/Django-1.0/django/template/__init__.py", line 711, in _resolve_lookup
    current = current()
  File "/home/brandon/Development/DJANGO_VERSIONS/Django-1.0/django/contrib/auth/models.py", line 291, in get_profile
    app_label, model_name = settings.AUTH_PROFILE_MODULE.split('.')
ValueError: too many values to unpack

This I think is one of the few cases where Django still has a bit of import magic that tends to cause confusion when a small error doesn’t throw the expected exception.

You can see at the end of the traceback that I posted how using anything other than the form “appname.modelname” for the AUTH_PROFILE_MODULE would cause the line “app_label, model_name = settings.AUTH_PROFILE_MODULE.split(‘.’)” to throw the “too many values to unpack” error.

I’m 99% sure that this was the original problem encountered here.

Question 11

Most likely there is an error somewhere in the get_profile() call. In your view, before you return the request object, put this line:

request.user.get_profile()

It should raise the error, and give you a more detailed traceback, which you can then use to further debug.

Question 12

This happens to me when I’m using Jinja2 for templates. The problem can be solved by running the development server using the runserver_plus command from django_extensions.

It uses the werkzeug debugger which also happens to be a lot better and has a very nice interactive debugging console. It does some ajax magic to launch a python shell at any frame (in the call stack) so you can debug.

Question 13

Can a Python function be an argument of another function?

Say:

def myfunc(anotherfunc, extraArgs):
    # run anotherfunc and also pass the values from extraArgs to it
    pass

So this is basically two questions:

Is it allowed at all?
And if it is, how do I use the function inside the other function? Would I need to use exec(), eval() or something like that? Never needed to mess with them.

BTW, extraArgs is a list/tuple of anotherfunc’s arguments.

Question 14

Can a Python function be an argument of another function?

Yes.

def myfunc(anotherfunc, extraArgs):
    anotherfunc(*extraArgs)

To be more specific … with various arguments …

>>> def x(a,b):
...     print "param 1 %s param 2 %s"%(a,b)
...
>>> def y(z,t):
...     z(*t)
...
>>> y(x,("hello","manuel"))
param 1 hello param 2 manuel
>>>

Question 15

Here’s another way using *args (and also optionally), **kwargs:

def a(x, y):
  print x, y

def b(other, function, *args, **kwargs):
  function(*args, **kwargs)
  print other

b('world', a, 'hello', 'dude')

Output

hello dude
world

Note that function, *args, **kwargs have to be in that order and have to be the last arguments to the function calling the function.

Question 16

Functions in Python are first-class objects. But your function definition is a bit off.

def myfunc(anotherfunc, extraArgs, extraKwArgs):
  return anotherfunc(*extraArgs, **extraKwArgs)

Question 17

Sure, that is why python implements the following methods where the first parameter is a function:

map(function, iterable, …) – Apply function to every item of iterable and return a list of the results.
filter(function, iterable) – Construct a list from those elements of iterable for which function returns true.
reduce(function, iterable[,initializer]) – Apply function of two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single value.
lambdas

Question 18

Yes, it’s allowed.
You use the function as you would any other: anotherfunc(*extraArgs)

Question 19

Yes. By including the function call in your input argument/s, you can call two (or more) functions at once.

For example:

def anotherfunc(inputarg1, inputarg2):
    pass
def myfunc(func = anotherfunc):
    print func

When you call myfunc, you do this:

myfunc(anotherfunc(inputarg1, inputarg2))

This will print the return value of anotherfunc.

Hope this helps!

Question 20

Function inside function: we can use the function as parameter too..

In other words, we can say an output of a function is also a reference for an object, see below how the output of inner function is referencing to the outside function like below..

def out_func(a):

  def in_func(b):
       print(a + b + b + 3)
  return in_func

obj = out_func(1)
print(obj(5))

the result will be.. 14

Hope this helps.

Question 21

def x(a):
    print(a)
    return a

def y(a):
    return a

y(x(1))

Question 22

def x(a):
    print(a)
    return a

def y(func_to_run, a):
    return func_to_run(a)

y(x, 1)

That I think would be a more proper sample. Now what I wonder is if there is a way to code the function to use within the argument submission to another function. I believe there is in C++, but in Python I am not sure.

Question 23

Decorators are very powerful in Python since it allows programmers to pass function as argument and can also define function inside another function.

def decorator(func):
      def insideFunction():
        print("This is inside function before execution")
        func()
      return insideFunction

def func():
    print("I am argument function")

func_obj = decorator(func) 
func_obj()

Output

This is inside function before execution
I am argument function

Question 24

I’ve been using cProfile to profile my code, and it’s been working great. I also use gprof2dot.py to visualize the results (makes it a little clearer).

However, cProfile (and most other Python profilers I’ve seen so far) seem to only profile at the function-call level. This causes confusion when certain functions are called from different places – I have no idea if call #1 or call #2 is taking up the majority of the time. This gets even worse when the function in question is six levels deep, called from seven other places.

How do I get a line-by-line profiling?

Instead of this:

function #12, total time: 2.0s

I’d like to see something like this:

function #12 (called from somefile.py:102) 0.5s
function #12 (called from main.py:12) 1.5s

cProfile does show how much of the total time “transfers” to the parent, but again this connection is lost when you have a bunch of layers and interconnected calls.

Ideally, I’d love to have a GUI that would parse through the data, then show me my source file with a total time given to each line. Something like this:

main.py:

a = 1 # 0.0s
result = func(a) # 0.4s
c = 1000 # 0.0s
result = func(c) # 5.0s

Then I’d be able to click on the second “func(c)” call to see what’s taking up time in that call, separate from the “func(a)” call.

Does that make sense? Is there any profiling library that collects this type of information? Is there some awesome tool I’ve missed?

Question 25

I believe that’s what Robert Kern’s line_profiler is intended for. From the link:

File: pystone.py
Function: Proc2 at line 149
Total time: 0.606656 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   149                                           @profile
   150                                           def Proc2(IntParIO):
   151     50000        82003      1.6     13.5      IntLoc = IntParIO + 10
   152     50000        63162      1.3     10.4      while 1:
   153     50000        69065      1.4     11.4          if Char1Glob == 'A':
   154     50000        66354      1.3     10.9              IntLoc = IntLoc - 1
   155     50000        67263      1.3     11.1              IntParIO = IntLoc - IntGlob
   156     50000        65494      1.3     10.8              EnumLoc = Ident1
   157     50000        68001      1.4     11.2          if EnumLoc == Ident1:
   158     50000        63739      1.3     10.5              break
   159     50000        61575      1.2     10.1      return IntParIO

Hope that helps!

Question 26

You could also use pprofile(pypi). If you want to profile the entire execution, it does not require source code modification. You can also profile a subset of a larger program in two ways:

toggle profiling when reaching a specific point in the code, such as:

import pprofile
profiler = pprofile.Profile()
with profiler:
    some_code
# Process profile content: generate a cachegrind file and send it to user.

# You can also write the result to the console:
profiler.print_stats()

# Or to a file:
profiler.dump_stats("/tmp/profiler_stats.txt")

toggle profiling asynchronously from call stack (requires a way to trigger this code in considered application, for example a signal handler or an available worker thread) by using statistical profiling:

import pprofile
profiler = pprofile.StatisticalProfile()
statistical_profiler_thread = pprofile.StatisticalThread(
    profiler=profiler,
)
with statistical_profiler_thread:
    sleep(n)
# Likewise, process profile content

Code annotation output format is much like line profiler:

$ pprofile --threads 0 demo/threads.py
Command line: ['demo/threads.py']
Total duration: 1.00573s
File: demo/threads.py
File duration: 1.00168s (99.60%)
Line #|      Hits|         Time| Time per hit|      %|Source code
------+----------+-------------+-------------+-------+-----------
     1|         2|  3.21865e-05|  1.60933e-05|  0.00%|import threading
     2|         1|  5.96046e-06|  5.96046e-06|  0.00%|import time
     3|         0|            0|            0|  0.00%|
     4|         2|   1.5974e-05|  7.98702e-06|  0.00%|def func():
     5|         1|      1.00111|      1.00111| 99.54%|  time.sleep(1)
     6|         0|            0|            0|  0.00%|
     7|         2|  2.00272e-05|  1.00136e-05|  0.00%|def func2():
     8|         1|  1.69277e-05|  1.69277e-05|  0.00%|  pass
     9|         0|            0|            0|  0.00%|
    10|         1|  1.81198e-05|  1.81198e-05|  0.00%|t1 = threading.Thread(target=func)
(call)|         1|  0.000610828|  0.000610828|  0.06%|# /usr/lib/python2.7/threading.py:436 __init__
    11|         1|  1.52588e-05|  1.52588e-05|  0.00%|t2 = threading.Thread(target=func)
(call)|         1|  0.000438929|  0.000438929|  0.04%|# /usr/lib/python2.7/threading.py:436 __init__
    12|         1|  4.79221e-05|  4.79221e-05|  0.00%|t1.start()
(call)|         1|  0.000843048|  0.000843048|  0.08%|# /usr/lib/python2.7/threading.py:485 start
    13|         1|  6.48499e-05|  6.48499e-05|  0.01%|t2.start()
(call)|         1|   0.00115609|   0.00115609|  0.11%|# /usr/lib/python2.7/threading.py:485 start
    14|         1|  0.000205994|  0.000205994|  0.02%|(func(), func2())
(call)|         1|      1.00112|      1.00112| 99.54%|# demo/threads.py:4 func
(call)|         1|  3.09944e-05|  3.09944e-05|  0.00%|# demo/threads.py:7 func2
    15|         1|  7.62939e-05|  7.62939e-05|  0.01%|t1.join()
(call)|         1|  0.000423908|  0.000423908|  0.04%|# /usr/lib/python2.7/threading.py:653 join
    16|         1|  5.26905e-05|  5.26905e-05|  0.01%|t2.join()
(call)|         1|  0.000320196|  0.000320196|  0.03%|# /usr/lib/python2.7/threading.py:653 join

Note that because pprofile does not rely on code modification it can profile top-level module statements, allowing to profile program startup time (how long it takes to import modules, initialise globals, …).

It can generate cachegrind-formatted output, so you can use kcachegrind to browse large results easily.

Disclosure: I am pprofile author.

Question 27

You can take help of line_profiler package for this

1. 1st install the package:

    pip install line_profiler

2. Use magic command to load the package to your python/notebook environment

    %load_ext line_profiler

3. If you want to profile the codes for a function then
do as follows:

    %lprun -f demo_func demo_func(arg1, arg2)

you will get a nice formatted output with all the details if you follow these steps :)

Line #      Hits      Time    Per Hit   % Time  Line Contents
 1                                           def demo_func(a,b):
 2         1        248.0    248.0     64.8      print(a+b)
 3         1         40.0     40.0     10.4      print(a)
 4         1         94.0     94.0     24.5      print(a*b)
 5         1          1.0      1.0      0.3      return a/b

Question 28

Just to improve @Joe Kington ‘s above-mentioned answer.

For Python 3.x, use line_profiler:

Installation:

pip install line_profiler

Usage:

Suppose you have the program main.py and within it, functions fun_a() and fun_b() that you want to profile with respect to time; you will need to use the decorator @profile just before the function definitions. For e.g.,

@profile
def fun_a():
    #do something

@profile
def fun_b():
    #do something more

if __name__ == '__main__':
    fun_a()
    fun_b()

The program can be profiled by executing the shell command:

$ kernprof -l -v main.py

The arguments can be fetched using $ kernprof -h

Usage: kernprof [-s setupfile] [-o output_file_path] scriptfile [arg] ...

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -l, --line-by-line    Use the line-by-line profiler from the line_profiler
                        module instead of Profile. Implies --builtin.
  -b, --builtin         Put 'profile' in the builtins. Use 'profile.enable()'
                        and 'profile.disable()' in your code to turn it on and
                        off, or '@profile' to decorate a single function, or
                        'with profile:' to profile a single section of code.
  -o OUTFILE, --outfile=OUTFILE
                        Save stats to <outfile>
  -s SETUP, --setup=SETUP
                        Code to execute before the code to profile
  -v, --view            View the results of the profile in addition to saving
                        it.

The results will be printed on the console as:

Total time: 17.6699 s
File: main.py
Function: fun_a at line 5

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    5                                           @profile
    6                                           def fun_a():
...

EDIT: The results from the profilers can be parsed using the TAMPPA package. Using it, we can get line-by-line desired plots as

Question 29

PyVmMonitor has a live-view which can help you there (you can connect to a running program and get statistics from it).

See: http://www.pyvmmonitor.com/

Question 30

Any help on this problem will be greatly appreciated.

So basically I want to run a query to my SQL database and store the returned data as Pandas data structure.

I have attached code for query.

I am reading the documentation on Pandas, but I have problem to identify the return type of my query.

I tried to print the query result, but it doesn’t give any useful information.

Thanks!!!!

from sqlalchemy import create_engine

engine2 = create_engine('mysql://THE DATABASE I AM ACCESSING')
connection2 = engine2.connect()
dataid = 1022
resoverall = connection2.execute("
  SELECT 
      sum(BLABLA) AS BLA,
      sum(BLABLABLA2) AS BLABLABLA2,
      sum(SOME_INT) AS SOME_INT,
      sum(SOME_INT2) AS SOME_INT2,
      100*sum(SOME_INT2)/sum(SOME_INT) AS ctr,
      sum(SOME_INT2)/sum(SOME_INT) AS cpc
   FROM daily_report_cooked
   WHERE campaign_id = '%s'", %dataid)

So I sort of want to understand what’s the format/datatype of my variable “resoverall” and how to put it with PANDAS data structure.

Question 31

Here’s the shortest code that will do the job:

from pandas import DataFrame
df = DataFrame(resoverall.fetchall())
df.columns = resoverall.keys()

You can go fancier and parse the types as in Paul’s answer.

Question 32

Edit: Mar. 2015

As noted below, pandas now uses SQLAlchemy to both read from (read_sql) and insert into (to_sql) a database. The following should work

import pandas as pd

df = pd.read_sql(sql, cnxn)

Previous answer: Via mikebmassey from a similar question

import pyodbc
import pandas.io.sql as psql

cnxn = pyodbc.connect(connection_info) 
cursor = cnxn.cursor()
sql = "SELECT * FROM TABLE"

df = psql.frame_query(sql, cnxn)
cnxn.close()

Question 33

If you are using SQLAlchemy’s ORM rather than the expression language, you might find yourself wanting to convert an object of type sqlalchemy.orm.query.Query to a Pandas data frame.

The cleanest approach is to get the generated SQL from the query’s statement attribute, and then execute it with pandas’s read_sql() method. E.g., starting with a Query object called query:

df = pd.read_sql(query.statement, query.session.bind)

Question 34

Edit 2014-09-30:

pandas now has a read_sql function. You definitely want to use that instead.

Original answer:

I can’t help you with SQLAlchemy — I always use pyodbc, MySQLdb, or psychopg2 as needed. But when doing so, a function as simple as the one below tends to suit my needs:

import decimal

import pydobc
import numpy as np
import pandas

cnn, cur = myConnectToDBfunction()
cmd = "SELECT * FROM myTable"
cur.execute(cmd)
dataframe = __processCursor(cur, dataframe=True)

def __processCursor(cur, dataframe=False, index=None):
    '''
    Processes a database cursor with data on it into either
    a structured numpy array or a pandas dataframe.

    input:
    cur - a pyodbc cursor that has just received data
    dataframe - bool. if false, a numpy record array is returned
                if true, return a pandas dataframe
    index - list of column(s) to use as index in a pandas dataframe
    '''
    datatypes = []
    colinfo = cur.description
    for col in colinfo:
        if col[1] == unicode:
            datatypes.append((col[0], 'U%d' % col[3]))
        elif col[1] == str:
            datatypes.append((col[0], 'S%d' % col[3]))
        elif col[1] in [float, decimal.Decimal]:
            datatypes.append((col[0], 'f4'))
        elif col[1] == datetime.datetime:
            datatypes.append((col[0], 'O4'))
        elif col[1] == int:
            datatypes.append((col[0], 'i4'))

    data = []
    for row in cur:
        data.append(tuple(row))

    array = np.array(data, dtype=datatypes)
    if dataframe:
        output = pandas.DataFrame.from_records(array)

        if index is not None:
            output = output.set_index(index)

    else:
        output = array

    return output

Question 35

MySQL Connector

For those that works with the mysql connector you can use this code as a start. (Thanks to @Daniel Velkov)

Used refs:

import pandas as pd
import mysql.connector

# Setup MySQL connection
db = mysql.connector.connect(
    host="<IP>",              # your host, usually localhost
    user="<USER>",            # your username
    password="<PASS>",        # your password
    database="<DATABASE>"     # name of the data base
)   

# You must create a Cursor object. It will let you execute all the queries you need
cur = db.cursor()

# Use all the SQL you like
cur.execute("SELECT * FROM <TABLE>")

# Put it all to a data frame
sql_data = pd.DataFrame(cur.fetchall())
sql_data.columns = cur.column_names

# Close the session
db.close()

# Show the data
print(sql_data.head())

Question 36

Here’s the code I use. Hope this helps.

import pandas as pd
from sqlalchemy import create_engine

def getData():
  # Parameters
  ServerName = "my_server"
  Database = "my_db"
  UserPwd = "user:pwd"
  Driver = "driver=SQL Server Native Client 11.0"

  # Create the connection
  engine = create_engine('mssql+pyodbc://' + UserPwd + '@' + ServerName + '/' + Database + "?" + Driver)

  sql = "select * from mytable"
  df = pd.read_sql(sql, engine)
  return df

df2 = getData()
print(df2)

Question 37

This is a short and crisp answer to your problem:

from __future__ import print_function
import MySQLdb
import numpy as np
import pandas as pd
import xlrd

# Connecting to MySQL Database
connection = MySQLdb.connect(
             host="hostname",
             port=0000,
             user="userID",
             passwd="password",
             db="table_documents",
             charset='utf8'
           )
print(connection)
#getting data from database into a dataframe
sql_for_df = 'select * from tabledata'
df_from_database = pd.read_sql(sql_for_df , connection)

Question 38

1. Using MySQL-connector-python

# pip install mysql-connector-python

import mysql.connector
import pandas as pd

mydb = mysql.connector.connect(
    host = 'host',
    user = 'username',
    passwd = 'pass',
    database = 'db_name'
)
query = 'select * from table_name'
df = pd.read_sql(query, con = mydb)
print(df)

2. Using SQLAlchemy

# pip install pymysql
# pip install sqlalchemy

import pandas as pd
import sqlalchemy

engine = sqlalchemy.create_engine('mysql+pymysql://username:password@localhost:3306/db_name')

query = '''
select * from table_name
'''
df = pd.read_sql_query(query, engine)
print(df)

Question 39

Like Nathan, I often want to dump the results of a sqlalchemy or sqlsoup Query into a Pandas data frame. My own solution for this is:

query = session.query(tbl.Field1, tbl.Field2)
DataFrame(query.all(), columns=[column['name'] for column in query.column_descriptions])

Question 40

resoverall is a sqlalchemy ResultProxy object. You can read more about it in the sqlalchemy docs, the latter explains basic usage of working with Engines and Connections. Important here is that resoverall is dict like.

Pandas likes dict like objects to create its data structures, see the online docs

Good luck with sqlalchemy and pandas.

Question 41

Simply use pandas and pyodbc together. You’ll have to modify your connection string (connstr) according to your database specifications.

import pyodbc
import pandas as pd

# MSSQL Connection String Example
connstr = "Server=myServerAddress;Database=myDB;User Id=myUsername;Password=myPass;"

# Query Database and Create DataFrame Using Results
df = pd.read_sql("select * from myTable", pyodbc.connect(connstr))

I’ve used pyodbc with several enterprise databases (e.g. SQL Server, MySQL, MariaDB, IBM).

Question 42

This question is old, but I wanted to add my two-cents. I read the question as ” I want to run a query to my [my]SQL database and store the returned data as Pandas data structure [DataFrame].”

From the code it looks like you mean mysql database and assume you mean pandas DataFrame.

import MySQLdb as mdb
import pandas.io.sql as sql
from pandas import *

conn = mdb.connect('<server>','<user>','<pass>','<db>');
df = sql.read_frame('<query>', conn)

For example,

conn = mdb.connect('localhost','myname','mypass','testdb');
df = sql.read_frame('select * from testTable', conn)

This will import all rows of testTable into a DataFrame.

Question 43

Here is mine. Just in case if you are using “pymysql”:

import pymysql
from pandas import DataFrame

host   = 'localhost'
port   = 3306
user   = 'yourUserName'
passwd = 'yourPassword'
db     = 'yourDatabase'

cnx    = pymysql.connect(host=host, port=port, user=user, passwd=passwd, db=db)
cur    = cnx.cursor()

query  = """ SELECT * FROM yourTable LIMIT 10"""
cur.execute(query)

field_names = [i[0] for i in cur.description]
get_data = [xx for xx in cur]

cur.close()
cnx.close()

df = DataFrame(get_data)
df.columns = field_names

Question 44

pandas.io.sql.write_frame is DEPRECATED. https://pandas.pydata.org/pandas-docs/version/0.15.2/generated/pandas.io.sql.write_frame.html

Should change to use pandas.DataFrame.to_sql https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_sql.html

There is another solution. PYODBC to Pandas – DataFrame not working – Shape of passed values is (x,y), indices imply (w,z)

As of Pandas 0.12 (I believe) you can do:

import pandas
import pyodbc

sql = 'select * from table'
cnn = pyodbc.connect(...)

data = pandas.read_sql(sql, cnn)

Prior to 0.12, you could do:

import pandas
from pandas.io.sql import read_frame
import pyodbc

sql = 'select * from table'
cnn = pyodbc.connect(...)

data = read_frame(sql, cnn)

Question 45

Long time from last post but maybe it helps someone…

Shorted way than Paul H:

my_dic = session.query(query.all())
my_df = pandas.DataFrame.from_dict(my_dic)

Question 46

best way I do this

db.execute(query) where db=db_class() #database class
    mydata=[x for x in db.fetchall()]
    df=pd.DataFrame(data=mydata)

Question 47

If the result type is ResultSet, you should convert it to dictionary first. Then the DataFrame columns will be collected automatically.

This works on my case:

df = pd.DataFrame([dict(r) for r in resoverall])

Question 48

I have a dict and would like to remove all the keys for which there are empty value strings.

metadata = {u'Composite:PreviewImage': u'(Binary data 101973 bytes)',
            u'EXIF:CFAPattern2': u''}

What is the best way to do this?

Question 49

Python 2.X

dict((k, v) for k, v in metadata.iteritems() if v)

Python 2.7 – 3.X

{k: v for k, v in metadata.items() if v is not None}

Note that all of your keys have values. It’s just that some of those values are the empty string. There’s no such thing as a key in a dict without a value; if it didn’t have a value, it wouldn’t be in the dict.

Question 50

It can get even shorter than BrenBarn’s solution (and more readable I think)

{k: v for k, v in metadata.items() if v}

Tested with Python 2.7.3.

Question 51

If you really need to modify the original dictionary:

empty_keys = [k for k,v in metadata.iteritems() if not v]
for k in empty_keys:
    del metadata[k]

Note that we have to make a list of the empty keys because we can’t modify a dictionary while iterating through it (as you may have noticed). This is less expensive (memory-wise) than creating a brand-new dictionary, though, unless there are a lot of entries with empty values.

Question 52

BrenBarn’s solution is ideal (and pythonic, I might add). Here is another (fp) solution, however:

from operator import itemgetter
dict(filter(itemgetter(1), metadata.items()))

Question 53

If you want a full-featured, yet succinct approach to handling real-world data structures which are often nested, and can even contain cycles, I recommend looking at the remap utility from the boltons utility package.

After pip install boltons or copying iterutils.py into your project, just do:

from boltons.iterutils import remap

drop_falsey = lambda path, key, value: bool(value)
clean = remap(metadata, visit=drop_falsey)

This page has many more examples, including ones working with much larger objects from Github’s API.

It’s pure-Python, so it works everywhere, and is fully tested in Python 2.7 and 3.3+. Best of all, I wrote it for exactly cases like this, so if you find a case it doesn’t handle, you can bug me to fix it right here.

Question 54

Based on Ryan’s solution, if you also have lists and nested dictionaries:

For Python 2:

def remove_empty_from_dict(d):
    if type(d) is dict:
        return dict((k, remove_empty_from_dict(v)) for k, v in d.iteritems() if v and remove_empty_from_dict(v))
    elif type(d) is list:
        return [remove_empty_from_dict(v) for v in d if v and remove_empty_from_dict(v)]
    else:
        return d

For Python 3:

def remove_empty_from_dict(d):
    if type(d) is dict:
        return dict((k, remove_empty_from_dict(v)) for k, v in d.items() if v and remove_empty_from_dict(v))
    elif type(d) is list:
        return [remove_empty_from_dict(v) for v in d if v and remove_empty_from_dict(v)]
    else:
        return d

Question 55

If you have a nested dictionary, and you want this to work even for empty sub-elements, you can use a recursive variant of BrenBarn’s suggestion:

def scrub_dict(d):
    if type(d) is dict:
        return dict((k, scrub_dict(v)) for k, v in d.iteritems() if v and scrub_dict(v))
    else:
        return d

Question 56

Quick Answer (TL;DR)

Example01

### example01 -------------------

mydict  =   { "alpha":0,
              "bravo":"0",
              "charlie":"three",
              "delta":[],
              "echo":False,
              "foxy":"False",
              "golf":"",
              "hotel":"   ",                        
            }
newdict =   dict([(vkey, vdata) for vkey, vdata in mydict.iteritems() if(vdata) ])
print newdict

### result01 -------------------
result01 ='''
{'foxy': 'False', 'charlie': 'three', 'bravo': '0'}
'''

Detailed Answer

Problem

Context: Python 2.x
Scenario: Developer wishes modify a dictionary to exclude blank values
- aka remove empty values from a dictionary
- aka delete keys with blank values
- aka filter dictionary for non-blank values over each key-value pair

Solution

example01 use python list-comprehension syntax with simple conditional to remove “empty” values

Pitfalls

example01 only operates on a copy of the original dictionary (does not modify in place)
example01 may produce unexpected results depending on what developer means by “empty”
- Does developer mean to keep values that are falsy?
- If the values in the dictionary are not gauranteed to be strings, developer may have unexpected data loss.
- result01 shows that only three key-value pairs were preserved from the original set

Alternate example

example02 helps deal with potential pitfalls
The approach is to use a more precise definition of “empty” by changing the conditional.
Here we only want to filter out values that evaluate to blank strings.
Here we also use .strip() to filter out values that consist of only whitespace.

Example02

### example02 -------------------

mydict  =   { "alpha":0,
              "bravo":"0",
              "charlie":"three",
              "delta":[],
              "echo":False,
              "foxy":"False",
              "golf":"",
              "hotel":"   ",
            }
newdict =   dict([(vkey, vdata) for vkey, vdata in mydict.iteritems() if(str(vdata).strip()) ])
print newdict

### result02 -------------------
result02 ='''
{'alpha': 0,
  'bravo': '0', 
  'charlie': 'three', 
  'delta': [],
  'echo': False,
  'foxy': 'False'
  }
'''

Dicts mixed with Arrays

The answer at Attempt 3: Just Right (so far) from BlissRage’s answer does not properly handle arrays elements. I’m including a patch in case anyone needs it. The method is handles list with the statement block of if isinstance(v, list):, which scrubs the list using the original scrub_dict(d) implementation.

    @staticmethod
    def scrub_dict(d):
        new_dict = {}
        for k, v in d.items():
            if isinstance(v, dict):
                v = scrub_dict(v)
            if isinstance(v, list):
                v = scrub_list(v)
            if not v in (u'', None, {}):
                new_dict[k] = v
        return new_dict

    @staticmethod
    def scrub_list(d):
        scrubbed_list = []
        for i in d:
            if isinstance(i, dict):
                i = scrub_dict(i)
            scrubbed_list.append(i)
        return scrubbed_list

Question 61

An alternative way you can do this, is using dictionary comprehension. This should be compatible with 2.7+

result = {
    key: value for key, value in
    {"foo": "bar", "lorem": None}.items()
    if value
}

Question 62

Here is an option if you are using pandas:

import pandas as pd

d = dict.fromkeys(['a', 'b', 'c', 'd'])
d['b'] = 'not null'
d['c'] = ''  # empty string

print(d)

# convert `dict` to `Series` and replace any blank strings with `None`;
# use the `.dropna()` method and
# then convert back to a `dict`
d_ = pd.Series(d).replace('', None).dropna().to_dict()

print(d_)

Question 63

Some of Methods mentioned above ignores if there are any integers and float with values 0 & 0.0

If someone wants to avoid the above can use below code(removes empty strings and None values from nested dictionary and nested list):

def remove_empty_from_dict(d):
    if type(d) is dict:
        _temp = {}
        for k,v in d.items():
            if v == None or v == "":
                pass
            elif type(v) is int or type(v) is float:
                _temp[k] = remove_empty_from_dict(v)
            elif (v or remove_empty_from_dict(v)):
                _temp[k] = remove_empty_from_dict(v)
        return _temp
    elif type(d) is list:
        return [remove_empty_from_dict(v) for v in d if( (str(v).strip() or str(remove_empty_from_dict(v)).strip()) and (v != None or remove_empty_from_dict(v) != None))]
    else:
        return d

Question 64

“As I also currently write a desktop application for my work with Python, I found in data-entry application when there is lots of entry and which some are not mandatory thus user can left it blank, for validation purpose, it is easy to grab all entries and then discard empty key or value of a dictionary. So my code above a show how we can easy take them out, using dictionary comprehension and keep dictionary value element which is not blank. I use Python 3.8.3

data = {'':'', '20':'', '50':'', '100':'1.1', '200':'1.2'}

dic = {key:value for key,value in data.items() if value != ''}

print(dic)

{'100': '1.1', '200': '1.2'}

问题：从字符串中删除前x个字符？

回答 0

回答 1

回答 2

回答 3

回答 4

问题：“无法解包的值太多”异常

回答 0

回答 1

回答 2

回答 3

回答 4

问题：Python函数作为函数参数吗？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

输出量

Output

问题：如何逐行描述Python代码？

回答 0

回答 1

回答 2

回答 3

安装：

用法：

Installation:

Usage:

回答 4

问题：如何将SQL查询结果转换为PANDAS数据结构？

回答 0

回答 1

回答 2

回答 3

编辑2014-09-30：

原始答案：

Edit 2014-09-30:

Original answer:

回答 4

MySQL连接器

MySQL Connector

回答 5

回答 6

回答 7

1.使用MySQL-connector-python

2.使用SQLAlchemy

1. Using MySQL-connector-python

2. Using SQLAlchemy

回答 8

回答 9

回答 10

回答 11

回答 12

回答 13

回答 14

回答 15

回答 16

问题：从字典中删除带有空字符串的键的有效方法

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

快速解答（TL; DR）

范例01

详细答案

问题

解

陷阱

替代示例

示例02