Python 实用宝典

Question 1

How to retrieve inserted id after inserting row in SQLite using Python? I have table like this:

id INT AUTOINCREMENT PRIMARY KEY,
username VARCHAR(50),
password VARCHAR(50)

I insert a new row with example data username="test" and password="test". How do I retrieve the generated id in a transaction safe way? This is for a website solution, where two people may be inserting data at the same time. I know I can get the last read row, but I don’t think that is transaction safe. Can somebody give me some advice?

Question 2

You could use cursor.lastrowid (see “Optional DB API Extensions”):

connection=sqlite3.connect(':memory:')
cursor=connection.cursor()
cursor.execute('''CREATE TABLE foo (id integer primary key autoincrement ,
                                    username varchar(50),
                                    password varchar(50))''')
cursor.execute('INSERT INTO foo (username,password) VALUES (?,?)',
               ('test','test'))
print(cursor.lastrowid)
# 1

If two people are inserting at the same time, as long as they are using different cursors, cursor.lastrowid will return the id for the last row that cursor inserted:

cursor.execute('INSERT INTO foo (username,password) VALUES (?,?)',
               ('blah','blah'))

cursor2=connection.cursor()
cursor2.execute('INSERT INTO foo (username,password) VALUES (?,?)',
               ('blah','blah'))

print(cursor2.lastrowid)        
# 3
print(cursor.lastrowid)
# 2

cursor.execute('INSERT INTO foo (id,username,password) VALUES (?,?,?)',
               (100,'blah','blah'))
print(cursor.lastrowid)
# 100

Note that lastrowid returns None when you insert more than one row at a time with executemany:

cursor.executemany('INSERT INTO foo (username,password) VALUES (?,?)',
               (('baz','bar'),('bing','bop')))
print(cursor.lastrowid)
# None

Question 3

I am trying to run a Django app on my VPS running Debian 5. When I run a demo app, it comes back with this error:

  File "/usr/local/lib/python2.5/site-packages/django/utils/importlib.py", line 35, in     import_module
    __import__(name)

  File "/usr/local/lib/python2.5/site-packages/django/db/backends/sqlite3/base.py", line 30, in <module>
    raise ImproperlyConfigured, "Error loading %s: %s" % (module, exc)

ImproperlyConfigured: Error loading either pysqlite2 or sqlite3 modules (tried in that     order): No module named _sqlite3

Looking at the Python install, it gives the same error:

Python 2.5.2 (r252:60911, May 12 2009, 07:46:31) 
[GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlite3
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.5/sqlite3/__init__.py", line 24, in <module>
    from dbapi2 import *
  File "/usr/local/lib/python2.5/sqlite3/dbapi2.py", line 27, in <module>
    from _sqlite3 import *
ImportError: No module named _sqlite3
>>>

Reading on the web, I learn that Python 2.5 should come with all the necessary SQLite wrappers included. Do I need to reinstall Python, or is there another way to get this module up and running?

Question 4

It seems your makefile didn’t include the appropriate .so file. You can correct this problem with the steps below:

Install sqlite-devel (or libsqlite3-dev on some Debian-based systems)
Re-configure and re-compiled Python with ./configure --enable-loadable-sqlite-extensions && make && sudo make install

Note

The sudo make install part will set that python version to be the system-wide standard, which can have unforseen consequences. If you run this command on your workstation, you’ll probably want to have it installed alongside the existing python, which can be done with sudo make altinstall.

Question 5

I had the same problem (building python2.5 from source on Ubuntu Lucid), and import sqlite3 threw this same exception. I’ve installed libsqlite3-dev from the package manager, recompiled python2.5, and then the import worked.

Question 6

I had the same problem with Python 3.5 on Ubuntu while using pyenv.

If you’re installing the python using pyenv, it’s listed as one of the common build problems. To solve it, remove the installed python version, install the requirements (for this particular case libsqlite3-dev), then reinstall the python version.

Question 7

This is what I did to get it to work.

I am using pythonbrew(which is using pip) with python 2.7.5 installed.

I first did what Zubair(above) said and ran this command:

sudo apt-get install libsqlite3-dev

Then I ran this command:

pip install pysqlite

This fixed the database problem and I got confirmation of this when I ran:

python manager.py syncdb

Question 8

Install the sqlite-devel package:

yum install sqlite-devel -y
Recompile python from the source:
```
./configure
make
make altinstall
```

Question 9

My _sqlite3.so is in /usr/lib/python2.5/lib-dynload/_sqlite3.so. Judging from your paths, you should have the file /usr/local/lib/python2.5/lib-dynload/_sqlite3.so.

Try the following:

find /usr/local -name _sqlite3.so

If the file isn’t found, something may be wrong with your Python installation. If it is, make sure the path it’s installed to is in the Python path. In the Python shell,

import sys
print sys.path

In my case, /usr/lib/python2.5/lib-dynload is in the list, so it’s able to find /usr/lib/python2.5/lib-dynload/_sqlite3.so.

Question 10

I recently tried installing python 2.6.7 on my Ubuntu 11.04 desktop for some dev work. Came across similar problems to this thread. I mamaged to fix it by:

Adjusting the setup.py file to include the correct sqlite dev path. Code snippet from setup.py:

def sqlite_incdir:
sqlite_dirs_to_check = [
os.path.join(sqlite_incdir, '..', 'lib64'),
os.path.join(sqlite_incdir, '..', 'lib'),
os.path.join(sqlite_incdir, '..', '..', 'lib64'),
os.path.join(sqlite_incdir, '..', '..', 'lib'),
'/usr/lib/x86_64-linux-gnu/'
]

With the bit that I added being ‘/usr/lib/x86_64-linux-gnu/’.

After running make I did not get any warnings saying the sqlite support was not built (i.e., it built correctly :P ), but after running make install, sqlite3 still did not import with the same “ImportError: No module named _sqlite3" whe running "import sqlite3“.

So, the library was compiled, but not moved to the correct installation path, so I copied the .so file (cp /usr/src/python/Python-2.6.7/build/lib.linux-x86_64-2.6/_sqlite3.so /usr/local/python-2.6.7/lib/python2.6/sqlite3/ — these are my build paths, you will probably need to adjust them to your setup).

Voila! SQLite3 support now works.

Question 11

I found lots of people meet this problem because the Multi-version Python, on my own vps (cent os 7 x64), I solved it in this way:

Find the file “_sqlite3.so”
```
find / -name _sqlite3.so
```
out: /usr/lib64/python2.7/lib-dynload/_sqlite3.so
Find the dir of python Standard library you want to use,

for me /usr/local/lib/python3.6/lib-dynload

Copy the file:

cp   /usr/lib64/python2.7/lib-dynload/_sqlite3.so /usr/local/lib/python3.6/lib-dynload

Finally, everything will be ok.

Question 12

This worked for me in Redhat Centos 6.5:

yum install sqlite-devel
pip install pysqlite

Question 13

my python is build from source, the cause is missing options when exec configure python version：3.7.4

./configure --enable-loadable-sqlite-extensions --enable-optimizations
make
make install

fixed

Question 14

I have the problem in FreeBSD 8.1:

- No module named _sqlite3 -

It is solved by stand the port ———-

/usr/ports/databases/py-sqlite3

after this one can see:

OK ----------
'>>>' import sqlite3 -----
'>>>' sqlite3.apilevel -----
'2.0'

Question 15

Is the python-pysqlite2 package installed?

sudo apt-get install python-pysqlite2

Question 16

Checking your settings.py file. Did you not just write “sqlite” instead of “sqlite3” for the database engine?

Question 17

sqlite3 ships with Python. I also had the same problem, I just uninstalled python3.6 and installed it again.

Uninstall existing python:

sudo apt-get remove --purge python3.6

Install python3.6:

sudo apt install build-essential checkinstall
sudo apt install libreadline-gplv2-dev libncursesw5-dev libssl-dev libsqlite3-dev tk-dev libgdbm-dev libc6-dev libbz2-dev
wget https://www.python.org/ftp/python/3.6.0/Python-3.6.0.tar.xz
tar xvf Python-3.6.0.tar.xz
cd Python-3.6.0/
./configure
sudo make altinstall

Question 18

you must be in centos or redhat and compile python yourself， it is python‘s bug do this in your python source code dir and do this below

curl -sk https://gist.github.com/msabramo/2727063/raw/59ea097a1f4c6f114c32f7743308a061698b17fd/gistfile1.diff | patch -p1

Question 19

I got the same problem, nothing worked for me from the above ans but now I fixed it by

just remove python.pip and sqlite3 and reinstall

sudo apt-get remove python.pip
sudo apt-get remove sqlite3

now install it again

sudo apt-get install python.pip
sudo apt-get install sqlite3

in my case while installing sqlite3 again it showed some error then I typed

sqlite3

on terminal to check if it was removed or not and it started unpacking it

once the sqlite3 is installed fireup terminal and write

sqlite3
database.db (to create a database)

I’m sure this will definitely help you

Question 20

Putting answer for anyone who lands on this page searching for a solution for Windows OS:

You have to install pysqlite3 or db-sqlite3 if not already installed. you can use following to install.

pip install pysqlite3
pip install db-sqlite3

For me the issue was with DLL file of sqlite3.

Solution:

I took DLL file from sqlite site. This might vary based on your version of python installation.
I pasted it in the DLL directory of the env. for me it was “C:\Anaconda\Lib\DLLs”, but check for yours.

Question 21

I was disappointed this issue still exist till today. As I have recently been trying to install vCD CLI on CentOS 8.1 and I was welcomed with the same error when tried to run it. The way I had to resolve it in my case is as follow:

Install SQLite3 from scratch with the proper prefix
Make clean my Python Installation
Run Make install to reinstall Python

As I have been doing this to create a different blogpost about how to install vCD CLI and VMware Container Service Extension. I have end up capturing the steps I used to fix the issue and put it in a separate blog post at:

http://www.virtualizationteam.com/cloud/running-vcd-cli-fail-with-the-following-error-modulenotfounderror-no-module-named-_sqlite3.html

I hope this helpful, as while the tips above had helped me get to a solution, I had to combine few of them and modify them a bit.

Question 22

Download sqlite3:

wget http://www.sqlite.org/2016/sqlite-autoconf-3150000.tar.gz

Follow these steps to install:

$tar xvfz sqlite-autoconf-3071502.tar.gz
$cd sqlite-autoconf-3071502
$./configure --prefix=/usr/local
$make install

Question 23

You need to install pysqlite in your python environment:

    $ pip install pysqlite

Question 24

Try copying _sqlite3.so so that Python can find it.

It should be as simple as:

cp /usr/lib64/python2.6/lib-dynload/_sqlite3.so /usr/local/lib/python2.7/

Trust me, try it.

Question 25

I’m completely new to Python’s sqlite3 module (and SQL in general for that matter), and this just completely stumps me. The abundant lack of descriptions of cursor objects (rather, their necessity) also seems odd.

This snippet of code is the preferred way of doing things:

import sqlite3
conn = sqlite3.connect("db.sqlite")
c = conn.cursor()
c.execute('''insert into table "users" values ("Jack Bauer", "555-555-5555")''')
conn.commit()
c.close()

This one isn’t, even though it works just as well and without the (seemingly pointless) cursor:

import sqlite3
conn = sqlite3.connect("db.sqlite")
conn.execute('''insert into table "users" values ("Jack Bauer", "555-555-5555")''')
conn.commit()

Can anyone tell me why I need a cursor?
It just seems like pointless overhead. For every method in my script that accesses a database, I’m supposed to create and destroy a cursor?
Why not just use the connection object?

Question 26

Just a misapplied abstraction it seems to me. A db cursor is an abstraction, meant for data set traversal.

From Wikipedia article on subject:

In computer science and technology, a database cursor is a control structure that enables traversal over the records in a database. Cursors facilitate subsequent processing in conjunction with the traversal, such as retrieval, addition and removal of database records. The database cursor characteristic of traversal makes cursors akin to the programming language concept of iterator.

And:

Cursors can not only be used to fetch data from the DBMS into an application but also to identify a row in a table to be updated or deleted. The SQL:2003 standard defines positioned update and positioned delete SQL statements for that purpose. Such statements do not use a regular WHERE clause with predicates. Instead, a cursor identifies the row. The cursor must be opened and already positioned on a row by means of FETCH statement.

If you check the docs on Python sqlite module, you can see that a python module cursor is needed even for a CREATE TABLE statement, so it’s used for cases where a mere connection object should suffice – as correctly pointed out by the OP. Such abstraction is different from what people understand a db cursor to be and hence, the confusion/frustration on the part of users. Regardless of efficiency, it’s just a conceptual overhead. Would be nice if it was pointed out in the docs that the python module cursor is bit different than what a cursor is in SQL and databases.

Question 27

You need a cursor object to fetch results. Your example works because it’s an INSERT and thus you aren’t trying to get any rows back from it, but if you look at the sqlite3 docs, you’ll notice that there aren’t any .fetchXXXX methods on connection objects, so if you tried to do a SELECT without a cursor, you’d have no way to get the resulting data.

Cursor objects allow you to keep track of which result set is which, since it’s possible to run multiple queries before you’re done fetching the results of the first.

Question 28

According to the official docs connection.execute() is a nonstandard shortcut that creates an intermediate cursor object:

Connection.execute
This is a nonstandard shortcut that creates a cursor object by calling the cursor() method, calls the cursor’s execute() method with the parameters given, and returns the cursor.

Question 29

12.6.8. Using sqlite3 efficiently

12.6.8.1. Using shortcut methods

Using the nonstandard execute(), executemany() and executescript() methods of the Connection object, your code can be written more concisely because you don’t have to create the (often superfluous) Cursor objects explicitly. Instead, the Cursor objects are created implicitly and these shortcut methods return the cursor objects. This way, you can execute a SELECT statement and iterate over it directly using only a single call on the Connection object.

(sqlite3 documentation; emphasis mine.)

Why not just use the connection object?

Because those methods of the connection object are nonstandard, i.e. they are not part of Python Database API Specification v2.0 (PEP 249).

As long as you use the standard methods of the Cursor object, you can be sure that if you switch to another database implementation that follows the above specification, your code will be fully portable. Perhaps you will only need to change the import line.

But if you use the connection.execute there is a chance that switching won’t be that straightforward. That’s the main reason you might want to use cursor.execute instead.

However if you are certain that you’re not going to switch, I’d say it’s completely OK to take the connection.execute shortcut and be “efficient”.

Question 30

It gives us the ability to have multiple separate working environments through the same connection to the database.

Question 31

db = sqlite.connect("test.sqlite")
res = db.execute("select * from table")

With iteration I get lists coresponding to the rows.

for row in res:
    print row

I can get name of the columns

col_name_list = [tuple[0] for tuple in res.description]

But is there some function or setting to get dictionaries instead of list?

{'col1': 'value', 'col2': 'value'}

or I have to do myself?

Question 32

You could use row_factory, as in the example in the docs:

import sqlite3

def dict_factory(cursor, row):
    d = {}
    for idx, col in enumerate(cursor.description):
        d[col[0]] = row[idx]
    return d

con = sqlite3.connect(":memory:")
con.row_factory = dict_factory
cur = con.cursor()
cur.execute("select 1 as a")
print cur.fetchone()["a"]

or follow the advice that’s given right after this example in the docs:

If returning a tuple doesn’t suffice and you want name-based access to columns, you should consider setting row_factory to the highly-optimized sqlite3.Row type. Row provides both index-based and case-insensitive name-based access to columns with almost no memory overhead. It will probably be better than your own custom dictionary-based approach or even a db_row based solution.

Question 33

I thought I answer this question even though the answer is partly mentioned in both Adam Schmideg’s and Alex Martelli’s answers. In order for others like me that have the same question, to find the answer easily.

conn = sqlite3.connect(":memory:")

#This is the important part, here we are setting row_factory property of
#connection object to sqlite3.Row(sqlite3.Row is an implementation of
#row_factory)
conn.row_factory = sqlite3.Row
c = conn.cursor()
c.execute('select * from stocks')

result = c.fetchall()
#returns a list of dictionaries, each item in list(each dictionary)
#represents a row of the table

Question 34

Even using the sqlite3.Row class– you still can’t use string formatting in the form of:

print "%(id)i - %(name)s: %(value)s" % row

In order to get past this, I use a helper function that takes the row and converts to a dictionary. I only use this when the dictionary object is preferable to the Row object (e.g. for things like string formatting where the Row object doesn’t natively support the dictionary API as well). But use the Row object all other times.

def dict_from_row(row):
    return dict(zip(row.keys(), row))

Question 35

After you connect to SQLite: con = sqlite3.connect(.....) it is sufficient to just run:

con.row_factory = sqlite3.Row

Voila!

Question 36

From PEP 249:

Question: 

   How can I construct a dictionary out of the tuples returned by
   .fetch*():

Answer:

   There are several existing tools available which provide
   helpers for this task. Most of them use the approach of using
   the column names defined in the cursor attribute .description
   as basis for the keys in the row dictionary.

   Note that the reason for not extending the DB API specification
   to also support dictionary return values for the .fetch*()
   methods is that this approach has several drawbacks:

   * Some databases don't support case-sensitive column names or
     auto-convert them to all lowercase or all uppercase
     characters.

   * Columns in the result set which are generated by the query
     (e.g.  using SQL functions) don't map to table column names
     and databases usually generate names for these columns in a
     very database specific way.

   As a result, accessing the columns through dictionary keys
   varies between databases and makes writing portable code
   impossible.

So yes, do it yourself.

Question 37

Shorter version:

db.row_factory = lambda c, r: dict([(col[0], r[idx]) for idx, col in enumerate(c.description)])

Question 38

Fastest on my tests:

conn.row_factory = lambda c, r: dict(zip([col[0] for col in c.description], r))
c = conn.cursor()

%timeit c.execute('SELECT * FROM table').fetchall()
19.8 µs ± 1.05 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)

vs:

conn.row_factory = lambda c, r: dict([(col[0], r[idx]) for idx, col in enumerate(c.description)])
c = conn.cursor()

%timeit c.execute('SELECT * FROM table').fetchall()
19.4 µs ± 75.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

You decide :)

Question 39

Similar like before-mentioned solutions, but most compact:

db.row_factory = lambda C, R: { c[0]: R[i] for i, c in enumerate(C.description) }

Question 40

As mentioned by @gandalf’s answer, one has to use conn.row_factory = sqlite3.Row, but the results are not directly dictionaries. One has to add an additional “cast” to dict in the last loop:

import sqlite3
conn = sqlite3.connect(":memory:")
conn.execute('create table t (a text, b text, c text)')
conn.execute('insert into t values ("aaa", "bbb", "ccc")')
conn.execute('insert into t values ("AAA", "BBB", "CCC")')
conn.row_factory = sqlite3.Row
c = conn.cursor()
c.execute('select * from t')
for r in c.fetchall():
    print(dict(r))

# {'a': 'aaa', 'b': 'bbb', 'c': 'ccc'}
# {'a': 'AAA', 'b': 'BBB', 'c': 'CCC'}

Question 41

I think you were on the right track. Let’s keep this very simple and complete what you were trying to do:

import sqlite3
db = sqlite3.connect("test.sqlite3")
cur = db.cursor()
res = cur.execute("select * from table").fetchall()
data = dict(zip([c[0] for c in cur.description], res[0]))

print(data)

The downside is that .fetchall(), which is murder on your memory consumption, if your table is very large. But for trivial applications dealing with mere few thousands of rows of text and numeric columns, this simple approach is good enough.

For serious stuff, you should look into row factories, as proposed in many other answers.

Question 42

Or you could convert the sqlite3.Rows to a dictionary as follows. This will give a dictionary with a list for each row.

    def from_sqlite_Row_to_dict(list_with_rows):
    ''' Turn a list with sqlite3.Row objects into a dictionary'''
    d ={} # the dictionary to be filled with the row data and to be returned

    for i, row in enumerate(list_with_rows): # iterate throw the sqlite3.Row objects            
        l = [] # for each Row use a separate list
        for col in range(0, len(row)): # copy over the row date (ie. column data) to a list
            l.append(row[col])
        d[i] = l # add the list to the dictionary   
    return d

Question 43

A generic alternative, using just three lines

def select_column_and_value(db, sql, parameters=()):
    execute = db.execute(sql, parameters)
    fetch = execute.fetchone()
    return {k[0]: v for k, v in list(zip(execute.description, fetch))}

con = sqlite3.connect('/mydatabase.db')
c = con.cursor()
print(select_column_and_value(c, 'SELECT * FROM things WHERE id=?', (id,)))

But if your query returns nothing, will result in error. In this case…

def select_column_and_value(self, sql, parameters=()):
    execute = self.execute(sql, parameters)
    fetch = execute.fetchone()

    if fetch is None:
        return {k[0]: None for k in execute.description}

    return {k[0]: v for k, v in list(zip(execute.description, fetch))}

or

def select_column_and_value(self, sql, parameters=()):
    execute = self.execute(sql, parameters)
    fetch = execute.fetchone()

    if fetch is None:
        return {}

    return {k[0]: v for k, v in list(zip(execute.description, fetch))}

Question 44

import sqlite3

db = sqlite3.connect('mydatabase.db')
cursor = db.execute('SELECT * FROM students ORDER BY CREATE_AT')
studentList = cursor.fetchall()

columnNames = list(map(lambda x: x[0], cursor.description)) #students table column names list
studentsAssoc = {} #Assoc format is dictionary similarly


#THIS IS ASSOC PROCESS
for lineNumber, student in enumerate(studentList):
    studentsAssoc[lineNumber] = {}

    for columnNumber, value in enumerate(student):
        studentsAssoc[lineNumber][columnNames[columnNumber]] = value


print(studentsAssoc)

The result is definitely true, but I do not know the best.

Question 45

Dictionaries in python provide arbitrary access to their elements. So any dictionary with “names” although it might be informative on one hand (a.k.a. what are the field names) “un-orders” the fields, which might be unwanted.

Best approach is to get the names in a separate list and then combine them with the results by yourself, if needed.

try:
         mycursor = self.memconn.cursor()
         mycursor.execute('''SELECT * FROM maintbl;''')
         #first get the names, because they will be lost after retrieval of rows
         names = list(map(lambda x: x[0], mycursor.description))
         manyrows = mycursor.fetchall()

         return manyrows, names

Also remember that the names, in all approaches, are the names you provided in the query, not the names in database. Exception is the SELECT * FROM

If your only concern is to get the results using a dictionary, then definitely use the conn.row_factory = sqlite3.Row (already stated in another answer).

Question 46

Can someone tell me how to install the sqlite3 module alongside the most recent version of Python? I am using a Macbook, and on the command line, I tried:

pip install sqlite

but an error pops up.

Question 47

You don’t need to install sqlite3 module. It is included in the standard library (since Python 2.5).

Question 48

I have python 2.7.3 and this solved my problem:

pip install pysqlite

Question 49

For Python version 3:

pip install pysqlite3

Question 50

Normally, it is included. However, as @ngn999 said, if your python has been built from source manually, you’ll have to add it.

Here is an example of a script that will setup an encapsulated version (virtual environment) of Python3 in your user directory with an encapsulated version of sqlite3.

INSTALL_BASE_PATH="$HOME/local"
cd ~
mkdir build
cd build
[ -f Python-3.6.2.tgz ] || wget https://www.python.org/ftp/python/3.6.2/Python-3.6.2.tgz
tar -zxvf Python-3.6.2.tgz

[ -f sqlite-autoconf-3240000.tar.gz ] || wget https://www.sqlite.org/2018/sqlite-autoconf-3240000.tar.gz
tar -zxvf sqlite-autoconf-3240000.tar.gz

cd sqlite-autoconf-3240000
./configure --prefix=${INSTALL_BASE_PATH}
make
make install

cd ../Python-3.6.2
LD_RUN_PATH=${INSTALL_BASE_PATH}/lib configure
LDFLAGS="-L ${INSTALL_BASE_PATH}/lib"
CPPFLAGS="-I ${INSTALL_BASE_PATH}/include"
LD_RUN_PATH=${INSTALL_BASE_PATH}/lib make
./configure --prefix=${INSTALL_BASE_PATH}
make
make install

cd ~
LINE_TO_ADD="export PATH=${INSTALL_BASE_PATH}/bin:\$PATH"
if grep -q -v "${LINE_TO_ADD}" $HOME/.bash_profile; then echo "${LINE_TO_ADD}" >> $HOME/.bash_profile; fi
source $HOME/.bash_profile

Why do this? You might want a modular python environment that you can completely destroy and rebuild without affecting your operating system–for an independent development environment. In this case, the solution is to install sqlite3 modularly too.

Question 51

I have a CSV file and I want to bulk-import this file into my sqlite3 database using Python. the command is “.import …..”. but it seems that it cannot work like this. Can anyone give me an example of how to do it in sqlite3? I am using windows just in case. Thanks

Question 52

import csv, sqlite3

con = sqlite3.connect(":memory:") # change to 'sqlite:///your_filename.db'
cur = con.cursor()
cur.execute("CREATE TABLE t (col1, col2);") # use your column names here

with open('data.csv','r') as fin: # `with` statement available in 2.5+
    # csv.DictReader uses first line in file for column headings by default
    dr = csv.DictReader(fin) # comma is default delimiter
    to_db = [(i['col1'], i['col2']) for i in dr]

cur.executemany("INSERT INTO t (col1, col2) VALUES (?, ?);", to_db)
con.commit()
con.close()

Question 53

Creating an sqlite connection to a file on disk is left as an exercise for the reader … but there is now a two-liner made possible by the pandas library

df = pandas.read_csv(csvfile)
df.to_sql(table_name, conn, if_exists='append', index=False)

Question 54

My 2 cents (more generic):

import csv, sqlite3
import logging

def _get_col_datatypes(fin):
    dr = csv.DictReader(fin) # comma is default delimiter
    fieldTypes = {}
    for entry in dr:
        feildslLeft = [f for f in dr.fieldnames if f not in fieldTypes.keys()]
        if not feildslLeft: break # We're done
        for field in feildslLeft:
            data = entry[field]

            # Need data to decide
            if len(data) == 0:
                continue

            if data.isdigit():
                fieldTypes[field] = "INTEGER"
            else:
                fieldTypes[field] = "TEXT"
        # TODO: Currently there's no support for DATE in sqllite

    if len(feildslLeft) > 0:
        raise Exception("Failed to find all the columns data types - Maybe some are empty?")

    return fieldTypes


def escapingGenerator(f):
    for line in f:
        yield line.encode("ascii", "xmlcharrefreplace").decode("ascii")


def csvToDb(csvFile, outputToFile = False):
    # TODO: implement output to file

    with open(csvFile,mode='r', encoding="ISO-8859-1") as fin:
        dt = _get_col_datatypes(fin)

        fin.seek(0)

        reader = csv.DictReader(fin)

        # Keep the order of the columns name just as in the CSV
        fields = reader.fieldnames
        cols = []

        # Set field and type
        for f in fields:
            cols.append("%s %s" % (f, dt[f]))

        # Generate create table statement:
        stmt = "CREATE TABLE ads (%s)" % ",".join(cols)

        con = sqlite3.connect(":memory:")
        cur = con.cursor()
        cur.execute(stmt)

        fin.seek(0)


        reader = csv.reader(escapingGenerator(fin))

        # Generate insert statement:
        stmt = "INSERT INTO ads VALUES(%s);" % ','.join('?' * len(cols))

        cur.executemany(stmt, reader)
        con.commit()

    return con

Question 55

The .import command is a feature of the sqlite3 command-line tool. To do it in Python, you should simply load the data using whatever facilities Python has, such as the csv module, and inserting the data as per usual.

This way, you also have control over what types are inserted, rather than relying on sqlite3’s seemingly undocumented behaviour.

Question 56

#!/usr/bin/python
# -*- coding: utf-8 -*-

import sys, csv, sqlite3

def main():
    con = sqlite3.connect(sys.argv[1]) # database file input
    cur = con.cursor()
    cur.executescript("""
        DROP TABLE IF EXISTS t;
        CREATE TABLE t (COL1 TEXT, COL2 TEXT);
        """) # checks to see if table exists and makes a fresh table.

    with open(sys.argv[2], "rb") as f: # CSV file input
        reader = csv.reader(f, delimiter=',') # no header information with delimiter
        for row in reader:
            to_db = [unicode(row[0], "utf8"), unicode(row[1], "utf8")] # Appends data from CSV file representing and handling of text
            cur.execute("INSERT INTO neto (COL1, COL2) VALUES(?, ?);", to_db)
            con.commit()
    con.close() # closes connection to database

if __name__=='__main__':
    main()

Question 57

Many thanks for bernie’s answer! Had to tweak it a bit – here’s what worked for me:

import csv, sqlite3
conn = sqlite3.connect("pcfc.sl3")
curs = conn.cursor()
curs.execute("CREATE TABLE PCFC (id INTEGER PRIMARY KEY, type INTEGER, term TEXT, definition TEXT);")
reader = csv.reader(open('PC.txt', 'r'), delimiter='|')
for row in reader:
    to_db = [unicode(row[0], "utf8"), unicode(row[1], "utf8"), unicode(row[2], "utf8")]
    curs.execute("INSERT INTO PCFC (type, term, definition) VALUES (?, ?, ?);", to_db)
conn.commit()

My text file (PC.txt) looks like this:

1 | Term 1 | Definition 1
2 | Term 2 | Definition 2
3 | Term 3 | Definition 3

Question 58

You’re right that .import is the way to go, but that’s a command from the SQLite3.exe shell. A lot of the top answers to this question involve native python loops, but if your files are large (mine are 10^6 to 10^7 records), you want to avoid reading everything into pandas or using a native python list comprehension/loop (though I did not time them for comparison).

For large files, I believe the best option is to create the empty table in advance using sqlite3.execute("CREATE TABLE..."), strip the headers from your CSV files, and then use subprocess.run() to execute sqlite’s import statement. Since the last part is I believe the most pertinent, I will start with that.

`subprocess.run()`

from pathlib import Path
db_name = Path('my.db').resolve()
csv_file = Path('file.csv').resolve()
result = subprocess.run(['sqlite3',
                         str(db_name),
                         '-cmd',
                         '.mode csv',
                         '.import '+str(csv_file).replace('\\','\\\\')
                                 +' <table_name>'],
                        capture_output=True)

Explanation
From the command line, the command you’re looking for is sqlite3 my.db -cmd ".mode csv" ".import file.csv table". subprocess.run() runs a command line process. The argument to subprocess.run() is a sequence of strings which are interpreted as a command followed by all of it’s arguments.

sqlite3 my.db opens the database
-cmd flag after the database allows you to pass multiple follow on commands to the sqlite program. In the shell, each command has to be in quotes, but here, they just need to be their own element of the sequence
'.mode csv' does what you’d expect
'.import '+str(csv_file).replace('\\','\\\\')+' <table_name>' is the import command.
Unfortunately, since subprocess passes all follow-ons to -cmd as quoted strings, you need to double up your backslashes if you have a windows directory path.

Stripping Headers

Not really the main point of the question, but here’s what I used. Again, I didn’t want to read the whole files into memory at any point:

with open(csv, "r") as source:
    source.readline()
    with open(str(csv)+"_nohead", "w") as target:
        shutil.copyfileobj(source, target)

Question 59

Based on Guy L solution (Love it) but can handle escaped fields.

import csv, sqlite3

def _get_col_datatypes(fin):
    dr = csv.DictReader(fin) # comma is default delimiter
    fieldTypes = {}
    for entry in dr:
        feildslLeft = [f for f in dr.fieldnames if f not in fieldTypes.keys()]        
        if not feildslLeft: break # We're done
        for field in feildslLeft:
            data = entry[field]

            # Need data to decide
            if len(data) == 0:
                continue

            if data.isdigit():
                fieldTypes[field] = "INTEGER"
            else:
                fieldTypes[field] = "TEXT"
        # TODO: Currently there's no support for DATE in sqllite

    if len(feildslLeft) > 0:
        raise Exception("Failed to find all the columns data types - Maybe some are empty?")

    return fieldTypes


def escapingGenerator(f):
    for line in f:
        yield line.encode("ascii", "xmlcharrefreplace").decode("ascii")


def csvToDb(csvFile,dbFile,tablename, outputToFile = False):

    # TODO: implement output to file

    with open(csvFile,mode='r', encoding="ISO-8859-1") as fin:
        dt = _get_col_datatypes(fin)

        fin.seek(0)

        reader = csv.DictReader(fin)

        # Keep the order of the columns name just as in the CSV
        fields = reader.fieldnames
        cols = []

        # Set field and type
        for f in fields:
            cols.append("\"%s\" %s" % (f, dt[f]))

        # Generate create table statement:
        stmt = "create table if not exists \"" + tablename + "\" (%s)" % ",".join(cols)
        print(stmt)
        con = sqlite3.connect(dbFile)
        cur = con.cursor()
        cur.execute(stmt)

        fin.seek(0)


        reader = csv.reader(escapingGenerator(fin))

        # Generate insert statement:
        stmt = "INSERT INTO \"" + tablename + "\" VALUES(%s);" % ','.join('?' * len(cols))

        cur.executemany(stmt, reader)
        con.commit()
        con.close()

Question 60

You can do this using blaze & odo efficiently

import blaze as bz
csv_path = 'data.csv'
bz.odo(csv_path, 'sqlite:///data.db::data')

Odo will store the csv file to data.db (sqlite database) under the schema data

Or you use odo directly, without blaze. Either ways is fine. Read this documentation

Question 61

If the CSV file must be imported as part of a python program, then for simplicity and efficiency, you could use os.system along the lines suggested by the following:

import os

cmd = """sqlite3 database.db <<< ".import input.csv mytable" """

rc = os.system(cmd)

print(rc)

The point is that by specifying the filename of the database, the data will automatically be saved, assuming there are no errors reading it.

Question 62

import csv, sqlite3

def _get_col_datatypes(fin):
    dr = csv.DictReader(fin) # comma is default delimiter
    fieldTypes = {}
    for entry in dr:
        feildslLeft = [f for f in dr.fieldnames if f not in fieldTypes.keys()]        
        if not feildslLeft: break # We're done
        for field in feildslLeft:
            data = entry[field]

        # Need data to decide
        if len(data) == 0:
            continue

        if data.isdigit():
            fieldTypes[field] = "INTEGER"
        else:
            fieldTypes[field] = "TEXT"
    # TODO: Currently there's no support for DATE in sqllite

if len(feildslLeft) > 0:
    raise Exception("Failed to find all the columns data types - Maybe some are empty?")

return fieldTypes


def escapingGenerator(f):
    for line in f:
        yield line.encode("ascii", "xmlcharrefreplace").decode("ascii")


def csvToDb(csvFile,dbFile,tablename, outputToFile = False):

    # TODO: implement output to file

    with open(csvFile,mode='r', encoding="ISO-8859-1") as fin:
        dt = _get_col_datatypes(fin)

        fin.seek(0)

        reader = csv.DictReader(fin)

        # Keep the order of the columns name just as in the CSV
        fields = reader.fieldnames
        cols = []

        # Set field and type
        for f in fields:
            cols.append("\"%s\" %s" % (f, dt[f]))

        # Generate create table statement:
        stmt = "create table if not exists \"" + tablename + "\" (%s)" % ",".join(cols)
        print(stmt)
        con = sqlite3.connect(dbFile)
        cur = con.cursor()
        cur.execute(stmt)

        fin.seek(0)


        reader = csv.reader(escapingGenerator(fin))

        # Generate insert statement:
        stmt = "INSERT INTO \"" + tablename + "\" VALUES(%s);" % ','.join('?' * len(cols))

        cur.executemany(stmt, reader)
        con.commit()
        con.close()

Question 63

in the interest of simplicity, you could use the sqlite3 command line tool from the Makefile of your project.

%.sql3: %.csv
    rm -f $@
    sqlite3 $@ -echo -cmd ".mode csv" ".import $< $*"
%.dump: %.sql3
    sqlite3 $< "select * from $*"

make test.sql3 then creates the sqlite database from an existing test.csv file, with a single table “test”. you can then make test.dump to verify the contents.

Question 64

I’ve found that it can be necessary to break up the transfer of data from the csv to the database in chunks as to not run out of memory. This can be done like this:

import csv
import sqlite3
from operator import itemgetter

# Establish connection
conn = sqlite3.connect("mydb.db")

# Create the table 
conn.execute(
    """
    CREATE TABLE persons(
        person_id INTEGER,
        last_name TEXT, 
        first_name TEXT, 
        address TEXT
    )
    """
)

# These are the columns from the csv that we want
cols = ["person_id", "last_name", "first_name", "address"]

# If the csv file is huge, we instead add the data in chunks
chunksize = 10000

# Parse csv file and populate db in chunks
with conn, open("persons.csv") as f:
    reader = csv.DictReader(f)

    chunk = []
    for i, row in reader: 

        if i % chunksize == 0 and i > 0:
            conn.executemany(
                """
                INSERT INTO persons
                    VALUES(?, ?, ?, ?)
                """, chunk
            )
            chunk = []

        items = itemgetter(*cols)(row)
        chunk.append(items)

问题：使用Python在SQLite中插入行后如何检索插入的ID？

回答 0

问题：没有名为_sqlite3的模块

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

回答 13

回答 14

回答 15

回答 16

回答 17

回答 18

回答 19

回答 20

问题：查询sqlite数据库时为什么需要创建游标？

回答 0

回答 1

回答 2

回答 3

回答 4

问题：如何从sqlite查询中获取字典？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

回答 13

问题：如何将sqlite3模块添加到Python？

回答 0

回答 1

回答 2

回答 3

问题：使用Python将CSV文件导入sqlite3数据库表

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

subprocess.run()

剥离标题

subprocess.run()

Stripping Headers

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

问题：使用Python sqlite3 API的表，数据库模式，转储等的列表

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

`subprocess.run()`

`subprocess.run()`