SQLAlchemy:打印实际查询

问题:SQLAlchemy:打印实际查询

我真的很希望能够为我的应用程序打印出有效的SQL,包括值,而不是绑定参数,但是如何在SQLAlchemy中做到这一点并不明显(我很确定)。

有人以一般方式解决了这个问题吗?

I’d really like to be able to print out valid SQL for my application, including values, rather than bind parameters, but it’s not obvious how to do this in SQLAlchemy (by design, I’m fairly sure).

Has anyone solved this problem in a general way?


回答 0

在大多数情况下,SQLAlchemy语句或查询的“字符串化”很简单:

print str(statement)

这适用于ORM Query以及任何其他select()或其他语句。

注意:以下详细答案正在sqlalchemy文档中维护

要获得已编译为特定方言或引擎的语句,如果该语句本身尚未绑定到某个方言,则可以将此语句传递给compile()

print statement.compile(someengine)

或没有引擎:

from sqlalchemy.dialects import postgresql
print statement.compile(dialect=postgresql.dialect())

当给定一个ORM Query对象时,为了获得该compile()方法,我们只需要首先访问.statement访问器即可:

statement = query.statement
print statement.compile(someengine)

关于将绑定参数“内联”到最终字符串的原始规定,这里的挑战是SQLAlchemy通常不承担此任务,因为这是由Python DBAPI适当处理的,更不用说绕过绑定参数了可能是现代Web应用程序中利用最广泛的安全漏洞。SQLAlchemy在某些情况下(例如,发出DDL的情况)进行这种字符串化的能力有限。为了访问此功能,可以使用传递给的’literal_binds’标志compile_kwargs

from sqlalchemy.sql import table, column, select

t = table('t', column('x'))

s = select([t]).where(t.c.x == 5)

print s.compile(compile_kwargs={"literal_binds": True})

上述方法有一个警告:仅基本类型(例如int和string)支持该方法,此外,如果 bindparam 直接使用不带预设值的a,则也不能将其字符串化。

要为不支持的类型支持内联文字呈现,请TypeDecorator为目标类型实现,其中包括一种 TypeDecorator.process_literal_param方法:

from sqlalchemy import TypeDecorator, Integer


class MyFancyType(TypeDecorator):
    impl = Integer

    def process_literal_param(self, value, dialect):
        return "my_fancy_formatting(%s)" % value

from sqlalchemy import Table, Column, MetaData

tab = Table('mytable', MetaData(), Column('x', MyFancyType()))

print(
    tab.select().where(tab.c.x > 5).compile(
        compile_kwargs={"literal_binds": True})
)

产生如下输出:

SELECT mytable.x
FROM mytable
WHERE mytable.x > my_fancy_formatting(5)

In the vast majority of cases, the “stringification” of a SQLAlchemy statement or query is as simple as:

print(str(statement))

This applies both to an ORM Query as well as any select() or other statement.

Note: the following detailed answer is being maintained on the sqlalchemy documentation.

To get the statement as compiled to a specific dialect or engine, if the statement itself is not already bound to one you can pass this in to compile():

print(statement.compile(someengine))

or without an engine:

from sqlalchemy.dialects import postgresql
print(statement.compile(dialect=postgresql.dialect()))

When given an ORM Query object, in order to get at the compile() method we only need access the .statement accessor first:

statement = query.statement
print(statement.compile(someengine))

with regards to the original stipulation that bound parameters are to be “inlined” into the final string, the challenge here is that SQLAlchemy normally is not tasked with this, as this is handled appropriately by the Python DBAPI, not to mention bypassing bound parameters is probably the most widely exploited security holes in modern web applications. SQLAlchemy has limited ability to do this stringification in certain circumstances such as that of emitting DDL. In order to access this functionality one can use the ‘literal_binds’ flag, passed to compile_kwargs:

from sqlalchemy.sql import table, column, select

t = table('t', column('x'))

s = select([t]).where(t.c.x == 5)

print(s.compile(compile_kwargs={"literal_binds": True}))

the above approach has the caveats that it is only supported for basic types, such as ints and strings, and furthermore if a bindparam without a pre-set value is used directly, it won’t be able to stringify that either.

To support inline literal rendering for types not supported, implement a TypeDecorator for the target type which includes a TypeDecorator.process_literal_param method:

from sqlalchemy import TypeDecorator, Integer


class MyFancyType(TypeDecorator):
    impl = Integer

    def process_literal_param(self, value, dialect):
        return "my_fancy_formatting(%s)" % value

from sqlalchemy import Table, Column, MetaData

tab = Table('mytable', MetaData(), Column('x', MyFancyType()))

print(
    tab.select().where(tab.c.x > 5).compile(
        compile_kwargs={"literal_binds": True})
)

producing output like:

SELECT mytable.x
FROM mytable
WHERE mytable.x > my_fancy_formatting(5)

回答 1

这可以在python 2和3中运行,并且比以前更干净,但是需要SA> = 1.0。

from sqlalchemy.engine.default import DefaultDialect
from sqlalchemy.sql.sqltypes import String, DateTime, NullType

# python2/3 compatible.
PY3 = str is not bytes
text = str if PY3 else unicode
int_type = int if PY3 else (int, long)
str_type = str if PY3 else (str, unicode)


class StringLiteral(String):
    """Teach SA how to literalize various things."""
    def literal_processor(self, dialect):
        super_processor = super(StringLiteral, self).literal_processor(dialect)

        def process(value):
            if isinstance(value, int_type):
                return text(value)
            if not isinstance(value, str_type):
                value = text(value)
            result = super_processor(value)
            if isinstance(result, bytes):
                result = result.decode(dialect.encoding)
            return result
        return process


class LiteralDialect(DefaultDialect):
    colspecs = {
        # prevent various encoding explosions
        String: StringLiteral,
        # teach SA about how to literalize a datetime
        DateTime: StringLiteral,
        # don't format py2 long integers to NULL
        NullType: StringLiteral,
    }


def literalquery(statement):
    """NOTE: This is entirely insecure. DO NOT execute the resulting strings."""
    import sqlalchemy.orm
    if isinstance(statement, sqlalchemy.orm.Query):
        statement = statement.statement
    return statement.compile(
        dialect=LiteralDialect(),
        compile_kwargs={'literal_binds': True},
    ).string

演示:

# coding: UTF-8
from datetime import datetime
from decimal import Decimal

from literalquery import literalquery


def test():
    from sqlalchemy.sql import table, column, select

    mytable = table('mytable', column('mycol'))
    values = (
        5,
        u'snowman: ☃',
        b'UTF-8 snowman: \xe2\x98\x83',
        datetime.now(),
        Decimal('3.14159'),
        10 ** 20,  # a long integer
    )

    statement = select([mytable]).where(mytable.c.mycol.in_(values)).limit(1)
    print(literalquery(statement))


if __name__ == '__main__':
    test()

给出以下输出:(在python 2.7和3.4中测试)

SELECT mytable.mycol
FROM mytable
WHERE mytable.mycol IN (5, 'snowman: ☃', 'UTF-8 snowman: ☃',
      '2015-06-24 18:09:29.042517', 3.14159, 100000000000000000000)
 LIMIT 1

This works in python 2 and 3 and is a bit cleaner than before, but requires SA>=1.0.

from sqlalchemy.engine.default import DefaultDialect
from sqlalchemy.sql.sqltypes import String, DateTime, NullType

# python2/3 compatible.
PY3 = str is not bytes
text = str if PY3 else unicode
int_type = int if PY3 else (int, long)
str_type = str if PY3 else (str, unicode)


class StringLiteral(String):
    """Teach SA how to literalize various things."""
    def literal_processor(self, dialect):
        super_processor = super(StringLiteral, self).literal_processor(dialect)

        def process(value):
            if isinstance(value, int_type):
                return text(value)
            if not isinstance(value, str_type):
                value = text(value)
            result = super_processor(value)
            if isinstance(result, bytes):
                result = result.decode(dialect.encoding)
            return result
        return process


class LiteralDialect(DefaultDialect):
    colspecs = {
        # prevent various encoding explosions
        String: StringLiteral,
        # teach SA about how to literalize a datetime
        DateTime: StringLiteral,
        # don't format py2 long integers to NULL
        NullType: StringLiteral,
    }


def literalquery(statement):
    """NOTE: This is entirely insecure. DO NOT execute the resulting strings."""
    import sqlalchemy.orm
    if isinstance(statement, sqlalchemy.orm.Query):
        statement = statement.statement
    return statement.compile(
        dialect=LiteralDialect(),
        compile_kwargs={'literal_binds': True},
    ).string

Demo:

# coding: UTF-8
from datetime import datetime
from decimal import Decimal

from literalquery import literalquery


def test():
    from sqlalchemy.sql import table, column, select

    mytable = table('mytable', column('mycol'))
    values = (
        5,
        u'snowman: ☃',
        b'UTF-8 snowman: \xe2\x98\x83',
        datetime.now(),
        Decimal('3.14159'),
        10 ** 20,  # a long integer
    )

    statement = select([mytable]).where(mytable.c.mycol.in_(values)).limit(1)
    print(literalquery(statement))


if __name__ == '__main__':
    test()

Gives this output: (tested in python 2.7 and 3.4)

SELECT mytable.mycol
FROM mytable
WHERE mytable.mycol IN (5, 'snowman: ☃', 'UTF-8 snowman: ☃',
      '2015-06-24 18:09:29.042517', 3.14159, 100000000000000000000)
 LIMIT 1

回答 2

鉴于您想要的仅在调试时才有意义,因此可以使用来启动SQLAlchemy echo=True,以记录所有SQL查询。例如:

engine = create_engine(
    "mysql://scott:tiger@hostname/dbname",
    encoding="latin1",
    echo=True,
)

也可以仅针对单个请求进行修改:

echo=False–如果True,引擎将所有语句repr()及其参数列表中的一个记录到引擎记录器中,默认为sys.stdout。可以随时修改的echo属性Engine以打开和关闭登录。如果设置为string "debug",结果行也将被打印到标准输出。该标志最终控制着Python记录器;请参阅配置日志有关如何直接的信息,。

来源:SQLAlchemy引擎配置

如果与Flask一起使用,则只需设置

app.config["SQLALCHEMY_ECHO"] = True

获得相同的行为。

Given that what you want makes sense only when debugging, you could start SQLAlchemy with echo=True, to log all SQL queries. For example:

engine = create_engine(
    "mysql://scott:tiger@hostname/dbname",
    encoding="latin1",
    echo=True,
)

This can also be modified for just a single request:

echo=False – if True, the Engine will log all statements as well as a repr() of their parameter lists to the engines logger, which defaults to sys.stdout. The echo attribute of Engine can be modified at any time to turn logging on and off. If set to the string "debug", result rows will be printed to the standard output as well. This flag ultimately controls a Python logger; see Configuring Logging for information on how to configure logging directly.

Source: SQLAlchemy Engine Configuration

If used with Flask, you can simply set

app.config["SQLALCHEMY_ECHO"] = True

to get the same behaviour.


回答 3

为此,我们可以使用编译方法。从文档

from sqlalchemy.sql import text
from sqlalchemy.dialects import postgresql

stmt = text("SELECT * FROM users WHERE users.name BETWEEN :x AND :y")
stmt = stmt.bindparams(x="m", y="z")

print(stmt.compile(dialect=postgresql.dialect(),compile_kwargs={"literal_binds": True}))

结果:

SELECT * FROM users WHERE users.name BETWEEN 'm' AND 'z'

来自文档的警告:

切勿将此技术与从不受信任的输入(例如从Web表单或其他用户输入应用程序)接收的字符串内容一起使用。SQLAlchemy的将Python值强制转换为直接SQL字符串值的功能对于不受信任的输入是不安全的,并且无法验证传递的数据类型。以编程方式对关系数据库调用非DDL SQL语句时,请始终使用绑定参数。

We can use compile method for this purpose. From the docs:

from sqlalchemy.sql import text
from sqlalchemy.dialects import postgresql

stmt = text("SELECT * FROM users WHERE users.name BETWEEN :x AND :y")
stmt = stmt.bindparams(x="m", y="z")

print(stmt.compile(dialect=postgresql.dialect(),compile_kwargs={"literal_binds": True}))

Result:

SELECT * FROM users WHERE users.name BETWEEN 'm' AND 'z'

Warning from docs:

Never use this technique with string content received from untrusted input, such as from web forms or other user-input applications. SQLAlchemy’s facilities to coerce Python values into direct SQL string values are not secure against untrusted input and do not validate the type of data being passed. Always use bound parameters when programmatically invoking non-DDL SQL statements against a relational database.


回答 4

因此,在@zzzeek对@bukzor的代码的注释的基础上,我想到了这一点,以轻松获得“可打印的”查询:

def prettyprintable(statement, dialect=None, reindent=True):
    """Generate an SQL expression string with bound parameters rendered inline
    for the given SQLAlchemy statement. The function can also receive a
    `sqlalchemy.orm.Query` object instead of statement.
    can 

    WARNING: Should only be used for debugging. Inlining parameters is not
             safe when handling user created data.
    """
    import sqlparse
    import sqlalchemy.orm
    if isinstance(statement, sqlalchemy.orm.Query):
        if dialect is None:
            dialect = statement.session.get_bind().dialect
        statement = statement.statement
    compiled = statement.compile(dialect=dialect,
                                 compile_kwargs={'literal_binds': True})
    return sqlparse.format(str(compiled), reindent=reindent)

我个人很难阅读未缩进的代码,因此我习惯于sqlparse重新缩进SQL。可以使用进行安装pip install sqlparse

So building on @zzzeek’s comments on @bukzor’s code I came up with this to easily get a “pretty-printable” query:

def prettyprintable(statement, dialect=None, reindent=True):
    """Generate an SQL expression string with bound parameters rendered inline
    for the given SQLAlchemy statement. The function can also receive a
    `sqlalchemy.orm.Query` object instead of statement.
    can 

    WARNING: Should only be used for debugging. Inlining parameters is not
             safe when handling user created data.
    """
    import sqlparse
    import sqlalchemy.orm
    if isinstance(statement, sqlalchemy.orm.Query):
        if dialect is None:
            dialect = statement.session.get_bind().dialect
        statement = statement.statement
    compiled = statement.compile(dialect=dialect,
                                 compile_kwargs={'literal_binds': True})
    return sqlparse.format(str(compiled), reindent=reindent)

I personally have a hard time reading code which is not indented so I’ve used sqlparse to reindent the SQL. It can be installed with pip install sqlparse.


回答 5

该代码基于@bukzor 提供的出色答案。我刚刚将自定义渲染datetime.datetime类型添加到Oracle的TO_DATE()

随时更新代码以适合您的数据库:

import decimal
import datetime

def printquery(statement, bind=None):
    """
    print a query, with values filled in
    for debugging purposes *only*
    for security, you should always separate queries from their values
    please also note that this function is quite slow
    """
    import sqlalchemy.orm
    if isinstance(statement, sqlalchemy.orm.Query):
        if bind is None:
            bind = statement.session.get_bind(
                    statement._mapper_zero_or_none()
            )
        statement = statement.statement
    elif bind is None:
        bind = statement.bind 

    dialect = bind.dialect
    compiler = statement._compiler(dialect)
    class LiteralCompiler(compiler.__class__):
        def visit_bindparam(
                self, bindparam, within_columns_clause=False, 
                literal_binds=False, **kwargs
        ):
            return super(LiteralCompiler, self).render_literal_bindparam(
                    bindparam, within_columns_clause=within_columns_clause,
                    literal_binds=literal_binds, **kwargs
            )
        def render_literal_value(self, value, type_):
            """Render the value of a bind parameter as a quoted literal.

            This is used for statement sections that do not accept bind paramters
            on the target driver/database.

            This should be implemented by subclasses using the quoting services
            of the DBAPI.

            """
            if isinstance(value, basestring):
                value = value.replace("'", "''")
                return "'%s'" % value
            elif value is None:
                return "NULL"
            elif isinstance(value, (float, int, long)):
                return repr(value)
            elif isinstance(value, decimal.Decimal):
                return str(value)
            elif isinstance(value, datetime.datetime):
                return "TO_DATE('%s','YYYY-MM-DD HH24:MI:SS')" % value.strftime("%Y-%m-%d %H:%M:%S")

            else:
                raise NotImplementedError(
                            "Don't know how to literal-quote value %r" % value)            

    compiler = LiteralCompiler(dialect, statement)
    print compiler.process(statement)

This code is based on brilliant existing answer from @bukzor. I just added custom render for datetime.datetime type into Oracle’s TO_DATE().

Feel free to update code to suit your database:

import decimal
import datetime

def printquery(statement, bind=None):
    """
    print a query, with values filled in
    for debugging purposes *only*
    for security, you should always separate queries from their values
    please also note that this function is quite slow
    """
    import sqlalchemy.orm
    if isinstance(statement, sqlalchemy.orm.Query):
        if bind is None:
            bind = statement.session.get_bind(
                    statement._mapper_zero_or_none()
            )
        statement = statement.statement
    elif bind is None:
        bind = statement.bind 

    dialect = bind.dialect
    compiler = statement._compiler(dialect)
    class LiteralCompiler(compiler.__class__):
        def visit_bindparam(
                self, bindparam, within_columns_clause=False, 
                literal_binds=False, **kwargs
        ):
            return super(LiteralCompiler, self).render_literal_bindparam(
                    bindparam, within_columns_clause=within_columns_clause,
                    literal_binds=literal_binds, **kwargs
            )
        def render_literal_value(self, value, type_):
            """Render the value of a bind parameter as a quoted literal.

            This is used for statement sections that do not accept bind paramters
            on the target driver/database.

            This should be implemented by subclasses using the quoting services
            of the DBAPI.

            """
            if isinstance(value, basestring):
                value = value.replace("'", "''")
                return "'%s'" % value
            elif value is None:
                return "NULL"
            elif isinstance(value, (float, int, long)):
                return repr(value)
            elif isinstance(value, decimal.Decimal):
                return str(value)
            elif isinstance(value, datetime.datetime):
                return "TO_DATE('%s','YYYY-MM-DD HH24:MI:SS')" % value.strftime("%Y-%m-%d %H:%M:%S")

            else:
                raise NotImplementedError(
                            "Don't know how to literal-quote value %r" % value)            

    compiler = LiteralCompiler(dialect, statement)
    print compiler.process(statement)

回答 6

我想指出的是,以上给出的解决方案不适用于非平凡的查询。我遇到的一个问题是更复杂的类型,例如导致问题的pgsql ARRAY。我确实找到了一个对我来说甚至可以与pgsql ARRAY一起使用的解决方案:

借用:https : //gist.github.com/gsakkis/4572159

链接的代码似乎基于旧版本的SQLAlchemy。您会收到一条错误消息,指出_mapper_zero_or_none属性不存在。这是一个更新的版本,将与较新的版本一起使用,您只需将_mapper_zero_or_none替换为bind即可。此外,它还支持pgsql数组:

# adapted from:
# https://gist.github.com/gsakkis/4572159
from datetime import date, timedelta
from datetime import datetime

from sqlalchemy.orm import Query


try:
    basestring
except NameError:
    basestring = str


def render_query(statement, dialect=None):
    """
    Generate an SQL expression string with bound parameters rendered inline
    for the given SQLAlchemy statement.
    WARNING: This method of escaping is insecure, incomplete, and for debugging
    purposes only. Executing SQL statements with inline-rendered user values is
    extremely insecure.
    Based on http://stackoverflow.com/questions/5631078/sqlalchemy-print-the-actual-query
    """
    if isinstance(statement, Query):
        if dialect is None:
            dialect = statement.session.bind.dialect
        statement = statement.statement
    elif dialect is None:
        dialect = statement.bind.dialect

    class LiteralCompiler(dialect.statement_compiler):

        def visit_bindparam(self, bindparam, within_columns_clause=False,
                            literal_binds=False, **kwargs):
            return self.render_literal_value(bindparam.value, bindparam.type)

        def render_array_value(self, val, item_type):
            if isinstance(val, list):
                return "{%s}" % ",".join([self.render_array_value(x, item_type) for x in val])
            return self.render_literal_value(val, item_type)

        def render_literal_value(self, value, type_):
            if isinstance(value, long):
                return str(value)
            elif isinstance(value, (basestring, date, datetime, timedelta)):
                return "'%s'" % str(value).replace("'", "''")
            elif isinstance(value, list):
                return "'{%s}'" % (",".join([self.render_array_value(x, type_.item_type) for x in value]))
            return super(LiteralCompiler, self).render_literal_value(value, type_)

    return LiteralCompiler(dialect, statement).process(statement)

已测试到两个级别的嵌套数组。

I would like to point out that the solutions given above do not “just work” with non-trivial queries. One issue I came across were more complicated types, such as pgsql ARRAYs causing issues. I did find a solution that for me, did just work even with pgsql ARRAYs:

borrowed from: https://gist.github.com/gsakkis/4572159

The linked code seems to be based on an older version of SQLAlchemy. You’ll get an error saying that the attribute _mapper_zero_or_none doesn’t exist. Here’s an updated version that will work with a newer version, you simply replace _mapper_zero_or_none with bind. Additionally, this has support for pgsql arrays:

# adapted from:
# https://gist.github.com/gsakkis/4572159
from datetime import date, timedelta
from datetime import datetime

from sqlalchemy.orm import Query


try:
    basestring
except NameError:
    basestring = str


def render_query(statement, dialect=None):
    """
    Generate an SQL expression string with bound parameters rendered inline
    for the given SQLAlchemy statement.
    WARNING: This method of escaping is insecure, incomplete, and for debugging
    purposes only. Executing SQL statements with inline-rendered user values is
    extremely insecure.
    Based on http://stackoverflow.com/questions/5631078/sqlalchemy-print-the-actual-query
    """
    if isinstance(statement, Query):
        if dialect is None:
            dialect = statement.session.bind.dialect
        statement = statement.statement
    elif dialect is None:
        dialect = statement.bind.dialect

    class LiteralCompiler(dialect.statement_compiler):

        def visit_bindparam(self, bindparam, within_columns_clause=False,
                            literal_binds=False, **kwargs):
            return self.render_literal_value(bindparam.value, bindparam.type)

        def render_array_value(self, val, item_type):
            if isinstance(val, list):
                return "{%s}" % ",".join([self.render_array_value(x, item_type) for x in val])
            return self.render_literal_value(val, item_type)

        def render_literal_value(self, value, type_):
            if isinstance(value, long):
                return str(value)
            elif isinstance(value, (basestring, date, datetime, timedelta)):
                return "'%s'" % str(value).replace("'", "''")
            elif isinstance(value, list):
                return "'{%s}'" % (",".join([self.render_array_value(x, type_.item_type) for x in value]))
            return super(LiteralCompiler, self).render_literal_value(value, type_)

    return LiteralCompiler(dialect, statement).process(statement)

Tested to two levels of nested arrays.