SQLAlchemy ORM转换为Pandas DataFrame

问题:SQLAlchemy ORM转换为Pandas DataFrame

这个话题已经有一段时间没有在这里或其他地方了。是否有将SQLAlchemy <Query object>转换为pandas DataFrame 的解决方案?

Pandas具有使用能力,pandas.read_sql但这需要使用原始SQL。我有两个避免发生这种情况的原因:1)我已经使用ORM拥有了一切(本身就是一个很好的理由),并且2)我正在使用python列表作为查询的一部分(例如:模型类.db.session.query(Item).filter(Item.symbol.in_(add_symbols)在哪里Item)并且add_symbols是列表)。这等效于SQL SELECT ... from ... WHERE ... IN

有什么可能吗?

This topic hasn’t been addressed in a while, here or elsewhere. Is there a solution converting a SQLAlchemy <Query object> to a pandas DataFrame?

Pandas has the capability to use pandas.read_sql but this requires use of raw SQL. I have two reasons for wanting to avoid it: 1) I already have everything using the ORM (a good reason in and of itself) and 2) I’m using python lists as part of the query (eg: .db.session.query(Item).filter(Item.symbol.in_(add_symbols) where Item is my model class and add_symbols is a list). This is the equivalent of SQL SELECT ... from ... WHERE ... IN.

Is anything possible?


回答 0

在大多数情况下,下面的代码应该有效:

df = pd.read_sql(query.statement, query.session.bind)

有关pandas.read_sql参数的更多信息,请参见文档。

Below should work in most cases:

df = pd.read_sql(query.statement, query.session.bind)

See pandas.read_sql documentation for more information on the parameters.


回答 1

为了让新手熊猫程序员更加清楚,这是一个具体示例,

pd.read_sql(session.query(Complaint).filter(Complaint.id == 2).statement,session.bind) 

在这里,我们从id = 2的投诉表(sqlalchemy模型为Complaint)中选择一个投诉

Just to make this more clear for novice pandas programmers, here is a concrete example,

pd.read_sql(session.query(Complaint).filter(Complaint.id == 2).statement,session.bind) 

Here we select a complaint from complaints table (sqlalchemy model is Complaint) with id = 2


回答 2

所选解决方案对我不起作用,因为我不断收到错误消息

AttributeError:’AnnotatedSelect’对象没有属性’lower’

我发现以下工作:

df = pd.read_sql_query(query.statement, engine)

The selected solution didn’t work for me, as I kept getting the error

AttributeError: ‘AnnotatedSelect’ object has no attribute ‘lower’

I found the following worked:

df = pd.read_sql_query(query.statement, engine)

回答 3

如果要使用参数编译查询并说方言特定的参数,请使用以下命令:

c = query.statement.compile(query.session.bind)
df = pandas.read_sql(c.string, query.session.bind, params=c.params)

If you want to compile a query with parameters and dialect specific arguments, use something like this:

c = query.statement.compile(query.session.bind)
df = pandas.read_sql(c.string, query.session.bind, params=c.params)

回答 4

from sqlalchemy import Column, Integer, String, create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

engine = create_engine('postgresql://postgres:postgres@localhost:5432/DB', echo=False)
Base = declarative_base(bind=engine)
Session = sessionmaker(bind=engine)
session = Session()

conn = session.bind

class DailyTrendsTable(Base):

    __tablename__ = 'trends'
    __table_args__ = ({"schema": 'mf_analysis'})

    company_code = Column(DOUBLE_PRECISION, primary_key=True)
    rt_bullish_trending = Column(Integer)
    rt_bearish_trending = Column(Integer)
    rt_bullish_non_trending = Column(Integer)
    rt_bearish_non_trending = Column(Integer)
    gen_date = Column(Date, primary_key=True)

df_query = select([DailyTrendsTable])

df_data = pd.read_sql(rt_daily_query, con = conn)
from sqlalchemy import Column, Integer, String, create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

engine = create_engine('postgresql://postgres:postgres@localhost:5432/DB', echo=False)
Base = declarative_base(bind=engine)
Session = sessionmaker(bind=engine)
session = Session()

conn = session.bind

class DailyTrendsTable(Base):

    __tablename__ = 'trends'
    __table_args__ = ({"schema": 'mf_analysis'})

    company_code = Column(DOUBLE_PRECISION, primary_key=True)
    rt_bullish_trending = Column(Integer)
    rt_bearish_trending = Column(Integer)
    rt_bullish_non_trending = Column(Integer)
    rt_bearish_non_trending = Column(Integer)
    gen_date = Column(Date, primary_key=True)

df_query = select([DailyTrendsTable])

df_data = pd.read_sql(rt_daily_query, con = conn)