Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
560 views
in Technique[技术] by (71.8m points)

python - Query SQLAlchemy taking too long

folks. I got a query on SQLalchemy that is taking too long. I don't if it's because the db is too big (6.5 millions of rows, so I don't think so)or if I'm doing something wrong. The following are the tables.

class A(Base):
    __tablename__ = 'tbl_a'

    id = Column(Integer, primary_key=True, autoincrement=True)
    
    a = Column(CHAR(3))
    b = Column(DATE)
    c = Column(Integer)
    d = Column(Integer)
    e = Column(Integer)
    

class B(Base):
    __tablename__ = 'tbl_b'

    id = Column(Integer, primary_key=True, autoincrement=True)
    a = Column(Integer)
    b  = Column(DATE)
    c  = Column(Integer)
    d  = Column(Integer)

And this is the query:

row  = session.query(A).get(id)

value = session.query(B.d).filter((B.b == row.b)&
                                  (B.d == row.c)&
                                  (B.e == row.d)).first()

    

With just one condition B.b == row.b it takes 2 minutes, with more I didn't saw it end after 10 minutes. It takes 0.5 seconds to get the row value. One thing that I saw was that indexing columns you intend to use the where clause one speed up things. If so, can I do it after I uploaded the records to my db?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

ORM is not related to this issue, so it's not about SQLAlchemy, but how can we solve this? well, 6 million data is so much! that's normal to take time. if you wanna fix the problem you should use timescaledb which is an extension on SQL databases.

Timescaledb documents

What does timesacledb do?

It chunks your database tables into separated smaller tables based on time created or the id which is indexing in other words. when you ask for 6 million data it doesn't need to process all the rows, because it has indexing and chunks, so data will be provided in a blink of an eye!!

How can I use timescaledb?

Well, as the issue is not about Python or ORM, the solution is not also related to them, you need to configure timescaledb and somehow manage how many chunks should be created and how many rows should a chunk contain. that is up to you and your data.

Good luck pal.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...