PSYCOPG2 - Paging

2

In the search for paging using psycopg2 I discovered the attributes cursor.itersize and cursor.arraysize , and the methods cursor.fetchmany () and cursor.scroll ()

I wanted to check with the staff here if my understanding is correct about these methods and attributes. fetchmany() returns many rows, whose quantity can be passed as parameter. scroll() by what I understand is to scroll the cursor to some position among the lines loaded by the query, but I did not understand the parameter that it receives. The itersize attribute, as I understand it, controls the number of lines that the cursor should receive from the database. Maybe behind the driver is making use of the parameter LIMIT of postgresql that limits the amount of rows that should be returned by the query. The arraysize attribute defines the number of rows that will be returned by the fetchmany() method. I do not understand the purpose of this attribute on the object, since the fetchmany() method already receives a parameter to define this.

I need to display the rows of a table in a gui table component, some api graphics that I have not decided yet. But I think it is more appropriate that I make use of these attributes and methods to create a pagination, because creating a cursor object with all the data of a table that has many records can consume many resources.     

asked by anonymous 03.02.2016 / 21:51

1 answer

1

In Python we use a maxim: "premature optimization is the root of all evil": You do not know, and you have no way of knowing a priori, before your system works if there is going to be a memory or I / O bottleneck there.

What's more, you're using a database connector that has been in development for a number of years, and has great potential for using best practices in Python, which involves yes, fetching results in a "lazy" way - that is, search results only as they are needed.

So the recommendation is to believe that the default parameters of fetchmany will be good in most cases, and use your results as recommended in Python: as an iterator.

for result in cursor.fetchmany():
      # your code here

In this way, you leave the job of bringing an optimum amount of records to the connector, and work with one result at a time. Once the code is ready, you can benchmark, load, and use memory to see how this is going - and then worry about those parameters.

Disclaimer: I've seen people having problems with fetchmany in an old version of a MySQL Connector - in that case, the internal implementation was really childish and it was much better for the person to get all the results all at once from a list than using an interaction like the one above. But I believe that in the case of psycopg2, fetchmany is better.

If you want pagination on your side, you can do this simply by creating a Python iterator, which is independent of paging on the side of the database: you can do this:

def get_paged_results(cursor, limit=50):
    while True:
        page = list(cursor.fetchmany(limit))
        if not page:
              break
        yield page

...
for page in get_paged_results(cursor):
     ....
    
04.02.2016 / 15:40