Save reports to a table or execute a query to the database?

0

I am setting up an e-commerce where I have an area to generate sales reports. These reports are broken down by year and month.

Currently I have a query sql where I segment the results by year and month and then I get the results I want, for example:

  • Sales
    • Total sales;
    • Total sales in R $;
  • Product
    • List of products;
    • Best selling product;
  • Shipping
    • Total sales sent by mail;
    • Total sales shipped by carrier;
  • Payment
    • Total online payments;
    • Total deposit payments;

And so on ...

My question is the following, as the reports are fixed, that is, when the month ends, you no longer have the data for that month, would it be better for me to save a table with the reports? Or should I keep them through a new query and filter this information every time the user accesses the reporting page?

When doing the query every time I need to display the report, I feel it is a very heavy and time consuming query because there are several related tables, for example, tipoPagamento , tipoEntrega , cliente , etc. .

Already registering the report data in a table, the query is simpler, but I will have more information in the database. Note: Within the save scheme in a new table, only the list of products is that would remain through query, as it would generate unnecessary duplication.

I am using PHP and database MySql .

    
asked by anonymous 11.11.2016 / 14:39

2 answers

2

Saving the results is a good option to optimize because it is common to query the same condition several times. For example, consulting the whole month is something obvious and common. This saves a lot of repetitive processing. Consulting a fortnight or a week is also common. Find out what the "standard" or most relevant frequency of users is.

Of course this will not make a difference to a not too large amount. Let's say up to 800,000 records will not have that much difference. But it is very relative to say such a number since it can vary greatly depending on the environment and complexity of the queries. The ideal (particularly speaking) is just to observe. When you realize you're starting to slow down, it's time to optimize.

Example of a static filter

ThisisthesearchfilterfortheitunnesConnect(Apple)salesreportpage.

Therearestaticoptions"yesterday", "7 days," "30 days", etc.
These are the options that users are likely to query more often and so the internal system probably automatically generates the query cache, returning the data very quickly. And it actually returns very fast considering the amount of data they have.

Alternatively it has the custom search where the user sets the period that they want. In this case, if the chosen period has no cache, the query takes a little time depending on the period. But if you repeat the same query, it returns pretty fast because a cache is probably generated.

For a small system with little volume you do not have to worry about that much. It can be implemented on demand. Most of the time it is "in vain" to build something optimized right from the front, not knowing if the project will have a great volume.

The rules vary from business to business. Here another example, from Amazon.co.jp

In Amazon, you do not have the option to customize the date range. It's all static.

What static filter rules to choose, is something based on opinion, that is, depends on the business model of each case.

I have used examples from two large companies to emphasize and show that even the giants apply this method. There is no magic, for example, using fulltext, using innoDB or MyISAM, applying indexes and keys here and there. It does not make that much difference. But the way you assemble the queries can present a big difference just like modeling. And obviously the results caching techniques.

    
11.11.2016 / 16:07
1

I recommend leaving the report through SQL query. Once your report can have start and end date filters, the user could access at any time and make the necessary comparisons. Regarding the queries, try to optimize the maximum your query, but I do not see problem in that once it is well elaborated.

Regarding the data, it would have to have a validation, perhaps a trigger so that it is not allowed to change data after a specific time.

    
11.11.2016 / 14:47