MySQL Query - Count of crosstable records

5

I have the following tables:

EachtupleofthePageViewstable(main)containsasingle-pageview.Ifasinglepersoninthesamesession(Sessions)viewsthesamepage(Pages),anewtupleisnotaddedinPageViews>.InsteadthePageViews.quantityfieldisincremented.

Inonereport,Ineedthenumberofuniquevisitors(COUNTDISTINCTVisitors.id)andpageviews(SUMPageViews.quantity)inagiventime,week,month,andyear)fromaspecificstore(whichstoresStores.code).

ThequeriesI'mcurrentlydoingare:

Visitors

SELECTDISTINCTVisitors.idFROMPageViewsINNERJOINVisitorsStoresONVisitorsStores.id=PageViews.visitorINNERJOINStoresONStores.id=VisitorsStores.storeINNERJOINSessionsONSessions.id=VisitorsStores.sessionINNERJOINVisitorsONVisitors.id=Sessions.visitorWHEREDATE(PageViews.createdAt)BETWEEN'2017-11-01'AND'2017-11-30'ANDStores.code='loja1';

PageViews

SELECTSUM(quantity)ASqtdeFROMPageViewsINNERJOINVisitorsStoresONVisitorsStores.id=PageViews.visitorINNERJOINStoresONStores.id=VisitorsStores.storeWHEREDATE(PageViews.createdAt)BETWEEN'2017-11-01'AND'2017-11-30'ANDStores.code='loja1';

Both

SET@loja:='loja1';SET@inicio:='2017-11-01';SET@fim:='2017-11-31';SELECTCOUNT(tVisitors.qtdVisitors)ASvisitors,SUM(tVisitors.qtdPageViews)aspageViewsFROM(SELECTVisitors.idASqtdVisitors,SUM(quantity)ASqtdPageViewsFROMPageViewsINNERJOINVisitorsStoresONVisitorsStores.id=PageViews.visitorINNERJOINStoresONStores.id=VisitorsStores.storeINNERJOINSessionsONSessions.id=VisitorsStores.sessionINNERJOINVisitorsONVisitors.id=Sessions.visitorWHEREDATE(PageViews.createdAt)BETWEEN@[email protected]=@lojaGROUPBYqtdVisitors)astVisitors;

Itworksperfectlywhenastorehasfewaccessesperday.Butwhenastorehasalotofhits(somearound300visitorsand5,000pageviews)thequerytakesalongtimetocomplete(32swasthelongestinthementionedsituation.Lotswithlessthan1,000pageviewsusuallytakelessthan3s.)

Thequestionis:HowelsecouldIdothisquerytobefaster?

EXPLAINofthequeryBothforasingleday(05/12)

(visitor:696|pageViews:5809|queryexecutiontime:32s)

[{"id": 1,
        "select_type": "PRIMARY",
        "table": "<derived2>",
        "type": "ALL",
        "possible_keys": null,
        "key": null,
        "key_len": null,
        "ref": null,
        "rows": 6192,
        "Extra": ""
    },
    {
        "id": 1,
        "select_type": "PRIMARY",
        "table": "<derived3>",
        "type": "ALL",
        "possible_keys": null,
        "key": null,
        "key_len": null,
        "ref": null,
        "rows": 6192,
        "Extra": ""
    },
    {
        "id": 3,
        "select_type": "DERIVED",
        "table": "Stores",
        "type": "index",
        "possible_keys": "PRIMARY",
        "key": "code",
        "key_len": "62",
        "ref": null,
        "rows": 1,
        "Extra": "Using where; Using index"
    },
    {
        "id": 3,
        "select_type": "DERIVED",
        "table": "VisitorsStores",
        "type": "ref",
        "possible_keys": "PRIMARY,FK_VisitorsStores_Stores",
        "key": "FK_VisitorsStores_Stores",
        "key_len": "5",
        "ref": "tmw_views.Stores.id",
        "rows": 6192,
        "Extra": "Using index"
    },
    {
        "id": 3,
        "select_type": "DERIVED",
        "table": "PageViews",
        "type": "ref",
        "possible_keys": "FK_PageViews_VisitorsStores",
        "key": "FK_PageViews_VisitorsStores",
        "key_len": "8",
        "ref": "tmw_views.VisitorsStores.id",
        "rows": 1,
        "Extra": "Using where"
    },
    {
        "id": 2,
        "select_type": "DERIVED",
        "table": "Stores",
        "type": "index",
        "possible_keys": "PRIMARY",
        "key": "code",
        "key_len": "62",
        "ref": null,
        "rows": 1,
        "Extra": "Using where; Using index; Using temporary"
    },
    {
        "id": 2,
        "select_type": "DERIVED",
        "table": "VisitorsStores",
        "type": "ref",
        "possible_keys": "PRIMARY,FK_VisitorsStores_Sessions,FK_VisitorsStores_Stores",
        "key": "FK_VisitorsStores_Stores",
        "key_len": "5",
        "ref": "tmw_views.Stores.id",
        "rows": 6192,
        "Extra": "Using where"
    },
    {
        "id": 2,
        "select_type": "DERIVED",
        "table": "Sessions",
        "type": "eq_ref",
        "possible_keys": "PRIMARY,FK_Sessions_Visitors",
        "key": "PRIMARY",
        "key_len": "8",
        "ref": "tmw_views.VisitorsStores.session",
        "rows": 1,
        "Extra": ""
    },
    {
        "id": 2,
        "select_type": "DERIVED",
        "table": "Visitors",
        "type": "eq_ref",
        "possible_keys": "PRIMARY",
        "key": "PRIMARY",
        "key_len": "8",
        "ref": "tmw_views.Sessions.visitor",
        "rows": 1,
        "Extra": "Using index"
    },
    {
        "id": 2,
        "select_type": "DERIVED",
        "table": "PageViews",
        "type": "ref",
        "possible_keys": "FK_PageViews_VisitorsStores",
        "key": "FK_PageViews_VisitorsStores",
        "key_len": "8",
        "ref": "tmw_views.VisitorsStores.id",
        "rows": 1,
        "Extra": "Using where; Distinct"
    }
]

SHOW TABLE STATUS FROM tmw_views ;

{
    "table": "TABLES",
    "rows":
    [
        {
            "Name": "PageViews",
            "Engine": "InnoDB",
            "Version": 10,
            "Row_format": "Compact",
            "Rows": 1560181,
            "Avg_row_length": 84,
            "Data_length": 131694592,
            "Max_data_length": 0,
            "Index_length": 194314240,
            "Data_free": 12582912,
            "Auto_increment": null,
            "Create_time": "2017-10-27 23:31:14",
            "Update_time": null,
            "Check_time": null,
            "Collation": "utf8_general_ci",
            "Checksum": null,
            "Create_options": "",
            "Comment": ""
        },
        {
            "Name": "PageViewsYear",
            "Engine": "InnoDB",
            "Version": 10,
            "Row_format": "Compact",
            "Rows": 25,
            "Avg_row_length": 655,
            "Data_length": 16384,
            "Max_data_length": 0,
            "Index_length": 32768,
            "Data_free": 0,
            "Auto_increment": 62,
            "Create_time": "2017-11-27 11:17:44",
            "Update_time": null,
            "Check_time": null,
            "Collation": "utf8_general_ci",
            "Checksum": null,
            "Create_options": "",
            "Comment": ""
        },
        {
            "Name": "Pages",
            "Engine": "InnoDB",
            "Version": 10,
            "Row_format": "Compact",
            "Rows": 94648,
            "Avg_row_length": 116,
            "Data_length": 11026432,
            "Max_data_length": 0,
            "Index_length": 0,
            "Data_free": 4194304,
            "Auto_increment": 205925,
            "Create_time": "2017-10-27 23:31:14",
            "Update_time": null,
            "Check_time": null,
            "Collation": "utf8_general_ci",
            "Checksum": null,
            "Create_options": "",
            "Comment": ""
        },
        {
            "Name": "Sessions",
            "Engine": "InnoDB",
            "Version": 10,
            "Row_format": "Compact",
            "Rows": 366180,
            "Avg_row_length": 73,
            "Data_length": 26804224,
            "Max_data_length": 0,
            "Index_length": 41058304,
            "Data_free": 3145728,
            "Auto_increment": 531262,
            "Create_time": "2017-10-27 23:31:14",
            "Update_time": null,
            "Check_time": null,
            "Collation": "utf8_general_ci",
            "Checksum": null,
            "Create_options": "",
            "Comment": ""
        },
        {
            "Name": "Stores",
            "Engine": "InnoDB",
            "Version": 10,
            "Row_format": "Compact",
            "Rows": 0,
            "Avg_row_length": 0,
            "Data_length": 16384,
            "Max_data_length": 0,
            "Index_length": 16384,
            "Data_free": 0,
            "Auto_increment": 31,
            "Create_time": "2017-10-27 23:31:14",
            "Update_time": null,
            "Check_time": null,
            "Collation": "utf8_general_ci",
            "Checksum": null,
            "Create_options": "",
            "Comment": ""
        },
        {
            "Name": "SystemLogs",
            "Engine": "InnoDB",
            "Version": 10,
            "Row_format": "Compact",
            "Rows": 5913,
            "Avg_row_length": 1687,
            "Data_length": 9977856,
            "Max_data_length": 0,
            "Index_length": 540672,
            "Data_free": 4194304,
            "Auto_increment": null,
            "Create_time": "2017-11-20 15:21:31",
            "Update_time": null,
            "Check_time": null,
            "Collation": "utf8_general_ci",
            "Checksum": null,
            "Create_options": "",
            "Comment": ""
        },
        {
            "Name": "Visitors",
            "Engine": "InnoDB",
            "Version": 10,
            "Row_format": "Compact",
            "Rows": 31067,
            "Avg_row_length": 51,
            "Data_length": 1589248,
            "Max_data_length": 0,
            "Index_length": 1589248,
            "Data_free": 4194304,
            "Auto_increment": 102579,
            "Create_time": "2017-10-27 23:31:14",
            "Update_time": null,
            "Check_time": null,
            "Collation": "utf8_general_ci",
            "Checksum": null,
            "Create_options": "",
            "Comment": ""
        },
        {
            "Name": "VisitorsStores",
            "Engine": "InnoDB",
            "Version": 10,
            "Row_format": "Compact",
            "Rows": 297216,
            "Avg_row_length": 44,
            "Data_length": 13107200,
            "Max_data_length": 0,
            "Index_length": 17842176,
            "Data_free": 4194304,
            "Auto_increment": 562767,
            "Create_time": "2017-10-27 23:31:14",
            "Update_time": null,
            "Check_time": null,
            "Collation": "utf8_general_ci",
            "Checksum": null,
            "Create_options": "",
            "Comment": ""
        }
    ]
}

SHOW VARIABLES LIKE 'query_cache%';

query_cache_limit             1048576
query_cache_min_res_unit      4096
query_cache_size              33554432
query_cache_strip_comments    OFF
query_cache_type              ON
query_cache_wlock_invalidate  OFF
    
asked by anonymous 06.12.2017 / 11:21

2 answers

5

The problem is with the use of the DATE() function, since it ends up disrupting the use of the index.

Consider the following: For each tuple, MySQL has to perform the function before checking if the date is within the limit.

I recommend that you instead set the start time ( 2017-12-31 00:00:00 ) to end date ( 2017-12-31 23:59:59 ) on the start date.

So you would not have to use DATE() and the index will improve the speed of the query, as it should be.

SET @loja := 'loja1';
SET @inicio := '2017-11-01 00:00:00';
SET @fim := '2017-11-31 23:59:59';

SELECT
  COUNT(tVisitors.qtdVisitors) AS visitors,
  SUM(tVisitors.qtdPageViews) as pageViews
FROM (
  SELECT
    Visitors.id   AS qtdVisitors,
    SUM(quantity) AS qtdPageViews
  FROM PageViews
    INNER JOIN VisitorsStores ON VisitorsStores.id = PageViews.visitor
    INNER JOIN Stores ON Stores.id = VisitorsStores.store
    INNER JOIN Sessions ON Sessions.id = VisitorsStores.session
    INNER JOIN Visitors ON Visitors.id = Sessions.visitor
  WHERE
    PageViews.createdAt BETWEEN @inicio AND @fim AND Stores.code = @loja
  GROUP BY qtdVisitors
) as tVisitors;
    
08.12.2017 / 11:39
2

Being straightforward: an implicit JOIN is currently being done between tVisitor and tPageViews in the Both query.

We can improve and optimize with Semi-Join [ WHERE Stores.id IN (Subqueries) ] because both use the same data in the WHERE clause and the Subqueries tPageViews is only a COUNT() .

-- Teste de pageviews
SELECT SUM(quantity) AS qtdePageViews, COUNT(Visitors.id) AS visitor
FROM PageViews
  INNER JOIN VisitorsStores ON VisitorsStores.id = PageViews.visitor
  INNER JOIN Stores ON Stores.id = VisitorsStores.store
  INNER JOIN Sessions ON Sessions.id = VisitorsStores.session
  INNER JOIN Visitors ON Visitors.id = Sessions.visitor
WHERE Stores.id IN (
    SELECT Stores.id
    FROM PageViews
      INNER JOIN VisitorsStores ON VisitorsStores.id = PageViews.visitor
      INNER JOIN Stores ON Stores.id = VisitorsStores.store
    WHERE
      DATE(PageViews.createdAt) BETWEEN '2017-11-01' AND '2017-11-30' 
      AND Stores.code = 'loja1'
);

Just to contribute and facilitate the verification, follow the tables with SQL in the dialect of MySQL :

  

Table Visitors :

CREATE TABLE 'Visitors' (
  'id' bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  'ip' varchar(50) NOT NULL,
  'createdAt' timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY ('id')
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
  

Table Stores :

CREATE TABLE 'Stores' (
  'id' int(10) unsigned NOT NULL AUTO_INCREMENT,
  'code' varchar(250) NOT NULL,
  'visitor' bigint(20) unsigned DEFAULT NULL,
  'createdAt' timestamp NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY ('id'),
  UNIQUE KEY 'code_UNIQUE' ('code')
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
  

Table Pages :

CREATE TABLE 'Pages' (
  'id' int(10) unsigned NOT NULL AUTO_INCREMENT,
  'url' varchar(500) DEFAULT NULL,
  PRIMARY KEY ('id')
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
  

Table VisitorsStores :

CREATE TABLE 'VisitorsStores' (
  'id' bigint(20) unsigned NOT NULL,
  'session' bigint(20) unsigned DEFAULT NULL,
  'store' int(10) unsigned DEFAULT NULL,
  PRIMARY KEY ('id'),
  KEY 'session_FKIndex_idx' ('session'),
  KEY 'store_FKIndex_idx' ('store'),
  CONSTRAINT 'session_FKIndex' FOREIGN KEY ('session') 
    REFERENCES 'Sessions' ('id') ON DELETE 
        NO ACTION ON UPDATE NO ACTION,
  CONSTRAINT 'store_FKIndex' FOREIGN KEY ('store') 
    REFERENCES 'Stores' ('id') 
        ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
  

Table PageViews :

CREATE TABLE 'PageViews' (
  'page' int(10) unsigned NOT NULL,
  'visitor' bigint(20) unsigned NOT NULL,
  'quantity' int(10) unsigned DEFAULT NULL,
  'updatedAt' timestamp NULL DEFAULT CURRENT_TIMESTAMP,
  'createdAt' datetime DEFAULT NULL,
  PRIMARY KEY ('page','visitor'),
  KEY 'visitor_FKIndex_idx' ('visitor'),
  CONSTRAINT 'page_FKIndex' FOREIGN KEY ('page') 
    REFERENCES 'Pages' ('id') 
        ON DELETE NO ACTION ON UPDATE NO ACTION,
  CONSTRAINT 'visitor_FKIndex' FOREIGN KEY ('visitor') 
    REFERENCES 'VistitorsStores' ('id') 
        ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
  

Table Sessions :

CREATE TABLE 'Sessions' (
  'id' bigint(20) unsigned NOT NULL,
  'code' varchar(250) DEFAULT NULL,
  'visitor' bigint(20) unsigned DEFAULT NULL,
  'createdAt' timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY ('id'),
  KEY 'session_FKIndex_visitor_idx' ('visitor'),
  CONSTRAINT 'session_FKIndex_visitor' FOREIGN KEY ('visitor') 
    REFERENCES 'Visitors' ('id') 
        ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

There are several optimization strategies that we can use to make queries faster. And what I consider to be a priority is:

  

Queries work more efficiently when you choose a Storage Engine with properties that best meet the requirements of the application.

Optimizing InnoDB queries [ MySQL 5.6 - 2015 , 1085 p.]

  • If you frequently have recurring queries for tables that do not refresh frequently, enable the query cache:

    [mysqld]
    query_cache_type = 1
    query_cache_size = 10M
    
  • Do not create a separate secondary index for each column, as each query can only use one index. Indexes on rarely tested columns or columns with only a few different values may not be useful for any queries. If you have many queries for the same table, testing different column combinations, try to create a small number of concatenated indexes instead of a large number of indexes in a single column . If an index contains all the columns needed for the result set ( known as a coverage index ), the query may prevent you from reading the table data.

  • If an indexed column can not contain NULL values, declare it as NOT NULL when you create the table. The optimizer can best determine which index is most effective for a query, when it knows whether each column contains NULL values or not.

    In the context of the question here, it would be the tuple: 'Stores'. 'code' (varchar (250)) that is too large should be indexed as below, using coverage index by combining the 'Stores'. 'id' with the 'Stores'. 'code':
    ALTER TABLE 'Stores' 
    DROP INDEX 'code_UNIQUE' ,
    ADD UNIQUE INDEX 'code_UNIQUE' USING BTREE ('id' ASC, 'code'(62) ASC);
    

Optimizing the Server: Memory Parameters [ Paul DuBois et al - 2006 , 543 p.]

Interesting to keep in mind that when thinking about performance and optimizations we can not forget the server. I know it may come out of the scope of the question, but it would be an incomplete response if we did not optimize the server. In the context here, let's focus on creating a my-large.cnf file on grazing /usr/share/mysql with the following values:

[mysqld]
key_buffer_size = 256M
table_cache = 256
sort_buffer_size = 1M
query_cache_type = ON
query_cache_size = 16M

Optimizing Server :: Query Cache [ Paul DuBois et al - 2006 , 549 p.]

[mysqld]
query_cache_type = 1
query_cache_size = 10M
query_cache_limit = 2M

MySQL : MySQL

To check the current Query Cache values of your environment:

SHOW VARIABLES LIKE ‘query_cache%’;

Reference : [MySQL 5.6 - 2015], MySQL TM , MySQL 5.6 Reference Manual : Including MySQL Cluster NDB 7.3-7.4 Reference Guide. Document generated on: 2015. May 08. (revision: 43103)
[Paul DuBois et al - 2006], Copyright © 2006 by MySQL AB,
[Sheeri Cabral, Keith Murphy - 2009], Published by Wiley Publishing, Inc., MySQL® Administrator's Bible

    
06.12.2017 / 22:20