Capture the WHERE clause of an SQL query

4

I need to capture only the where clause of multiple queries to parse the filters used. For example:

select "DIM_1"."col1",
"DIM_2"."col2" ,
"DIM_3"."col3" , 
"DIM_4"."col4" , 
"FAT_1"."col5"     
from "FAT_1",
 "DIM_1",
 "DIM_2",
 "DIM_3",
 "DIM_4"
where  "DIM_1"."col1" IS NOT NULL 
AND "DIM_2"."col2" LIKE ('SUCCESS')
AND "DIM_3"."col3" BETWEEN 20161213 AND 20161222
AND "DIM_4"."col4" > 0

I created a list with SQL, and then tried to apply regular expressions to extract the part of the where, but without success, it follows what I tried:

'for line in sql:'
    'if re.search(r'[where]\W',line):'
        'where.append(line)'

Unfortunately, I could not extract only the part of the where, can you tell me what error I made and how to fix it?

    
asked by anonymous 22.12.2016 / 19:05

3 answers

3

I believe that what you want is to get the where clauses.

First let's check what can come after the command where .

Subsequent commands from where

Second postgres . Seeing only the most common

  • GROUP BY
  • HAVING
  • ORDER BY
  • LIMIT
  • OFFSET
  • "NOTHING" - Because it can only be where without any subsequent commands.

REGEX

  • pattern: (?<=where)(.*?)((ORDER BY|GROUP BY|HAVING|LIMIT|OFFSET|$).*)
  • flags: si

Explanation

  • (?<=where) - ensures that what we are searching comes after where
  • (.*?) - everything that follows will be the clauses.
  • ((ORDER BY|GROUP BY|HAVING|LIMIT|OFFSET|$).*) - ensures that it will terminate at one of the commands or at the end ( $ ).
  • Flag: s - says . (dot) should include \n in search.
  • Flag: i - case-insensitive - allows you to search uppercase or lowercase.

Examples

23.12.2016 / 17:46
0

I'm not sure how to do this in Python, but I'm not sure how to do this.

According to your case, I believe it is not working because you are checking line by line, that is, you will find select in the where line, but in the next they will not.

Now if you get the group that corresponds to the regular expression, then I believe it works.

   import re

    text = 'select "DIM_1"."col1",
    "DIM_2"."col2" ,
    "DIM_3"."col3" , 
    "DIM_4"."col4" , 
    "FAT_1"."col5"     
    from "FAT_1",
     "DIM_1",
     "DIM_2",
     "DIM_3",
     "DIM_4"
    where  "DIM_1"."col1" IS NOT NULL 
    AND "DIM_2"."col2" LIKE ('SUCCESS')
    AND "DIM_3"."col3" BETWEEN 20161213 AND 20161222
    AND "DIM_4"."col4" > 0'

    m = re.search('where.*', text)
    if m:
        found = m.group(1)

If someone wants to edit for code correction, or if they do not sense, comment that I delete the answer.

Source: 7.2. re - Regular expression operations

    
22.12.2016 / 19:34
0
select = 'select * from tabela where campo = 1 and and campo2 = 2'
print (select[select.find('where'):])  # com o where incluso

or

 print (select[select.find('where') + 5:]) # ou para excluir a palavra where
    
22.12.2016 / 19:55