Annotate: Returning the cheapest product and vendor name (Django)

5

Given the following table

product;company;price 
AAAAA;forn1;395.69 
BBBBB;forn1;939.45 
CCCCC;forn1;480.33 
DDDDD;forn1;590.59 
EEEEE;forn1;847.69 
AAAAA;forn2;227.31 
BBBBB;forn2;375.90 
CCCCC;forn2;602.18 
DDDDD;forn2;615.02 
EEEEE;forn2;845.20 
AAAAA;forn3;120.00 
BBBBB;forn3;379.30 
CCCCC;forn3;227.98 
DDDDD;forn3;522.33 
EEEEE;forn3;193.51 
AAAAA;forn4;869.01 
BBBBB;forn4;287.48 
CCCCC;forn4;405.49 
DDDDD;forn4;477.97 
EEEEE;forn4;971.85 

How to return the lowest price of each product and return the vendor name?

I've tried:

from django.db.models import Min

q=Quotation.objects.values('product').annotate(m=Min('price'))
for i in q: print(i)

And it returns

{'product': 'AAAAA', 'm': Decimal('120.00')}
{'product': 'BBBBB', 'm': Decimal('287.48')}
{'product': 'CCCCC', 'm': Decimal('227.98')}
{'product': 'DDDDD', 'm': Decimal('477.97')}
{'product': 'EEEEE', 'm': Decimal('193.51')}

But it does not return the vendor. Give me when I try

q=Quotation.objects.values('product').annotate(m=Min('price')).values_list('product','company','m')
for i in q: print(i)

It returns

('AAAAA', 'forn1', Decimal('395.69'))
('AAAAA', 'forn2', Decimal('227.31'))
('AAAAA', 'forn3', Decimal('120.00'))
('AAAAA', 'forn4', Decimal('869.01'))
('BBBBB', 'forn1', Decimal('939.45'))
('BBBBB', 'forn2', Decimal('375.90'))
('BBBBB', 'forn3', Decimal('379.30'))
('BBBBB', 'forn4', Decimal('287.48'))
('CCCCC', 'forn1', Decimal('480.33'))
('CCCCC', 'forn2', Decimal('602.18'))
('CCCCC', 'forn3', Decimal('227.98'))
('CCCCC', 'forn4', Decimal('405.49'))
('DDDDD', 'forn1', Decimal('590.59'))
('DDDDD', 'forn2', Decimal('615.02'))
('DDDDD', 'forn3', Decimal('522.33'))
('DDDDD', 'forn4', Decimal('477.97'))
('EEEEE', 'forn1', Decimal('847.69'))
('EEEEE', 'forn2', Decimal('845.20'))
('EEEEE', 'forn3', Decimal('193.51'))
('EEEEE', 'forn4', Decimal('971.85'))

And it's not what I want, I need to

{'product': 'AAAAA', 'company': 'forn2' 'm': Decimal('120.00')}
{'product': 'BBBBB', 'company': 'forn1' 'm': Decimal('287.48')}
{'product': 'CCCCC', 'company': 'forn1' 'm': Decimal('227.98')}
{'product': 'DDDDD', 'company': 'forn3' 'm': Decimal('477.97')}
{'product': 'EEEEE', 'company': 'forn4' 'm': Decimal('193.51')}

In other words, how do I return the cheapest product and supplier name?

    
asked by anonymous 03.09.2015 / 18:32

3 answers

3

( TL; DR )
If the table does not have 'zillions' of records, you can load the entire table into memory and filter with the pandas, see the example below.

import pandas as pd
import io

# Simulando a tabela
s = '''
product,company,price
AAAAA,forn1,395.69
BBBBB,forn1,939.45
CCCCC,forn1,480.33
DDDDD,forn1,590.59
EEEEE,forn1,847.69
AAAAA,forn2,227.31
BBBBB,forn2,375.90
CCCCC,forn2,602.18
DDDDD,forn2,615.02
EEEEE,forn2,845.20
AAAAA,forn3,120.00
BBBBB,forn3,379.30
CCCCC,forn3,227.98
DDDDD,forn3,522.33
EEEEE,forn3,193.51
AAAAA,forn4,869.01
BBBBB,forn4,287.48
CCCCC,forn4,405.49
DDDDD,forn4,477.97
EEEEE,forn4,971.85
'''
# Criando o pandas dataframe (Carregando a tabela)
df = pd.read_csv(io.StringIO(s), usecols=['product','company','price'])

# Agrupando os minimos
df.loc[df.groupby(['product'])['price'].idxmin()]

Output:

    product company price
10  AAAAA   forn3   120.00
16  BBBBB   forn4   287.48
12  CCCCC   forn3   227.98
18  DDDDD   forn4   477.97
14  EEEEE   forn3   193.51

Watch running on a jupyter notebook.

    
27.07.2017 / 06:51
2

Solved with a different solution (but I still accept a better and smaller solution).

In the end it was like this

from core.models import Quotation
from django.db.models import Min

q=Quotation.objects.values('product').annotate(m=Min('price'))
for i in q: print(i)

{'product': 'AAAAA', 'm': Decimal('120.00')}
{'product': 'BBBBB', 'm': Decimal('287.48')}
{'product': 'CCCCC', 'm': Decimal('227.98')}
{'product': 'DDDDD', 'm': Decimal('477.97')}
{'product': 'EEEEE', 'm': Decimal('193.51')}

Returns the cheapest products, but does not return the supplier.

from django.db import connection
len(connection.queries)
1

Solution:

q=Quotation.objects.values_list('product').annotate(m=Min('price'))
c=[]
for i in q: c.append(Quotation.objects.get(product=i[0],price=i[1]))

for co in c: co.product, co.company, co.price

('AAAAA', 'forn3', Decimal('120.00'))
('BBBBB', 'forn4', Decimal('287.48'))
('CCCCC', 'forn3', Decimal('227.98'))
('DDDDD', 'forn4', Decimal('477.97'))
('EEEEE', 'forn3', Decimal('193.51'))

Returns a list of the product, the vendor, and the lowest price. OK , that's all I wanted, but ...

Problem : low performance.

from django.db import connection
len(connection.queries)
6

Question: How can I improve this?

    
03.09.2015 / 21:44
1

If I understand the right subqueries manual >, you could implement the query this way:

outros_precos_menores = Quotation.objects.filter(
    product=OuterRef('product'),
    price__lt=OuterRef('price'),
)

precos_menores = Quotation.objects.annotate(nao_tem_menor=~Exists(outros_precos_menores)).filter(nao_tem_menor=True)

for p in precos_menores:
    print(p)

An alternative would be to use SQL directly. Take a look at the Django documentation at Performing raw SQL queries

SQL using derived table:

select
    qt.product,
    qt.company,
    qt.price

from (
    select qt2.product, Min(qt2.price) as min_price
    from Quotation qt2
    group by qt2.product
) preco

inner join Quotation qt on  qt.product = preco.product
                        and qt.price   = preco.min_price

SQL using subselect:

select
    qt.product,
    qt.company,
    qt.price

from Quotation qt
where qt.price = (  
    select Min(qt2.price)
    from Quotation qt2
    where qt2.product = qt.product
)

You could try something like:

sql = """
select
    qt.product,
    qt.company,
    qt.price

from Quotation qt
where qt.price = (  
    select Min(qt2.price)
    from Quotation qt2
    where qt2.product = qt.product
)
"""

for p in Quotation.objects.raw(sql):
    print(p)
    
26.07.2017 / 18:11