How to sort a Django query ignoring accents?

4

I am returning the query Carro.objects.all().order_by(Lower('marca')) , but the order is not respecting names that start with an accent, causing these results to appear at the end of the sort order. Is there a function that can ignore accents when sorting results? Something like unaccent at the time of filtering that is available in Django 1.8.

    
asked by anonymous 04.05.2015 / 21:29

1 answer

1

I can not talk about recent versions of Django, but in the last system I developed (I think it was 1.4 or 1.5) I had not found anything, and I ended up using that workaround

  • Assign the locale of the application. The way to do this is slightly different in Unix and Linux / Windows, so I ended up with the following code:

    locale_set_correctly = False
    try:
        locale.setlocale(locale.LC_ALL, "pt_BR.UTF-8") # Unix
        locale_set_correctly = True
    except:
        try:
            locale.setlocale(locale.LC_ALL, "Portuguese_Brazil.1252") # Linux
            locale_set_correctly = True
        except:
            try:
                locale.setlocale(locale.LC_ALL, "") # Tenta usar o locale padrão
                locale_set_correctly = True
            except:
                pass
    
  • Read the bank records and then sort. If you intend to use all , no problem, but if you'd like to say the 100 first in alphabetical order, then unfortunately that option is not for you.

    def locale_sort(result, field):
        if locale_set_correctly:
            def collation(a,b):
                if hasattr(a,field):
                    if hasattr(b,field):
                        fa = getattr(a,field)
                        fb = getattr(b,field)
                        return locale.strcoll(fa, fb)
                    else:
                        return -1
                elif hasattr(b,field):
                    return 1
                else:
                    return -1 if a.pk < b.pk else 1 if b.pk < a.pk else 0
            result.sort(collation)
        return result
    

    Use this way (by first ordering stupidly if the locale has not been correctly assigned):

    resultado = locale_sort(list(Carro.objects.all().order_by(Lower('marca'))), 'marca')
    
  • The locale.strcoll will sort so that all variations of the same letter (uppercase, lowercase, with an accent, without an accent) stick together in the ordering. Only if the rest everything is equal, only the letter is different, is that it orders in order minúscula sem acento < maiúscula sem acento < minúscula com acento < maiúscula com acento . Example:

    >>> sorted([u"Alberto", u"Álvaro", u"avião", u"águia"], cmp=locale.strcoll)
    [u'\xe1guia', u'Alberto', u'\xc1lvaro', u'avi\xe3o']
    >>> sorted([u"A", u"Á", u"a", u"á"], cmp=locale.strcoll)
    [u'a', u'A', u'\xe1', u'\xc1']
    

    Note: This is a very "robust" solution that I have been using in practice. If you are sure that the locale will be supported, and all your objects always have the marca attribute, you can simplify this code to:

    resultado = sorted(Carro.objects.all(), cmp=lambda a,b: locale.strcoll(a.marca, b.marca))
    
        
    05.05.2015 / 14:09