Sort Varchar field taking into account groups and subgroups

10

I need to sort the values below in ascending order. They are of type Varchar and the solutions I have found so far for MySQL did not work.

An example of the data type would be 1.12.02 , as these values serve as specific code for a chart of accounts chart. The problem is that values such as 1.2 are coming after 1.15 and so on.

Numbers have up to 3 sub-levels after the first point, and each sub-level can be up to 3 characters long.

For example, instead of:

1.1   
1.1.01   
1.10   
1.11   
1.12   
1.12.01   
1.15.1   
1.2   

Should be:

1.1    
1.1.01   
1.2   
1.10   
1.11    
1.12   
1.12.01   
1.15.1   
    
asked by anonymous 26.09.2016 / 19:38

1 answer

9

The biggest problem was in the modeling decision, which prioritized the storage of the formatted value, not the semantic value.


Preferred solution: remodel

The ideal solution would be to convert the column to some format that has the padding of the houses preserved, either string or preferably numeric if the maximum padding value "fits" the digits of the chosen type. p>

Instead of

1.15
1.2.17.01

would store, for example:

01015000000
01002017001

I used the AABBBCCCDDD format (two houses on the first level, and 3 on the following ones), but this should be adjusted according to your application.

The points would be placed only in the screen display, and removed in input , with their validation and formatting.


Intermediate solution

A possible solution is to create a function in MySQL that breaks the string into groups, multiplying its factors by a constant corresponding to the possible levels, effectively ordering.

Another one is to use substrings, but this involves extracting each group, appending zeros before and cutting to a fixed extent:

SELECT
  conta
FROM
  planodecontas
ORDER BY
  0 + SUBSTRING_INDEX( CONCAT( conta, '.0.0.0' ), '.', 1 ),
  0 + SUBSTRING_INDEX( SUBSTRING_INDEX( CONCAT( conta, '.0.0.0' ), '.', 2 ), '.', -1),
  0 + SUBSTRING_INDEX( SUBSTRING_INDEX( CONCAT( conta, '.0.0.0' ), '.', 3 ), '.', -1),
  0 + SUBSTRING_INDEX( SUBSTRING_INDEX( CONCAT( conta, '.0.0.0' ), '.', 4 ), '.', -1)

The problem in both cases is that there are "ugly and expensive" functions for use in sorting. Doing in the client application (sort after retrieving, with the main language of the code), can end up being better.


Solution "gambiarra oriented":

By chance, it has a function that is made to interpret IPV4 numbers, which are strings with format AAA.BBB.CCC.DDD , which is INET_ATON

SELECT
   conta
FROM
   planodecontas
ORDER BY
   INET_ATON( SUBSTRING_INDEX( CONCAT( conta, '.0.0.0' ), '.', 4 ) );

The limitation is that INET_ATON needs fixed number of groups for a balanced comparison, and only works with values from 0 to 255 per group .

To align the groups, we use

SUBSTRING_INDEX( CONCAT( conta, '.0.0.0' ), '.', 4 )

That takes 4 separate groups with . of concatenated string .

I do not recommend it, but you should know that it exists.

    
26.09.2016 / 20:20