Sorting a list of strings in python

0

I need to sort a list inside it is another list containing exactly one string and a number, the problem is that I'm not getting the desired result, let's put an example if I have the list below

a = [
    ['c2sp1s5', 0],
    ['c2sp1s10', 1],
    ['c2sp1s11', 0],
    ['c2sp1s1', 0]
]

and I want it sorted this way

a = [
    ['c2sp1s1', 0],
    ['c2sp1s5', 0],
    ['c2sp1s10', 1],
    ['c2sp1s11', 0]
]

I need the list to be sorted as the example above, so I can make a comparison of the resulting list with another list that is already sorted as the list above, thus extracting my desired result. But if my list is sorted the way the example below follows, I will have an incorrect result.

More precisely, I need to use them in the same format, where the internal list string is in the same position in both lists, so that I can use the same data as in position 1 of the internal list and extract my result.

I can not get a list sorted this way using sorted(a, key=itemgetter(0)) , as they result in a list as follows

a = [
    ['c2sp1s1', 0],
    ['c2sp1s10', 1],
    ['c2sp1s11', 0],
    ['c2sp1s5', 0]
]

Is there any practical way to do this or will I have to implement sorting in hand?

    
asked by anonymous 13.12.2017 / 21:47

1 answer

2

As it was not yet perfectly clear in the question - and apparently the author himself could not explain it - I will consider for this answer strings independent of the format with a very specific condition : whenever there is an integer value in the string , the classification should consider the numerical value of these characters and not more as text; this will imply, for example, that the string c2sp1s5 should appear before the string c10sp1s5 , due to the presence of the numeric values 2 and 10 in the string and that generally 2 is less than 10.

For the implementation of this logic, I will create a function called magic , which as the name suggests, will do magic with the classification. The function will receive a string to then separate it to each numeric value found, generating a list of strings , some with only text, others with numeric values; for example, with the c2sp1s5 entry will generate the ['c', '2', 'sp', '1', 's', '5'] list, while the c2sp1s10 entry will generate the ['c', '2', 'sp', '1', 's', '11'] list. If we compare the two lists generated, we would have the same initial problem: each term of the lists would be compared one by one and the result would be exactly the same, since still '11 'would be less than '2' , then, before comparing the list , we must convert the numeric values to integers, resulting in the lists ['c', 2, 'sp', 1, 's', 5] and ['c', 2, 'sp', 1, 's', 11] ; in this way, when comparing the lists, in the last term would be compared the integer values 2 and 11, returning 2 as less than 11.

The code looks like this:

def magic(value):
    parts = re.split(r'(\d+)', value)
    return [int(part) if part.isdigit() else part for part in parts]

The first line of the function divides the input into numeric values and the second returns a list by converting numeric values to integers. Since the function expects only a string , to use the example given in the question, it is necessary to indicate the string that will be considered in the list classification. In this case, it is the string present at index 0, so we do:

import re


def magic(value):
    parts = re.split(r'(\d+)', value)
    return [int(part) if part.isdigit() else part for part in parts]


a = [
    ['c2sp1s5', 0],
    ['c2sp1s10', 1],
    ['c2sp1s11', 0],
    ['c2sp1s1', 0]
]

print( sorted(a, key=lambda v: magic(v[0])) )

See working at Ideone | Repl.it

What generates the result >

[
    ['c2sp1s1', 0], 
    ['c2sp1s5', 0], 
    ['c2sp1s10', 1], 
    ['c2sp1s11', 0]
]
    
13.12.2017 / 22:04