As it was not yet perfectly clear in the question - and apparently the author himself could not explain it - I will consider for this answer strings independent of the format with a very specific condition : whenever there is an integer value in the string , the classification should consider the numerical value of these characters and not more as text; this will imply, for example, that the string c2sp1s5
should appear before the string c10sp1s5
, due to the presence of the numeric values 2 and 10 in the string and that generally 2 is less than 10.
For the implementation of this logic, I will create a function called magic
, which as the name suggests, will do magic with the classification. The function will receive a string to then separate it to each numeric value found, generating a list of strings , some with only text, others with numeric values; for example, with the c2sp1s5
entry will generate the ['c', '2', 'sp', '1', 's', '5']
list, while the c2sp1s10
entry will generate the ['c', '2', 'sp', '1', 's', '11']
list. If we compare the two lists generated, we would have the same initial problem: each term of the lists would be compared one by one and the result would be exactly the same, since still '11
'would be less than '2'
, then, before comparing the list , we must convert the numeric values to integers, resulting in the lists ['c', 2, 'sp', 1, 's', 5]
and ['c', 2, 'sp', 1, 's', 11]
; in this way, when comparing the lists, in the last term would be compared the integer values 2 and 11, returning 2 as less than 11.
The code looks like this:
def magic(value):
parts = re.split(r'(\d+)', value)
return [int(part) if part.isdigit() else part for part in parts]
The first line of the function divides the input into numeric values and the second returns a list by converting numeric values to integers. Since the function expects only a string , to use the example given in the question, it is necessary to indicate the string that will be considered in the list classification. In this case, it is the string present at index 0, so we do:
import re
def magic(value):
parts = re.split(r'(\d+)', value)
return [int(part) if part.isdigit() else part for part in parts]
a = [
['c2sp1s5', 0],
['c2sp1s10', 1],
['c2sp1s11', 0],
['c2sp1s1', 0]
]
print( sorted(a, key=lambda v: magic(v[0])) )
See working at Ideone | Repl.it
What generates the result >
[
['c2sp1s1', 0],
['c2sp1s5', 0],
['c2sp1s10', 1],
['c2sp1s11', 0]
]