Sort list by similarity to a string

6

I have a list of string :

AAA
BBB
CCC
ABB
ABC
ACC
ACD

The user will type what he is looking for, he would like to take to the first positions, the most similar ones. Example:

String: A

Result:

AAA
ABB
ABC
ACC
ACD
BBB
CCC

String: AB

Result:

ABB
ABC
AAA
ACC
ACD
BBB
CCC

String: C

Result:

CCC
AAA
ABB
ABC
ACC
ACD
BBB

String: AC

Result:

ACC
ACD
AAA
ABB
ABC
CCC
BBB

String: B

Result:

BBB
AAA
ABB
ABC
ACC
ACD
CCC

Edit:

Just incrementing the @maniero solution that worked perfectly:

lista.OrderByDescending(x => (x.StartsWith(padrao))).ThenByDescending(x => (x.Contains(padrao)));

And so I got an even better result than expected.

    
asked by anonymous 30.11.2017 / 18:37

1 answer

9

What seems to have been defined as similar is if the substring exists in the string of each element in the list. Then just order the ones you have first, so the OrderByDecending() applied to Contains() . It will bundle everything it contains and then what does not contain the text pattern.

using System;
using System.Collections.Generic;
using System.Linq;

public class Program {
    public static void Main() {
        var lista = new List<string> { "AAA", "BBB", "CCC", "ABB", "ABC", "ACC", "ACD" };
        Semelhante(lista, "A");
        Semelhante(lista, "B");
        Semelhante(lista, "C");
        Semelhante(lista, "AB");
        Semelhante(lista, "AC");
    }
    public static void Semelhante(List<string> lista, string padrao) {
        foreach (var item in lista.OrderByDescending(x => (x.Contains(padrao)))) {
            Console.WriteLine(item);
        }
        Console.WriteLine();
    }
}

See running on .NET Fiddle E no Coding Ground . Also I placed GitHub for future reference .

In the new edition the question has an even better option, but only the AP knew this was what it needed.

I previously interpreted differently what was similarity. It's just to try to help someone else.

I think it can improve and I have not fully tested. gambi alert to use only LINQ: P

The OrderBy() waits for the element that must be used for it to sort, ie what should be the key. So I'm giving him the number of occurrences of the substring that he found as a key, after all the more occurrences, the closer he is. I used Count() in string to find the number of occurrences.

It may be that the requirement of "like" was not quite the same, but the question does not make it so clear. The result is expected.

I do not know if AAB is better than ABC because it has 2 Bs or if B comes before C (mine got it like this).

using System;
using System.Collections.Generic;
using System.Linq;

public class Program {
    public static void Main() {
        var lista = new List<string> { "AAA", "BBB", "CCC", "ABB", "ABC", "ACC", "ACD" };
        var padrao = "A";
        foreach (var item in lista.OrderByDescending(x => x.Select((c, i) => x.Substring(i)).Count(sub => sub.StartsWith(padrao)))) {
            Console.WriteLine(item);
        }
        Console.WriteLine();
        padrao = "AB";
        foreach (var item in lista.OrderByDescending(x => x.Select((c, i) => x.Substring(i)).Count(sub => sub.StartsWith(padrao)))) {
            Console.WriteLine(item);
        }
    }
}

See running on .NET Fiddle . And no Coding Ground . Also I put it in GitHub for future reference .

    
30.11.2017 / 19:01