Test for string padding

9
When I read a post about good programming practices, more related to validating string fills, I came across the following:

Check too slow:

string ret = String.Empty;

if (string.IsNullOrEmpty(ret))

Slow check:

string ret = String.Empty;

if (ret == "")

Performing Check:

string ret = String.Empty;

if (ret.Length == 0)

On seeing this I was very doubtful, because in the place where I work a lot of people had already told me to use string.IsNullOrEmpty , but after seeing this I'm questioning.

So if anyone can clarify why this difference in performance, or even if this information is real, it would be interesting.

And if there really is an ideal way to check the filler of one string , the other 2 can be "discarded", or does the usage vary from case to case? >     

asked by anonymous 29.08.2018 / 14:03

5 answers

9

Stop reading good practices! This only creates vices of programming and illusion that is learning to program better. Study the basics, understand why things work that way, research and see for yourself, or ask experts who can be challenged and evaluated by others, as you are now doing (in the past, it was more reliable, as if they were good, so it's not that reliable either.)

Have you done the tests? The right way? Are you sure you saw which one is slower? In the right situation? Be very careful with tests that have no environmental control. I see a lot of error when the person is going to test and then it gives different results from reality. And even if right, in normal use it may be that the result is different from the actual and correct test because the code does not run in isolation. I did the test of the response of Rodolfo and in my machine, under the conditions of my solution gave different result, including in different executions the result was not very consistent.

It does not happen, but it could have a compiler that looks at the whole context (and it's not that difficult in certain cases) and you could see that most of it is not needed and eliminate everything.

Okay, intuitively I even think I'm right, but only a correct test can guarantee.

The first one does something different from the others, so we are already comparing oranges with bananas. It checks to see if the string is null before checking the contents. If the semantics is different already complicates the comparison. Then I have to ask: can you guarantee that the string is not null?

I've already given a answer about the subject and this method actually does only two things: it checks to see if it is null and if its size is 0. Then we can conclude that .NET itself prefers to check the size, and makes sense, because it avoids a comparison with memory indirection and numerically compares a constant. There we can see that IsNullOrWhiteSpace() is potentially much less performative and wasted resources if not what it needs, and the semantics are different in certain situations.

If you can guarantee that it is not null then the third option is better. I can state this without testing for the knowledge I have, but it could have some optimization and not be different. Nothing prevents the compiler or JITter identify what you want and switch to a more performative code. And this can change from version to version, so if you want accurate information, you need to test the version you're going to use, on the platform you're going to use. Anyway, everything can influence.

If you want to ensure the best performance do not count that there will be optimization. But rarely is this really necessary.

Of course, I would avoid the second whenever possible because it tends not to be optimized. I would discard the first if I ensure that it is not null, which I usually guarantee already and even more in C # 8.

If someone finds a reason to use another form, they need to justify it.

As a useful note in C # 8 it will be possible to ensure that string and other types by reference are never null at compile time, then any null comparison is unnecessary, unless the type is declared nullable ( string? ).

Note

This part does not make sense anymore because the question was edited, but it may be useful for other people.

Finally in this specific case posted in the question I would do so:

 
 
 

If I declare a variable of type string and do not put any value it will be null for sure, it does not have anything else to do, nor does it need to check if it is null, even more if it has something. Of course, examples 2 and 3 will give an error. So it's not good practice, it just makes sense to do nothing. I responded considering that the example was an error and the intent is in another context.

    
29.08.2018 / 15:08
6

The version of code "performance" will give error System.NullReferenceException if the variable is null, I believe you will have to change to if (ret == null || ret.Length == 0) .

I did the following test:

static void Main(string[] args)
    {
        string ret = null;

        Stopwatch stopWatch = new Stopwatch();
        stopWatch.Start();
        for (int i = 0; i <= 100000; i++)
        {
            // if (string.IsNullOrEmpty(ret)) Console.WriteLine($"{i} empty string");
            if (string.IsNullOrEmpty(ret)) Console.WriteLine($"{i} empty string");
        }

        stopWatch.Stop();
        // Get the elapsed time as a TimeSpan value.
        TimeSpan ts = stopWatch.Elapsed;

        // Format and display the TimeSpan value.
        string elapsedTime = String.Format("{0:00}:{1:00}:{2:00}.{3:00}",
            ts.Hours, ts.Minutes, ts.Seconds,
            ts.Milliseconds / 10);
        Console.WriteLine("RunTime " + elapsedTime);

        Console.ReadKey();
    }

Here are the results:

if (ret == null || ret.Length == 0)

  

RunTime 00: 00: 07.95

if (string.IsNullOrEmpty (ret))

  

RunTime 00: 00: 08.43

I confess that I never bothered with this kind of optimization, I usually use string.IsNullOrWhiteSpace() which already checks for white space, but of course you have to see what is best to use on a case by case basis.

    
29.08.2018 / 14:28
5

I think the question of the performance of the first two examples is related to the fact that they are validating the content of string , where in the 3rd example, more performing , we are only validating the size .

string.IsNullOrEmpty(ret)

In this example, the IsNullOrEmpty method internally represents ret == null || ret == string.Empty , where basically we are making two comparisons to validate the result.

Given that ret == string.Empty represents string.Equals(ret, String.Empty) where basically compares if the 1st object is equal to 2, considering if one of them is null and even comparing character to character in loop

null

In this case, as already explained above, ret == "" represents ret == "" that "drags" some complexity in the comparison and therefore is slow but less than the previous because it does not have the "extra" validation of string.Equals(ret, "") .

ret == null

This is the fastest option if we want to validate whether a ret.Length == 0 is empty (no characters) or not, but is not the safest, since if string has string value it will give an exception of null .

If we are absolutely sure that System.NullReferenceException will not be never string , then it will be a good option.

Conclusion

Each case is a case, and everything depends on the context.

The safest way will certainly be to use null when we do not know if the content will be null or not, and bad as bad is better to prevent, but ... it is necessary to evaluate which of the options, taking into account the which we have, will be more feasible to use.

    
29.08.2018 / 14:35
4

One way to understand performance is by testing:

class Program
{
    static void Main(string[] args)
    {
        string test = StrintToTest();

        while(true)
        {
            Console.WriteLine("Press any key. ESC to exit");
            var key = Console.ReadKey();
            if (key.Key == ConsoleKey.Escape) break;

            Console.WriteLine(ElapsedTime("==", () => (test == "")));
            Console.WriteLine(ElapsedTime("Length", () => (test.Length == 0)));
            Console.WriteLine(ElapsedTime("IsNullOrEmpty", () => (string.IsNullOrEmpty(test))));
            Console.WriteLine(ElapsedTime("null or Len", () => (test == null || test.Length == 0)));
            Console.WriteLine();
        }

    }

    public static string ElapsedTime(string what, Func<bool> method)
    {
        var time = new Stopwatch();

        time.Start();
        long i = 0;
        long total = 1000000;
        while (i++ < total)
            method.Invoke();
        time.Stop();

        return string.Format("{0,15} - Elapsed {1,9:f5} ms", what, time.Elapsed.TotalMilliseconds);
    }

    public static string StrintToTest()
    {
        return "The quick brown fox jumps over the lazy dog";
    }
}

On my machine the result was:

             == - Elapsed   7,48910 ms
         Length - Elapsed   6,62160 ms
  IsNullOrEmpty - Elapsed   5,81710 ms
    null or Len - Elapsed   7,22380 ms

It seems fair that checking for size is more efficient than comparing two strings.

Calling the method IsNullOrEmpty of class string took less time than the other in this test . When we look at the .Net Framework source code for the IsNullOrEmpty method we have the following:

public static bool IsNullOrEmpty(String value) {
    return (value == null || value.Length == 0);
}

So there seems to be some compiler optimization when you use the string class method. Is it?

The question of performance, you need to do the tests to choose what is best, but understand that performance is very relative. Running a code should consider many factors in the runtime environment that you do not control, and most applications do not have the need to optimize performance at this level of instructions.

What's left over then are issues that have more to do with program / code organization. The "clean code," "readable code," "organized code," or the code that follows some sort of pattern or style. This is what we end up calling a written code of good practice.

Turning to the personal opinion, I remember that in the past, every time we joined a development team, we looked for "the good practice guide". In fact this guide was more directed to make the code somewhat standardized because people have several different experiences in various places and ended up each writing the code differently. It was not a question of whether the code was good or bad, fast or slow, it was a matter of code being understood more quickly by anyone on the team. If your team can have that privilege, I do not see how that can be messy or bad, so before you follow good practice, you may need to understand what good practices mean in that community / team you are involved with. >     

29.08.2018 / 16:12
2

If you want to evaluate if a string has value, C # has a function of its own primitive type string .

Ex:

string.IsNullOrEmpty("abc");

If you want to evaluate if a string has value, abstracting whitespace, C # has a function of the primitive type itself string .

Ex:

string.IsNullOrWhiteSpace("abc");

Regardless of language, you can evaluate:

var mstr= ''; if(mstr == null) return true;

And the resulting will be false.

    
04.09.2018 / 19:53