Yield does not return data

3

When executing the method call, an Enumerable of HTML components should be returned.

I'm using the HTML Agility Pack to read an HTML file. The same method works as expected by removing yield and manually adding to a list

    HtmlNode slideCineAll = GetNodeById(cinema, "slide-cine-all");
    HtmlNode section = GetNodeByName(slideCineAll, "section");
    IEnumerable<HtmlNode> articles = GetNodesByName(section, "article");

    private static IEnumerable<HtmlNode> GetNodesByName(HtmlNode root, string node)
    {
        foreach (HtmlNode link in root.ChildNodes)
        {
            if (link.Name.Equals(node))
            {
                yield return link;
            }
        }
    }

    private static List<HtmlNode> GetNodesByNameList(HtmlNode root, string node)
    {
        List<HtmlNode> nodes = new List<HtmlNode>();
        foreach (HtmlNode link in root.ChildNodes)
        {
            if (link.Name.Equals(node))
            {
                nodes.Add(link);
            }
        }
        return nodes;
    }

This is the result stored in the variable when executing the method

{ConsoleApplication1.Program.GetNodesByName}
node: null
root: null
System.Collections.Generic.IEnumerator<HtmlAgilityPack.HtmlNode>.Current: null
System.Collections.IEnumerator.Current: null

Expected result

values
Count = 20
[0]: Name: "article"}
.
.
.
values[0]
_attributes: {HtmlAgilityPack.HtmlAttributeCollection}
_childnodes: {HtmlAgilityPack.HtmlNodeCollection}
_endnode: Name: "article"}
.
.
.

This is the structure I'm going through, using the method GetNodesByName or GetNodesByNameList I can retrieve a list from any node of the html

<div id="slide-cine-all">
<section>
    <article>
        <!--mais elementos-->
    </article>
    <article>
        <!--mais elementos-->
    </article>
    <article>
        <!--mais elementos-->
    </article>
    <article>
        <!--mais elementos-->
    </article>
    <article>
        <!--mais elementos-->
    </article>
    <article>
        <!--mais elementos-->
    </article>
</section>
</div>

As described in the beginning, the GetNodesByNameList method returns all items, in this case of the type article found in the file structure, but not the same when using yield.     

asked by anonymous 02.11.2014 / 01:51

1 answer

3

See the yield return documentation . He does not do what seems to be what you expect of him. The return part is important. When it arrives at this line, it finishes executing the method. So your code only returns an element of the nodes in your XML.

You can return more. You need to call the method more often. Each call will execute a new line from where it left off. yield creates something called generator ( in English ). It controls execution through a hidden state that determines at which point an enumeration the program is executing, so the call can continue from where it left off. Note that it returns an enumerable type and not the type of what you want in it. This enumerable structure is that it controls the continuity of execution from where it left off.

So your problem is that you get a single link from the node and then try to search inside it as if there were other elements of this node. Of course there are no other elements, you have not read them yet. That's where the problem comes from.

In the other method that works because it does not have yield , loop runs completely and scans all nodes and returns a more complete tree that can then be searched without problem. Everything you need is there.

I can not understand the problem as a whole but I think that in this case yield is disturbing. And I advise you to use it only when you fully understand its operation. It is excellent but there is no solution for all the problems. I'm not saying that this problem can not benefit from it (generate efficiency by not sweeping the entire structure but only what is needed at the moment) but would need to change some things in the code that consumes this method. In practice when you use a yield , roughly, you will have another external loop to scan the entire structure you are searching for (of course you can do it again and again).

I suggest to inspect the data and monitor the execution in debug to better understand what is happening with the code. It can help you learn about how yield works and see your problem clearly, and maybe even find a better solution. Or to explain otherwise the problem that I could not see better than this.

To help understand, run the following code taken from this answer in SO :

public void Consumer() {
    foreach(int i in Integers()) {
        Console.WriteLine(i.ToString());
    }
}

public IEnumerable<int> Integers() {
    yield return 1;
    yield return 2;
    yield return 4;
    yield return 8;
    yield return 16;
    yield return 16777216;
}

Or this OS response :

// Display powers of 2 up to the exponent 8:
foreach (int i in Power(2, 8)) {
    Console.Write("{0} ", i);
}

public static IEnumerable<int> Power(int number, int exponent) {
    int counter = 0;
    int result = 1;
    while (counter++ < exponent) {
        result = result * number;
        yield return result;
    }
}

I placed GitHub for future reference .

Did you notice that the consumer code always ends up having to repeat the calls, in a way the repetitions in the generator code ( yield )? The great advantage of yield for most situations is to create better abstractions.

See a quick explanation of how the internals command is. And one more more complete explanation . / p>     

02.11.2014 / 16:57