Retrieve values from an XML contained in a String

2

I have a String, which contains an XML with a structure similar to this:

<TAG0>
   <TAG1>
      <TAG2>valor1</TAG2>
   </TAG1>
   <TAG1>
      <TAG2>valor2</TAG2>
   </TAG1>
</TAG0>

In the case I have tags with the same name, which are repeated in the XML body, like a sale, which contains several items. The String with XML is with plain text without spaces. Ex.:

String VARIAVEL = "<TAG0><TAG1><TAG2>valor1</TAG2></TAG1><TAG1><TAG2>valor2</TAG2></TAG1></TAG0>"

What I have to do, as in this example, is to retrieve the values of the tags "TAG2", knowing that I can have N tags as "TAG1". The real case is to retrieve all CFOPs from items in an NFe.

    
asked by anonymous 14.10.2015 / 20:45

2 answers

0

It is possible to get the values with regex, but there is the disadvantage of having to change it whenever you need to look for another element. I suggest you create a Document , so it's easier to get the elements by name.

I changed the tag name so that the code succeeds, but the structure remains the same and the idea is to get the text in the elements <c> :

<?xml version="1.0" encoding="UTF-8"?>
<a>
  <b>
     <c>Valor 1</c>
  </b>
  <b>
     <c>Valor 2</c>
  </b>
</a>

Here is a solution using these Java classes:

import java.io.StringReader;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import org.w3c.dom.Document;
import org.w3c.dom.NodeList;

import org.xml.sax.InputSource;

And the code to parse the XML string and get the content of the elements:

String xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><a><b><c>valor 1</c></b><b><c>valor 2</c></b></a>";

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();

Document xmlDocument = builder.parse(new InputSource(new StringReader(xml)));

// Pega os elementos em que o nome da tag seja "c":
NodeList nodes = xmlDocument.getElementsByTagName("c");
for(int i = 0; i < nodes.getLength(); i++)
    System.out.println(nodes.item(i).getTextContent());

output:

  

Value 1
  Value 2

    
15.10.2015 / 01:24
0

You can solve the problem by using regular expressions. If you have the well-defined text in the format that showed the code below will solve.

public static void main(String[] args) {
    String texto = "<TAG0><TAG1><TAG2>valor1 x</TAG2>" + 
                      "</TAG1><TAG1><TAG2>valor2</TAG2></TAG1></TAG0>";
    Pattern p = Pattern.compile("<tag2>(.+?)</tag2>", Pattern.CASE_INSENSITIVE);
    Matcher m = p.matcher(texto);
    while( m.find() ){
        System.out.println(m.group(1));
    }
}

I broke the string just to look better presentable here.

What the code is doing is identifying the pattern you requested through a regular expression <tag2>(.+?)</tag2> which means:

Find the text <tag2> followed by anything non-greedy, followed by the text </tag2> using the Pattern.CASE_INSENSITIVE parameter to ignore the case.

while( m.find() ) traverses the text as long as there is the same pattern and m.group(1) to print the grouping 1 defined in the parentheses in the (.+?)

To view the working regex, see here:

ThetoollinkIusedwiththedefaultcreated: link

    
14.10.2015 / 21:21