Read complex XML with asp to embed BD

0

My main problem is that XML is not uniform, it does not always have the same nodes sequence.

It is a huge XML, with 600mil lines per day, basically the structure is always the same up to a certain level of node descentes. Until the Item Code node, then, IF it was negotiated, it MAY have: price, date, quantity, Price_previous, etc. (not necessarily the same). There is a large qtde of items that are listed in the XML, but that I do not want because it had no movement, so there will not be any such nodes price, date, quantity, Price_previous, etc (may have some others, that tb do not interest me) What I want to do is read the XML node by node until you get to the Item Code, then do a nodes check below to see if the Node exists. Ai So I select what I need to include in the DB. But I think I'm not going by the best way, and as the file is very big to read everything and add in DB will be very slow.

I'm doing this:

    <%
Set objXMLDoc = CreateObject("MSXML2.DOMDocument.6.0")
objXMLDoc.async = False
objXMLDoc.load(Server.MapPath("arquivo.xml"))

'Caminho objetivo do XML
'//Document/BizFileHdr/Xchg/BizGrp/Document/PricRpt (TradDt/1|SctyId/1|TradDtls/2|FinInstrmAttrbts/*)
Set objNodeList = objXMLDoc.SelectNodes("*")
response.write objNodeList.length
on Error resume next
 For Each objNode In objNodeList
    if objNode.nodeName="Document"  then    response.write "<br>"
    response.write "0- "&objNode.nodeName &" : "& objNode.firstChild.nodevalue &"<br>"
    For Each objNode2 In objNode.childNodes
        response.write "1- "&objNode2.nodeName &" : "& objNode2.firstChild.nodevalue &"<br>"
        For Each objNode3 In objNode2.childNodes
            response.write "2- "&objNode3.nodeName &" : "& objNode3.firstChild.nodevalue &"<br>"
            For Each objNode4 In objNode3.childNodes
                response.write "3- "&objNode4.nodeName &" : "& objNode4.firstChild.nodevalue &"<br>"
                For Each objNode5 In objNode4.childNodes
                    if objNode5.nodeName="Document" then    
                        response.write "<br><b>"
                        response.write "4- "&objNode5.nodeName &" : "& objNode5.firstChild.nodevalue &"</b><br>"
                        For Each objNode6 In objNode5.childNodes
                            response.write "5- "&objNode6.nodeName &" : "& objNode6.firstChild.nodevalue &"<br>"
                            For Each objNode7 In objNode6.childNodes
                                response.write "<b>A "& objNode7.childNodes.item(objNode7).text &"</b><br>"
                                if objNode7.nodeName="TradDtls" then 
                                    if objNode7.length=0 then exit for
                                end if
                                response.write "6- "&objNode7.nodeName &" : "& objNode7.firstChild.nodevalue &"<br>"
                                For Each objNode8 In objNode7.childNodes
                                    response.write "7- "&objNode8.nodeName &" : "& objNode8.firstChild.nodevalue &"<br>"
                                Next
                            Next
                        Next
                    end if  'if objNode5.nodeName="Document" then  
                Next
            Next
        Next
    Next
 Next
on error goto 0
%>

In SelectNodes ("*") I could not define anything, such as "PricRpt". I do not know why! If anyone has suggestions to improve, you can talk, but what would make it easier, it would already be where it has "if objNode5.nodeName=" Document "then" I already wonder if the node // Document / BizFileHdr / Xchg / BizGrp / Document / PricRpt / TradDtls Or, I do not know if it is possible, already filter in XML Load and not list if there is no such node there.

As the XML is huge, I will leave here a well-summarized version, with only one case that has these nodes and another without. (Highlight what matters with **)

    <?xml version="1.0" encoding="utf-8"?>
<Document xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:bvmf.052.01.xsd bvmf.052.01.xsd" xmlns="urn:bvmf.052.01.xsd">
  <BizFileHdr>
    <Xchg>
      <BizGrpDesc>
        <Fr>
          <OrgId>
            <Id>
              <OrgId>
                <Othr>
                  <Id>BVMF</Id>
                  <Issr>40</Issr>
                  <SchmeNm>
                    <Prtry>39</Prtry>
                  </SchmeNm>
                </Othr>
              </OrgId>
            </Id>
          </OrgId>
        </Fr>
        <To>
          <OrgId>
            <Id>
              <OrgId>
                <Othr>
                  <Id>PUBLIC</Id>
                  <Issr>40</Issr>
                  <SchmeNm>
                    <Prtry>39</Prtry>
                  </SchmeNm>
                </Othr>
              </OrgId>
            </Id>
          </OrgId>
        </To>
        <BizGrpDtls>
          <BizGrpIdr>BV000328201708290328000002001525443</BizGrpIdr>
          <TtlNbOfMsg>8733</TtlNbOfMsg>
          <BizGrpTp>BVBG.086.01</BizGrpTp>
          <CreDtAndTm>2017-08-29T20:01:52</CreDtAndTm>
        </BizGrpDtls>
        <MsgTpDef>
          <MsgDefIdr>BVMF.217.01</MsgDefIdr>
          <NbOfMsg>8733</NbOfMsg>
        </MsgTpDef>
      </BizGrpDesc>
      <BizGrp>
        <AppHdr xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:iso:std:iso:20022:tech:xsd:head.001.001.01">
          <BizMsgIdr>BV000328201708290328000002001525443</BizMsgIdr>
          <MsgDefIdr>BVMF.217.01</MsgDefIdr>
          <CreDt>2017-08-29T23:01:52Z</CreDt>
          <Fr>
            <OrgId>
              <Id>
                <OrgId>
                  <Othr>
                    <Id>BVMF</Id>
                    <SchmeNm>
                      <Prtry>39</Prtry>
                    </SchmeNm>
                    <Issr>40</Issr>
                  </Othr>
                </OrgId>
              </Id>
            </OrgId>
          </Fr>
          <To>
            <OrgId>
              <Id>
                <OrgId>
                  <Othr>
                    <Id>PUBLIC</Id>
                    <SchmeNm>
                      <Prtry>39</Prtry>
                    </SchmeNm>
                    <Issr>40</Issr>
                  </Othr>
                </OrgId>
              </Id>
            </OrgId>
          </To>
        </AppHdr>
        **<Document xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:bvmf.217.01.xsd">**
          **<PricRpt>**
            <TradDt>
              <Dt>2017-08-29</Dt>
            </TradDt>
            <SctyId>
              <TckrSymb>BGIV17P013450</TckrSymb>
            </SctyId>
            <FinInstrmId>
              <OthrId>
                <Id>100000087017</Id>
                <Tp>
                  <Prtry>8</Prtry>
                </Tp>
              </OthrId>
              <PlcOfListg>
                <MktIdrCd>BVMF</MktIdrCd>
              </PlcOfListg>
            </FinInstrmId>
            **<TradDtls />**
            <FinInstrmAttrbts>
              <MaxTradLmt Ccy="BRL">999999.01</MaxTradLmt>
              <MinTradLmt Ccy="BRL">0.01</MinTradLmt>
            </FinInstrmAttrbts>
          </PricRpt>
        </Document>
      </BizGrp>
      <BizGrp>
        <AppHdr xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:iso:std:iso:20022:tech:xsd:head.001.001.01">
          <BizMsgIdr>BV000328201708290328000002001558725</BizMsgIdr>
          <MsgDefIdr>BVMF.217.01</MsgDefIdr>
          <CreDt>2017-08-29T23:01:55Z</CreDt>
          <Fr>
            <OrgId>
              <Id>
                <OrgId>
                  <Othr>
                    <Id>BVMF</Id>
                    <SchmeNm>
                      <Prtry>39</Prtry>
                    </SchmeNm>
                    <Issr>40</Issr>
                  </Othr>
                </OrgId>
              </Id>
            </OrgId>
          </Fr>
          <To>
            <OrgId>
              <Id>
                <OrgId>
                  <Othr>
                    <Id>PUBLIC</Id>
                    <SchmeNm>
                      <Prtry>39</Prtry>
                    </SchmeNm>
                    <Issr>40</Issr>
                  </Othr>
                </OrgId>
              </Id>
            </OrgId>
          </To>
        </AppHdr>
        **<Document xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:bvmf.217.01.xsd">**
          **<PricRpt>**
            <TradDt>
              <Dt>2017-08-30</Dt>
            </TradDt>
            <SctyId>
              **<TckrSymb>ICFZ17</TckrSymb>**
            </SctyId>
            <FinInstrmId>
              <OthrId>
                <Id>100000070706</Id>
                <Tp>
                  <Prtry>8</Prtry>
                </Tp>
              </OthrId>
              <PlcOfListg>
                <MktIdrCd>BVMF</MktIdrCd>
              </PlcOfListg>
            </FinInstrmId>
            **<TradDtls>
              <TradQty>26</TradQty>
            </TradDtls>**
            **<FinInstrmAttrbts>**
              <MktDataStrmId>E</MktDataStrmId>
              <NtlFinVol Ccy="BRL">1952230.59</NtlFinVol>
              <IntlFinVol Ccy="USD">617150</IntlFinVol>
              <OpnIntrst>6413</OpnIntrst>
              <FinInstrmQty>39</FinInstrmQty>
              <BestBidPric Ccy="USD">157.7</BestBidPric>
              <BestAskPric Ccy="USD">158.2</BestAskPric>
              <FrstPric Ccy="USD">157.85</FrstPric>
              <MinPric Ccy="USD">157.7</MinPric>
              <MaxPric Ccy="USD">159.7</MaxPric>
              <TradAvrgPric Ccy="USD">158.24</TradAvrgPric>
              <LastPric Ccy="USD">157.7</LastPric>
              <RglrTxsQty>26</RglrTxsQty>
              <RglrTraddCtrcts>39</RglrTraddCtrcts>
              <NtlRglrVol Ccy="BRL">1952230.59</NtlRglrVol>
              <IntlRglrVol Ccy="USD">617150</IntlRglrVol>
              <AdjstdQt Ccy="USD">156.9</AdjstdQt>
              <AdjstdQtStin>F</AdjstdQtStin>
              <PrvsAdjstdQt Ccy="USD">159.1</PrvsAdjstdQt>
              <PrvsAdjstdQtStin>F</PrvsAdjstdQtStin>
              <OscnPctg>0.5</OscnPctg>
              <VartnPts Ccy="USD">-2.2</VartnPts>
              <EqvtVal Ccy="BRL">496.32</EqvtVal>
              <AdjstdValCtrct Ccy="USD">695.92</AdjstdValCtrct>
              <MaxTradLmt Ccy="USD">171</MaxTradLmt>
              <MinTradLmt Ccy="USD">142.8</MinTradLmt>
            **</FinInstrmAttrbts>**
          </PricRpt>
        </Document>
      </BizGrp>
    </Xchg>
  </BizFileHdr>
</Document>
    
asked by anonymous 31.08.2017 / 18:04

1 answer

0

I suggest you use version 3.0 of MSXML and use XPATH to do the most refined search for XML nodes.

See an example:

var xmlDoc = Server.CreateObject("Msxml2.DOMDocument.3.0");   
xmlDoc.loadXML(xml);   
var expNode = xmlDoc.documentElement.selectSingleNode("Document/BizFileHdr/Xchg/BizGrp/Document/PricRpt/TradDtls");   
if (expNode)   
    Response.Write(expNode.text);  
    
31.08.2017 / 18:18