Parse HTML regex problem

1

Well the doubt I have is the following, I need to get the following HTML snippet below:

HTML:

        <section class="ovw-summary">

                <div class="ovw-summary__balance balance-amounts">
                    <header><h3>Meu dinheiro</h3></header>
                    <div class="box-container mp-box-shadow bg-trama">

                        <dl class="balance-amounts__list available-money">
                            <dt>Disponível</dt>
                            <dd class="price price-large mlb">
                            <span class="price-symbol">R$</span> <span class="price-integer">0</span><span class="price-decimal-mark">,</span><span class="price-decimal">00</span>
                            </dd>
                        </dl>
                        <dl class="balance-amounts__list account-money">
                            <dt>Em conta</dt>
                            <dd class="price">
                            <span class="price-symbol">R$</span> <span class="price-integer">24</span><span class="price-decimal-mark">,</span><span class="price-decimal">99</span> 
                            </dd>
                        </dl>

I've made it this way so I can read the HTML and it returns me the correct data. Here my code in PHP made a regex see how it is:

$SaldoEmConta = '~<dl class="account-money">\s*<dt>Em conta<\/dt>\s*<dd class="ch-price" name="balance_total" value=".*?">R\$ (.*?)<sup>(.*?)<\/sup>\s*<a href=".*?" class="icon-info-balance">\s*<i class="ch-icon-help-sign">\s*<\/i>\s*<\/a>\s*<\/dd>\s*<\/dl>~';
preg_match($SaldoEmConta, $RetornoSaldo, $ArrayConta);

$SaldoDisponivel = '~<dl class="open-detail">\s*<dt class="available-label">Dispon&iacute;vel<\/dt>\s*<dd class="ch-price available-price" name="balance_available" value=".*">R\$ (.*?)<sup>(.*?)<\/sup>\s*<\/dd>~';
preg_match($SaldoEmConta, $RetornoSaldo, $ArrayDisponivel);

echo 'Em conta: R$ ' . $ArrayConta[1].','.$ArrayConta[2]  . ' Disponivel: R$ ' . $ArrayDisponivel[1].','.$ArrayDisponivel[2] .'<hr>';

But for some reason I can not get these values can someone help me correct my regular expression?

    
asked by anonymous 07.03.2015 / 00:04

2 answers

1

Do not use Regex, use an HTML parser like simpleparser:

link

or the ganon:

link

The 2 are much quieter to work with HTML.

An example with simpleparser:

$html = str_get_html('<div id="hello">Hello</div><div id="world">World</div>');
$html->find('div', 1)->class = 'bar';
$html->find('div[id=hello]', 0)->innertext = 'foo';
echo $html; // Output: <div id="hello">foo</div><div id="world" class="bar">World</div> 

Just to complement, it's always a good read: link

    
07.03.2015 / 00:45
0

You can use this regular expression, I think it matches what you need:

balance_unavailable['"]\s+value=['"](\d*\.?\d*).*balance_dispute['"]\s+value=['"](\d*\.?\d*)
    
07.03.2015 / 00:45