How to extract this information in a string?

1

Below is an output string generated by ffmpeg for some video file:

  

Stream # 0: 0 (jpn): Video: h264 ...

     

Stream # 0: 1 (jpn): Audio: mp3 ...

     

Stream # 0: 2 (by): Subtitle ...

In most cases ffmpeg offers a function to track the streams that I want to convert through -map and language, but in some video files it is not possible to get such a mapping through the language, which is why I want use PHP to track the stream I want by number rather than by language.

How can I with PHP get the stream keys of the video 0:0 , audio 0:1 and caption 0:2 , knowing that they can change position and even language?

    
asked by anonymous 06.04.2017 / 22:34

2 answers

1

You can use preg_match_all , like this:

<?php

$resposta = 'Stream #0:0(jpn): Video: h264 ...
Stream #0:1(jpn): Audio: mp3 ...
Stream #0:2(por): Subtitle ...';

if (preg_match_all('/Stream\s[#](\d+:\d+)\((\w+)\):\s(\w+)[ :]+(\w+|)/i', $resposta, $output) == 0) {
    echo 'Nenhuma informação encontrada';
} else {
    print_r($output);
}

Then just manipulate array , see the result in the ideone: link .

Edited

I was able to create an easier-to-use example, read the descriptions:

<?php
function extrairDados($dados) {
    if (preg_match_all('/Stream\s[#](\d+:\d+)(\(\w+\))?:(\s\w+|)[ :]+(\w+|)/i', $dados, $output) == 0) {
        echo 'Nenhuma informação encontrada', PHP_EOL;
    } else {
        $reorganizado = array(); //Array que terá o resultado final

        //Chaves que serão usadas para tornar mais intuitivo o que é cada item
        $chaves = array(
            'tempo',
            'idioma',
            'formato',
            'codec'
        );

        //Remove o primeiro item do array gerado pelo preg_match_all, ele não é necessário
        array_shift($output);

        //Conta o total de itens
        $y = count($output);

        for ($x = 0; $x < $y; $x++) {
            $item = $output[$x]; //Pega o item atual
            $chave = $chaves[$x]; //Pega a chave atual para identificar no array
            $j = count($item); //Conta "propriedades" do item

            for ($i = 0; $i < $j; $i++) {

                //Se não existir o sub-array irá gerar
                if (isset($reorganizado[$i]) === false) {
                    $reorganizado[$i] = array();
                }

                $str = trim($item[$i]); //Remove espaços em branco
                $str = trim($str, '('); //Remove ( das extremidades
                $str = trim($str, ')'); //Remove ) das extremidades

                //Salva o item no array chave correspondente
                $reorganizado[$i][$chave] = $str;
            }
        }

        //Exibe o array
        return $reorganizado;
    }

    return false;
}

And to use just do so:

$resposta = 'Stream #0:0(jpn): Video: h264 ...
Stream #0:1(jpn): Audio: mp3 ...
Stream #0:2(por): Subtitle ...';

$dados = extrairDados($resposta);

if ($dados) {
    foreach ($dados as $item) {
        echo 'Tempo: ', $item['tempo'], PHP_EOL;
        echo 'Idioma: ', $item['idioma'], PHP_EOL;
        echo 'Formato: ', $item['formato'], PHP_EOL;
        echo 'Codec: ', $item['codec'], PHP_EOL, PHP_EOL;
    }
}

Example with different results:

$resposta1 = '    Stream #0:0(und): Video: h264 (High) ...
    Stream #0:1(und): Audio: aac (LC) ...';

$resposta2 = '    Stream #0:0: Video: mpeg4 ...
    Stream #0:1: Audio: mp3 ...';

$resposta3 = '    Stream #0:0(und): Video: mpeg4 ...
    Stream #0:1(jpn): Audio: mp3 ...
    Stream #0:1(por): Subtitle:';

print_r(extrairDados($resposta1));
print_r(extrairDados($resposta2));
print_r(extrairDados($resposta3));

Example on ideone

    
06.04.2017 / 22:57
2

A solution very similar to that presented by Guilherme, but with a little less code, can be implemented as:

if (preg_match_all('/Stream\s[#](\d+:\d+)(\(\w+\))?:(\s\w+|)[ :]+(\w+|)/i', $resposta, $output) == 0) {
    echo 'Nenhuma informação encontrada';
} else {
    print_r(array_map(null, ...$output));
}

In fact, the solution to the problem is exactly the same: use regular expression to extract the data from the text; what changes is just the way to group that data using the array_map function.

The array_map function receives a callback function as the first parameter, however, when null , the value of the array is returned. If you pass multiple arrays , a compression is done, so to speak, similar to the native function of Python.

  

The zip operator used in calling ... causes each value of array_map to be passed as a parameter. The equivalent code would be: $output . This operator is known as splat , it supports array_map(null, $output[0], $output[1], $output[2]) and Arrays and is available since PHP 5.6

For the entry:

Stream #0:0(und): Video: mpeg4 ...
Stream #0:1(jpn): Audio: mp3 ...
Stream #0:1(por): Subtitle:

Output output would be:

Array
(
    [0] => Array
        (
            [0] => Stream #0:0(und): Video: mpeg4
            [1] => 0:0
            [2] => (und)
            [3] =>  Video
            [4] => mpeg4
        )

    [1] => Array
        (
            [0] => Stream #0:1(jpn): Audio: mp3
            [1] => 0:1
            [2] => (jpn)
            [3] =>  Audio
            [4] => mp3
        )

    [2] => Array
        (
            [0] => Stream #0:1(por): Subtitle:
            [1] => 0:1
            [2] => (por)
            [3] =>  Subtitle
            [4] => 
        )

)

Well similar to the response generated by Guillermo's code, but with a few less lines.

You can see the code working in Repl.it or Ideone .

    
07.04.2017 / 01:31