Is there any audio transcription API that can be used in PHP, C library or Java?

Question

Is there any audio transcription API that can be used in PHP, C library or Java?

Navigation

#1 by (5 votes)
#2 by (0 votes)

6

I'm looking for an api that I can get when I receive the audio, try to recognize the text. Does anyone know of any opensource api for this? The opposite (receiving a text and generating an audio has several). The intention is to install on the local server (linux) and use together with PHP. If so, what would it be?

The first objective is to use an anti-captcha tool, in general, today there are many websites that, besides having the image, have the audio option, once I have the audio, I can send it to an api and submit the captcha. thus easier to consume services, such as consultation and validation of CPF, CNPJ and etc ...

Today I can use CURL to normally capture HTML, make anti-captcha for images come up against many issues where the algorithm is not always efficient to break the captcha and still need to develop algorithms for different images.

In the searches I did I found a lot to generate audio from text, but the opposite generate text from audio I only found in closed applications where I would need to use the application along with shortcut key to solve the question. As it will be running on a web server I did not find a good solution.

I found GoogleSpeech, it falls on link , however, I did not implement, because when I started reading I saw that it would work by the shortcut key and the microphone. This would undoubtedly be one of the worst possible implementations possible if it did not have an easier way to use it. "When you want to turn on the Google2Ubuntu voice recognition system, press the keyboard shortcut that you have set.When you press the keyboard shortcut ..."

asr

asked by anonymous 31.07.2014 / 18:59

2 answers

0

AT & T has an API for Speech 2 Text here: link

To use it you need to sign up as a developer, create an app, and establish a grammar. Use speech to text custom .

10.08.2015 / 19:29

What can cause EF performance to fall in this scenario? How to optimize this query query with other subqueries?

score 5 · Accepted Answer

You are looking for ASR (automatic speech recognition).

Open source is very complicated to find, these algorithms have a very large commercial appeal, have some very old projects and I think they only support transcription in English:

Sphinx

freespeech

I've already tested the verbia is not opensource, but you can install a demo of it and let running in evaluation mode has support the Portuguese language-BR.

I would not think twice about using googlespeech, with the help of curl you should mount the appropriate header send the audio file to google and get the transcript, you must first convert the audio file to the format flac and make resample to 8000hz, after these procedures just send the file to google, in php you will do something like this:

$file_to_upload = array('myfile'=>'@'.$filename.'.flac');
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://www.google.com/speech-api/v2/recognize?output=json&lang=pt-BR&key=___my_api_key___");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Content-Type: audio/x-flac; rate=8000"));
curl_setopt($ch, CURLOPT_POSTFIELDS, $file_to_upload);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$result=curl_exec ($ch);

I do this with python and it works which is a beauty :-)