Doubt about the operation of the AudioRecord class

2

I am a beginner and I was trying to understand how to record an audio using android and how it processes that audio. And during the searches I found this site: Audio Record
The example works perfectly. But I did not really understand how this class works to capture the sound and be able to record that sound. It is known that in digital processing the samples are sampled (number of samples per second) and then these values are converted into binary codes according to the voltage ranges. So the higher the number of bits, the more accurate the sound identification will be. In the example below it uses 16 bits. So I'm in doubt about how to relate what it does to digital audio processing in fact.

  • In the example it declares a variable named samplerate = 8000. Does this variable represent the number of samples that the microphone in the handset will pick up per second?
  • The variable RECORDER_AUDIO_ENCODING is the number of possibilities (2¹6) that I can use to relate the voltage values in digital form with binary numbers?
  • At some point it takes the minimum buffer with the getminBufferSize () method. Why does he have to do this? I can not use the size of the buffer I want?

    int bufferSize = AudioRecord.getMinBufferSize (RECORDER_SAMPLERATE, RECORDER_CHANNELS, RECORDER_AUDIO_ENCODING);

  • In the documentation it says:

      

    Returns the minimum buffer size required for the successful creation of an AudioRecord object, in byte units. Note that this size does not guarantee smooth recording under load, and higher values should be chosen according to the expected frequency at which the AudioRecord instance will be polled for new data. See AudioRecord (int, int, int, int, int) for more information on valid configuration values.   I did not understand this question of him having to give me the minimal bugffer. What prevents me from setting the buffer?

  • When it creates the recorder object of the AudioRecord class it passes some parameters to the constructor and among these parameters it passes "BufferElements2Rec" and "BytesPerElement". In the documentation it says that this parameter is the "bufferSizeInBytes". And says the following:
  •   

    int: the total size (in bytes) of the buffer where audio data is written to during the recording. New audio data can be read from this buffer in smaller chunks than this size. I am trying to create an audioRecord instance in an audioRecord instance. Using values smaller than getMinBufferSize () will result in an initialization failure.   What does this parameter represent, since it first uses buffersize = getminbuffersize () and then it uses "BufferElements2Rec" and "BytesPerElement"?

    In my understanding, he would take samples of the sound and record the voltage levels converted to a binary number. I do not get it right or does he do it? If anyone can help in understanding this, I would be grateful!

                import java.io.FileNotFoundException;
                import java.io.FileOutputStream;
                import java.io.IOException;
                import android.app.Activity;
                import android.media.AudioFormat;
                import android.media.AudioRecord;
                import android.media.MediaRecorder;
                import android.os.Bundle;
                import android.view.KeyEvent;
                import android.view.View;
                import android.widget.Button;
    
                /**
                *
                * @author RAHUL BARADIA
                *
                *
                */
                public class Audio_Record extends Activity {
                private static final int RECORDER_SAMPLERATE = 8000;
    
                private static final int RECORDER_CHANNELS = AudioFormat.CHANNEL_IN_MONO;
    
                private static final int RECORDER_AUDIO_ENCODING = AudioFormat.ENCODING_PCM_16BIT;
    
                private AudioRecord recorder = null;
                private Thread recordingThread = null;
                private boolean isRecording = false;
    
                @Override
                public void onCreate(Bundle savedInstanceState) {
                super.onCreate(savedInstanceState);
                setContentView(R.layout.main);
    
                setButtonHandlers();
                enableButtons(false);
    
                int bufferSize = AudioRecord.getMinBufferSize(RECORDER_SAMPLERATE,
                RECORDER_CHANNELS, RECORDER_AUDIO_ENCODING);
                }
    
                private void setButtonHandlers() {
                ((Button) findViewById(R.id.btnStart)).setOnClickListener(btnClick);
                ((Button) findViewById(R.id.btnStop)).setOnClickListener(btnClick);
                }
    
                private void enableButton(int id, boolean isEnable) {
                ((Button) findViewById(id)).setEnabled(isEnable);
                }
    
                private void enableButtons(boolean isRecording) {
                enableButton(R.id.btnStart, !isRecording);
                enableButton(R.id.btnStop, isRecording);
                }
    
                int BufferElements2Rec = 1024; // want to play 2048 (2K) since 2 bytes we use only 1024
                int BytesPerElement = 2; // 2 bytes in 16bit format
    
                private void startRecording() {
    
                recorder = new AudioRecord(MediaRecorder.AudioSource.MIC,
                RECORDER_SAMPLERATE, RECORDER_CHANNELS,
                RECORDER_AUDIO_ENCODING, BufferElements2Rec * BytesPerElement);
    
                recorder.startRecording();
    
                isRecording = true;
    
                recordingThread = new Thread(new Runnable() {
    
                public void run() {
    
                writeAudioDataToFile();
    
                }
                }, "AudioRecorder Thread");
                recordingThread.start();
                }
    
                //Conversion of short to byte
                private byte[] short2byte(short[] sData) {
                int shortArrsize = sData.length;
                byte[] bytes = new byte[shortArrsize * 2];
    
                for (int i = 0; i < shortArrsize; i++) {
                bytes[i * 2] = (byte) (sData[i] & 0x00FF);
                bytes[(i * 2) + 1] = (byte) (sData[i] >> 8);
                sData[i] = 0;
                }
                return bytes;
                }
    
                private void writeAudioDataToFile() {
                // Write the output audio in byte
                String filePath = "/sdcard/8k16bitMono.pcm";
    
                        short sData[] = new short[BufferElements2Rec];
    
                FileOutputStream os = null;
                try {
                os = new FileOutputStream(filePath);
                } catch (FileNotFoundException e) {
                e.printStackTrace();
                }
    
                while (isRecording) {
                // gets the voice output from microphone to byte format
                recorder.read(sData, 0, BufferElements2Rec);
                System.out.println("Short wirting to file" + sData.toString());
                try {
                // writes the data to file from buffer stores the voice buffer
                byte bData[] = short2byte(sData);
    
                os.write(bData, 0, BufferElements2Rec * BytesPerElement);
    
                } catch (IOException e) {
                e.printStackTrace();
                }
                }
    
                try {
                os.close();
                } catch (IOException e) {
                e.printStackTrace();
                }
                }
    
                private void stopRecording() {
                // stops the recording activity
                if (null != recorder) {
                isRecording = false;
    
    
                recorder.stop();
                recorder.release();
    
                recorder = null;
                recordingThread = null;
                }
                }
    
                private View.OnClickListener btnClick = new View.OnClickListener() {
                public void onClick(View v) {
                switch (v.getId()) {
                case R.id.btnStart: {
                enableButtons(true);
                startRecording();
                break;
                }
                case R.id.btnStop: {
                enableButtons(false);
                stopRecording();
                break;
                }
                }
                }
                };
    
                // onClick of backbutton finishes the activity.
                @Override
                public boolean onKeyDown(int keyCode, KeyEvent event) {
                if (keyCode == KeyEvent.KEYCODE_BACK) {
                finish();
                }
                return super.onKeyDown(keyCode, event);
                }
                }
    
        
    asked by anonymous 22.09.2017 / 02:04

    1 answer

    2

    Answers:

      

    In the example it declares a variable named samplerate = 8000.   variable represents the number of samples that the microphone of the mobile   will capture per second?

    A: Samplerate = 8000hz means that every second 8000 samples of the microphone are captured, if 44100hz every second 44100 samples would be captured ...

    The converse is true, after you have saved and encoded or your audio (mp3, wav, flac, etc.) to play the audio in the exact time and frequencies you have to dump these samples in the same sample in which it was recorded / generated , to play an audio sampled at 8000Hz your player has to play 8000 samples per second for your sound box, what happens if you have an audio recorded at 44100hz and have play played at 8000hz (have 8000 samples per second played) the sound box)?

    Something called downsample is going to happen, you are getting less samples per second than the audio was generated, the audio will play much slower and seeming to come from hell (bass frequencies).

    If there is a downsample process, there will also be an upsample process, an audio generated at 8000hz and played at 16000hz, for example, will play twice as fast and with frequencies octave (frequencies twice as large) your audio will stay Like some squirrels.

    The digital process has some characteristics of the analog process, this phenomenon upsample / downsample also happened on vinyl records, I do not know how old it was, but when I used to go to my beggar's house I did not understand what happened, discs that rotated at 78RPM or 78 revolutions per minute (mechanically had a motor that rotated the disc shaft at this speed) and I kept playing with the discs with my finger and changing the speed of the rotation of the disc, the hour sound became more serious time was getting sharper depending on the speed I was spinning the disks, actually the same thing happened, the disks were written to run at 78RPM and if you change the execution speed will happen downsample / upsample ...

      

    The variable RECORDER_AUDIO_ENCODING is the number of possibilities (2¹6)   that I can use to relate the voltage values in digital form   with binary numbers?

    It's really the way you're going to represent the voltage values, they're not in binaries, they're going to be in the PCM format a short int or float point , you can have those values plot, it's represented in the format you you chose, for example if you encode an audio in% with%, the representations of your tentions will vary between integers ranging from short int (ENCODING_PCM_16BIT) to an encoded audio in% with% representations vary from -32768 até 32767 , it's just a way of representation, some prefer short int other float point, some systems can only play in float values, in android for example there is performance difference, low-end ARM processors have worse performance with float point audios ...

      

    At some point it takes the minimum buffer with the method   getminBufferSize (). Why does he have to do this? I can not use   the size of the buffer I want?

    Android is a system known for latency issues, OS and Hardware are problematic and working with audio in android needs to be magic rsrs, float point ENCODING_PCM_FLOAT is a parameter generated by google developers trying to guarantee a size minimum and acceptable for you to be able to record your audio with less latency possible, if you put a buffer size smaller than the one returned by -1 até 1 will require more processing and for that reason it was stuck, and what happens if you want to put a larger buffer? it will work? this will depend on your hardware and OS, does the device have enough memory to work with the buffer of your choice? These are questions I can not answer ...

    If you notice in your code the variable getminBufferSize() is not being used anywhere, actually the last parameter (int bufferSizeInBytes) used in your getminBufferSize()

    AudioRecord (int audioSource, 
                    int sampleRateInHz, 
                    int channelConfig, 
                    int audioFormat, 
                    int bufferSizeInBytes)
    

    It is bufferSize

    int BufferElements2Rec = 1024;only 1024
    int BytesPerElement = 2
    

    view:

    BufferElements2Rec * BytesPerElement
    

    Just remembering that AudioRecord automatically generates different buffer values depending on the device, in theory you do not have to be calculating and worrying about it.

        
    27.09.2017 / 14:32