Periklis Ntanasis:
Master's Touch

fade out

Android: Crafting a Metronome with Audio Synthesis

Where is the source code

Currently I am working on a project for my uni where I need to extend a metronome for android with some additional functionalities. At first I thought that it would be very easy to find plenty of open source metronomes out there.

As it seems I was wrong. The only code I found was android-metronome by Akshat Aranya. After compiling and running the project with my colleague we saw that the beat wasn’t stable. For my disappointment this stackoverflow thread confirmed that.

Funny thing is that the google play version of it seems stable, but anyway we weren’t able to produce a stable version from the source code.

Crafting a Metronome day 0

Amazed how the code for such a trivial thing wasn’t available we started our own implementation.

In the beginning we used timers to play a sound every x seconds. It was a great a failure!

We used threads and the like but the metronome was far from accurate. Again after a stackoverflow search we saw that this path was wrong.

Just for the story, it was about then when my colleague had to quit for personal matters.

So, convinced by the first stackoverflow thread that the android AudioTrack was the right way to go I started to implement a metronome with that.

At last a Metronome!

So, here we are now with a working metronome :D

Let’s see the core metronome class.

AudioGenerator
package pntanasis.master_ex.android;

import android.media.AudioFormat;
import android.media.AudioManager;
import android.media.AudioTrack;

public class AudioGenerator {

    private int sampleRate;
    private AudioTrack audioTrack;

    public AudioGenerator(int sampleRate) {
        this.sampleRate = sampleRate;
    }

    public double[] getSineWave(int samples,int sampleRate,double frequencyOfTone){
        double[] sample = new double[samples];
        for (int i = 0; i < samples; i++) {
            sample[i] = Math.sin(2 * Math.PI * i / (sampleRate/frequencyOfTone));
        }
                return sample;
    }

    public byte[] get16BitPcm(double[] samples) {
        byte[] generatedSound = new byte[2 * samples.length];
        int index = 0;
        for (double sample : samples) {
            // scale to maximum amplitude
            short maxSample = (short) ((sample * Short.MAX_VALUE));
            // in 16 bit wav PCM, first byte is the low order byte
            generatedSound[index++] = (byte) (maxSample & 0x00ff);
            generatedSound[index++] = (byte) ((maxSample & 0xff00) >>> 8);

        }
        return generatedSound;
    }

    public void createPlayer(){
        audioTrack = new AudioTrack(AudioManager.STREAM_MUSIC,
                sampleRate, AudioFormat.CHANNEL_CONFIGURATION_MONO,
                AudioFormat.ENCODING_PCM_16BIT, sampleRate,
                AudioTrack.MODE_STREAM);

        audioTrack.play();
    }

    public void writeSound(double[] samples) {
        byte[] generatedSnd = get16BitPcm(samples);
        audioTrack.write(generatedSnd, 0, generatedSnd.length);
    }

    public void destroyAudioTrack() {
        audioTrack.stop();
        audioTrack.release();
    }

}

Here we create an AudioTrack object and set the options for streaming audio, using mono sound and 16bit PCM audio format.

Audio 101

Sound is actually a wave. We call it soundwave. There are many different soundwaves named from their form such as sine wave, sawtooth wave, square wave etc.

Soundwaves have a frequency just as all the kind of waves. That is the circle or repeat rate.

So, different sounds are produced by different frequencies. For example C note may be produced by 16.35 Hz frequency and D by 18.35 Hz.

Something else important when capturing or creating sound from a digital device is the sample rate. This is how many samples we capture per second.

A sample is the value of the sound (in the waveform) at a specific moment.

The sample rate has to be 2 times more than the max frequency.

Anyway, enough with the theory.

Back to the code

The getSineWave() method returns an array with samples of a sine wave with the given frequency and sample rate.

The get16BitPcm() gets the samples and returns a PCM byte array. PCM is a method used to digitally represent sampled analog signals. It stands for Pulse Code Modulation.

Notice that every sample is represented by a double and in order to encode it to 16bit PCM we have to use only 16bits aka 2 bytes. So we get the max integer (short: 2^15-1 or 32767) nearest to the double sample value and then we convert it to 2 bytes.

To do that we perform a bitwise AND with the bitmask 0x00ff to get the little end bits and then a bitwise AND with the bitmask 0xff00 and shift the bits 8 positions (bits) to the right.

samples and bitmasks

So we populate an array of bytes that contains the waveform in PCM.

Other than that we write the sound directly to the audio device buffer with the writeSound() method.

The other important class of our metronome is the Metronome class. Here it is:

Metronome
package pntanasis.master_ex.android;

import android.util.Log;

public class Metronome {

        private double bpm;
        private int beat;
        private int noteValue;
        private int silence;

        private double beatSound;
        private double sound;
        private final int tick = 1000; // samples of tick

        private boolean play = true;

        private AudioGenerator audioGenerator = new AudioGenerator(8000);

        public Metronome() {
                audioGenerator.createPlayer();
        }

        public void calcSilence() {
                silence = (int) (((60/bpm)*8000)-tick);
        }

        public void play() {
                calcSilence();
                double[] tick =
                        audioGenerator.getSineWave(this.tick, 8000, beatSound);
                double[] tock =
                        audioGenerator.getSineWave(this.tick, 8000, sound);
                double silence = 0;
                double[] sound = new double[8000];
                int t = 0,s = 0,b = 0;
                do {
                        for(int i=0;i<sound.length&&play;i++) {
                                if(t<this.tick) {
                                        if(b == 0)
                                                sound[i] = tock[t];
                                        else
                                                sound[i] = tick[t];
                                        t++;
                                } else {
                                        sound[i] = silence;
                                        s++;
                                        if(s >= this.silence) {
                                                t = 0;
                                                s = 0;
                                                b++;
                                                if(b > (this.beat-1))
                                                        b = 0;
                                        }
                                }
                        }
                        audioGenerator.writeSound(sound);
                } while(play);
        }

        public void stop() {
                play = false;
                audioGenerator.destroyAudioTrack();
        }

        /* Getters and Setters ... */
}

In a nutshell it creates 2 sounds or better 2 arrays of samples, one for the first beat and one for the others and it sends them to the audio device by the writeSound() method. Between the sounds it writes a zero waveform which is actually silence.

So we produce sound all the time in a loop but this sound may be silence.

The only significant thing I should say is that the beat sound lasts as long as 1000 samples, which is 1/8 of a second beacause a second consists of 8000 samples (or another sample rate of our choice).

So, the remaining space between tho beats is filled with "silence" by the formula silence = 60/bpm * 8000 - tick.

Some more Audio Synthesis Magic

As you have noticed I don’t load any sound for the metronome but I create one programmatisticaly.

In such a way one could synthesize a song. As a proof of concept I used my fine AudioGenerator Class to play the first few note of the traditional song Oh, Susanna.

susana partiture

This partitura was created with TuxGuitar.

Here is my first attempt:

AudioSynthesisDemoActivity
package pntanasis.master_ex.android;

import android.app.Activity;
import android.os.Bundle;

public class AudioSynthesisDemoActivity extends Activity {
    /** Called when the activity is first created. */
    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);

        AudioGenerator audio = new AudioGenerator(8000);

        double[] silence = audio.getSineWave(200, 8000, 0);

        int noteDuration = 2400;

        double[] doNote = audio.getSineWave(noteDuration/2, 8000, 523.25);
        double[] reNote = audio.getSineWave(noteDuration/2, 8000, 587.33);
        double[] faNote = audio.getSineWave(noteDuration, 8000, 698.46);
        double[] laNote = audio.getSineWave(noteDuration, 8000, 880.00);
        double[] laNote2 =
                audio.getSineWave((int) (noteDuration*1.25), 8000, 880.00);
        double[] siNote = audio.getSineWave(noteDuration/2, 8000, 987.77);
        double[] doNote2 =
                audio.getSineWave((int) (noteDuration*1.25), 8000, 523.25);
        double[] miNote = audio.getSineWave(noteDuration/2, 8000, 659.26);
        double[] miNote2 = audio.getSineWave(noteDuration, 8000, 659.26);
        double[] doNote3 = audio.getSineWave(noteDuration, 8000, 523.25);
        double[] miNote3 = audio.getSineWave(noteDuration*3, 8000, 659.26);
        double[] reNote2 = audio.getSineWave(noteDuration*4, 8000, 587.33);

        audio.createPlayer();
        audio.writeSound(doNote);
        audio.writeSound(silence);
        audio.writeSound(reNote);
        audio.writeSound(silence);
        audio.writeSound(faNote);
        audio.writeSound(silence);
        audio.writeSound(laNote);
        audio.writeSound(silence);
        audio.writeSound(laNote2);
        audio.writeSound(silence);
        audio.writeSound(siNote);
        audio.writeSound(silence);
        audio.writeSound(laNote);
        audio.writeSound(silence);
        audio.writeSound(faNote);
        audio.writeSound(silence);
        audio.writeSound(doNote2);
        audio.writeSound(silence);
        audio.writeSound(miNote);
        audio.writeSound(silence);
        audio.writeSound(faNote);
        audio.writeSound(silence);
        audio.writeSound(faNote);
        audio.writeSound(silence);
        audio.writeSound(miNote2);
        audio.writeSound(silence);
        audio.writeSound(doNote3);
        audio.writeSound(silence);
        audio.writeSound(miNote3);
        audio.writeSound(silence);
        audio.writeSound(doNote);
        audio.writeSound(silence);
        audio.writeSound(reNote);
        audio.writeSound(silence);
        audio.writeSound(faNote);
        audio.writeSound(silence);
        audio.writeSound(laNote);
        audio.writeSound(silence);
        audio.writeSound(laNote2);
        audio.writeSound(silence);
        audio.writeSound(siNote);
        audio.writeSound(silence);
        audio.writeSound(laNote);
        audio.writeSound(silence);
        audio.writeSound(faNote);
        audio.writeSound(silence);
        audio.writeSound(doNote2);
        audio.writeSound(silence);
        audio.writeSound(miNote);
        audio.writeSound(silence);
        audio.writeSound(faNote);
        audio.writeSound(silence);
        audio.writeSound(faNote);
        audio.writeSound(silence);
        audio.writeSound(miNote2);
        audio.writeSound(silence);
        audio.writeSound(miNote2);
        audio.writeSound(silence);
        audio.writeSound(reNote2);

        audio.destroyAudioTrack();

    }
}

And here is my hopefully more human readable version:

AudioSynthesisDemoActivity
package pntanasis.master_ex.android;

import pntanasis.master_ex.android.Synthesizer.Note0;
import android.app.Activity;
import android.os.Bundle;

public class AudioSynthesisDemoActivity extends Activity {
    /** Called when the activity is first created. */
    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);

        Synthesizer synthesizer = new Synthesizer();

        synthesizer.play(Note0.C, 5, 1.0/8);
        synthesizer.play(Note0.E, 5, 1.0/8);

        synthesizer.play(Note0.F, 5, 1.0/4);
        synthesizer.play(Note0.A, 5, 1.0/4);
        synthesizer.play(Note0.A, 5, 3.0/8);
        synthesizer.play(Note0.B, 5, 1.0/8);

        synthesizer.play(Note0.A, 5, 1.0/4);
        synthesizer.play(Note0.F, 5, 1.0/4);
        synthesizer.play(Note0.C, 5, 3.0/8);
        synthesizer.play(Note0.E, 5, 1.0/8);

        synthesizer.play(Note0.F, 5, 1.0/4);
        synthesizer.play(Note0.F, 5, 1.0/4);
        synthesizer.play(Note0.E, 5, 1.0/4);
        synthesizer.play(Note0.C, 5, 1.0/4);

        synthesizer.play(Note0.E, 5, 3.0/4);
        synthesizer.play(Note0.C, 5, 1.0/8);
        synthesizer.play(Note0.E, 5, 1.0/8);

        synthesizer.play(Note0.F, 5, 1.0/4);
        synthesizer.play(Note0.A, 5, 1.0/4);
        synthesizer.play(Note0.A, 5, 3.0/8);
        synthesizer.play(Note0.B, 5, 1.0/8);

        synthesizer.play(Note0.A, 5, 1.0/4);
        synthesizer.play(Note0.F, 5, 1.0/4);
        synthesizer.play(Note0.C, 5, 3.0/8);
        synthesizer.play(Note0.E, 5, 1.0/8);

        synthesizer.play(Note0.F, 5, 1.0/4);
        synthesizer.play(Note0.F, 5, 1.0/4);
        synthesizer.play(Note0.E, 5, 1.0/4);
        synthesizer.play(Note0.C, 5, 1.0/4);

        synthesizer.play(Note0.C, 5, 1);

        synthesizer.stop();

    }
}

The signature of the play method is play(Note0 note,int octave,double duration).

In both cases I used the frequencies corresponding to notes that I found in the website Physics of Music Notes of MTU.

The code of the above snippets is available here.

Path of Illumination

To sum up the trick was to use an ever lasting while loop to write sound to the audio device all the time and not to try to play sound every x seconds with timers and the like.

I speculate that this happens because the android os cannot guarantee a process to start in a specified time. However, when a process starts the system will split the resources equally.

Note that if we stop write data to the audio device buffer we’ll get an empty buffer warning.

If you don’t like that just release the device when you don’t need it.

Share love

I am relatively new to android and to multimedia stuff so I may very likely be wrong here or there.

Feel free to correct me or share your thoughts :)

cheers xD

Comments

fade out