Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added sound recognition: emoji and morse code #99

Open
wants to merge 28 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
71cec58
Added FFT Files to V2 Repo
JoshuaAHill Mar 3, 2021
11bbf4f
Merge pull request #1 from JoshuaAHill/input_pipeline
JoshuaAHill Mar 3, 2021
0dde8b9
Merge pull request #2 from JoshuaAHill/input_pipeline
JoshuaAHill Mar 3, 2021
a561c1d
Updated FFT Files
JoshuaAHill Mar 3, 2021
82f3cfe
Add MicroBitSoundRecogniser and changed MicroBitAudioProcessor
vladturcuman Apr 15, 2021
75234cb
Add Morse code classes and bug fixed audio processor and sound recogn…
vladturcuman Apr 30, 2021
012e975
Re-sampled the soaring and twinkle sounds and added comments to the a…
vladturcuman May 1, 2021
7415cf3
Added comments to the sound recogniser
vladturcuman May 1, 2021
d15525a
added comments to the emoji recogniser
vladturcuman May 1, 2021
7597059
Re-sampled the happy sound
vladturcuman May 3, 2021
af924cb
Made EmojiRecogniser to output on the message bus instead of having a…
vladturcuman May 3, 2021
93e3973
Get recogniser and interpreter working
mateiBanu May 3, 2021
0d48ede
Merge pull request #1 from vladturcuman/morse_recognition
vladturcuman May 4, 2021
46650a2
Changed the microphone sampling rate to allow for frequency shift mor…
vladturcuman May 4, 2021
a62ab0f
Added comments to morse
mateiBanu May 5, 2021
e9425a7
Merged with sound_recognition
mateiBanu May 5, 2021
78cd8fe
Changed the algorithm for morse recogniser, sampled the sounds for th…
vladturcuman May 5, 2021
4e25ef6
updated description for some constants
vladturcuman May 5, 2021
06d53fb
Bug fixed audio processor when sampling at 11000 hz
vladturcuman May 5, 2021
859c6c6
Changed MorseInterpreter to send message on bus instead of having a c…
vladturcuman May 5, 2021
97c0b4b
Added documentation for the MorseRecogniser
vladturcuman May 5, 2021
8f175b9
Mereg recogniser docs
mateiBanu May 6, 2021
fef4f28
More documentation
mateiBanu May 6, 2021
e10543a
Final touches to morse documentation
mateiBanu May 6, 2021
c0a711a
Merge pull request #2 from vladturcuman/morse_recognition
vladturcuman May 6, 2021
6833bba
Corrected spelling
vladturcuman May 6, 2021
0fef6d8
Merge branch 'sound_recognition' of https://github.com/vladturcuman/c…
vladturcuman May 6, 2021
02e3c51
Added different threshold for zeroes than ones in morse recogniser
vladturcuman May 11, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[submodule "CMSIS_5"]
path = CMSIS_5
url = [email protected]:ARM-software/CMSIS_5.git
[submodule "CMSIS"]
path = CMSIS
url = https://github.com/ARM-software/CMSIS_5.git
1 change: 1 addition & 0 deletions CMSIS
Submodule CMSIS added at 4ed573
20 changes: 20 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,35 @@ set(CMAKE_LINKER_FLAGS "${CMAKE_LINKER_FLAGS} -T\"${CMAKE_CURRENT_LIST_DIR}/ld/n
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -T\"${CMAKE_CURRENT_LIST_DIR}/ld/nrf52833.ld\"" PARENT_SCOPE)
set(CMAKE_SYSTEM_PROCESSOR "armv7-m" PARENT_SCOPE)


set(ROOT "${CMAKE_CURRENT_LIST_DIR}/CMSIS/")
list(APPEND INCLUDE_DIRS "${ROOT}/CMSIS/Core/Include/")


# Define the path to CMSIS-DSP (ROOT is defined on command line when using cmake)
set(DSP ${ROOT}/CMSIS/DSP)

include(${DSP}/Toolchain/GCC.cmake)

# add them

include_directories(${INCLUDE_DIRS})

# Load CMSIS-DSP definitions. Libraries will be built in bin_dsp
add_subdirectory(${DSP}/Source bin_dsp)

# create our target
add_library(codal-microbit-v2 ${SOURCE_FILES})

target_link_libraries(
codal-microbit-v2
codal-nrf52
CMSISDSPSupport
CMSISDSPTransform
CMSISDSPCommon
CMSISDSPComplexMath
CMSISDSPFastMath
CMSISDSPStatistics
codal-core
codal-microbit-nrf5sdk
${LIB_OBJECT_FILES}
Expand Down
78 changes: 78 additions & 0 deletions inc/EmojiRecogniser.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@

#ifndef EMOJI_RECOGNISER_H
#define EMOJI_RECOGNISER_H

/*
*
* The emoji recogniser is a subclass of sound recogniser that defines
* the actual samples for the emoji sounds. They are just parts of the
* emoji sounds that can be recognised: remain quite consistent across
* multiple plays of the sound.
*
*
* Example
*
* Taking the happy sound as an example, there are a few constants defined:
*
* happy_sequences the number of sequences in the happy sound
*
* happy_max_deviations the maximum number of deviations in the
* sound - i.e. a deviation is considered
* a data point that is more than the allowed
* threshold off the sampled frequency
*
* happy_samples a 3-dimensional array with the sampled sound:
* - the first dimension is the different
* sequences
* - the second is the samples in each sequence
* - the third is the data points in each sample
* of each sequence
*
* happy_thresholds an array with the thresholds for each of the
* sequences
*
* happy_deviations an array with the maximum deviations for each
* sequence
*
* happy_nr_samples an array with the number of samples in each
* sequence
*
* All these are packaged in a Sound struct.
*/

#include "MicroBitSoundRecogniser.h"

#define DEVICE_EMOJI_RECOGNISER_EVT_HAPPY 1
#define DEVICE_EMOJI_RECOGNISER_EVT_HELLO 2
#define DEVICE_EMOJI_RECOGNISER_EVT_SAD 3
#define DEVICE_EMOJI_RECOGNISER_EVT_SOARING 4
#define DEVICE_EMOJI_RECOGNISER_EVT_TWINKLE 5

// 37 is the first unused id in CodalComponent.
// It might be better for this to be in CodalComponent.
#define DEVICE_ID_EMOJI_RECOGNISER 37

class EmojiRecogniser : public MicroBitSoundRecogniser
{
void addHappySound();
void addHelloSound();
void addSadSound();
void addSoaringSound();
void addTwinkleSound();

protected:
/*
* The function to call when a sound is recognised.
*/
void recognisedSound(uint16_t id);

public:
EmojiRecogniser(MicroBitAudioProcessor& processor);

/*
* Converts from id to sound name.
*/
static ManagedString getSoundName(Event& evnt);
};

#endif
188 changes: 188 additions & 0 deletions inc/MicroBitAudioProcessor.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,188 @@
/*
The MIT License (MIT)
Copyright (c) 2020 Arm Limited.
Permission is hereby granted, free of charge, to any person obtaining a
copy of this software and associated documentation files (the "Software"),
to deal in the Software without restriction, including without limitation
the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
*/

#include "MicroBit.h"
#include "DataStream.h"
#define ARM_MATH_CM4
#include "arm_math.h"

#ifndef MICROBIT_AUDIO_PROCESSOR_H
#define MICROBIT_AUDIO_PROCESSOR_H

/*
* Provides the fundamental frequencies in the microphone data.
*
* It takes in the microphone data (sampled at MIC_SAMPLE_RATE Hz
* which is ~11000 Hz now) and produces AudioFrameAnalysis data.
*
*/

// Default configuration values
#define MIC_SAMPLE_RATE (1000000 / MIC_SAMPLE_DELTA)

#define RECOGNITION_START_FREQ 1700
#define RECOGNITION_END_FREQ 5400
#define ANALYSIS_STD_MULT_THRESHOLD 2

#define MAXIMUM_NUMBER_OF_FREQUENCIES 4
#define SIMILAR_FREQ_THRESHOLD 100

// Sampling more often - ~22000 Hz, allows better detection of emoji
// sounds and gives relibale results for morse code
#if MIC_SAMPLE_DELTA == 45

#define DEFAULT_AUDIO_SAMPLES_NUMBER 512
#define EMOJI_AUDIO_SAMPLES_NUMBER 512
#define MORSE_AUDIO_SAMPLES_NUMBER 256

#define DEFAULT_STD_THRESHOLD 140
#define EMOJI_STD_THRESHOLD 80
#define MORSE_STD_THRESHOLD 200

// If sampling at 11000 Hz the emoji detection would still work - not
// as good, but the morse code would require with longer durations.
#elif MIC_SAMPLE_DELTA == 91

#define DEFAULT_AUDIO_SAMPLES_NUMBER 512
#define EMOJI_AUDIO_SAMPLES_NUMBER 512
#define MORSE_AUDIO_SAMPLES_NUMBER 256

#define DEFAULT_STD_THRESHOLD 70
#define EMOJI_STD_THRESHOLD 45
#define MORSE_STD_THRESHOLD 75

#endif

class MicroBitAudioProcessor : public DataSink, public DataSource
{
public:

/*
* An AudioFrameAnalysis has the fundamental frequencies of a
* frame - maximum MAXIMUM_NUMBER_OF_FREQUENCIES and ordered
* from the most likely to the least.
*/
struct AudioFrameAnalysis {
uint8_t size;
uint16_t buf[MAXIMUM_NUMBER_OF_FREQUENCIES];
};

private:

DataSource &audiostream; // the stream of data to analyse
DataSink *recogniser; // the recogniser the frequencies should be send to
uint16_t audio_samples_number; // the number of samples to collect before analysing a frame
uint16_t std_threshold; // the threshold for the standard deviation
arm_rfft_fast_instance_f32 fft_instance; // the instance of CMSIS fft that is used to run fft
float *buf; // the buffer to store the incoming data
float *fft_output; // an array to store the result of the fft
float *mag; // an array to store the magnitudes of the frequencies

uint16_t buf_len; // the length of the incoming buffer
bool recording; // whether it should analyse the data or be idle

AudioFrameAnalysis output; // the result of the analysis

/*
* Converts from frequency to the index in the array.
*
* @param freq a frequency in the range 0 - 5000 Hz.
*
* @return the index to the frequency bucket freq is in
* as it comes out of the fft
*/
uint16_t frequencyToIndex(int freq);

/*
* Converts from the index in the array to frequency.
*
* @param index a index in the range 0 - audio_samples_number / 2.
*
* @return the avg frequency in the bucket
*/
float32_t indexToFrequency(int index);

public:

/*
* Constructor.
*
* Initialize the MicroBitAduioProcessor.
*/
MicroBitAudioProcessor(DataSource& source, uint16_t audio_samples_number = DEFAULT_AUDIO_SAMPLES_NUMBER, uint16_t std_threshold = DEFAULT_STD_THRESHOLD);

/*
* Destructor.
*
* Deallocates all the memory allocated dynamically.
*/
~MicroBitAudioProcessor();

/*
* A callback for when the data is ready.
*
* Analyses the data when enough of it comes in, using
* the following algorithm:
*
* The audio processor accumulates microphone data as it comes
* in and after getting audio_samples_number of them it process
* the frame.
*
* It transforms the date from time domain to frequency domain
* using the CMSIS fft.
*
* If the mean of the magnitudes of frequnecies is lower than
* ANALYSIS_MEAN_THRESHOLD or the standard deviation (std) is
* lower than ANALYSIS_STD_THRESHOLD then the frame is considered
* silence - no fundamental frequency.
*
* It then filters out the frequencies that have a magnitude lower
* than the mean + ANALYSIS_STD_MULT_THRESHOLD * std. This ensures
* that only outlier frequencies are being considered.
*
* It then filters out the neighbour frequencies around the peaks.
*
* Some of these operations are implemented together to optimize the
* algorithm.
*/
virtual int pullRequest();

/*
* Allow out downstream component to register itself with us
*/
void connect(DataSink *downstream);

/*
* Provides the next available data to the downstream caller.
*/
virtual ManagedBuffer pull();

/*
* Starts recording and analysing.
*/
void startRecording();

/*
* Stops from recording and analysing.
*/
void stopRecording();
};

#endif
62 changes: 62 additions & 0 deletions inc/MicroBitMorseInterpreter.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
#ifndef MICROBIT_MORSE_INTERPRETER_H
#define MICROBIT_MORSE_INTERPRETER_H

#include "MicroBit.h"
#include "DataStream.h"
#include "MicroBitMorseRecogniser.h"
#include "MorseEncoder.h"

#define DEVICE_MORSE_INTERPRETER_EVT_NEW_MESSAGE 1

// It might be better for this to be in CodalComponent.
#define DEVICE_ID_MORSE_INTERPRETER 38

/*
* This class takes morse data from a MicroBitMorseCodeRecogniser and uses a MorseEncoder to decode it.
* It then calls an event signalling the fact that it is done processing some data.
* The last processed data can then be taken from lastMessage.
*/
class MicroBitMorseInterpreter: public DataSink {

private:

MicroBitMorseRecogniser& recogniser; // recogniser that this takes data from
MicroBit& uBit; // the microbit - used in order to send an event in the message bus
MorseEncoder encoder; // encoder used for decoding received data
bool interpreting; // wether the Interpreter is currently interpreting or not

public:

/*
* Last processed message
*/
ManagedString lastMessage;

/*
* Constructor.
*
* Initializes the interpreter.
*
* @param rec is the recogniser this will receive data from
*
* @param bit is the micro:bit
*/
MicroBitMorseInterpreter(MicroBitMorseRecogniser& rec, MicroBit& bit);

/*
* Callback for when the data is ready.
*/
virtual int pullRequest();

/*
* Starts interpreting and also starts the associated recogniser.
*/
void startInterpreting();

/*
* Stops interpreting and also stops the associated recogniser.
*/
void stopInterpreting();
};

#endif
Loading