Deployment of KWS

Overview

Keyword Spotting Components

Arduino Example

- File/Examples/Harvard_TinyMLx/micro_speech

Initialization

Overview

General steps that need to be done for any ML application deployment

Steps

Declare variables
- micro_speech.ino
  - namespace
  - TFL-related variables
  - Tensor Arena Size
  - …
Load model
- micro_features_model.cpp
  - Byte array describing the model
- micro_speech.ino
  - setup()
  - Load model into a model variable
Resolve Operators
- micro_speech.ino
  - setup()
  - Use AddOPERATOR with OPERATOR is the name of the operator to be added.
  - List of default Ops
Allocate/instantiate the Interpreter
- micro_speech.ino
  - setup()
Allocate memory for the TensorArena
- micro_speech.ino
  - setup()
Define model inputs
- micro_speech.ino
  - setup()
- model_input->dims->size
  - See kws-training-... notebook.
  - model_input is the FLATTENED_SPECTROGRAM_SHAPE
  - The input has two dimensions, and the first one is simply a wrapper with value of 1 (flattened)
  - The second dimension is the size of the spectrogram (see the kws-training notebook)
    - FEATURE_BIN_COUNT = 40: 40
    - OVERLAPPING_WINDOWS = window_counter(CLIP_DURATION_MS, int(WINDOW_SIZE_MS), WINDOW_STRIDE): 49
  - micro_features_micro_model_settings.h
    - constexpr int kFeatureSliceSize = 40;
    - constexpr int kFeatureSliceCount = 49;
    - FEATURE_BIN_COUNT: kFeatureSliceSize
    - OVERLAPPING_WINDOWS: kFeatureSliceCount
Setup Main loop
- micro_speech.ino
  - setup()
- Setting up the feature provider: static_feature_provider
  - Accesses the audio buffer and generates spectrograms.
- After inference, we go to recognize command: static_recognizer
  - Recognize commands based on inference results and facilitate device responses.

Pre-processing

Overview

micro_speech.ino
- void loop()

Audio provider

Continuously get audio signals in from the microphone on the Arduino and convert that into a digital representation that can then be converted into a spectrogram that is consumed by neural network.
void loop()/PopulateFeatureData: right click then go to definition
- goal: get the audio samples in a spectrogram format.
- Previous time is the last time PopulateFeatureData is called.
- Current time is the actual time that records as of right now.
- Identify (old) slices to drop.
- Identify new slices to calculate information.
GetAudioSamples
- InitAudioRecording/CaptureSamples

Feature Extractor

GenerateMicroFeatures
- Unlikely need to be modified.
Perform FFT transform on audio sample data then generate the MFCC and extract the spectrogram data.

Inference

Recall: MNIST data

Spectrogram data

Inference process

Copy collected audio data into feature buffer
- Manual loop (no fancy memcopy!)
Invoke interpreter

Post-processing

Overview

Potential problems:

Is it Up?

Or is it Upward?

Process Latest Results

A solution to the false positive problem

micro_speech.ino
- void loop():recognizer->ProcessLatestResults
Calculate confidence score

Respond to Command

micro_speech.ino
- void loop():RespondToCommand
Configure Lights
Use TF_LITE_REPORT_ERROR to print out logs

Deployment

Preparation

Make sure that you used https://netron.app to identify and modify the Ops
Use the Pretrained Notebook to convert the tflite model file into .cc format.
Open and copy the .cc format to replace the byte array of the model in micro_speech.
Save micro_speech as a new project on Arduino.