The story begins with The Things Network Conference. Signed in. Got an Arduino Portenta board (inclusive camera shield) and attended an inspiring Edge Impulse workshop about building an embedded, machine learning based, elephant detector.
Finding elephants in my room is not a very useful thing. First: I never miss one if it shows up. Second: no elephant has ever shown up in my room. But there is a monkey. It eats one banana in the morning to get energy and one banana in the evening before it goes to bed*. Between eating its bananas, it sits, codes, drinks coffee, codes, makes some exercises and codes even more. Let’s detect this monkey and make a fortune by selling its presence information to big data companies.
Checklist
Four years ago, I half-finished “Machine Learning” online training (knowledge: checked). Once I run an out-of-the-box cat detection model with the Caffe framework (experience: checked). I read the book** (book: checked). I have no clue what I will be doing, but hey, that is how code monkeys do work.
Step one: get Arduino
I use an Arduino board, so I need an Arduino IDE. I follow the steps described here and have a working IDE. Now I probably need to install some extra stuff.
Step two: extra stuff
Following online tutorials, I add Portenta board (Tools->Board->Boards Managment, type “mbed”, install Arduino “mbed-enabled Boards” package). I struggle for some time with the camera example. The output image does not seem to be right. I am stuck for some time trying to figure out why the image height should be 244 instead of 240. The camera code looks like a very beta thing. Maybe I am just using the wrong package version. I switch from 1.3.1 to 1.3.2. Example works.
Next, I add TensorFlow library. Tools->Manage Libraries and type “tensor”. From the list, I install Arduino_TensorFlowLite.
Step three: run “Hello World”
The book starts with an elementary example, which uses machine learning to control PWM cycle of a LED light. This example targets Arduino Nano 33 BLE device, but apparently, it also works on Portena without any code modifications. I select, upload it and stay there watching as the LED changes its brightness.
Step four: detect the monkey
After some time of watching red LED, I am ready to do some actual coding. First I switch to the person detection example (File->Examples->Arduino_TensorFlowLite->person_detection). Then, I modify arduino_image_provider.cpp file, which contains GetImage function, which is used to get the image out of our camera. I throw away all its content and replace it with a modified version of the Portenta CameraCaptureRawBytes example:
#include <mbed.h>
#include "image_provider.h"
#include "camera.h"
const uint32_t cImageWidth = 320;
const uint32_t cImageHeigth = 240;
uint8_t sync[] = {0xAA, 0xBB, 0xCC, 0xDD};
CameraClass cam;
uint8_t buffer[cImageWidth * cImageHeigth];
// Get the camera module ready
void InitCamera(tflite::ErrorReporter* error_reporter) {
TF_LITE_REPORT_ERROR(error_reporter, "Attempting to start Arducam");
cam.begin();
}
// Get an image from the camera module
TfLiteStatus GetImage(tflite::ErrorReporter* error_reporter, int image_width,
int image_height, int channels, int8_t* image_data) {
static bool g_is_camera_initialized = false;
if (!g_is_camera_initialized) {
InitCamera(error_reporter);
g_is_camera_initialized = true;
}
cam.grab(buffer);
Serial.write(sync, sizeof(sync));
auto xOffset = (cImageWidth - image_width) / 2;
auto yOffset = (cImageHeigth - image_height) / 2;
for(int i = 0; i < image_height; i++) {
for(int j = 0; j < image_width; j++) {
image_data[(i * image_width) + j] = buffer[((i + yOffset) * cImageWidth) + (xOffset + j)];
}
}
Serial.write(reinterpret_cast<uint8_t*>(image_data), image_width * image_height);
return kTfLiteOk;
}
And a small change to the main program, so it shouts if the monkey is there.
...
// Process the inference results.
int8_t person_score = output->data.uint8[kPersonIndex];
int8_t no_person_score = output->data.uint8[kNotAPersonIndex];
if(person_score > 50 && no_person_score < 50) {
TF_LITE_REPORT_ERROR(error_reporter, "MONKEY DETECTED!\n");
TF_LITE_REPORT_ERROR(error_reporter, "Score %d %d.", person_score, no_person_score);
}
I pass the image data to the model, and I send it via a serial port. I will use a simple OpenCV program to display my camera view (which can be useful when testing).
#include <opencv2/core.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/video.hpp>
#include <iostream>
#include <thread>
#include <boost/asio.hpp>
#include <boost/asio/serial_port.hpp>
using namespace cv;
const uint8_t cHeight = 96;
const uint8_t cWidth = 96;
uint8_t cSyncWord[] = {0xAA, 0xBB, 0xCC, 0xDD};
int main(int, char**)
{
Mat frame, fgMask, back, dst;
boost::asio::io_service io;
boost::system::error_code error;
auto port = boost::asio::serial_port(io);
port.open("/dev/ttyACM0");
port.set_option(boost::asio::serial_port_base::baud_rate(115200));
std::thread t([&io]() {io.run();});
while(true) {
uint8_t buffer[cWidth * cHeight];
uint8_t syncByte = 0;
uint8_t currentByte;
while (true) {
boost::asio::read(port, boost::asio::buffer(¤tByte, 1));
if (currentByte == cSyncWord[syncByte]) {
syncByte++;
} else {
std::cerr << (char) currentByte;
syncByte = 0;
}
if (syncByte == 4) {
std::cerr << std::endl;
break;
}
}
boost::asio::read(port, boost::asio::buffer(buffer, cHeight * cWidth));
frame = cv::Mat(cHeight, cWidth, CV_8U, buffer);
if (frame.empty()) {
std::cerr << "ERROR! blank frame grabbed" << std::endl;
continue;
}
imshow("View", frame);
if (waitKey(5) >= 0)
break;
}
return 0;
}
I upload the script. Run the app. Point the camera at me. And… we got him! Monkey detected. I think I deserved a banana.

* bananas seem to possess some kind of fruit magic that makes them work according to user (eater) needs. Please google “boost your energy with banana” and “get better sleep with banana” if you want more details
** Tinyml: Machine Learning with Tensorflow Lite on Arduino and Ultra-Low-Power Microcontrollers by Pete Warden and Daniel Situnayake