# Tips in implementing Tensorflow Lite to use movenet models in C++

I built Tensorflow and Tensorflow Lite from source in order to write my own C++ implementation of a model that uses Movenet to track up to 6 skeletons. I've included some tips on how to implement and interpret the model in this documentation.

There are several implementations of Movenet in the wild built for Python, the web, raspberry pi, iOS, Android, etc., but finding a C++ implementation that shows how to open the model, copy an image into the input tensor, and re-interpret the results into a skeleton nodes from the output tensor requires pulling together documentation from different sources. Below are the installation steps

- Install tensorflow from source (docs). Instead of following the last step for python, build with 'bazel build tensorflow_cc.dll' and 'bazel build tensorflow_cc.lib'
- Install tensorflow-lite with cmake (docs)
- Add tensorflow build products in the bazel-bin folder to the user PATH
- Change tensorflow project to use c++ 20 standard
- Add all the libs and headers for tensorflow and tensorflow-lite to the VC project's properties for VC++ headers and linker paths

Once tensorflow and tensorflow-lite are installed, the next step is to set up the program to read in a bitmap image file (encoded from photoshop in 24 bit depth using BGR uint8_t channels). The image file needs to be decoded into RGB int32_t input tensor. Image height and image width need to be a multiple of 32 and no bigger than 256 on the largest size, so I resized the input image in photoshop to a bitmap of 256 x 256.

After loading the model using the tensorflow-lite interpreter (docs), we need to resize the input tensor before allocating the tensors on the model.

```
interpreter->ResizeInputTensor(input, { 1, image_width, image_height, 3 });
```

Then, get a mutable pointer to the input tensor's typed data and copy the decoded bitmap RGB array into the typed input tensor.

```
uint8_t* typedInputTensor = interpreter->typed_input_tensor<uint8_t>(input);
for (int ii = 0; ii < in.size(); ii++) {
typedInputTensor[ii] = in[ii];
}
```

The output tensor comes in the shape of [1,6,56], and most Python implementations reshape this into a [6,17,3] tensor/array by using numpy. Since that is not available without adding a library in c++, my VS2019 solution first unravels the data of floats into flat array and then copies those floats into a vector shaped in the [people / joints / coordinate + confidence] configuration.

Below is the sample output data for a resized version of the input image into a 256 x 256 square image:

Reshaped array size: (number of people) 6 Person Data size: (number of joints) 17

##### person: 0 joint: 0

##### nose

y coordinate: 43.1959 x coordinate: 104.527 confidence: 0.71908

##### person: 0 joint: 1

##### left eye

y coordinate: 35.5635 x coordinate: 111.077 confidence: 0.663622

##### person: 0 joint: 2

##### right eye

y coordinate: 37.1297 x coordinate: 102.99 confidence: 0.734936

##### person: 0 joint: 3

##### left ear

y coordinate: 37.3665 x coordinate: 129.487 confidence: 0.913192

##### person: 0 joint: 4

##### right ear

y coordinate: 38.1035 x coordinate: 110.545 confidence: 0.733949

##### person: 0 joint: 5

##### left shoulder

y coordinate: 67.8285 x coordinate: 148.238 confidence: 0.760208

##### person: 0 joint: 6

##### right shoulder

y coordinate: 70.6572 x coordinate: 113.235 confidence: 0.782177

##### person: 0 joint: 7

##### left elbow

y coordinate: 103.372 x coordinate: 116.976 confidence: 0.771361

##### person: 0 joint: 8

##### right elbow

y coordinate: 99.093 x coordinate: 75.52 confidence: 0.599049

##### person: 0 joint: 9

##### left wrist

y coordinate: 68.1098 x coordinate: 91.1617 confidence: 0.6747

##### person: 0 joint: 10

##### right wrist

y coordinate: 69.5191 x coordinate: 81.5573 confidence: 0.488354

##### person: 0 joint: 11

##### left hip

y coordinate: 131.448 x coordinate: 186.783 confidence: 0.808038

##### person: 0 joint: 12

##### right hip

y coordinate: 132.244 x coordinate: 149.793 confidence: 0.74378

##### person: 0 joint: 13

##### left knee

y coordinate: 152.912 x coordinate: 150.159 confidence: 0.603105

##### person: 0 joint: 14

##### right knee

y coordinate: 144.769 x coordinate: 105.615 confidence: 0.863145

##### person: 0 joint: 15

##### left ankle

y coordinate: 224.04 x coordinate: 170.388 confidence: 0.88596

##### person: 0 joint: 16

##### right ankle

y coordinate: 201.919 x coordinate: 121.049 confidence: 0.406041

Finished running inference

©2023, Secret Atomics