At HD2i, one of our core missions is to engage in strategic partnerships with early stage health technology companies. Our bet is that the technologies that will power next-generation healthcare are more likely to come from these folks than from established players. We aim to pair the best aspects of entrepreneurship with the expertise and rigor of an academic medical center to accelerate progress. Our first realization of that vision is the Health Entrepreneur Partners Program, in which we identify and build the algorithmic and statistical tools that help entrepreneurs use data to drive their products forward.
In this post I’m going to describe preliminary work we’ve completed with Fit3D, the first participant in the partners program.
Fit3D is a 3D body scanning company geared toward the fitness industry. It uses cameras to create a 3D model of your body. It is relatively inexpensive, completely safe, and fast - a complete scan takes less than a minute.
The Fit3D device.
From this 3D model, Fit3D derives a large number of measurements about the user, including circumferences, lengths, contours, widths, surface areas and volumes. They also provide users with rotating models of their bodies so they can check themselves out from multiple angles.
Building a better BFP model
One of the key metrics Fit3D reports to the user is body fat composition (BFP), or percent body fat. There are many different ways to estimate body fat, from bio-electrical impedance which you might have experienced on a “smart scale” to skinfold calipers. One of the gold standard techniques for measuring body composition is a DEXA scan, which is based on the absorptivity of X-rays at two different energies.
Fit3D has spent the last few years amassing a validation dataset of individuals who underwent both a DEXA and a Fit3D scan on the same day. They used an earlier version of this dataset to develop a proprietary algorithm to predict body fat percentage from Fit3D scans.
Their model was a few years old and they now have training data on many more users, so I decided to redo it from scratch to see if a different machine learning algorithm, or simply the presence of more training data, would improve model performance. We also wanted to do a formal evaluation of different scan features and how consistent each was from scan to scan.
The initial dataset had 435 features, consisting of circumferences (e.g. trunk, right bicep, left calf), lengths (e.g. elbow to wrist), and volumes (e.g. upper and lower torso volumes), as well as demographic data about the users. For the purposes of BFP, left and right side information is somewhat redundant, so we averaged those values, looking carefully for cases where the two sides disagreed. After some further data cleaning and outlier removal, we got the final feature set down to 214 predictors and 1041 training examples (i.e. scans).
Building a new model
Two initial observations: (1) Many of the features were correlated (e.g. torso length is highly correlated with overall height), and (2) Our ratio of subjects to predictors was only about 5:1. It was clear that we’d need to incorporate some kind of regularization or implicit/explicit feature selection.
We experimented with a variety of models including various forms of regularized regression (Lasso, ridge regression, partial least squares, principal component regression, etc.), and a random forest. The winning model predicted BFP to within 5% over 87% of the time, compared to 78% for the old model. When the model does deviate from reality, it tends to happen on users with extremely low BFP, which is a group where DEXA also performs worse than usual.
Some examples of real users and their predicted BFPs are below. We can’t show you real vs. predicted BFP examples from the training set, unfortunately, because those users didn’t consent to have their images released online.
Examples of three different Fit3D users and their predicted BFP measurements from our model. The heights of the figures are proportional to the users’ real heights.
On to the partners program…
We had such a good experience working with Fit3D that we are engaging in a much more substantial project with them this year as part of the HD2i partners program. We all agree (Fit3D and MSSM scientists and clinicians) that there is a wealth of potentially useful clinical data available in these scans, such as information on fat distribution and posture.
Be on the lookout for more partners program blog posts in the next few months!
In the meantime, check them out and email us if you have any general questions about our project with Fit3D or if you need help gauging whether the HD2i partners program is a good fit for your company.