Changepoints in Heart Rate Variability

In the course of a project on changepoints methods for a time series course, I found that that there is not a lot of literature documenting analysis of heart rate and fitness tracking data. Presumably there are some neat analytics in a product like Strava Premium, but these methods are proprietary. Since the data are relatively straightforward (pace, heartrate, and altitude), it would be neat to explore whether there are some easy insights in the data.

For this project, I looked at heart rate variability (HRV), which is commonly known as a metric for recovery. It seems like it's also a good indicatory of "activity types." See below for a quick demonstration of changepoint in variance methods on heart rate data collected by me on the Mailbox Peak hike in North Bend, WA (8,690 m, 1,220 m elevation gain) using a Suunto Ambit2 HR watch and heart rate belt. Although heart rate data is discrete, I found that lag-1 differenced heart rate is approximately normal for a given activity (e.g., running) with minimal autocorrelation beyond lag 0. Thus the assumption of an i.i.d. Gaussian process appeared to be reasonable for these data. 

Applying the PELT method to detect changepoints in variance with a normal likelihood cost appeared to identify different activities and phases in the hike: an initial "try hard" phase where pace was erratic, a cardiovascular phase where heart rate variability was lower as pace settled, a rest phase (around the discontinuity in the data), and the descent phase. The binary segmentation method with a cumulative sums of squares cost identified a few additional changepoints, suggesting possible overfitting. Note that we omitted some outliers corresponding to short breaks (visible in the figure) to clean up the output.


Mapping Public Transit Accessibility

A few months ago, I moved almost exactly one mile up Capitol Hill, from an area nearly adjacent to downtown to a more residential and quiet area. My old apartment was kitty-corner to a dance club and Occupy Seattle and had experienced a break-in where the only thing stolen was my old bike. It was a crazy and fun place to live for nearly two years.

My new neighborhood is beautiful and peaceful, but the move has added considerably more travel time to any public transit travel outside of the downtown core of Seattle. Andrew Hardin at the University of Colorado recently created an interactive visualization that demonstrates exactly how the move affected my transit times.

Trips to the neighborhoods and cities of Fremont (north), Magnolia (west), West Seattle (southwest), Georgetown (south), Kirkland (east), and Issaquah (east) are now 50 minute long hauls—all of these places were previously within 40 minutes. (Ballard remains a long haul from Capitol Hill.)

Seattle is investing in express buses, streetcars, and rapid transit that should increase my transit range, but I still feel public transit in the city has a long way to go. These maps don't reflect the frequent bus delays that can make transfers difficult to time—one of the major advantages of rapid transit.

The King County transit system can boast, however, of being one of few systems that allows for bus-to-hike. You can see some of these options off the 520 in Issaquah (southeast). In less than two hours, I can take a series of buses to Mt. Si and hike 3,500 feet to a snow-covered peak. Now that's a different kind of accessibility!

Interactive Storytelling

The New York Times has put together a superb archive of interactive infographics, visualizations, and photo/video journalism that they are calling 2013: The Year of Interactive Storytelling. This is, in fact, a misuse of the term—wikipedia defines interactive storytelling as "a form of digital entertainment in which users create or influence a dramatic storyline through actions." But it is nonetheless a term that seems to encompass the experimental and innovative formats the Times has begun incorporating into their reporting.

My favorite piece was How Y’all, Youse and You Guys Talk, a combination of a survey and map depicting linguistic similarity to the user (within the US). I think it's fair to say this was the most talked about visualization of the year—and among my friends, probably the most discussed visualization ever. It popped up in social media, in my email inbox, and in conversations over beers.

Addendum: It turns out the piece was the most read article of the year on, despite coming online on December 21st. Remarkable.

 My map: born and raised in Berkeley, CA, college in Minnesota and Southern California, current resident of Seattle, WA. One of my biggest work clients is located in Michigan—how many milliseconds does it take them to realize I'm an out-of-towner?

My map: born and raised in Berkeley, CA, college in Minnesota and Southern California, current resident of Seattle, WA. One of my biggest work clients is located in Michigan—how many milliseconds does it take them to realize I'm an out-of-towner?

I think this visualization succeeds because it reminds us, in a highly personal way, of the communities and cultures we come from, years after we have physically left them. My dad's map reflects the decade he spent growing up in Washington D.C., despite the 40 years he's spent in Cailfornia.

The results are memorable because they challenge some our conventional notions of place divisions. In the West, the urban/rural divisions seen in voting patterns are not discernible (Minneapolis, Chicago, and Washington D.C. are easily spotted, however). State lines seem to matter to some extent but the trends bleed across the borders.

I do wish there was additional annotation and explanation. The visualization presents you with the words most definitive of the three most and three least similar cities. But I have no idea what pronunciations or vocabulary I share with South Carolina and Maine.

In a high school linguistic class, I remember being told the US has a remarkably low number of dialects given its size, which is of course a product of the country's young history. This visualization does not refute that, but does show a surprising amount of linguistic diversity in light of a dominant national media and high rates of mobility between states and regions. 

In conclusion, it's a hella savage visualization.

Scientific Integrity

Scientific integrity ... corresponds to a kind of utter honesty--a kind of leaning over backwards. For example, if you're doing an experiment, you should report everything that you think might make it invalid--not only what you think is right about it: other causes that could possibly explain your results; and things you thought of that you've eliminated by some other experiment, and how they worked--to make sure the other fellow can tell they have been eliminated.

Important words from Richard Feynman's 1974 Caltech Commencement Address.

Cited in Arthur Lupia's great plenary talk at Evaluation 2013.